URL Redirection with Mod-Rewrite
Using Regular Expressions
Understanding How to Create Regular Expressions to Use in Mod-Rewrite URL Redirection
When using Mod-Rewrite for URL Redirection, it is very important to be able to use regular expressions. Why? Because, if you can't you will have to have a separate rule (or condition) for every page you want to rewrite. (On one Macwom's, Inc. site, this is over 500 pages... Not very practical.)
(If you absolutely do not want to use regular expressions, or have a site where it would be more work to learn regular expressions, than to rewrite a few URL's, they are not required for mod_reqrite to function.)
I guess I should start by describing a regular expression. (They aren't too scary once you get to know them.) A regular expression is basically a small piece of code that checks for patterns. The pattern can range from a single character that matches to absolutely everything.
Regular Expression Pre-qualifier... these definitions are how regular expressions are generally used in htaccess files and though most definitions will be applicable globally, there are some that may not.
There are some predefined 'terms' in regular expressions to make your life easier. (At least, that are supposed to make your life easier.) Here is a short list, with what each does in the mod-rewrite setting.
[ ] enclose the expression or a portion of the expression. (Used for determining the characters, or range of characters to be matched.)
letter-letter (EG [a-z] matches any single lowercase alphabetical character in the range of a to z), so [c-e] will match any single character that is the lowercase letter c, d, or e.
LETTER-LETTER (EG [A-Z] matches any single capital alphabetical character in the range of A to Z), so [C-E] will match any single character that is the capital letter C, D, or E.
number-number (EG [0-9] matches any single number in the range of 0 to 9), so [4-6] would match any single number 4, 5, or 6.
^ has two purposes, when used inside of [ ] it desingates 'not'. (EG [^0-9] would match any character that is not 0 to 9 and [^abc] would match any character that is not a lowercase a, b, or c.) When used in mod-rewrite it also designates the begining of a 'line'.
It is very important to understand and remember [dog] does not match the word 'dog', it matches any individual lowercase letter d, o, or g anywhere in the comparison. In the same way, [^dog] does not exclude the word 'dog' from matching, it excludes the lowercase letters d, o, or g from matching individually.
To match a 'word' or a group of characters in order, you need to use () so (dog) would match the word dog, and not d, o, or g as a single character.
.(dot) matches any single character, except the ending of a line.
+ matches 1 or more of the characters or set of characters immediately before it. (EG a+ would match the lowercase letter 'a' 1 or more times, while [a-z]+ would match 1 or more lowercase letters from 'a to z'.)
? matches 0 or 1 of the characters or set of characters immediately before it. (EG a? would match the lowercase letter 'a' 0 or 1 time, while [a-z]? would match any lowercase letter from 'a to z' 0 or 1 time.)
* matches anything in the string immediately preceding it as many times as it can, but is much less efficient than +, so should only be used if absolutely necessary. (There is not room here for a detailed explaination, just trust me it is not efficient.)
These are the basic building blocks of regular expressions as used in htaccess and associated with mod-rewrite. By themselves, they do little, but when you put them together, they become very powerful.
Regular Expression Examples