Redirect all urls to lowercase

Discuss practical ways rearrange URLs using mod_rewrite.

Redirect all urls to lowercase

Postby Haran » Mon Jun 01, 2009 7:23 am

Hi guys,
links in my CMS have, by default, a lot of upper case names and other ugly characters (like "," , ":" , """ , "'" , etc...) already indexed in Google and other search engines.
Since I'm now able (editing the cms code) to make all the urls lowercase and to remove/replace all the other ugly stuff, I need to redirect all the old url to the new ones.
Here is an example:

OLD: http://haran.local/EN/ITALY-holiday-gui ... HUB-REGCAM

NEW: http://haran.local/en/italy-holiday-gui ... HUB-REGCAM

Can I use a ruleset like the following one to make the urls lowercase?

RewriteEngine On
RewriteMap lc int:tolower
RewriteCond %{REQUEST_URI} [.*]
RewriteRule (.*) ${lc:$1} [R=301]

What about the other characters? Should I use and external Rewrite Map?



Thanks,
Haran
Haran
 
Posts: 16
Joined: Sat Feb 21, 2009 1:09 pm

Postby Haran » Tue Jun 02, 2009 2:35 am

Please let me be more specific, given the old URL I need to:

- get rid of the ugly characters by replacing them with hyphens (i.e. spaces and ') or nothing (i.e. ,;:")
- make everything lower case
- append a slash at the end of the URL
- redirect (301) the old URLs to the new ones

Please have a look at the following rewrite rule which is actually fully working on the server.

Code: Select all
RewriteRule ^([^/]+)/([^/]+)-travel-([^/]+)/([^/]+)/([^/]+)/([^/]+)-([^/]+)$ /public/region.cgi?cat=$3&area1=$2&area2=$4&des=$5&cdo=$6&loc=$7&lng=$1 [L]


Thanks,
Haran
Haran
 
Posts: 16
Joined: Sat Feb 21, 2009 1:09 pm

Postby richardk » Tue Jun 02, 2009 11:28 am

The following should lowercase all URLs.
Code: Select all
Options +FollowSymLinks

RewriteEngine On

RewriteMap lowercase int:tolower

RewriteRule [A-Z] ${lowercase:%{REQUEST_URI}} [R=301,L]

The RewriteMap has to be defined in the httpd.conf file, not a .htaccess file.

- get rid of the ugly characters by replacing them with hyphens (i.e. spaces and ') or nothing (i.e. ,;:")

While that is possible with mod_rewrite, it would probably be easier to route all the requests to a script that did it (and the rest) in one go. Are you using Perl?
richardk
 
Posts: 8800
Joined: Wed Dec 21, 2005 7:50 am

Postby Haran » Tue Jun 02, 2009 11:56 pm

richardk wrote:While that is possible with mod_rewrite, it would probably be easier to route all the requests to a script that did it (and the rest) in one go. Are you using Perl?


Thanks richardk, your suggestion worked perfectly :)
The cms I'm using has been written in perl.

Could you please suggest me an example (or a reference) of both the solutions you suggested, mod_rewrite and the script?


Ale
Haran
 
Posts: 16
Joined: Sat Feb 21, 2009 1:09 pm

Postby richardk » Wed Jun 03, 2009 12:58 pm

To use Perl
Code: Select all
Options +FollowSymLinks

RewriteEngine On

# If it includes any characters that are not listed
RewriteCond %{REQUEST_URI} [^a-z0-9/_-]
# and it's the correct format send the request to /redirect.cgi
RewriteRule ^[^/]+/[^/]+-travel-[^/]+/[^/]+/[^/]+/[^/]+-[^/]+$ /redirect.cgi [QSA,L]

then in /redirect.cgi (not tested)
Code: Select all
#!/usr/bin/perl

use CGI;
my $query=new CGI;


$u = $ENV{'REQUEST_URI'};
$u = lc($u);
$u =~ s/%20//g;
$u =~ s/[^a-z0-9/_-]//g;
$h = 'http://' . $ENV{'HTTP_HOST'};
print $query->redirect($h . $u);


Edit: and for the mod_rewrite, you can try
Code: Select all
Options +FollowSymLinks -MultiViews

RewriteEngine On

RewriteMap lowercase int:tolower

# Match the correct format.
RewriteCond %{REQUEST_URI} ^/[^/]+/[^/]+-travel-[^/]+/[^/]+/[^/]+/[^/]+-[^/]+/?$
RewriteRule ^(.*)(%[0-9a-z]{2}|[^0-9a-z/])(.*)$ $1$3 [NC,E=NU:$1$3,QSA]

RewriteCond %{ENV:NU} ^[0-9a-z/]+$ [NC]
RewriteRule . /${lowercase:%{ENV:NU}|lcfail} [R=301,L]
richardk
 
Posts: 8800
Joined: Wed Dec 21, 2005 7:50 am


Return to Friendly URLs with Mod_Rewrite

Who is online

Users browsing this forum: No registered users and 22 guests

cron