Complex rewrite problem

Discuss practical ways rearrange URLs using mod_rewrite.

Complex rewrite problem

Postby kaos » Mon Dec 27, 2004 4:05 pm

Hi, have a rewrite going on a site but its not fully doing the job, its partly lifted from net and partly figured out but I just cant get my head round it properly, rewrite is this :

Code: Select all
RewriteEngine On
RewriteRule (.*)\.css $1.css [L]
RewriteRule ^$ /cgi-bin/page.cgi [L]
RewriteRule ^(.*).html /cgi-bin/page.cgi?g=$1.html [L]
RewriteRule ^ergo/([^/]+)/(.*)/$ /cgi-bin/jump.cgi?ID=$1 [L]
RewriteRule ^game/([^/]+)/(.*)/$ /cgi-bin/jump.cgi?ID=$1 [L]


it works for http://www.ergo247.com/Arts/index.html but not for http://www.ergo247.com/Arts or http://www.ergo247.com/Arts/

I need to figure out how to make it work for the last two url's with and without trailing slash as well as one ending in index.html and I just cant remember enough about rewriting to get it sorted out.

To complicate matters it needs to be able to cope with multiple directory levels such as http://www.ergo247.com/Arts/something/ and http://www.ergo247.com/Arts/something/s ... something/ etc also with and without trailing slash as well as ending in index.html

Some spidering bots are handling the site just fine but some are failing with 404 errors trying to access without url ending in index.html :(

Can anyone help?

BTW webserver site is on is down just at this point of writing this unfortunately, should hopefully come back up soon.
kaos
 
Posts: 3
Joined: Mon Dec 27, 2004 3:53 pm

Postby Caterham » Tue Dec 28, 2004 4:27 am

have a try on
Code: Select all
RewriteEngine On
# prevent loop: do nothing, if these extensions are requested, last rule
RewriteRule (.+)\.(css|gif|jpg|js)$ - [L]
RewriteRule ^$ /cgi-bin/page.cgi [L]
RewriteRule ^ergo/([^/]+)/(.*)$ /cgi-bin/jump.cgi?ID=$1 [L]
RewriteRule ^game/([^/]+)/(.*)$ /cgi-bin/jump.cgi?ID=$1 [L]
# catch the rest, eg. /bla/bla/index.html or /bla or /bla/bla/
# note: all files which should not be rewriten, must be execluded, see 'prevent loop' above
RewriteRule ^(.+) /cgi-bin/page.cgi?g=$1.html [L]

Some spidering bots are handling the site just fine but some are failing with 404 errors trying to access without url ending in index.html
Especially yahoo

hth,
Rob
Caterham
 
Posts: 690
Joined: Fri Dec 10, 2004 1:30 pm

Hi

Postby kaos » Tue Dec 28, 2004 5:26 pm

Thanks, but that seems to cause internal server error.
kaos
 
Posts: 3
Joined: Mon Dec 27, 2004 3:53 pm

Postby Caterham » Tue Dec 28, 2004 6:11 pm

oh.. try adding a cgi to the line
Code: Select all
RewriteRule (.+)\.(css|gif|jpg|js|cgi)$ - [L]
Caterham
 
Posts: 690
Joined: Fri Dec 10, 2004 1:30 pm

Thanks

Postby kaos » Wed Dec 29, 2004 9:06 am

Great help so far thanks :)

That worked great for url's ending with or without a forward slash but not ending in index.html which gave an error, I mucked about with it a little and got it working like this :

Code: Select all
RewriteEngine On
# prevent loop: do nothing, if these extensions are requested, last rule
RewriteRule (.+)\.(css|gif|jpg|js|cgi)$ - [L]
RewriteRule ^$ /cgi-bin/page.cgi [L]
RewriteRule ^ergo/([^/]+)/(.*)$ /cgi-bin/jump.cgi?ID=$1 [L]
RewriteRule ^game/([^/]+)/(.*)$ /cgi-bin/jump.cgi?ID=$1 [L]
# catch the rest, eg. /bla/bla/index.html or /bla or /bla/bla/
# note: all files which should not be rewriten, must be execluded, see 'prevent loop' above
RewriteRule ^(.*).html /cgi-bin/page.cgi?g=$1.html [L]
RewriteRule ^(.+) /cgi-bin/page.cgi?g=$1index.html [L]


I'm not sure if thats the best way, perhaps theres a cleaner way to do it?
kaos
 
Posts: 3
Joined: Mon Dec 27, 2004 3:53 pm

Postby Caterham » Thu Dec 30, 2004 1:34 pm

I think, there isn't a better way to rewrite this, if you need a index.html if no file is called.

But you might get a "wrong result" with the last rule, if /dir is requested (without the trailing slash). Your variable g would become dirindex.html instad of dir/index.html. You can fix this with a rewrite rule, see below.

Code: Select all
RewriteEngine On
# prevent loop: do nothing, if these extensions are requested, last rule
RewriteRule (.+)\.(css|gif|jpg|js|cgi)$ - [L]
RewriteRule ^$ /cgi-bin/page.cgi [L]
RewriteRule ^ergo/([^/]+)/(.*)$ /cgi-bin/jump.cgi?ID=$1 [L]
RewriteRule ^game/([^/]+)/(.*)$ /cgi-bin/jump.cgi?ID=$1 [L]
# catch the rest, eg. /bla/bla/index.html or /bla or /bla/bla/
# note: all files which should not be rewriten, must be execluded, see 'prevent loop' above
RewriteRule ^(.*).html /cgi-bin/page.cgi?g=$1.html [L]
# fix missing trailig slash
RewriteRule !\..{3,4}$ - [C]
RewriteCond %{REQUEST_URI} !^.*/$
RewriteRule ^(.+)$ $1/ [R=301,L]
RewriteRule ^(.+) /cgi-bin/page.cgi?g=$1index.html [L]


Bob
Caterham
 
Posts: 690
Joined: Fri Dec 10, 2004 1:30 pm


Return to Friendly URLs with Mod_Rewrite

Who is online

Users browsing this forum: No registered users and 25 guests

cron