Very severe mod_rewrite bug returns random documents

Discuss practical ways rearrange URLs using mod_rewrite.

Postby Guest » Wed Jul 10, 2002 4:30 pm

I'm the webadmin for a large IC manufacturer. Recently, we installed mod_rewrite and a simple perl script (rewrite program) to serve up "search engine friendly" URLs.

http://my.company.com?datasheet=1234
becomes
http://my.company.com/datasheet/1234

This is so the majority of search engines continue indexing past the "?". It's worked fine for a week or so, but recently we found a very interesting (and very embarassing) bug. Occasionally, something in mod_rewrite gets "confused" and begins returning html documents that other users have requested. The end result is hundreds of customers clicking on links within our site, and they ALL get seemingly random documents. Click a datasheet for an IC and get an appnote on a voltage regulator. The problem continues until someone (i.e. me) restarts the Apache process. That was until we wrote a background script to watch for random content from a given page and restart Apache automatically. Still, we get a few moments of embarassment, and pretty angry visitors who have no idea why our site returns random pages.

We're not using VirtualHost in this case, have implemented the RewriteLock file, and the rewrite program (perl script) is pretty basic and is not the culprit. I've searched and search the net and haven't found any reports similar to this. It only affects pages in our site which have the URL's rewritten via mod_rewrite and not standard static html documents (which don't meet the regex and so don't get passed to the rewrite program).

We have a pretty busy site.. about 200,000 hits per day, so I'm wondering if the problem is started once two visitors "collide" at the same moment, and the "pipe" back from mod_rewrite gets out of whack, and starts serving customer A with request B, rather than customer A with request A, etc.

The relevant section from our http.conf:

RewriteEngine on
RewriteLock /export/home/local/apache/logs/rewrite.lock
RewriteMap sefmap prg:/export/home/local/apache/bin/rwmapd
RewriteRule ^(/.+.cfm/.+)$ {sefmap:$1}

Any ideas? Has anyone heard of this?
Guest
 

Postby Brett » Mon Aug 05, 2002 11:55 am

I've never seen anything like that, but what you can do is to set up datasheet as a CGI script and have it parse the REQUEST_URI in order to provide the appropriate output. No need for mod_rewrite. 8)
Brett
 
Posts: 82
Joined: Tue Jul 10, 2001 4:00 pm
Location: yohost.com

Postby Tom Kagan » Tue Feb 04, 2003 9:28 pm

I don't think this is a problem with mod_rewrite. Two thoughts:

1. Your perl script is running under mod_perl and is not properly written to work in this fashion.

2. Your script is not sending proper cache control headers causing downline proxy servers to provide a wrong page to the user.

My guess is #2. Proxy caches usually handle the /foo/bar syntax and mostly don't do anything with ?foo&bar

look at http://www.web-caching.com/ for more information on caching.
Tom Kagan
 
Posts: 2
Joined: Tue Jan 28, 2003 10:23 am
Location: New York, NY USA


Return to Friendly URLs with Mod_Rewrite

Who is online

Users browsing this forum: No registered users and 31 guests

cron