Blocking User Agents

Fix it!!

Blocking User Agents

Postby T.J. » Tue Apr 22, 2003 7:45 am

I am sure this has been posted before, but I searched and couldn't find any good answers.

I have a site with alot of pictures and people like to use grabbers and rippers to get the images. I want to prevent this at all costs. I have found information on other forums and here and came up with this:
Code: Select all
RewriteCond %{http_user_agent} ^EmailCollector [OR]
RewriteCond %{http_user_agent} ^EmailWolf [OR]
RewriteCond %{http_user_agent} ^WebZIP [OR]
RewriteCond %{http_user_agent} ^webbandit [OR]
RewriteCond %{http_user_agent} ^WebCopier [OR]
RewriteCond %{http_user_agent} ^Webster [OR]
RewriteCond %{http_user_agent} ^MSproxy/2.0 [OR]
RewriteCond %{http_user_agent} ^MSFrontPage [OR]
RewriteCond %{http_user_agent} ^WebZIP [OR]
RewriteCond %{http_user_agent} ^EmailSiphon [OR]
RewriteCond %{http_user_agent} ^Extractorpro [OR]
RewriteCond %{http_user_agent} ^Web\ Downloader[OR]
RewriteCond %{http_user_agent} ^WebEMailExtrac [OR]
RewriteCond %{http_user_agent} ^WebStripper [OR]
RewriteCond %{http_user_agent} ^teleport\ pro [NC,OR]
RewriteCond %{http_user_agent} ^combine [OR]
RewriteCond %{http_user_agent} ^UtilMind [OR]
RewriteCond %{http_user_agent} ^Bloodhound [OR]
RewriteCond %{http_user_agent} ^e-collector [OR]
RewriteCond %{http_user_agent} ^htmlgobble [OR]
RewriteCond %{http_user_agent} ^JavaBee [OR]
RewriteCond %{http_user_agent} ^Robofox [OR]
RewriteCond %{http_user_agent} ^WebFetcher [OR]
RewriteCond %{http_user_agent} ^tarspider [OR]
RewriteCond %{http_user_agent} ^webmirror [OR]
RewriteCond %{http_user_agent} ^webvac [OR]
RewriteCond %{http_user_agent} ^w3mir [OR]
RewriteCond %{http_user_agent} ^JoBo [OR]
RewriteCond %{http_user_agent} ^Zeus [OR]
RewriteCond %{http_user_agent} ^Java [OR]
RewriteCond %{http_user_agent} ^Offline [OR]
RewriteCond %{http_user_agent} ^Larbin [OR]
RewriteCond %{http_user_agent} ^Wget [OR]
RewriteCond %{http_user_agent} ^Crescent [OR]
RewriteCond %{http_user_agent} ^Email [OR]
RewriteCond %{http_user_agent} ^Mozilla/2.0 [OR]
RewriteCond %{http_user_agent} ^Down2Web [OR]
RewriteCond %{http_user_agent} ^Microsoft*

I also want to add in this code and have it all work together, but I want a different rule for the user_agents than what I have for HTTP_REFERER.
Code: Select all
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://localhost/.*$     [NC]
RewriteCond %{HTTP_REFERER} !^http://xx.x.x.*/.*$     [NC]
RewriteCond %{HTTP_REFERER} !^http://xx.xxx.xx.xx/.*$     [NC]
RewriteCond %{HTTP_REFERER} !^http://domain1.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://domain2.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.domain2.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.domain1.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://forums.domain3.com/.*$ 
[NC]
RewriteRule .*\.(gif|GIF|jpg|JPG|mpg|avi|mp3|bmp|png|swf)        http://www.domain1.com:8080/images/eyes.gif    [L,R]

RewriteEngine is On


My Problem is that the http_user_agent part doesn't work. I want this all in one .htaccess and I want the HTTP_REFERER Rule to be different than the HTTP_USER_AGENT rule, which should just forward to a different domain name.



Please help.
T.J.
 

Postby T.J. » Tue Apr 22, 2003 8:44 am

I have also tried the following code, and it doesn't work either. I am testing using Internet Exploder and Teleport Pro.
Code: Select all
RewriteEngine on
# (testing purposes) RewriteCond %{HTTP_USER_AGENT}  ^Mozilla*       [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Pockey*                 [OR]
RewriteCond %{HTTP_USER_AGENT}  ^NetMechanic*            [OR]
RewriteCond %{HTTP_USER_AGENT}  ^SuperBot*               [OR]
RewriteCond %[HTTP_USER_AGENT}  ^QRVA*                   [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebMiner*               [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebCopier*              [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Web\ Downloader*        [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebMirror*              [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Offline*                [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebZIP*                 [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebReaper*              [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Anarchie*               [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Mass\ Down*             [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Slurp*                  [OR]
RewriteCond %{HTTP_USER_AGENT}  ^BlackWidow*             [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebStripper*            [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Wget*                   [OR]
RewriteCond %{HTTP_USER_AGENT}  ^WebHook*                [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Scooter*                [OR]
RewriteCond %{HTTP_USER_AGENT}  ^Teleport*
RewriteRule ^.*$ /pbourke/errors/robots.html     [L]
T.J.
 

Postby hulkster » Thu Jul 15, 2004 8:30 pm

This is a pretty old thread, but I would be curious if anyone has a more up-to-date list of User Agent Spiders (that are typically used to slurp web sites). For instance, I didn't see curl listed, and using libcurl is probably a good string to use.

Yea, I know that the bad guys can easily change the user agent to anything, so this is just a hurdle, but I had a recent experience with a guy named Graeme Kellett who scraped my entire site and replicated it at his site - I'd say that qualifies as a scumbag move in my book!
User avatar
hulkster
 
Posts: 1
Joined: Thu Jul 15, 2004 6:51 pm


Return to Security with Mod_Rewrite

Who is online

Users browsing this forum: No registered users and 13 guests

cron