Help - Stop Spidering a specific Directory

Fix it!!

Help - Stop Spidering a specific Directory

Postby midway » Mon Sep 10, 2007 2:28 pm

I would like to prevent the spidering of a download directory. Since not all spiders respect robots.txt file and people can discover a site structure by viewing the robots.txt file, I would like to use htaccess to deny spiders but allow users without the need for a password to have access.

Here is an example, let's say I have a pdf file in a download directory. I would like to protect that file name from being discovered by a spider.

I have searched this forum and nothing seems to be what I require. But I probably missed it because I'm new to mod rewrite and htaccess.

Please let me know if this is possible. And some sample code would be greatly appreciated :D

thanks in advance for your help

mw
midway
 
Posts: 2
Joined: Mon Sep 10, 2007 2:21 pm

Postby richardk » Mon Sep 10, 2007 2:59 pm

You can block by user agent. Try the following in a .htaccess file in the directory you would like to protect
Code: Select all
Options +FollowSymLinks

RewriteEngine On

RewriteCond %{HTTP_USER} (googlebot|slurp|another|etc) [NC]
RewriteRule .* - [F,L]


You have add the bad bot's user agents into the () separated by a |.
richardk
 
Posts: 8800
Joined: Wed Dec 21, 2005 7:50 am

Thanks

Postby midway » Mon Sep 10, 2007 4:09 pm

Now it clicks. Thanks

mw
midway
 
Posts: 2
Joined: Mon Sep 10, 2007 2:21 pm


Return to Security with Mod_Rewrite

Who is online

Users browsing this forum: No registered users and 2 guests

cron