Went through this a year or so ago. The Baidu spider was hammering a forum and I blocked it because after hundreds of thousands of hits they sent me a grand total of 1 visitor. Let it sit for a while and then removed the block to see what would happen.
Today I noticed that Baidu was viewing pages on my forum that should have been excluded through robots.txt. Specifically Baidu was looking at the print view of threads which are something I didn’t want crawled.
So a bit of .htaccess work and Baidu is blocked again.
RewriteEngine on RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC] RewriteRule .* [F]
Now all they get is a 403 Forbidden error.
Be First to Comment