Start a new topic
Answered

Specific site using http 404 to circumvent ban detection

I've recently seen some important sites returning http 404's instead of other codes to ban ip's. This behavior appears only when I use crawlera, not other proxies or ips using the same headers, randomized intervals, etc., and circumvents crawlera's ban detection.


Anyone else run into this, and are there any possible fixes down the road?




Best Answer

I've added a ban rule to handle this cases of 404s so that Crawlera will retry the request with a different IP if it receives this response.


About 50-75% of pages need to be re-tried multiple times before success.

Answer

I've added a ban rule to handle this cases of 404s so that Crawlera will retry the request with a different IP if it receives this response.


2 people like this
Login to post a comment