I've recently seen some important sites returning http 404's instead of other codes to ban ip's. This behavior appears only when I use crawlera, not other proxies or ips using the same headers, randomized intervals, etc., and circumvents crawlera's ban detection.
Anyone else run into this, and are there any possible fixes down the road?
0 Votes
nestor posted
almost 7 years ago
AdminBest Answer
I've added a ban rule to handle this cases of 404s so that Crawlera will retry the request with a different IP if it receives this response.
2 Votes
2 Comments
Sorted by
nestorposted
almost 7 years ago
AdminAnswer
I've added a ban rule to handle this cases of 404s so that Crawlera will retry the request with a different IP if it receives this response.
2 Votes
x
xingzhouliuposted
almost 7 years ago
About 50-75% of pages need to be re-tried multiple times before success.
I've recently seen some important sites returning http 404's instead of other codes to ban ip's. This behavior appears only when I use crawlera, not other proxies or ips using the same headers, randomized intervals, etc., and circumvents crawlera's ban detection.
Anyone else run into this, and are there any possible fixes down the road?
0 Votes
nestor posted almost 7 years ago Admin Best Answer
I've added a ban rule to handle this cases of 404s so that Crawlera will retry the request with a different IP if it receives this response.
2 Votes
2 Comments
nestor posted almost 7 years ago Admin Answer
I've added a ban rule to handle this cases of 404s so that Crawlera will retry the request with a different IP if it receives this response.
2 Votes
xingzhouliu posted almost 7 years ago
About 50-75% of pages need to be re-tried multiple times before success.
0 Votes
Login to post a comment