How to handle 302 redirects?

Posted over 4 years ago by aimering

Post a topic
Un Answered
a
aimering

I read that Crawlera treats a 302 redirect as a successful request, but what if it's actually an anti-spider response from the server? This happened to me when I tried to use the POST method, only to be rebuffed and redirected to an authentication page. Is there a way to manually ask Crawlera to retry with a new IP address when that happens?

2020-06-24 20:37:56 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://wenshuapp.court.gov.cn/authenticatin/require> from <POST http://wenshuapp.court.gov.cn/appinterface/rest.q4w/>
2020-06-24 20:38:00 [scrapy.core.engine] DEBUG: Crawled (401) <GET http://wenshuapp.court.gov.cn/authenticatin/require> (referer: None)

 

0 Votes


1 Comments

thriveni

thriveni posted over 4 years ago Admin

Hello,


We can add the case as a ban rule so that when such a response is received Crawlera would treat it as ban and retry. 

However, we would need to check the flow of the pages and if the redirects are valid. Does it need authentication. Does the retry to the request gives successful response. 

0 Votes

Login to post a comment