Start a new topic

how to capture redirected sites using the crawlera proxy with casperjs

I am having trouble when using the click function in casperjs on anything that redirects me to a new page.  The initial site takes a screen shot using capture() and will getHTML().  But then as soon as I click on anything that takes me to another screen, it just goes blank.  Capture returns a blank screen as well as getHTML returns - '<html><head></head><body></body></html>'.  I have included a basic code that demonstrates what I am trying to explain.  I would love a work around, or to know what I am doing wrong.  If I comment out the proxy information, the redirected page works, but that's not the point.  I want to know how to use the proxy while progressing through a website.

js
(1.24 KB)

Hello,


You may need to disable cookies as by default Crawlera manages Cookies. It has cookies per IP wise and hence subsequent requests will have different cookies. You can disable cookies and use Sessions(which uses single IP to make multiple requests) as given in https://support.scrapinghub.com/solution/articles/22000188409-does-crawlera-handle-cookies- .



Hello,


Crawlera by default does not follow redirects. Normally the redirect response codes:301 and 302 returns only the Header Location which has the final redirected URL.

You would need to follow the redirected URL. 



Regards,

Thriveni

Maybe I used the wrong term when I said 'redirected sites'.  I want to know how to click a link or button, that takes you to another page within the same website.  For example, maybe I want to sign in and click the sign in button, which takes me to another page.  Or I want to click on anything that takes you to a different part of the website.


Login to post a comment