Start a new topic
Answered

Possible setup issue?

I just set up my Crawlera account and was testing the setup. I ran the commands

curl -x proxy.crawlera.com:8010 -U <API_KEY>: --cacert crawlera-ca.crt https://www.toysrus.com

curl -x proxy.crawlera.com:8010 -U <API_KEY>: --insecure https://www.toysrus.com

Both commands timed out with the error message "Timeout processing HTTP stream". I did the following command and it worked: 

curl -x proxy.crawlera.com:8010 -U <API_KEY>: --cacert crawlera-ca.crt https://www.amazon.com

 Do I have something set up incorrectly or is toysrus.com just denying access?


Thanks in advance


Best Answer

Hi,


toysrus.com seems to be expecting some headers like Accept and Accept-Encoding, make sure to include those in your CURL request.


Answer

Hi,


toysrus.com seems to be expecting some headers like Accept and Accept-Encoding, make sure to include those in your CURL request.

Thanks nestor! I'm pretty new to this. Can you briefly explain how you discovered that? Was it trial and error or something else that I should have noticed? Or, if you don't have the time, is there resource you would recommend to get me up to speed on diagnosing things like this?

Just for anyone else that runs into this, the following curl command worked:

curl -x proxy.crawlera.com:8010 -U <API_KEY>: --cacert crawlera-ca.crt https://www.toysrus.com -H "accept:text/html,application/xhtml+xml,application/xml" -H "accept-encoding:identity"

 

For more info 

curl -vx proxy.crawlera.com:8010 -U <API_KEY>: --cacert crawlera-ca.crt https://www.toysrus.com/product?productId=72337296 -H "accept:text/html,application/xhtml+xml,application/xml" -H "accept-encoding:identity"

 


1 person likes this

Sorry for replying so late here, I simply checked the request headers sent to the website in my browser's developer tools and added them to the curl command.


1 person likes this
Login to post a comment