I'm trying to get this site rendered https://turbotax.intuit.com/personal-taxes/online/ using Scrapy and Splash (I have access to an Scrapinghub Splash instance), but I'm always receiving an HTTP 401 error. The spider works on my PC with my local Splash instance. What is happenning?
I receive this:
[scrapy.core.engine] Crawled (401) <GET https://turbotax.intuit.com/personal-taxes/online/ via https://qmmxre59-splash.scrapinghub.com/execute> (referer: None)
[scrapy.spidermiddlewares.httperror] Ignoring response <401 https://turbotax.intuit.com/personal-taxes/online/>: HTTP status code is not handled or not allowed
I'm trying to get this site rendered https://turbotax.intuit.com/personal-taxes/online/ using Scrapy and Splash (I have access to an Scrapinghub Splash instance), but I'm always receiving an HTTP 401 error. The spider works on my PC with my local Splash instance. What is happenning?
I receive this:
Attachments (1)
logturbotaxo....txt
4.22 KB
0 Votes
nestor posted almost 7 years ago Admin Best Answer
Just to mark this as answered, it was solved by adding "http_user" variable set in spider class as shown here: https://support.scrapinghub.com/support/solutions/articles/22000188427-using-scrapy-with-splash
0 Votes
1 Comments
nestor posted almost 7 years ago Admin Answer
Just to mark this as answered, it was solved by adding "http_user" variable set in spider class as shown here: https://support.scrapinghub.com/support/solutions/articles/22000188427-using-scrapy-with-splash
0 Votes
Login to post a comment