Start a new topic
Answered

Can't make Splash works receiving HTTP 401

I'm trying to get this site rendered https://turbotax.intuit.com/personal-taxes/online/ using Scrapy and Splash (I have access to an Scrapinghub Splash instance), but I'm always receiving an HTTP 401 error. The spider works on my PC with my local Splash instance. What is happenning?


I receive this:

[scrapy.core.engine] Crawled (401) <GET https://turbotax.intuit.com/personal-taxes/online/ via https://qmmxre59-splash.scrapinghub.com/execute> (referer: None)
[scrapy.spidermiddlewares.httperror] Ignoring response <401 https://turbotax.intuit.com/personal-taxes/online/>: HTTP status code is not handled or not allowed
txt

Best Answer

Just to mark this as answered, it was solved by adding "http_user" variable set in spider class as shown here: https://support.scrapinghub.com/support/solutions/articles/22000188427-using-scrapy-with-splash

1 Comment

Answer

Just to mark this as answered, it was solved by adding "http_user" variable set in spider class as shown here: https://support.scrapinghub.com/support/solutions/articles/22000188427-using-scrapy-with-splash

Login to post a comment