Can't make Splash works receiving HTTP 401

Posted over 7 years ago by Jerick Órdenes Sepúlveda

Post a topic

Answered

Jerick Órdenes Sepúlveda

I'm trying to get this site rendered https://turbotax.intuit.com/personal-taxes/online/ using Scrapy and Splash (I have access to an Scrapinghub Splash instance), but I'm always receiving an HTTP 401 error. The spider works on my PC with my local Splash instance. What is happenning?

I receive this:

[scrapy.core.engine] Crawled (401) <GET https://turbotax.intuit.com/personal-taxes/online/ via https://qmmxre59-splash.scrapinghub.com/execute> (referer: None)
[scrapy.spidermiddlewares.httperror] Ignoring response <401 https://turbotax.intuit.com/personal-taxes/online/>: HTTP status code is not handled or not allowed

Attachments (1)

txt

logturbotaxo....txt
4.22 KB

0 Votes

nestor posted over 7 years ago Admin Best Answer

Just to mark this as answered, it was solved by adding "http_user" variable set in spider class as shown here: https://support.scrapinghub.com/support/solutions/articles/22000188427-using-scrapy-with-splash

0 Votes

1 Comments

nestor posted over 7 years ago Admin Answer

0 Votes