The recommended way to integrate Scrapy and Splash is using the scrapy-splash library. There are two ways to authenticate to your Splash instance when using it.
1. Using HttpAuthMiddleware
You can use the HttpAuthMiddleware
to send every single request from your spider to Splash. Simply add the following attribute to your spider class:
http_user = '<APIKEY>'
Where <APIKEY>
is your Splash API key (see details below).
2. Using splash_headers
If you only want to make certain requests through Splash, you can send the authorization header manually using the splash_headers
parameter to the SplashRequest
object. See this example:
from w3lib.http import basic_auth_header ... yield SplashRequest( 'http://target.website.com/', splash_headers={'Authorization': basic_auth_header('<APIKEY>', '')} )
Notice that you have to build a basic HTTP authorization header with your API key on it.
Where are my credentials?
You can find the API key (user) and URL for your Splash instance in your organizations's Splash > Setup page, as shown below:
If you haven't signed up for Splash yet, have a look at this article on how to do it.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article