what is exactly needed to connect to CRAWLERA to scrape stuff???
I am kind of lost at this point...
0 Votes
nestor posted
almost 5 years ago
AdminBest Answer
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
apart from setting:
SPLASH_APIKEY = SCRAPINGHUB_SPLASH_KEY
CRAWLERA_ENABLED = True
CRAWLERA_APIKEY = SCRAPINGHUB_CRAWLERA_KEY
and then make a Scrapy.Spider with
start_request(self):
yield SplashRequest(
url=self.start_urls[0],
endpoint='execute',
callback=self.parse,
args={
'lua_source': self.script,
'crawlera_user': self.settings['CRAWLERA_APIKEY'],
'timeout': 3600,
},
cache_args=['lua_source'],
)
what is exactly needed to connect to CRAWLERA to scrape stuff???
I am kind of lost at this point...
0 Votes
nestor posted almost 5 years ago Admin Best Answer
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
Other than that, the rest looks fine. You can refer to this article for more information: https://support.scrapinghub.com/support/solutions/articles/22000188428-using-crawlera-with-splash-scrapy
0 Votes
1 Comments
nestor posted almost 5 years ago Admin Answer
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
Other than that, the rest looks fine. You can refer to this article for more information: https://support.scrapinghub.com/support/solutions/articles/22000188428-using-crawlera-with-splash-scrapy
0 Votes
Login to post a comment