what is exactly needed to connect to CRAWLERA to scrape stuff???
I am kind of lost at this point...
Best Answer
n
nestor
said
over 4 years ago
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
Simone Gabbriellini
apart from setting:
SPLASH_APIKEY = SCRAPINGHUB_SPLASH_KEY
CRAWLERA_ENABLED = True
CRAWLERA_APIKEY = SCRAPINGHUB_CRAWLERA_KEY
and then make a Scrapy.Spider with
start_request(self):
yield SplashRequest(
url=self.start_urls[0],
endpoint='execute',
callback=self.parse,
args={
'lua_source': self.script,
'crawlera_user': self.settings['CRAWLERA_APIKEY'],
'timeout': 3600,
},
cache_args=['lua_source'],
)
what is exactly needed to connect to CRAWLERA to scrape stuff???
I am kind of lost at this point...
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
Other than that, the rest looks fine. You can refer to this article for more information: https://support.scrapinghub.com/support/solutions/articles/22000188428-using-crawlera-with-splash-scrapy
nestor
CRAWLERA_ENABLED should not be set to 'true' as that will activate the scrapy-crawlera middleware and mess up the order of requests Scrapy > Crawlera > Splash > Website instead of Scrapy > Splash > Crawlera > Website.
Other than that, the rest looks fine. You can refer to this article for more information: https://support.scrapinghub.com/support/solutions/articles/22000188428-using-crawlera-with-splash-scrapy
-
Crawlera 503 Ban
-
Amazon scraping speed
-
Website redirects
-
Error Code 429 Too Many Requests
-
Bing
-
Subscribed to Crawlera but saying Not Subscribed
-
Selenium with c#
-
Using Crawlera with browsermob
-
CRAWLERA_PRESERVE_DELAY leads to error
-
How to connect Selenium PhantomJS to Crawlera?
See all 399 topics