I have used subscribed scrapy instance for a period of time, I found the average of request is 500 per hour, it's a little bit slow to my business, how can I improve the speed of request.
Btw, I am not saying the way of adding more instances because each of my website is using a spiede to scrape the data.
Looking forward to your kind response.
Best Answer
n
nestor
said
about 4 years ago
I would suggest you turn off AUTOTHROTTLE, it's not recommended to use Crawlera with AUTOTHROTTLE since that results in you not being in control of concurrent requests and most importantly "double throttling".
Yes, you are able to see I had already made it closed before I posted this question.
any further help.
as I described in another topic, 2000+ request almost cost me 8hrs? was that normal?
Regards
L
Lilu Cao
said
about 4 years ago
nestor
said
about 4 years ago
Answer
I would suggest you turn off AUTOTHROTTLE, it's not recommended to use Crawlera with AUTOTHROTTLE since that results in you not being in control of concurrent requests and most importantly "double throttling".
Lilu Cao
Hi team,
I have used subscribed scrapy instance for a period of time, I found the average of request is 500 per hour, it's a little bit slow to my business, how can I improve the speed of request.
Btw, I am not saying the way of adding more instances because each of my website is using a spiede to scrape the data.
Looking forward to your kind response.
I would suggest you turn off AUTOTHROTTLE, it's not recommended to use Crawlera with AUTOTHROTTLE since that results in you not being in control of concurrent requests and most importantly "double throttling".
- Oldest First
- Popular
- Newest First
Sorted by Newest FirstLilu Cao
Yes, you are able to see I had already made it closed before I posted this question.
any further help.
as I described in another topic, 2000+ request almost cost me 8hrs? was that normal?
Regards
Lilu Cao
nestor
I would suggest you turn off AUTOTHROTTLE, it's not recommended to use Crawlera with AUTOTHROTTLE since that results in you not being in control of concurrent requests and most importantly "double throttling".
-
Unable to select Scrapy project in GitHub
-
ScrapyCloud can't call spider?
-
Unhandled error in Deferred
-
Item API - Filtering
-
newbie to web scraping but need data from zillow
-
ValueError: Invalid control character
-
Cancelling account
-
Best Practices
-
Beautifulsoup with ScrapingHub
-
Delete a project in ScrapingHub
See all 453 topics