Amazon scraping taking 5~10 seconds each page load. Is this normal?
e
ethmz
started a topic
about 6 years ago
I'm using crawlera to scrape Amazon but nothing massive, just some thousands pages a month.
It works fine but each page takes about 5 to 10 seconds to complete the job, is this normal? 10 seconds means that in a month we only scrape 260,000 pages a month.
I read somewhere that to scrape big guys I need an "Enterprise" plan, so does this mean that all C* plans are not recommended if you want to scrape Amazon?
Best Answer
s
surge
said
about 6 years ago
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
Considering using Crawlera for the same purposes. Could anyone give an indication of the relationship between their package (C10, C50, C200) and speed (requests per minute)?
k
krisyoges2323
said
about 6 years ago
So no point to subscribe to higher plan right?
surge
said
about 6 years ago
Answer
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
ethmz
I'm using crawlera to scrape Amazon but nothing massive, just some thousands pages a month.
It works fine but each page takes about 5 to 10 seconds to complete the job, is this normal? 10 seconds means that in a month we only scrape 260,000 pages a month.
I read somewhere that to scrape big guys I need an "Enterprise" plan, so does this mean that all C* plans are not recommended if you want to scrape Amazon?
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
- Oldest First
- Popular
- Newest First
Sorted by Popularchrisjanwust
Considering using Crawlera for the same purposes. Could anyone give an indication of the relationship between their package (C10, C50, C200) and speed (requests per minute)?
krisyoges2323
surge
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
-
Crawlera 503 Ban
-
Amazon scraping speed
-
Website redirects
-
Error Code 429 Too Many Requests
-
Bing
-
Subscribed to Crawlera but saying Not Subscribed
-
Selenium with c#
-
Using Crawlera with browsermob
-
CRAWLERA_PRESERVE_DELAY leads to error
-
How to connect Selenium PhantomJS to Crawlera?
See all 395 topics