Amazon scraping taking 5~10 seconds each page load. Is this normal?
e
ethmz
started a topic
over 6 years ago
I'm using crawlera to scrape Amazon but nothing massive, just some thousands pages a month.
It works fine but each page takes about 5 to 10 seconds to complete the job, is this normal? 10 seconds means that in a month we only scrape 260,000 pages a month.
I read somewhere that to scrape big guys I need an "Enterprise" plan, so does this mean that all C* plans are not recommended if you want to scrape Amazon?
Best Answer
s
surge
said
over 6 years ago
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
k
krisyoges2323
said
over 6 years ago
So no point to subscribe to higher plan right?
c
chrisjanwust
said
over 5 years ago
Considering using Crawlera for the same purposes. Could anyone give an indication of the relationship between their package (C10, C50, C200) and speed (requests per minute)?
ethmz
I'm using crawlera to scrape Amazon but nothing massive, just some thousands pages a month.
It works fine but each page takes about 5 to 10 seconds to complete the job, is this normal? 10 seconds means that in a month we only scrape 260,000 pages a month.
I read somewhere that to scrape big guys I need an "Enterprise" plan, so does this mean that all C* plans are not recommended if you want to scrape Amazon?
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
- Oldest First
- Popular
- Newest First
Sorted by Oldest Firstsurge
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
krisyoges2323
chrisjanwust
Considering using Crawlera for the same purposes. Could anyone give an indication of the relationship between their package (C10, C50, C200) and speed (requests per minute)?
-
Crawlera 503 Ban
-
Amazon scraping speed
-
Website redirects
-
Error Code 429 Too Many Requests
-
Bing
-
Subscribed to Crawlera but saying Not Subscribed
-
Selenium with c#
-
Using Crawlera with browsermob
-
CRAWLERA_PRESERVE_DELAY leads to error
-
How to connect Selenium PhantomJS to Crawlera?
See all 399 topics