I'm using crawlera to scrape Amazon but nothing massive, just some thousands pages a month.
It works fine but each page takes about 5 to 10 seconds to complete the job, is this normal? 10 seconds means that in a month we only scrape 260,000 pages a month.
I read somewhere that to scrape big guys I need an "Enterprise" plan, so does this mean that all C* plans are not recommended if you want to scrape Amazon?
0 Votes
surge posted
almost 7 years ago
AdminBest Answer
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
0 Votes
3 Comments
Sorted by
c
chrisjanwustposted
over 5 years ago
Considering using Crawlera for the same purposes. Could anyone give an indication of the relationship between their package (C10, C50, C200) and speed (requests per minute)?
0 Votes
k
krisyoges2323posted
almost 7 years ago
So no point to subscribe to higher plan right?
0 Votes
surgeposted
almost 7 years ago
AdminAnswer
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
I'm using crawlera to scrape Amazon but nothing massive, just some thousands pages a month.
It works fine but each page takes about 5 to 10 seconds to complete the job, is this normal? 10 seconds means that in a month we only scrape 260,000 pages a month.
I read somewhere that to scrape big guys I need an "Enterprise" plan, so does this mean that all C* plans are not recommended if you want to scrape Amazon?
0 Votes
surge posted almost 7 years ago Admin Best Answer
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
0 Votes
3 Comments
chrisjanwust posted over 5 years ago
Considering using Crawlera for the same purposes. Could anyone give an indication of the relationship between their package (C10, C50, C200) and speed (requests per minute)?
0 Votes
krisyoges2323 posted almost 7 years ago
0 Votes
surge posted almost 7 years ago Admin Answer
For the shared pool and such a popular target website, it is normal. Speed is not Crawlera's priority. The priority is to crawl politely and avoid getting banned (Crawlera FAQ, last paragraph). Generally, it is advised to make full use of the number of concurrent connections allowed by a given C* plan. And yes, there are other, more expensive options, if the shared pool's capacity doesn't bring desired results.
0 Votes
Login to post a comment