Start a new topic
Answered

google search results crawl using Crawlera

Hello


I would like to subscribe to Crawlera to start serp crawling, i have average of 100,000 request a month with about sometimes 10 parallels requests a second.


Im confused about the pricing plans which says "Concurrent requests: 10" for C10 plans while in requests limits https://doc.scrapinghub.com/crawlera.html#request-limits it says "Crawlera’s default request limit is 5 requests per second (rps) for each website"

 

also i dont get what you mean by "There is a default delay of 12 seconds between each request and a default delay of 1 second between requests through the same slave" how come in the previous line you say 5 requests per second and in this line you say default delay of 12 seconds between each request !!??
what exactly this delay means ?


If possible i need extensive explain to "Request Limits" with some example of possible.


Thanks 


Best Answer

The request limit on https://doc.scrapinghub.com/crawlera.html#request-limits refers to the throughput from Crawlera, meaning that you will achieve a max of 300 requests per minute. But this is only the default value for most websites, not all websites have this request limit.


Regarding the other line, it should be 12 seconds of delay when using the same slave, e.g. when using Crawlera Sessions, there's a default delay of 12 seconds per request (added throttling).


I hope this clarifies the information.


May I ask how you setup such a Google SERP crawler using Crawlera, without deploying any custom written scripts or code?

A question to all of you - how exactly is it possible to setup SERP crawling and extracting maximum number of Google search results (titles+urls) per each query - without writing any code? Soorry for the newbie question like that - but I am not a developer, and have this urgent task to be done, thought that Crawlera can help right away with such a task?

Throttling may apply regardless of using sessions or not, on a CXX plan you'll be on a shared pool so there's other users crawling the same domain and the throttling applies to the domain not per user.

You can send 10 concurrent requests on the C10 plan, 50 on the C50 and so on.

Thanks nestor, that means throttling apply only to sessions, since im crwling google this will not apply, right ?

i can just send 5 requests at same time to google and return the serp page.

Answer

The request limit on https://doc.scrapinghub.com/crawlera.html#request-limits refers to the throughput from Crawlera, meaning that you will achieve a max of 300 requests per minute. But this is only the default value for most websites, not all websites have this request limit.


Regarding the other line, it should be 12 seconds of delay when using the same slave, e.g. when using Crawlera Sessions, there's a default delay of 12 seconds per request (added throttling).


I hope this clarifies the information.

Login to post a comment