Error Code 429 Too Many Requests

Posted over 7 years ago by atomant

Post a topic
Answered
a
atomant

I'm scanning a major retailer that has ~400k SKUs and would like to be able to scrape all of these daily in the future.


Right now, I'm doing a small subset of that. After the first few days of scraping without errors I'm now getting a lot of 429s. I'm using 4 heroku worker dynos that scan 100 urls in a row. 


Is this just an issues of there aren't enough proxy IPs? Or is there something I can do to get around these errors?

0 Votes

thriveni

thriveni posted over 7 years ago Admin Best Answer

Yes 429s errors are thrown when the parallel connection limit have reached for the plan. And the limit is cumulative of all domains.


Glad to know that you could resolve the issue.

1 Votes


4 Comments

Sorted by
K

Khoa Nguyen posted almost 7 years ago

Hi,
I am on a C10 plan and I already specified CONCURRENT_REQUESTS = 10 so why I am still getting 429 error?

0 Votes

thriveni

thriveni posted over 7 years ago Admin Answer

Yes 429s errors are thrown when the parallel connection limit have reached for the plan. And the limit is cumulative of all domains.


Glad to know that you could resolve the issue.

1 Votes

a

atomant posted over 7 years ago

I think I figured it out, each dyno had multiple threads

0 Votes

a

atomant posted over 7 years ago

It looks like the 429 is coming from Crawlera, not the site? 


I currently have the C10 plan, which says 10 concurrent connections. I'm using 4 dynos so I'm not sure why this wouldn't work

0 Votes

Login to post a comment