How is it using crawlera for a website which is secured by Cloudflare?

Posted almost 6 years ago by andreyuhai

Post a topic
Answered
a
andreyuhai

Hello there,


I have been scraping a website with Cloudflare security system. I would like to subscribe to a plan here for Crawlera but I also saw some bad comments about Crawlera.


Since I am a student I do not want to waste my money which is not so much.

So is there anyone who had experience scraping a well secured website with Crawlera, especially protected with Cloudflare and having captchas?


Cheers!

0 Votes

Adriana Anghel

Adriana Anghel posted over 5 years ago Admin Best Answer

Cloudfare employs 2-3 different flavors of bot protection. Some of them can be addressed by using the `cfscrape` library - https://github.com/Anorov/cloudflare-scrape

Cloudfare also employs reCaptcha to weed out bots and in such cases you may need to use something like 2Captcha API with splash/headless chrome to get around it. However as an initial step you can use Crawlera to assess the level of protection employed by the site for requests emanating from a certain region. For example the URL https://nitrogensports.eu/dice/play when accessed from a non-US region redirects to a reCaptcha page. However the same URL can be accessed from a US based IP without having to solve the reCaptcha.

1 Votes


1 Comments

Adriana Anghel

Adriana Anghel posted over 5 years ago Admin Answer

Cloudfare employs 2-3 different flavors of bot protection. Some of them can be addressed by using the `cfscrape` library - https://github.com/Anorov/cloudflare-scrape

Cloudfare also employs reCaptcha to weed out bots and in such cases you may need to use something like 2Captcha API with splash/headless chrome to get around it. However as an initial step you can use Crawlera to assess the level of protection employed by the site for requests emanating from a certain region. For example the URL https://nitrogensports.eu/dice/play when accessed from a non-US region redirects to a reCaptcha page. However the same URL can be accessed from a US based IP without having to solve the reCaptcha.

1 Votes

Login to post a comment