Start a new topic

rSelenium & crawlera

I have a project that scrapes sites using an R script with rSelenium. My R code talks to a selenium driver from a pre-built docker image selenium/standalone-firefox.

Can anyone point me to how I go about integrating crawlera into this workflow? 

Is it correct that I need to install the crawlera-headless-proxy? I've tried that, but the crawlera-headless-proxy command never returns and the IP address remains unchanged.

Is there a way of making a crawlera-headless-proxy docker container work with/around a selenium docker container? 

1 Comment

Replied on the ticket you created for the same, posting it here too for other to benefit

Yes you are correct, you will required to use crawlera-headless-proxy in order to user selenium with crawlera. The process flow will change, now R -> Selenium -> crawlera-headless-proxy -> Request

You can run crawlera-headless-proxy in a separate container and directly install on your system

Instruction in following link
How to use crawlera with headless browsers it also includes a selenium example. Give it a try and let me know if there is any error. Screenshots and logs would be useful

Regarding crawlera-headless-proxy never returns, it is running but it does not show any output if no request is sent to it. For starters kindly start with -d option to run it in debug mode

Login to post a comment