Crawlera Headless Proxy - Selenium Example Broken

Posted over 6 years ago by Bill Warner

Post a topic

Answered

Bill Warner

After following the instructions for https://github.com/scrapinghub/crawlera-headless-proxy/tree/master/examples/selenium, I received this response.

ubuntu@ip-172-31-88-226:~/crawlera-headless-proxy/examples/selenium$ pipenv run ./run-example.py

<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">Unauthorized Crawlera Header: "x-crawlera-profile"</pre></body></html>

Is there any additional configuration I'm missing? Here is the full list of steps I took to repro.

Set up EC2 Instance

ssh -i "dev-windows.pem"
sudo apt update

Set up Python and Pipenv

sudo apt install python3-pip python3-dev
sudo pip3 install pipenv

echo "PATH=$HOME/.local/bin:$PATH" >> ~/.bashrc
source ~/.bashrc

Set up Docker

sudo apt install docker.io
sudo systemctl start docker
sudo systemctl enable docker
docker --version

Set up Selenium Example

git clone <https://github.com/scrapinghub/crawlera-headless-proxy.git>
cd crawlera-headless-proxy/examples/selenium
sudo pipenv sync

Update docker-compose.yml with API key and scrapinghub/ prefix

version: "2"
services:
  hub:
    image: selenium/hub:3
    depends_on:
      - headless-proxy
    networks:
      default:
        aliases:
          - hub
    ports:
      - 4444:4444

  chrome:
    image: selenium/node-chrome:3
    depends_on:
      - hub
    environment:
      GRID_TIMEOUT: 180  # Default timeout is 30s might be low for Selenium
      HUB_HOST: hub
    volumes:
      - /dev/shm:/dev/shm

  headless-proxy:
    image: scrapinghub/crawlera-headless-proxy
    networks:
      default:
        aliases:
          - proxy
    environment:
      CRAWLERA_HEADLESS_APIKEY: API_KEY_OMITTED

Ran the example

pipenv run ./run-example.py

Thanks!

robotmammoth

0 Votes

peixoto posted over 6 years ago Admin Best Answer

HI there!

The error you are receiving is actually expected.

You current Crawlera plan is C10, which does not allow the usage os X-Crawlera-Profile header, as described here: https://doc.scrapinghub.com/crawlera.html#request-headers

You may remove the header or upgrade your plan to a higher tier in order to use it.

Hope this helps your project.

0 Votes

1 Comments

peixoto posted over 6 years ago Admin Answer

HI there!

The error you are receiving is actually expected.

You current Crawlera plan is C10, which does not allow the usage os X-Crawlera-Profile header, as described here: https://doc.scrapinghub.com/crawlera.html#request-headers

You may remove the header or upgrade your plan to a higher tier in order to use it.

Hope this helps your project.

0 Votes