Changing the Deploy Environment With Scrapy Cloud Stacks

Modified on Fri, 12 Feb, 2021 at 1:54 AM

You can select the runtime environment for your spiders from a list of pre-defined stacks. Each stack is a runtime environment containing certain versions of packages on it.

For example, if you need your spiders to use specific versions of Scrapy and Python (let's say Scrapy 1.6 + Python 3), set the proper stack in your project's scrapinghub.yml file:

projects:
  default: 12345
stacks:
  default: scrapy:1.6-py3

What does the stack name means?

Stack names consists of a name, a version and, in some cases, a release date:

scrapy:1.6-py3: contains Scrapy version 1.6 running on Python 3
scrapy:1.5: contains Scrapy 1.5 running on Python 2.7 (for the stacks up to version 2.0, the absence of -py3 suffix indicates that the stack runs on Python 2.7)
scrapy:2.0-20200325: contains Scrapy 2.0 running on Python 3. The date indicates when the stack was released.

Where can I see which stack is used in my project?

Go to your project's Code & Deploys page, select the latest build and check the value for the Stack property, as shown below:

My stack is called hworker:20160708. What does that mean?

The hworker stack is used by default for organizations that were created before 2016-06-28 12:00 UTC and is maintained for compatibility reasons only. If you are getting this stack for your new projects, please define a more modern one (scrapy:2.3, for example) in your scrapinghub.yml file, as described above.

What's the default stack used for my deploys?

That depends on when your organization has been created:

before 2016-06-28 12:00 UTC: hworker
after 2016-06-28 12:00 UTC: right now it's scrapy:2.3, but this is usually updated with each major Scrapy release

Where can I find the list of available stacks?

There are two main types of stacks:

scrapy: features the latest stable version of Scrapyalong with all the basic requirements that you need to run a full featured Scrapy spider
- Check out the Scrapy stack releases
hworker: provides backward compatibility with the legacy Scrapy Cloud platform. This stack is used by default for organizations that were created before 2016-06-28 12:00 UTC, but it's not recommended for new organizations.
- Check out the Hworker stack releases

What packages are installed in a given stack?

To see the packages from a given stack, you have to look into the requirements.txt file from the branch that corresponds to the version that you're looking for in the stack repository on GitHub.

For example, let's say that you want to know what are the packages installed on the scrapy:2.3 stack:

Go to the Scrapy stack repository: https://github.com/scrapinghub/scrapinghub-stack-scrapy
Using GitHub UI, select the branch called branch-2.3

And check out the dependencies listed on the branch's requirements.txt file (this one for the given stack)

What if I need extra packages?

If your project depends on Python packages not shipped in any of the stacks, check out Deploying Python Dependencies for your Projects in Scrapy Cloud. If your project have non-Python dependencies (binary ones, for example), check out Deploying Custom Docker images on Scrapy Cloud.