In addition to Scrapy spiders, you can also run custom, standalone python scripts on Scrapy Cloud. They need to be declared in the scripts
section of your project setup.py
file.
⚠ Note that the project deployed still needs to be a Scrapy project. This is a limitation that will be removed in the future.
Here is a setup.py
example for a project that ships a hello.py
script:
from setuptools import setup, find_packages setup( name = 'myproject', version = '1.0', packages = find_packages(), scripts = ['bin/hello.py'], entry_points = {'scrapy': ['settings = myproject.settings']}, )
After you deploy your project, you will see the py:hello.py
script on the Zyte dashboard, in the Run pop-up dialog and in the Add periodic job pop-up dialog.
It’s also possible to schedule a script via the Zyte API:
curl -u API_KEY: -X POST https://app.zyte.com/api/schedule.json -d "project=123" -d "spider=py:hello.py" -d "cmd_args=-a --loglevel=10 x y"
And with the python-scrapinghub library:
from scrapinghub import Connection conn = Connection('API_KEY') project = conn[123] project.schedule('py:hello.py', cmd_args='-a --loglevel=10 x y')