In addition to Scrapy spiders, you can also run custom, standalone python scripts on Scrapy Cloud. They need to be declared in the scripts
section of your project setup.py
file.
⚠ Note that the project deployed still needs to be a Scrapy project. This is a limitation that will be removed in the future.
Here is a setup.py
example for a project that ships a hello.py
script:
from setuptools import setup, find_packages
setup(
name = 'myproject',
version = '1.0',
packages = find_packages(),
scripts = ['bin/hello.py'],
entry_points = {'scrapy': ['settings = myproject.settings']},
)
After you deploy your project, you will see the py:hello.py
script on the Zyte dashboard, in the Run pop-up dialog and in the Add periodic job pop-up dialog.
It’s also possible to schedule a script via the Scrapy Cloud API:
curl -u API_KEY: -X POST https://app.zyte.com/api/schedule.json -d "project=123" -d "spider=py:hello.py" -d "cmd_args=-a --loglevel=10 x y"
And with the python-scrapinghub library:
from scrapinghub import Connection
conn = Connection('API_KEY')
project = conn[123]
project.schedule('py:hello.py', cmd_args='-a --loglevel=10 x y')