I have a bunch of spiders running on Scrapy Cloud on a periodic basis. I need to be able to write the scraped data to multiple locations.
I am able to do so when I run the spiders on my local machine using the FEEDS variable that I set in custom settings:
custom_settings= { "FEEDS": {f"s3://systems_data/systems_sample_results_page/FILE1.jsonl": {"format": "jsonlines"}, f"s3://systems_data/systems_historical_results/FILE2.jsonl": {"format": "jsonlines"} } }
In Scrapy cloud, there is a custom setting for FEED_URI but not FEEDS. As far as I can tell, this only allows writing to one location.
How do I write scraped data to multiple locations in Scrapy cloud?
Locally, I am using Mac OSx, Scrapy 2.9.0 Python 3.8.8
Aaron McGarvey
I have a bunch of spiders running on Scrapy Cloud on a periodic basis. I need to be able to write the scraped data to multiple locations.
I am able to do so when I run the spiders on my local machine using the FEEDS variable that I set in custom settings:
In Scrapy cloud, there is a custom setting for FEED_URI but not FEEDS. As far as I can tell, this only allows writing to one location.
How do I write scraped data to multiple locations in Scrapy cloud?
Locally, I am using Mac OSx, Scrapy 2.9.0 Python 3.8.8