I've got a spider on Scrapy Cloud that I request to start at 2021-04-10 12:05:45 UTC.
However from my logs I see that the spider doesn't even initiate until ~30 seconds later each time, then runs for another ~15 seconds or so before it starts actually crawling. For my application, I really need to shorten this delay.
Is there any way to to minimize this delay, especially the time between my start command and for the process to start?
I've got a spider on Scrapy Cloud that I request to start at 2021-04-10 12:05:45 UTC.
However from my logs I see that the spider doesn't even initiate until ~30 seconds later each time, then runs for another ~15 seconds or so before it starts actually crawling. For my application, I really need to shorten this delay.
Is there any way to to minimize this delay, especially the time between my start command and for the process to start?
Thanks.
Log opened.
[scrapy.log] Scrapy 1.3.3 started
[scrapy.utils.log] Scrapy 1.3.3 started (bot: stocknews)
[scrapy.utils.log] Overridden settings: {'NEWSPIDER_MODULE': 'stocknews.spiders', 'STATS_CLASS': 'sh_scrapy.stats.HubStorageStatsCollector', 'LOG_LEVEL': 'INFO', 'SPIDER_MODULES': ['stocknews.spiders'], 'AUTOTHROTTLE_ENABLED': True, 'LOG_ENABLED': False, 'MEMUSAGE_LIMIT_MB': 950, 'TELNETCONSOLE_HOST': '0.0.0.0', 'BOT_NAME': 'stocknews', 'MEMUSAGE_ENABLED': True}
[scrapy_dotpersistence] Syncing .scrapy directory from s3://scrapinghub-app-dash-addons/org-125731/184920/dot-scrapy/sec/
[scrapy.middleware] Enabled extensions:
More[scrapy.middleware] Enabled downloader middlewares:
More[scrapy.middleware] Enabled spider middlewares:
Less[scrapy.middleware] Enabled item pipelines:
More[scrapy.core.engine] Spider opened
0 Votes
0 Comments
Login to post a comment