videocamWeb Data Extraction Summit - September 30th, 2021.
Join some of the greatest minds in web scraping to educate, inspire, and innovate.
Register for free!
Start a new topic

Spider takes 30 seconds to start

Hello,


I've got a spider on Scrapy Cloud that I request to start at 2021-04-10 12:05:45 UTC.


However from my logs I see that the spider doesn't even initiate until ~30 seconds later each time, then runs for another ~15 seconds or so before it starts actually crawling.  For my application, I really need to shorten this delay.

Is there any way to to minimize this delay, especially the time between my start command and for the process to start?

Thanks.



0:2021-04-10 12:06:21INFO

Log opened.

1:2021-04-10 12:06:21INFO

[scrapy.log] Scrapy 1.3.3 started

2:2021-04-10 12:06:21INFO

[scrapy.utils.log] Scrapy 1.3.3 started (bot: stocknews)

3:2021-04-10 12:06:21INFO

[scrapy.utils.log] Overridden settings: {'NEWSPIDER_MODULE': 'stocknews.spiders', 'STATS_CLASS': 'sh_scrapy.stats.HubStorageStatsCollector', 'LOG_LEVEL': 'INFO', 'SPIDER_MODULES': ['stocknews.spiders'], 'AUTOTHROTTLE_ENABLED': True, 'LOG_ENABLED': False, 'MEMUSAGE_LIMIT_MB': 950, 'TELNETCONSOLE_HOST': '0.0.0.0', 'BOT_NAME': 'stocknews', 'MEMUSAGE_ENABLED': True}

4:2021-04-10 12:06:21INFO

[scrapy_dotpersistence] Syncing .scrapy directory from s3://scrapinghub-app-dash-addons/org-125731/184920/dot-scrapy/sec/

5:2021-04-10 12:06:34INFO

[scrapy.middleware] Enabled extensions:

 More
6:2021-04-10 12:06:34INFO

[scrapy.middleware] Enabled downloader middlewares:

 More
7:2021-04-10 12:06:34INFO

[scrapy.middleware] Enabled spider middlewares:

 Less
['sh_scrapy.diskquota.DiskQuotaSpiderMiddleware',
 'sh_scrapy.middlewares.HubstorageSpiderMiddleware',
 'scrapy_deltafetch.middleware.DeltaFetch',
 'scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy_deltafetch.DeltaFetch',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
8:2021-04-10 12:06:34INFO

[scrapy.middleware] Enabled item pipelines:

 More
9:2021-04-10 12:06:34INFO

[scrapy.core.engine] Spider opened

Login to post a comment