The spider scrapes quite a big website, and it stops after 24 hours running, reporting "Received SIGTERM, shutting down gracefully. Send again to force". I'm sure I didn't stop it. This looks like the memory leaks problem, but I put effort to check this and the following shows against it:
a) gc shows the number of objects oscillating around the same values (50000 -71000 range), when checking periodically (in spider_idle method)
b) tracemalloc also show some leaks in the top of the list but the size doesn't grow, but the top-size leak in w3lib size oscillates in 1-10Mb when checking at the same place, compared to the first check
c) the same spider running on my PC doesn't stop, moreover it shows almost the stable memory consumption over time (217->218Mb growth over 12 hours)
Best Answer
n
nestor
said
almost 6 years ago
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
I am having the same issue, but I am not using scrapy cloud. I use scrapyd to run scrapy spiders. Do you by any chance know any other potential reasons?
Thanks,
Wendy
nestor
said
almost 6 years ago
Hard to say, Wendy, without looking at logs. Maybe you could start with https://doc.scrapy.org/en/latest/topics/signals.html to find out the possible reasons, also I would suggest you post your question in Stackoverflow with a 'scrapy' tag.
sergei-sh
The spider scrapes quite a big website, and it stops after 24 hours running, reporting "Received SIGTERM, shutting down gracefully. Send again to force". I'm sure I didn't stop it. This looks like the memory leaks problem, but I put effort to check this and the following shows against it:
a) gc shows the number of objects oscillating around the same values (50000 -71000 range), when checking periodically (in spider_idle method)
b) tracemalloc also show some leaks in the top of the list but the size doesn't grow, but the top-size leak in w3lib size oscillates in 1-10Mb when checking at the same place, compared to the first check
c) the same spider running on my PC doesn't stop, moreover it shows almost the stable memory consumption over time (217->218Mb growth over 12 hours)
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
For more information, please see:
- Oldest First
- Popular
- Newest First
Sorted by Oldest Firstnestor
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
For more information, please see:
文 姜
Hey nestor,
I am having the same issue, but I am not using scrapy cloud. I use scrapyd to run scrapy spiders. Do you by any chance know any other potential reasons?
Thanks,
Wendy
nestor
Hard to say, Wendy, without looking at logs. Maybe you could start with https://doc.scrapy.org/en/latest/topics/signals.html to find out the possible reasons, also I would suggest you post your question in Stackoverflow with a 'scrapy' tag.
-
Unable to select Scrapy project in GitHub
-
ScrapyCloud can't call spider?
-
Unhandled error in Deferred
-
Item API - Filtering
-
newbie to web scraping but need data from zillow
-
ValueError: Invalid control character
-
Cancelling account
-
Best Practices
-
Beautifulsoup with ScrapingHub
-
Delete a project in ScrapingHub
See all 438 topics