The spider scrapes quite a big website, and it stops after 24 hours running, reporting "Received SIGTERM, shutting down gracefully. Send again to force". I'm sure I didn't stop it. This looks like the memory leaks problem, but I put effort to check this and the following shows against it:
a) gc shows the number of objects oscillating around the same values (50000 -71000 range), when checking periodically (in spider_idle method)
b) tracemalloc also show some leaks in the top of the list but the size doesn't grow, but the top-size leak in w3lib size oscillates in 1-10Mb when checking at the same place, compared to the first check
c) the same spider running on my PC doesn't stop, moreover it shows almost the stable memory consumption over time (217->218Mb growth over 12 hours)
0 Votes
nestor posted
over 7 years ago
AdminBest Answer
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
Hard to say, Wendy, without looking at logs. Maybe you could start with https://doc.scrapy.org/en/latest/topics/signals.html to find out the possible reasons, also I would suggest you post your question in Stackoverflow with a 'scrapy' tag.
0 Votes
文
文 姜posted
over 7 years ago
Hey nestor,
I am having the same issue, but I am not using scrapy cloud. I use scrapyd to run scrapy spiders. Do you by any chance know any other potential reasons?
Thanks,
Wendy
0 Votes
nestorposted
over 7 years ago
AdminAnswer
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
The spider scrapes quite a big website, and it stops after 24 hours running, reporting "Received SIGTERM, shutting down gracefully. Send again to force". I'm sure I didn't stop it. This looks like the memory leaks problem, but I put effort to check this and the following shows against it:
a) gc shows the number of objects oscillating around the same values (50000 -71000 range), when checking periodically (in spider_idle method)
b) tracemalloc also show some leaks in the top of the list but the size doesn't grow, but the top-size leak in w3lib size oscillates in 1-10Mb when checking at the same place, compared to the first check
c) the same spider running on my PC doesn't stop, moreover it shows almost the stable memory consumption over time (217->218Mb growth over 12 hours)
0 Votes
nestor posted over 7 years ago Admin Best Answer
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
For more information, please see:
0 Votes
3 Comments
nestor posted over 7 years ago Admin
Hard to say, Wendy, without looking at logs. Maybe you could start with https://doc.scrapy.org/en/latest/topics/signals.html to find out the possible reasons, also I would suggest you post your question in Stackoverflow with a 'scrapy' tag.
0 Votes
文 姜 posted over 7 years ago
Hey nestor,
I am having the same issue, but I am not using scrapy cloud. I use scrapyd to run scrapy spiders. Do you by any chance know any other potential reasons?
Thanks,
Wendy
0 Votes
nestor posted over 7 years ago Admin Answer
Hi Sergei,
There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.
For more information, please see:
0 Votes
Login to post a comment