Spider is stopped by SIGTERM

Posted about 8 years ago by sergei-sh

Post a topic

Answered

sergei-sh

The spider scrapes quite a big website, and it stops after 24 hours running, reporting "Received SIGTERM, shutting down gracefully. Send again to force". I'm sure I didn't stop it. This looks like the memory leaks problem, but I put effort to check this and the following shows against it:

a) gc shows the number of objects oscillating around the same values (50000 -71000 range), when checking periodically (in spider_idle method)

b) tracemalloc also show some leaks in the top of the list but the size doesn't grow, but the top-size leak in w3lib size oscillates in 1-10Mb when checking at the same place, compared to the first check

c) the same spider running on my PC doesn't stop, moreover it shows almost the stable memory consumption over time (217->218Mb growth over 12 hours)

0 Votes

nestor posted about 8 years ago Admin Best Answer

Hi Sergei,

There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.

For more information, please see:

0 Votes

3 Comments

nestor posted about 8 years ago Admin

Hard to say, Wendy, without looking at logs. Maybe you could start with https://doc.scrapy.org/en/latest/topics/signals.html to find out the possible reasons, also I would suggest you post your question in Stackoverflow with a 'scrapy' tag.

0 Votes

文

文姜 posted about 8 years ago

Hey nestor,

I am having the same issue, but I am not using scrapy cloud. I use scrapyd to run scrapy spiders. Do you by any chance know any other potential reasons?

Thanks,

Wendy

0 Votes

nestor posted about 8 years ago Admin Answer

Hi Sergei,

There's a 24 hour job runtime limitation on free accounts, to remove such limitation you'll have to purchase a Scrapy Cloud unit.

For more information, please see:

0 Votes