My job outcome is cancelled (stalled) repeatedly after the scraping is over and the scrapy_dotpersitence addon stores the .scrapy directory to S3:
[scrapy_dotpersistence] Syncing .scrapy directory to s3://scrapinghub-app-dash-addons/org-176226/[...]/dot-scrapy/immo[...]/
1090: 2017-12-26 17:50:02 INFO
[scrapy.crawler] Received SIGTERM, shutting down gracefully. Send again to force
I tried to delete the httpcache folder in the console, but the syncing duration is over an hour and the job is getting canceled anyway.
How can I solve this issue? Can I "reset" the S3 folder directly?
0 Votes
nestor posted
almost 7 years ago
AdminBest Answer
Jobs will get cancelled if they're not doing anything for an hour, you could add some log every hour or so, so that the job doesn't get cancelled.
0 Votes
4 Comments
Sorted by
c
chopsposted
almost 7 years ago
I'm not facing this issue, because I've deleted the old project und created a new one.
Is it possible to insert own S3 Credentials for scrapy_dotpersistence?
0 Votes
thriveniposted
almost 7 years ago
Admin
Do let us know if you are still facing the issue? I do not see any jobs getting stalled in the account.
0 Votes
c
chopsposted
almost 7 years ago
Thanks for your answer. The spider is closed before the syncing starts. Where can I add this logging?
0 Votes
nestorposted
almost 7 years ago
AdminAnswer
Jobs will get cancelled if they're not doing anything for an hour, you could add some log every hour or so, so that the job doesn't get cancelled.
My job outcome is cancelled (stalled) repeatedly after the scraping is over and the scrapy_dotpersitence addon stores the .scrapy directory to S3:
I tried to delete the httpcache folder in the console, but the syncing duration is over an hour and the job is getting canceled anyway.
How can I solve this issue? Can I "reset" the S3 folder directly?
0 Votes
nestor posted almost 7 years ago Admin Best Answer
Jobs will get cancelled if they're not doing anything for an hour, you could add some log every hour or so, so that the job doesn't get cancelled.
0 Votes
4 Comments
chops posted almost 7 years ago
Is it possible to insert own S3 Credentials for scrapy_dotpersistence?
0 Votes
thriveni posted almost 7 years ago Admin
Do let us know if you are still facing the issue? I do not see any jobs getting stalled in the account.
0 Votes
chops posted almost 7 years ago
0 Votes
nestor posted almost 7 years ago Admin Answer
Jobs will get cancelled if they're not doing anything for an hour, you could add some log every hour or so, so that the job doesn't get cancelled.
0 Votes
Login to post a comment