How to launch a large-scale web scraping project? Find out how LexisNexis did it. Join the webinar on 29th March.Register now
Start a new topic

job storage url not being updated after job completed

I use the content from the storage URL (https://storage.scrapinghub.com/activity/nnnnnn/n/nnn?count=1&apikey=[api key]) to know when my task has finished. After starting the crawl, I poll the url every few seconds to check it.


I have just noticed that this isn't being updated in all cases when the job finishes.


It should say e.g.

{"job":"3****4/1/315","event":"job:completed","user":"jobrunner"}

when completed, but it is still reporting e.g.

{"job":"3****1/3/392","event":"job:started","user":"jobrunner"}

tens of minutes after the job has completed.


This all worked fine until 13th Sept, and now doesn't. Any ideas, please?


Could you provide real examples of jobs that are missing from the Activity API or Activity tab in the UI?

I have done some more research and re-posted this under a better title.

Login to post a comment