The job outcome indicates whether the job succeeded or failed. By default, it contains the value of the spider close reason from Scrapy. It’s available in the table of completed jobs:
Available job outcomes
Here is a summary of the most common job outcomes. Click on the name for more details:
finished
The job finished successfully. However, it may have produced errors, which you can inspect through the logs.
failed
The job failed to start, typically due to a bug in the spider’s code. Check the last lines of the job log for more information.
cancelled
The job was cancelled from the dashboard, the API or by the system if it got inactive and failed to produce anything (not even log entries) for an hour.
cancelled (24h limit)
The job was cancelled because it exceeded the 24 hours time limit imposed on free organizations. The number may be different - e.g. if limit will change in the future.
cancelled (stalled)
The job was cancelled because it was in running state but wasn't producing any logs, requests or items for 1 hour.
cancel_timeout
The job has failed to shutdown gracefully after cancellation (taking more than 5 minutes).
shutdown
The spider was cancelled prematurely, typically from code. shutdown is the default close reason (outcome) used by Scrapy for such cases. It is what you get, for example, when you cancel a Scrapy spider pressing Ctrl-C
.
memusage_exceeded
The job was consuming too much memory, exceeding the limit (1Gb for each unit), and it was cancelled by the system. This typically happens with spiders that don’t use memory efficiently (keeping state or references that grow quickly over time) and it’s most often manifested on long spider runs of many pages. This outcome is triggered by Scrapy’s Memory Usage Extension.
killed by oom
The job was killed because it tried to consume more memory than it was available to the process. This may happen if Scrapy’s Memory Usage Extension is disabled or when memory usage is growing so fast that Scrapy’s Memory Usage Extension cannot gracefully finish the process and set memusage_exceeded outcome. Available memory is proportional to the number of units used to run the job.
banned
The job was terminated because the spider got banned from the target website. This outcome is often set by the Zyte Smart Proxy Manager(formerly Crawlera) extension.
slybot_fewitems_scraped
This outcome is specific to Portia spiders. The job was cancelled because it wasn’t scraping enough items. This is used in portia to prevent infinite crawling loops. See Minimum items threshold for more details.
closespider_*
The closespider_errorcount, closespider_pagecount, closespider_timeout and closespider_itemcount are set by the CloseSpider Scrapy extension. Refer to its documentation for more details.
project_deleted
The job was killed because project was deleted from Scrapy Cloud.
Deprecated job outcomes
The following outcomes were used in Scrapy Cloud but have been removed and should no longer be set.
no_reason
The job finished successfully but did not set an outcome explicitly. For Scrapy jobs the outcome is taken from the spider close reason (which defaults to finished) but on non-Scrapy jobs this is not the case and jobs will get this outcome.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article