If viewing the logs is not enough, the Page Storage Addon could help inspecting the responses Scrapy Cloud is getting from a job's crawl.


1 - Go to https://app.zyte.com/p/<PROJECT_ID>/addons/page_storage, enable it and configure the settings:



Page storage mode:

  • Cache: Items expire after a month
  • Versioned Cache: Multiple copies are retained, and each one expires after a month


2 - Stored pages are found as collections at https://app.zyte.com/p/<PROJECT_ID>/collections/.






3 - Each stored page could be downloaded as JSON object or viewed from Dash. In order to check the HTML in a browser, the contents of the body field should be saved as HTML in a new file and opened in any browser. 


Fields available per stored page as JSON:


body: html code of the page
_encoding: 
cookies:
url: url of the response
_jobid: job id where the response came from