Start a new topic

Scrapy Cloud Job Config

I am trying to create a CrawlSpider. I defined the spider class as:

class SiteSpider(Args[SiteParams], CrawlSpider):


The SiteParams is a Pydantic class defining the inputs to the spider.

I want to define/use CrawlSpider Rules that use these input variables. How would I wire that together? It looks like the init of crawlspider compiles the rules, but that is probably also what gets the args. How do I wire that together? Also, and maybe this is a separate question - if I want the input to be a json doc, not a string or number - are there any programmatic examples of calling scrapy cloud with such configuration to start a job?

Login to post a comment