I'm frustrated, and I'm sure it's something really dumb I'm missing. I have a spider that I've used for easily a year or so now in some experimentation to populate Solr using a pipeline. I've stripped out the pipeline, moved the spider to a completely new experimental repo, deployed successfully, run successfully in scrapinghub, but there are no items once the job executes.
Do I have to have the spider produce json to stdout or something? Does scrapinghub just pick up whatever is yielded by the spider and ignore everything else? Help! What am I missing?
I'm frustrated, and I'm sure it's something really dumb I'm missing. I have a spider that I've used for easily a year or so now in some experimentation to populate Solr using a pipeline. I've stripped out the pipeline, moved the spider to a completely new experimental repo, deployed successfully, run successfully in scrapinghub, but there are no items once the job executes.
Do I have to have the spider produce json to stdout or something? Does scrapinghub just pick up whatever is yielded by the spider and ignore everything else? Help! What am I missing?
BTW - my spider / experimental repo is here: https://github.com/davidlday/scrapinghub-experiment/tree/init
0 Votes
0 Comments
Login to post a comment