We need a static URL for scraping results.

Posted almost 5 years ago by İsmail Aras

Post a topic
Answered
İ
İsmail Aras

We need a static URL for scraping results. Right now the URL changes after every run. What is the solution for this?


<project_id>/<spider_id>/<job_id> 

How to replace <job_id> with last one as default ?

0 Votes

thriveni

thriveni posted almost 5 years ago Admin Best Answer

You can use the Scrapinghub Jobs API and python-scrapinghub library. This library interacts with scrapy cloud, hence you can use in the spider to get the Job list and use the latest one. 


Thanks,

Thriveni.

0 Votes


2 Comments

Sorted by
thriveni

thriveni posted almost 5 years ago Admin

You can also fetch data from latest completed job in csv format using the url 

https://app.scrapinghub.com/api/items.csv?project=PROJECTNUMBER&spider=SPIDERNAME&include_headers=1&fields=FIELDNAME1,FIELDNAME2&apikey=APIKEY '


You need to replace:


  • PROJECTNUMBER  with your project number
  • SPIDERNAME with your spider name
  • FIELDNAME1 , FIELDNAME2  with the name of the fields, in the order you want them to appear in the CSV columns
  • APIKEY  with your Apikey


0 Votes

thriveni

thriveni posted almost 5 years ago Admin Answer

You can use the Scrapinghub Jobs API and python-scrapinghub library. This library interacts with scrapy cloud, hence you can use in the spider to get the Job list and use the latest one. 


Thanks,

Thriveni.

0 Votes

Login to post a comment