We need a static URL for scraping results. Right now the URL changes after every run. What is the solution for this?
<project_id>/<spider_id>/<job_id>
How to replace <job_id> with last one as default ?
Best Answer
t
thriveni
said
over 3 years ago
You can use the Scrapinghub Jobs API and python-scrapinghub library. This library interacts with scrapy cloud, hence you can use in the spider to get the Job list and use the latest one.
You can use the Scrapinghub Jobs API and python-scrapinghub library. This library interacts with scrapy cloud, hence you can use in the spider to get the Job list and use the latest one.
Thanks,
Thriveni.
thriveni
said
over 3 years ago
You can also fetch data from latest completed job in csv format using the url
İsmail Aras
We need a static URL for scraping results. Right now the URL changes after every run. What is the solution for this?
<project_id>/<spider_id>/<job_id>
How to replace <job_id> with last one as default ?
You can use the Scrapinghub Jobs API and python-scrapinghub library. This library interacts with scrapy cloud, hence you can use in the spider to get the Job list and use the latest one.
Thanks,
Thriveni.
- Oldest First
- Popular
- Newest First
Sorted by Oldest Firstthriveni
You can use the Scrapinghub Jobs API and python-scrapinghub library. This library interacts with scrapy cloud, hence you can use in the spider to get the Job list and use the latest one.
Thanks,
Thriveni.
thriveni
You can also fetch data from latest completed job in csv format using the url
https://app.scrapinghub.com/api/items.csv?project=PROJECTNUMBER&spider=SPIDERNAME&include_headers=1&fields=FIELDNAME1,FIELDNAME2&apikey=APIKEY '
You need to replace:
PROJECTNUMBER
with your project numberSPIDERNAME
with your spider nameFIELDNAME1
,FIELDNAME2
with the name of the fields, in the order you want them to appear in the CSV columnsAPIKEY
with your Apikey-
Unable to select Scrapy project in GitHub
-
ScrapyCloud can't call spider?
-
Unhandled error in Deferred
-
Item API - Filtering
-
newbie to web scraping but need data from zillow
-
ValueError: Invalid control character
-
Cancelling account
-
Best Practices
-
Beautifulsoup with ScrapingHub
-
Delete a project in ScrapingHub
See all 442 topics