videocamWeb Data Extraction Summit - September 30th, 2021.
Join some of the greatest minds in web scraping to educate, inspire, and innovate.
Register for free!
Start a new topic
Answered

Scraping a large URL list

 I have a large URL list (50k) in the form of a csv. Locally, I can open the csv with my spider like any other file and crawl the URLs. Is it possible to parse URLs from a csv on scrapinghub? When I deploy my project as is, scrapy cloud does not know where to find the csv. Any ideas would be welcome.


Best Answer

You need to declare the files in the package_data  section of your setup.py  file as given in Deploying non-code files.


Regards,

Thriveni Patil


Answer

You need to declare the files in the package_data  section of your setup.py  file as given in Deploying non-code files.


Regards,

Thriveni Patil

Hi, 

Can you share your code please 

How do you access csv from within spiders? having similar issue 

Login to post a comment