I have a large URL list (50k) in the form of a csv. Locally, I can open the csv with my spider like any other file and crawl the URLs. Is it possible to parse URLs from a csv on scrapinghub? When I deploy my project as is, scrapy cloud does not know where to find the csv. Any ideas would be welcome.
Best Answer
t
thriveni
said
over 5 years ago
You need to declare the files in thepackage_data section of yoursetup.py file as given in Deploying non-code files.
QLMarketing
I have a large URL list (50k) in the form of a csv. Locally, I can open the csv with my spider like any other file and crawl the URLs. Is it possible to parse URLs from a csv on scrapinghub? When I deploy my project as is, scrapy cloud does not know where to find the csv. Any ideas would be welcome.
You need to declare the files in the
package_data
section of yoursetup.py
file as given in Deploying non-code files.Regards,
Thriveni Patil
- Oldest First
- Popular
- Newest First
Sorted by Oldest Firstthriveni
You need to declare the files in the
package_data
section of yoursetup.py
file as given in Deploying non-code files.Regards,
Thriveni Patil
rafalf
Hi,
Can you share your code please
How do you access csv from within spiders? having similar issue
-
Unable to select Scrapy project in GitHub
-
ScrapyCloud can't call spider?
-
Unhandled error in Deferred
-
Item API - Filtering
-
newbie to web scraping but need data from zillow
-
ValueError: Invalid control character
-
Cancelling account
-
Best Practices
-
Beautifulsoup with ScrapingHub
-
Delete a project in ScrapingHub
See all 438 topics