I have a large URL list (50k) in the form of a csv. Locally, I can open the csv with my spider like any other file and crawl the URLs. Is it possible to parse URLs from a csv on scrapinghub? When I deploy my project as is, scrapy cloud does not know where to find the csv. Any ideas would be welcome.
Best Answer
t
thriveni
said
about 6 years ago
You need to declare the files in thepackage_data section of yoursetup.py file as given in Deploying non-code files.
QLMarketing
I have a large URL list (50k) in the form of a csv. Locally, I can open the csv with my spider like any other file and crawl the URLs. Is it possible to parse URLs from a csv on scrapinghub? When I deploy my project as is, scrapy cloud does not know where to find the csv. Any ideas would be welcome.
You need to declare the files in the
package_data
section of yoursetup.py
file as given in Deploying non-code files.Regards,
Thriveni Patil
- Oldest First
- Popular
- Newest First
Sorted by Popularrafalf
Hi,
Can you share your code please
How do you access csv from within spiders? having similar issue
thriveni
You need to declare the files in the
package_data
section of yoursetup.py
file as given in Deploying non-code files.Regards,
Thriveni Patil
-
Unable to select Scrapy project in GitHub
-
ScrapyCloud can't call spider?
-
Unhandled error in Deferred
-
Item API - Filtering
-
newbie to web scraping but need data from zillow
-
ValueError: Invalid control character
-
Cancelling account
-
Best Practices
-
Beautifulsoup with ScrapingHub
-
Delete a project in ScrapingHub
See all 452 topics