1 spider doesn't return anything on scrapinghub, but works fine locally

Start a new topic

Answered

1 spider doesn't return anything on scrapinghub, but works fine locally

imrans

started a topic over 6 years ago

I have a project with 3 spiders. Locally on my machine, all 3 spiders work fine.

However, when i deploy the 3 scrapy spiders on scrapinghub, 1 of the spider always fails to return anything (the other 2 spiders work fine).

Since all 3 spiders work fine locally, and 2 of them still work on scrapinghub, I'm quite sure this is an issue on scrapinghub (is that website blocking scrapinghub ?).

How can i debug this (file attached)?

txt

logbestbuy.c...

(3.11 KB)

Best Answer

vaz said over 6 years ago

Hi Scraper,

Could you describe a bit more about this issue?

What kind of errors are you experiencing (providing details if possible)
The domains you try to crawl
provide project ID to check with support

Best,

Pablo

2 Comments

vaz

said over 6 years ago

Answer

Hi Scraper,

Could you describe a bit more about this issue?

What kind of errors are you experiencing (providing details if possible)
The domains you try to crawl
provide project ID to check with support

Best,

Pablo

mindlessbrain

said over 6 years ago

Hi,

I'm having a similar issue.

My spider tries to access a link that has a robots.txt file but gets a timeout error like if USER_AGENT wasn't set. I have set it to:

USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'

It works fine locally (and was working in the cloud one month ago) but when I deploy and try to run on the cloud, I get a timeout error.

I've attached the log with details.

Thanks.

txt

logfarnell26...

(7.34 KB)

Zyte Support Center

How can we help you today?