'>' string is added as url when spider starts

Posted over 7 years ago by Jenny Palarca

Post a topic
Answered
J
Jenny Palarca

Hi, 


I am encountering this issue when running the spider:


[scrapy.core.scraper] Error downloading <GET http://www.yalwa.com>: Connection was refused by other side: 111: Connection refused.


As you notice, the string ">" is identified as part of the starting url. How should I fix this? 


My spider works when I run on my local machine so I am confused why it is not working in scrapinghub. 


Can you help me please? 


Thank you.

Attachments (1)

0 Votes

nestor

nestor posted over 7 years ago Admin Best Answer

Hi,


The ">" at the end is a known bug on how the logs are displayed. The connection refused error actually means that the target domain has the Scrapy Cloud IP(s) blocked, so the solution would be to use Crawlera as a proxy.

0 Votes


1 Comments

nestor

nestor posted over 7 years ago Admin Answer

Hi,


The ">" at the end is a known bug on how the logs are displayed. The connection refused error actually means that the target domain has the Scrapy Cloud IP(s) blocked, so the solution would be to use Crawlera as a proxy.

0 Votes

Login to post a comment