Start a new topic
Answered

## Spider not working

 Hello Team,


I have created a spider to crawl data from konga.com site but getting error like

[scrapy.spidermiddlewares.httperror] Ignoring response <403 https://www.konga.com/category/laptops-5230>: HTTP status code is not handled or not allowed


can anyone view this issue and suggest me any solutions.

py
(830 Bytes)

Best Answer

Hello,



This seems to be a ban from target website. I see that you have Crawlera susbcription. Please enable the Crawlera addon through UI or use Crawlera middleware. Please refer our article Using Crawlera with Scrapy to know more about integrating Crawlera.


Regards,

Thriveni


Thank you very much for confirmation.

Kongo site is Javascript based site , crawlera does not render javascript hence there is no data in ""section._588b5_3MtNs > ul.b49ee_2pjyI > li.bbe45_3oExY > div > div.a2cf5_2S5q5 > div._4941f_1HCZm " .


You would need to use some headless browser like Selenium or Splash with Crawlera to render the page.  


Please note that paying customers can get help from Support team and have a faster resolution. Please navigate to  Dashboard > Help > Contact Support to create a Support ticket.  

Hello Thriveni,


Its Critical Situation.


Please help to make proper spider file for crawl data from konga site. I have attached my spider code in attachment. Please review at once and let me suggest if i forgot anything.

py
(950 Bytes)

Hello Team,


I have created a spider to crawl data from konga site but unable to crawl any products. While spider successfully finished.

Can any one view my spider file and suggest ,what is mission in my code (spider file attached )

py
(950 Bytes)

Hello,


The requests were successful from Crawlera this time. The recent job stat shows response code of 200 https://app.scrapinghub.com/p/442308/77/6/stats

Please check if the CSS/Xpath elements are correctly chosen for the Items. You can make use of Scrapy Shell to check if the CSS/Xpath expressions work as expected. 

I have fixed the Authorization error as per as your suggestion.but still i am unable to crawl data. Please view the spider file and logs messag, and suggest if anything i missing.
py
(830 Bytes)

407 indicates Authorization error. Please ensure the Crawlera API key is provided in Crawlera addon settings.

The Crawlera APIKey can be found in the setup instruction of the crawlera account like here https://app.scrapinghub.com/o/324974/crawlera/setup?username=ayza.

I have enabled the crawlera using below configuration (attachment file) , but still getting error.





[scrapy.spidermiddlewares.httperror] Ignoring response <407 https://www.konga.com/category/laptops-5230>: HTTP status code is not handled or not allowed

Answer

Hello,



This seems to be a ban from target website. I see that you have Crawlera susbcription. Please enable the Crawlera addon through UI or use Crawlera middleware. Please refer our article Using Crawlera with Scrapy to know more about integrating Crawlera.


Regards,

Thriveni

Login to post a comment