## Spider not working

Posted over 4 years ago by ayza

Post a topic
Answered
a
ayza

 Hello Team,


I have created a spider to crawl data from konga.com site but getting error like

[scrapy.spidermiddlewares.httperror] Ignoring response <403 https://www.konga.com/category/laptops-5230>: HTTP status code is not handled or not allowed


can anyone view this issue and suggest me any solutions.

0 Votes

thriveni

thriveni posted over 4 years ago Admin Best Answer

Hello,



This seems to be a ban from target website. I see that you have Crawlera susbcription. Please enable the Crawlera addon through UI or use Crawlera middleware. Please refer our article Using Crawlera with Scrapy to know more about integrating Crawlera.


Regards,

Thriveni

0 Votes


10 Comments

Sorted by
aurish_hammad_hafeez

aurish_hammad_hafeez posted over 4 years ago Admin

0 Votes

thriveni

thriveni posted over 4 years ago Admin Answer

Hello,



This seems to be a ban from target website. I see that you have Crawlera susbcription. Please enable the Crawlera addon through UI or use Crawlera middleware. Please refer our article Using Crawlera with Scrapy to know more about integrating Crawlera.


Regards,

Thriveni

0 Votes

a

ayza posted over 4 years ago

I have enabled the crawlera using below configuration (attachment file) , but still getting error.





[scrapy.spidermiddlewares.httperror] Ignoring response <407 https://www.konga.com/category/laptops-5230>: HTTP status code is not handled or not allowed

0 Votes

thriveni

thriveni posted over 4 years ago Admin

407 indicates Authorization error. Please ensure the Crawlera API key is provided in Crawlera addon settings.

The Crawlera APIKey can be found in the setup instruction of the crawlera account like here https://app.scrapinghub.com/o/324974/crawlera/setup?username=ayza.

0 Votes

a

ayza posted over 4 years ago

I have fixed the Authorization error as per as your suggestion.but still i am unable to crawl data. Please view the spider file and logs messag, and suggest if anything i missing.

0 Votes

thriveni

thriveni posted over 4 years ago Admin

Hello,


The requests were successful from Crawlera this time. The recent job stat shows response code of 200 https://app.scrapinghub.com/p/442308/77/6/stats

Please check if the CSS/Xpath elements are correctly chosen for the Items. You can make use of Scrapy Shell to check if the CSS/Xpath expressions work as expected. 

0 Votes

a

ayza posted about 4 years ago

Hello Team,


I have created a spider to crawl data from konga site but unable to crawl any products. While spider successfully finished.

Can any one view my spider file and suggest ,what is mission in my code (spider file attached )

0 Votes

a

ayza posted about 4 years ago

Hello Thriveni,


Its Critical Situation.


Please help to make proper spider file for crawl data from konga site. I have attached my spider code in attachment. Please review at once and let me suggest if i forgot anything.

0 Votes

thriveni

thriveni posted about 4 years ago Admin

Kongo site is Javascript based site , crawlera does not render javascript hence there is no data in ""section._588b5_3MtNs > ul.b49ee_2pjyI > li.bbe45_3oExY > div > div.a2cf5_2S5q5 > div._4941f_1HCZm " .


You would need to use some headless browser like Selenium or Splash with Crawlera to render the page.  


Please note that paying customers can get help from Support team and have a faster resolution. Please navigate to  Dashboard > Help > Contact Support to create a Support ticket.  

0 Votes

a

ayza posted about 4 years ago

Thank you very much for confirmation.

0 Votes

Login to post a comment