## Spider not working

Posted almost 5 years ago by ayza

Post a topic

Answered

ayza

Hello Team,

I have created a spider to crawl data from konga.com site but getting error like

[scrapy.spidermiddlewares.httperror] Ignoring response <403 https://www.konga.com/category/laptops-5230>: HTTP status code is not handled or not allowed

can anyone view this issue and suggest me any solutions.

Attachments (1)

kongamobiles.py
830 Bytes

0 Votes

thriveni posted almost 5 years ago Admin Best Answer

Hello,

This seems to be a ban from target website. I see that you have Crawlera susbcription. Please enable the Crawlera addon through UI or use Crawlera middleware. Please refer our article Using Crawlera with Scrapy to know more about integrating Crawlera.

Regards,

Thriveni

0 Votes

10 Comments

ayza posted over 4 years ago

Thank you very much for confirmation.

0 Votes

thriveni posted over 4 years ago Admin

Kongo site is Javascript based site , crawlera does not render javascript hence there is no data in ""section._588b5_3MtNs > ul.b49ee_2pjyI > li.bbe45_3oExY > div > div.a2cf5_2S5q5 > div._4941f_1HCZm " .

You would need to use some headless browser like Selenium or Splash with Crawlera to render the page.

Please note that paying customers can get help from Support team and have a faster resolution. Please navigate to Dashboard > Help > Contact Support to create a Support ticket.

0 Votes

ayza posted over 4 years ago

Hello Thriveni,

Its Critical Situation.

Please help to make proper spider file for crawl data from konga site. I have attached my spider code in attachment. Please review at once and let me suggest if i forgot anything.

Attachments (1)

kongamobiles.py
950 Bytes

0 Votes

ayza posted almost 5 years ago

Hello Team,

I have created a spider to crawl data from konga site but unable to crawl any products. While spider successfully finished.

Can any one view my spider file and suggest ,what is mission in my code (spider file attached )

Attachments (1)

kongamobiles.py
950 Bytes

0 Votes

thriveni posted almost 5 years ago Admin

Hello,

The requests were successful from Crawlera this time. The recent job stat shows response code of 200 https://app.scrapinghub.com/p/442308/77/6/stats.

Please check if the CSS/Xpath elements are correctly chosen for the Items. You can make use of Scrapy Shell to check if the CSS/Xpath expressions work as expected.

0 Votes

ayza posted almost 5 years ago

I have fixed the Authorization error as per as your suggestion.but still i am unable to crawl data. Please view the spider file and logs messag, and suggest if anything i missing.

Attachments (2)

screenshot-a....png
155 KB

kongamobiles.py
830 Bytes

0 Votes

thriveni posted almost 5 years ago Admin

407 indicates Authorization error. Please ensure the Crawlera API key is provided in Crawlera addon settings.

The Crawlera APIKey can be found in the setup instruction of the crawlera account like here https://app.scrapinghub.com/o/324974/crawlera/setup?username=ayza.

0 Votes

ayza posted almost 5 years ago

I have enabled the crawlera using below configuration (attachment file) , but still getting error.

[scrapy.spidermiddlewares.httperror] Ignoring response <407 https://www.konga.com/category/laptops-5230>: HTTP status code is not handled or not allowed

Attachments (1)

screenshot-a....png
42.2 KB

0 Votes

thriveni posted almost 5 years ago Admin Answer

Hello,

Regards,

Thriveni

0 Votes

aurish_hammad_hafeez posted almost 5 years ago Admin

Hi,

Please check https://www.konga.com/robots.txt

0 Votes