Learn all about the latest trends and best practices in data extraction - Join us at Extract SummitGet tickets
Start a new topic
Answered

Scrapy version

Currently I'm running spider by crawling data from an API. However, as I can see in the log: Zyte is using Scrapy 2.0, which is really old compared to the newest version 2.6.2, led to some conflicts with the response.

As in this case, TextResponse should not be used in the current version.

 

2022-08-15 03:48:19 ERROR [scrapy.core.scraper] Spider error processing <POST https://fptshop.com.vn/api-data/tin-tuc/News/GetListNews/tin-khuyen-mai?numberRecord=6> (referer: None)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/scrapy/utils/defer.py", line 117, in iter_errback
    yield next(it)
  File "/usr/local/lib/python3.8/site-packages/scrapy/utils/python.py", line 345, in __next__
    return next(self.data)
  File "/usr/local/lib/python3.8/site-packages/scrapy/utils/python.py", line 345, in __next__
    return next(self.data)
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/usr/local/lib/python3.8/site-packages/sh_scrapy/middlewares.py", line 30, in process_spider_output
    for x in result:
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/usr/local/lib/python3.8/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
    for x in result:
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/usr/local/lib/python3.8/site-packages/scrapy/spidermiddlewares/referer.py", line 338, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/usr/local/lib/python3.8/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/usr/local/lib/python3.8/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/usr/local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/app/__main__.egg/khuyenmai_day_dealsbot/spiders/fptshop.py", line 16, in parse
    data = response.json()
AttributeError: 'TextResponse' object has no attribute 'json'

 

Here is my spider scode

 

import scrapy
from ..items import KhuyenmaiDayDealsbotItem


class FptshopSpider(scrapy.Spider):
    name = 'fptshop'
    allowed_domains = ['fptshop.com.vn']
    start_urls = ['https://fptshop.com.vn/api-data/tin-tuc/News/GetListNews/tin-khuyen-mai?numberRecord=6']

    def start_requests(self):
        urls = ['https://fptshop.com.vn/api-data/tin-tuc/News/GetListNews/tin-khuyen-mai?numberRecord=6']
        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse, method='POST')

    def parse(self, response):
        data = response.json()
        deals = data['datas']
        for deal in deals:
            item = KhuyenmaiDayDealsbotItem()
            item['title'] = deal['title'].strip()
            item['short_description'] = deal['description'].strip()
            item['url'] = 'https://fptshop.com.vn/tin-tuc/tin-khuyen-mai/' + deal['titleAscii']
            item['store'] = 1
            item['activate'] = True
            yield item

 

 

Can you please update the Scrapy, or what should I do?


Best Answer

Hello,


You can change the Scrapy version by providing the desired scrapy stack version in scrapinghub.yml file. 

Please refer https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks for information on recent versions of stacks and how to change the stacks. 


Regrads,

Thriveni


Answer

Hello,


You can change the Scrapy version by providing the desired scrapy stack version in scrapinghub.yml file. 

Please refer https://support.zyte.com/support/solutions/articles/22000200402-changing-the-deploy-environment-with-scrapy-cloud-stacks for information on recent versions of stacks and how to change the stacks. 


Regrads,

Thriveni


1 person likes this

Thank you, this is exactly what I need!

Login to post a comment