Creating a Scrapy spider

Modified on Wed, 3 Feb, 2021 at 6:43 AM

Here we will show you how to create your first Scrapy spider. We strongly recommend you also read the Scrapy tutorial for a more in-depth guide.


This assumes you have Scrapy already installed, otherwise please refer to the Scrapy installation guide.


For this example, we will build a spider to scrape famous quotes from this website: http://quotes.toscrape.com/

We begin by creating a Scrapy project which we will call quotes_crawler:


$ scrapy startproject quotes_crawler


Then we create a spider for quotes.toscrape.com:


$ scrapy genspider quotes-toscrape quotes.toscrape.com

Created spider 'quotes-toscrape' using template 'basic' in module:
quotes_crawler.spiders.quotes_toscrape


Then we edit the spider:


$ scrapy edit quotes-toscrape


Here is the code:


import scrapy


class QuotesToScrapeSpider(scrapy.Spider):
    name = "quotes-toscrape"
    allowed_domains = ["quotes.toscrape.com"]
    start_urls = ['http://quotes.toscrape.com/', ]

    def parse(self, response):
        for quote in response.css("div.quote"):
            yield {
                'text': quote.css("span.text ::text").extract_first(),
                'author': quote.css("small.author ::text").extract_first(),
                'tags': quote.css("div.tags > a.tag ::text").extract()
            }
        next_page_url = response.css("nav > ul > li.next > a ::attr(href)").extract_first()
        if next_page_url:
            yield scrapy.Request(response.urljoin(next_page_url))


For more information about Scrapy please refer to the Scrapy documentation.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article