Start a new topic

Continously sending scraped data to frontend


Is it possible to send request to scraper on scraping hub to start and get results continuously returned to frontend from where I called the scraper?


Sounds like you need to integrate two different systems. One crawler that scrapes the content from the web and produces a "database", and one consume job that will use that data.

If you wish to run your Crawler inside Scrapy Cloud, we offer a few APIs that may help you to create this integration.

Using the Scrapy Cloud Jobs API you may control running jobs:

We also offer our Storage set of APIs that provides several endpoints to deal with the results of jobs and spiders:

If you need real time data, then I also recommend you checking on ScrapyRT. It it not yet supported as product and not yet integrated on Scrapy Cloud, but it's worth mentioning:

Let me know if these help your current project.

Actually I need something better.
Because I need to process the scraped data on backend with deep learning model.
So better solution is I will start a process on backend from where I will start the scraper on scraping hub and process the results it continously returns back. Then from frontend to the backend I will ask for the final results every x seconds.

The business problem is following.
On the hotel reviews website user opens a hotel profile. Browser extension gets triggered and the scraper begins downloading all reviews for the hotel. Immediately a summary of the reviews is displayed. As more reviews are scraped the summary updates every few seconds

Login to post a comment