Continously sending scraped data to frontend

Posted about 6 years ago by Jakub Bares

Post a topic

Un Answered

Jakub Bares

Hey,

Is it possible to send request to scraper on scraping hub to start and get results continuously returned to frontend from where I called the scraper?

0 Votes

3 Comments

Jakub Bares posted about 6 years ago

Actually I need something better.
Because I need to process the scraped data on backend with deep learning model.
So better solution is I will start a process on backend from where I will start the scraper on scraping hub and process the results it continously returns back. Then from frontend to the backend I will ask for the final results every x seconds.

0 Votes

Jakub Bares posted about 6 years ago

The business problem is following.
On the hotel reviews website user opens a hotel profile. Browser extension gets triggered and the scraper begins downloading all reviews for the hotel. Immediately a summary of the reviews is displayed. As more reviews are scraped the summary updates every few seconds

0 Votes

peixoto posted almost 6 years ago Admin

Hi,

Sounds like you need to integrate two different systems. One crawler that scrapes the content from the web and produces a "database", and one consume job that will use that data.

If you wish to run your Crawler inside Scrapy Cloud, we offer a few APIs that may help you to create this integration.

Using the Scrapy Cloud Jobs API you may control running jobs: https://doc.scrapinghub.com/api/jobs.html#jobs-api

We also offer our Storage set of APIs that provides several endpoints to deal with the results of jobs and spiders: https://doc.scrapinghub.com/scrapy-cloud.html#storage-scrapinghub-com

If you need real time data, then I also recommend you checking on ScrapyRT. It it not yet supported as product and not yet integrated on Scrapy Cloud, but it's worth mentioning: https://github.com/scrapinghub/scrapyrt

Let me know if these help your current project.

0 Votes