Continously sending scraped data to frontend

Posted over 5 years ago by Jakub Bares

Post a topic
Un Answered
J
Jakub Bares

Hey,

Is it possible to send request to scraper on scraping hub to start and get results continuously returned to frontend from where I called the scraper?

0 Votes


3 Comments

Sorted by
peixoto

peixoto posted over 5 years ago Admin

Hi,


Sounds like you need to integrate two different systems. One crawler that scrapes the content from the web and produces a "database", and one consume job that will use that data.

If you wish to run your Crawler inside Scrapy Cloud, we offer a few APIs that may help you to create this integration.


Using the Scrapy Cloud Jobs API you may control running jobs: https://doc.scrapinghub.com/api/jobs.html#jobs-api

We also offer our Storage set of APIs that provides several endpoints to deal with the results of jobs and spiders: https://doc.scrapinghub.com/scrapy-cloud.html#storage-scrapinghub-com


If you need real time data, then I also recommend you checking on ScrapyRT. It it not yet supported as product and not yet integrated on Scrapy Cloud, but it's worth mentioning: https://github.com/scrapinghub/scrapyrt


Let me know if these help your current project.

0 Votes

J

Jakub Bares posted over 5 years ago

The business problem is following.
On the hotel reviews website user opens a hotel profile. Browser extension gets triggered and the scraper begins downloading all reviews for the hotel. Immediately a summary of the reviews is displayed. As more reviews are scraped the summary updates every few seconds

0 Votes

J

Jakub Bares posted over 5 years ago

Actually I need something better.
Because I need to process the scraped data on backend with deep learning model.
So better solution is I will start a process on backend from where I will start the scraper on scraping hub and process the results it continously returns back. Then from frontend to the backend I will ask for the final results every x seconds.

0 Votes

Login to post a comment