It does not help with automation. The question was about a webhook to be called when the spider is completed. The purpose would be to extract data automatically via the API without the need to constantly call the API and check for changes.
A spider can take anywhere from 2 minutes to 24 hours. If we poll the API for changes every 2 minutes, it adds up to a lot of calls with an empty result.
1 person likes this
J
Jason Z.
said
about 7 years ago
I was also trying to figure out if this was possible, and I found one way to accomplish webhook functionality is to use the closed() method of a spider:
This closed() method will be called after the spider is done, so you can make a request to another url here. And since you can pass spiders arguments, you can provide a custom webhook url to notify you when the spider is completed. Hope that helps.
5 people like this
K
Kevin McIsaac
said
over 3 years ago
I support this idea for enhanced automation.
Specifically when a job completes the output is posted to a webhook. This would enable me to set a schedule and when done the data is automatically uploaded to my platform. Simplest integration soon.
2 people like this
E
Engineer Styler
said
over 1 year ago
I was relying on the Spider.closed callback as mentioned above but, the problem is, the job isn't finished when this is called because the spider itself hasn't exited yet. It's kind of strange for a platform like this not to have any webhook support.
priceedge
Is there any way to define a web hook to be called when a spider/job is completed?
Hi, not sure to understand exactly but could this be helpful?
https://helpdesk.scrapinghub.com/support/solutions/articles/22000200451-getting-notifications-on-certain-events
Best regards,
Pablo
1 person has this question
- Oldest First
- Popular
- Newest First
Sorted by Oldest Firstnatevick
If not, could this be added to a feature request list for Scrapy Cloud?
1 person likes this
vaz
Hi, not sure to understand exactly but could this be helpful?
https://helpdesk.scrapinghub.com/support/solutions/articles/22000200451-getting-notifications-on-certain-events
Best regards,
Pablo
priceedge
It does not help with automation. The question was about a webhook to be called when the spider is completed. The purpose would be to extract data automatically via the API without the need to constantly call the API and check for changes.
A spider can take anywhere from 2 minutes to 24 hours. If we poll the API for changes every 2 minutes, it adds up to a lot of calls with an empty result.
1 person likes this
Jason Z.
I was also trying to figure out if this was possible, and I found one way to accomplish webhook functionality is to use the closed() method of a spider:
https://doc.scrapy.org/en/latest/topics/spiders.html?highlight=closed#scrapy.spiders.Spider.closed
This closed() method will be called after the spider is done, so you can make a request to another url here. And since you can pass spiders arguments, you can provide a custom webhook url to notify you when the spider is completed. Hope that helps.
5 people like this
Kevin McIsaac
2 people like this
Engineer Styler
I was relying on the Spider.closed callback as mentioned above but, the problem is, the job isn't finished when this is called because the spider itself hasn't exited yet. It's kind of strange for a platform like this not to have any webhook support.
-
Unable to select Scrapy project in GitHub
-
ScrapyCloud can't call spider?
-
Unhandled error in Deferred
-
Item API - Filtering
-
newbie to web scraping but need data from zillow
-
ValueError: Invalid control character
-
Cancelling account
-
Best Practices
-
Beautifulsoup with ScrapingHub
-
Delete a project in ScrapingHub
See all 460 topics