So we can see that item A is gone and items Z,T found in run N+1
Is there a way that I can calculate the items difference between jobs N and N+1 ?
0 Votes
vaz posted
over 7 years ago
Best Answer
Hey Balder-man,
I think it can be done through these steps:
1. Using Delta Fetch Addon to avoid repeated items between N and N+1 spider
2. The difference now will be just items collected in (N+1) for any N from 1 to inf. Because every item collected in N+1 is not present in the previous one.
I want the spider to do all requests in run N but drop all items that were found in run N-1.
In my example above - only Z,T should be valid items since they are "new"
How can I do that?
0 Votes
A
Avishay Baldermanposted
about 7 years ago
Thanks
I will check this addon
0 Votes
vazposted
over 7 years ago
Answer
Hey Balder-man,
I think it can be done through these steps:
1. Using Delta Fetch Addon to avoid repeated items between N and N+1 spider
2. The difference now will be just items collected in (N+1) for any N from 1 to inf. Because every item collected in N+1 is not present in the previous one.
I have a periodic job that collects items.
Lets assume that Job number N found the items:
A,B,C,D
Lets assume that Job number N+1 found the items:
B,C,D,Z,T
So we can see that item A is gone and items Z,T found in run N+1
Is there a way that I can calculate the items difference between jobs N and N+1 ?
0 Votes
vaz posted over 7 years ago Best Answer
Hey Balder-man,
I think it can be done through these steps:
1. Using Delta Fetch Addon to avoid repeated items between N and N+1 spider
2. The difference now will be just items collected in (N+1) for any N from 1 to inf. Because every item collected in N+1 is not present in the previous one.
Best regards,
Pablo
0 Votes
3 Comments
Avishay Balderman posted about 7 years ago
Hi
I was reading https://blog.scrapinghub.com/2016/07/20/scrapy-tips-from-the-pros-july-2016/ and I am not sure it can work for me.
I want the spider to do all requests in run N but drop all items that were found in run N-1.
In my example above - only Z,T should be valid items since they are "new"
How can I do that?
0 Votes
Avishay Balderman posted about 7 years ago
Thanks
I will check this addon
0 Votes
vaz posted over 7 years ago Answer
Hey Balder-man,
I think it can be done through these steps:
1. Using Delta Fetch Addon to avoid repeated items between N and N+1 spider
2. The difference now will be just items collected in (N+1) for any N from 1 to inf. Because every item collected in N+1 is not present in the previous one.
Best regards,
Pablo
0 Votes
Login to post a comment