How should I use the "probability" field?

Modified on Wed, 3 Feb, 2021 at 11:30 AM

This value is an indicator of how confident we are that a page is an individual Product or Article page, depending on whether pageType is "product" or "article". The closer the value is to 1, the more confident. For example, when you're scraping products, "probability" is high on product pages, and low on product list pages, blog pages, 404 error pages, etc. You can use this field to filter out non-product or non-article pages: keep only results with probability larger than a certain threshold. Recommended default threshold value is 0.5 (i.e. use probability > 0.5), but you may choose a different threshold.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article