videocamWeb Data Extraction Summit - September 30th, 2021.
Join some of the greatest minds in web scraping to educate, inspire, and innovate.
Register for free!

AutoExtract FAQ

Why isn't data extracted correctly?
Web scraping is complex - there are bans, location-specific content, issues with remote websites, misbehaving web pages. Like humans, any useful machine lea...
Wed, 3 Feb, 2021 at 11:30 AM
How do I use the API?
See https://docs.zyte.com/automatic-extraction.html
Wed, 3 Feb, 2021 at 11:30 AM
How should I use the "probability" field?
This value is an indicator of how confident we are that a page is an individual Product or Article page, depending on whether pageType is "product"...
Wed, 3 Feb, 2021 at 11:30 AM
What are the possible errors and how should my code handle them?
See https://docs.zyte.com/automatic-extraction.html#errors
Wed, 3 Feb, 2021 at 11:30 AM
What should I do if my request returns with HTTP status code 429 ("too many requests")?
This status code indicates that service is too busy and either per-user or system-level rate limit is hit. The best thing to do is to continue sending reque...
Wed, 3 Feb, 2021 at 11:30 AM
Can I pass custom cookies to be used to download a web page?
At present the answer is No. Withstanding that, please be assured we are working on this feature, so if it's important for you please reach out so that...
Wed, 3 Feb, 2021 at 11:30 AM
Is JavaScript executed?
We enable or disable JavaScript to get the best extraction result.
Wed, 3 Feb, 2021 at 11:30 AM
Do I have to request URLs against the API in a polite manner or will the API take care of scheduling requests in such a way it doesn't DDoS the site?
API server rate limits the requests, we're trying to avoid causing any problems for target websites.
Wed, 3 Feb, 2021 at 11:30 AM
Are the content extraction techniques language agnostic?
Yes, Automatic Extraction API works on pages in all languages and from all countries.
Wed, 3 Feb, 2021 at 11:30 AM