Apparently, there is a 1 MB limitation on serialized items. Is there a way to remove the limitation? I need around 6 MB at least.
0 Votes
nestor posted
over 6 years ago
AdminBest Answer
There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.
1 Votes
3 Comments
Sorted by
S
SAI KATTAposted
over 2 years ago
Hi ,
In my case the extracted data will be assigned to few variables and returned as json
So is the one mb limitation is too entire json
Or each variable in json. Could you please confirm this.
Response ={
Paganame :
Html content :
Downloaded pdf:
}
All these will be return as one item under zyte items tab.
Is the one mb limitation for entire response or pagename,html content and pdf content individually
}
0 Votes
M
Mattia Ferriniposted
over 5 years ago
From time to time, my scraper is not able to parse the html.
I am trying to get access to the raw HTML. I have enabled the Page Storage addon and I raise an error. I get a warning that says "Page not saved, body too large: ".
Any workaround?
0 Votes
nestorposted
over 6 years ago
AdminAnswer
There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.
Apparently, there is a 1 MB limitation on serialized items. Is there a way to remove the limitation? I need around 6 MB at least.
0 Votes
nestor posted over 6 years ago Admin Best Answer
There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.
1 Votes
3 Comments
SAI KATTA posted over 2 years ago
0 Votes
Mattia Ferrini posted over 5 years ago
From time to time, my scraper is not able to parse the html.
I am trying to get access to the raw HTML. I have enabled the Page Storage addon and I raise an error. I get a warning that says "Page not saved, body too large: ".
Any workaround?
0 Votes
nestor posted over 6 years ago Admin Answer
There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.
1 Votes
Login to post a comment