Rejected message because it was too big: ITM {

Posted over 6 years ago by RestStep

Post a topic
Answered
R
RestStep

Apparently, there is a 1 MB limitation on serialized items. Is there a way to remove the limitation? I need around 6 MB at least.

0 Votes

nestor

nestor posted over 6 years ago Admin Best Answer

There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.

1 Votes


3 Comments

Sorted by
S

SAI KATTA posted over 2 years ago

Hi , In my case the extracted data will be assigned to few variables and returned as json So is the one mb limitation is too entire json Or each variable in json. Could you please confirm this. Response ={ Paganame : Html content : Downloaded pdf: } All these will be return as one item under zyte items tab. Is the one mb limitation for entire response or pagename,html content and pdf content individually }

0 Votes

M

Mattia Ferrini posted over 5 years ago

From time to time, my scraper is not able to parse the html.

I am trying to get access to the raw HTML. I have enabled the Page Storage addon and I raise an error. I get a warning that says "Page not saved, body too large: ". 

Any workaround?


0 Votes

nestor

nestor posted over 6 years ago Admin Answer

There's no way to remove the limitation. Depending on your use case: one solution would be to split your items into several, for example accumulated data from a paginated list. Another solution would be to enable Page Storage addon, and access raw HTML pages from Collections (If you are storing raw HTML as an Item). Another solution would be to store your items in Amazon S3 using FeedExport.

1 Votes

Login to post a comment