Start a new topic

builtins.OSError: Disk quota exceeded

File "/usr/local/lib/python3.6/site-packages/twisted/internet/base.py", line 1272, in run
	    self.mainLoop()
	  File "/usr/local/lib/python3.6/site-packages/twisted/internet/base.py", line 1281, in mainLoop
	    self.runUntilCurrent()
	  File "/usr/local/lib/python3.6/site-packages/twisted/internet/base.py", line 902, in runUntilCurrent
	    call.func(*call.args, **call.kw)
	  File "/usr/local/lib/python3.6/site-packages/twisted/internet/task.py", line 671, in _tick
	    taskObj._oneWorkUnit()
	--- <exception caught here> ---
	  File "/usr/local/lib/python3.6/site-packages/twisted/internet/task.py", line 517, in _oneWorkUnit
	    result = next(self._iterator)
	  File "/usr/local/lib/python3.6/site-packages/scrapy/utils/defer.py", line 63, in <genexpr>
	    work = (callable(elem, *args, **named) for elem in iterable)
	  File "/usr/local/lib/python3.6/site-packages/scrapy/core/scraper.py", line 183, in _process_spidermw_output
	    self.crawler.engine.crawl(request=output, spider=spider)
	  File "/usr/local/lib/python3.6/site-packages/scrapy/core/engine.py", line 210, in crawl
	    self.schedule(request, spider)
	  File "/usr/local/lib/python3.6/site-packages/scrapy/core/engine.py", line 216, in schedule
	    if not self.slot.scheduler.enqueue_request(request):
	  File "/usr/local/lib/python3.6/site-packages/scrapy/core/scheduler.py", line 57, in enqueue_request
	    dqok = self._dqpush(request)
	  File "/usr/local/lib/python3.6/site-packages/scrapy/core/scheduler.py", line 86, in _dqpush
	    self.dqs.push(reqd, -request.priority)
	  File "/usr/local/lib/python3.6/site-packages/queuelib/pqueue.py", line 35, in push
	    q.push(obj) # this may fail (eg. serialization error)
	  File "/usr/local/lib/python3.6/site-packages/scrapy/squeues.py", line 16, in push
	    super(SerializableQueue, self).push(s)
	  File "/usr/local/lib/python3.6/site-packages/queuelib/queue.py", line 152, in push
	    self.f.write(string)
	builtins.OSError: [Errno 122] Disk quota exceeded
	

HI Chandan,


The error seems pretty straightforward. You spider is exceeding the storage capacity.

If you use Scrapy Units, that provides you with 2.5 GB of temporary storage during the time your spider runs.

Keep in mind that disk space is not persisted, though. You'll need to use other techniques in order to persist data between jobs.



What is the way to overcome this issue?

Hi Ihor


I'm sorry for the long delay in getting back at you.

The way to overcome this would be to assign more Scrapy Cloud units to your job. I can purchase as many scrapy units as you need.


Let me know if you still need help.

Login to post a comment