Start a new topic
Answered

Periodic Script / IOError: No such file or directory

Hi,

I am trying to run a periodic script and connect it with a json file within my project. I tried this (https://support.scrapinghub.com/support/solutions/articles/22000200416-deploying-non-code-files) but this is not working for me, structure imported from scraping hub looks very different. Script is working well until i need to call this file.


IOError: [Errno 2] No such file or directory  : 'resources/bmibmi-67d3f1f00f49.json'


with this in setup.py :

package_data={

        'project': ['resources/*.json']

    },


Thanks a lot for your help.


Best Answer

I assume you guys are trying to open the file and that's why you are getting the error.

The bottom part of the article shows how to read the data with pkgutil, which means that data = contents of the file.


Did you get an answer on this? I have sort of the same issue

Answer

I assume you guys are trying to open the file and that's why you are getting the error.

The bottom part of the article shows how to read the data with pkgutil, which means that data = contents of the file.

Hey Nestor, I don't need to open the file, I just need to pass the filename as a parameter. Without Scrapinghub it works fine, but when I try to upload it to the cloud the file is missing.

My code:

 

import scrapy

import gspread

from oauth2client.service_account import ServiceAccountCredentials

scope = ['https://spreadsheets.google.com/feeds',

'https://www.googleapis.com/auth/drive']

credentials = ServiceAccountCredentials.from_json_keyfile_name("resources/pythonSheets-7e2e130f23ff.json", scope)

wks = gspread.authorize(credentials).open("pythonTest").worksheet("storeScraper")


Error:
 File "/tmp/unpacked-eggs/__main__.egg/store/spiders/sheetsTestTotalFeed.py", line 8, in <module>

    credentials = ServiceAccountCredentials.from_json_keyfile_name("resources/pythonSheets-7e2e130f24ff.json", scope)

  File "/app/python/lib/python2.7/site-packages/oauth2client/service_account.py", line 219, in from_json_keyfile_name

    with open(filename, 'r') as file_obj:

IOError: [Errno 2] No such file or directory: 'resources/pythonSheets-7e2e130f23ff.json'

{"message": "shub-image-info exit code: 1", "details": null, "error": "image_info_error"}



 ServiceAccountCredentials.from_json_keyfile_name is trying to open the file. It's shown right there on the error log:


File "/app/python/lib/python2.7/site-packages/oauth2client/service_account.py", line 219, in from_json_keyfile_name

    with open(filename, 'r') as file_obj:


You need to use pkgutil to read the content of the file, without any additional reading, as if you had explicitly set the JSON content on the variable.


Maybe try something like:


import pkgutil

data = pkgutil.get_data("myproject","resources/yourfile.json")
data = data.decode("UTF-8")

credentials = ServiceAccountCredentials.from_json_keyfile_dict(data, scope)


 I added your solution:

import scrapy
import gspread
import pkgutil
from oauth2client.service_account import ServiceAccountCredentials
scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']
data = pkgutil.get_data("myproject","pythonSheets-7e2e130f24ff.json")
data = data.decode("UTF-8")
credentials = ServiceAccountCredentials.from_json_keyfile_name(data, scope)

 Then I got this error:


  File "/tmp/unpacked-eggs/__main__.egg/store/spiders/sheetsTestTotalFeed.py", line 8, in <module>

    data = data.decode("UTF-8")

AttributeError: 'NoneType' object has no attribute 'decode'

{"message": "shub-image-info exit code: 1", "details": null, "error": "image_info_error"}


It still looks like it can't find the file


My setup file looks like this:

from setuptools import setup, find_packages

setup(
    name         = 'project',
    version      = '1.0',
    packages     = find_packages(),
    package_data={
        'geurboetiek_haarlem': ['*.json']
    },
    data_files = [('', ['pythonSheets-7e2e130f24ff.json'])],
    entry_points = {'scrapy': ['settings = geurboetiek_haarlem.settings']},
    zip_safe=False,
)

 

You have several discrepancies:


data = pkgutil.get_data("myproject","pythonSheets-7e2e130f24ff.json")


setup(
    name         = 'project',


package_data={         'geurboetiek_haarlem':


Also not sure why you've removed resources folder


I have the same issue. 
I am trying to import a list of links and them use it as a list for start_urls.

data = pkgutil.get_data("quotetutorial", "resources/link_list.txt")

data = data.decode('utf-8').splitlines()


start_urls = data Locally the code works fine. But when i try to deploy it, the same No such file or dictionary error shows up. I know you are pointing out that we are not supposed to Open the file. But I do not understand what command to use in that case.


Login to post a comment