Periodic Script / IOError: No such file or directory

Posted about 7 years ago by Nicolas David

Post a topic

Answered

Nicolas David

Hi,

I am trying to run a periodic script and connect it with a json file within my project. I tried this (https://support.scrapinghub.com/support/solutions/articles/22000200416-deploying-non-code-files) but this is not working for me, structure imported from scraping hub looks very different. Script is working well until i need to call this file.

IOError: [Errno 2] No such file or directory : 'resources/bmibmi-67d3f1f00f49.json'

with this in setup.py :

package_data={

'project': ['resources/*.json']

Thanks a lot for your help.

0 Votes

nestor posted over 6 years ago Admin Best Answer

I assume you guys are trying to open the file and that's why you are getting the error.

The bottom part of the article shows how to read the data with pkgutil, which means that data = contents of the file.

0 Votes

7 Comments

jochemtimmers posted over 6 years ago

Did you get an answer on this? I have sort of the same issue

0 Votes

nestor posted over 6 years ago Admin Answer

I assume you guys are trying to open the file and that's why you are getting the error.

The bottom part of the article shows how to read the data with pkgutil, which means that data = contents of the file.

0 Votes

jochemtimmers posted over 6 years ago

Hey Nestor, I don't need to open the file, I just need to pass the filename as a parameter. Without Scrapinghub it works fine, but when I try to upload it to the cloud the file is missing.

My code:

import scrapy

import gspread

from oauth2client.service_account import ServiceAccountCredentials

scope = ['https://spreadsheets.google.com/feeds',

'https://www.googleapis.com/auth/drive']

credentials = ServiceAccountCredentials.from_json_keyfile_name("resources/pythonSheets-7e2e130f23ff.json", scope)

wks = gspread.authorize(credentials).open("pythonTest").worksheet("storeScraper")

Error:
File "/tmp/unpacked-eggs/__main__.egg/store/spiders/sheetsTestTotalFeed.py", line 8, in <module>

credentials = ServiceAccountCredentials.from_json_keyfile_name("resources/pythonSheets-7e2e130f24ff.json", scope)

File "/app/python/lib/python2.7/site-packages/oauth2client/service_account.py", line 219, in from_json_keyfile_name

with open(filename, 'r') as file_obj:

IOError: [Errno 2] No such file or directory: 'resources/pythonSheets-7e2e130f23ff.json'

{"message": "shub-image-info exit code: 1", "details": null, "error": "image_info_error"}

0 Votes

nestor posted over 6 years ago Admin

ServiceAccountCredentials.from_json_keyfile_name is trying to open the file. It's shown right there on the error log:

File "/app/python/lib/python2.7/site-packages/oauth2client/service_account.py", line 219, in from_json_keyfile_name

with open(filename, 'r') as file_obj:

You need to use pkgutil to read the content of the file, without any additional reading, as if you had explicitly set the JSON content on the variable.

Maybe try something like:

import pkgutil

data = pkgutil.get_data("myproject","resources/yourfile.json")
data = data.decode("UTF-8")

credentials = ServiceAccountCredentials.from_json_keyfile_dict(data, scope)

0 Votes

jochemtimmers posted over 6 years ago

I added your solution:

import scrapy
import gspread
import pkgutil
from oauth2client.service_account import ServiceAccountCredentials
scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']
data = pkgutil.get_data("myproject","pythonSheets-7e2e130f24ff.json")
data = data.decode("UTF-8")
credentials = ServiceAccountCredentials.from_json_keyfile_name(data, scope)

Then I got this error:

File "/tmp/unpacked-eggs/__main__.egg/store/spiders/sheetsTestTotalFeed.py", line 8, in <module>

data = data.decode("UTF-8")

AttributeError: 'NoneType' object has no attribute 'decode'

{"message": "shub-image-info exit code: 1", "details": null, "error": "image_info_error"}

It still looks like it can't find the file

My setup file looks like this:

from setuptools import setup, find_packages

setup(
    name         = 'project',
    version      = '1.0',
    packages     = find_packages(),
    package_data={
        'geurboetiek_haarlem': ['*.json']
    },
    data_files = [('', ['pythonSheets-7e2e130f24ff.json'])],
    entry_points = {'scrapy': ['settings = geurboetiek_haarlem.settings']},
    zip_safe=False,
)

0 Votes

nestor posted over 6 years ago Admin

You have several discrepancies:

data = pkgutil.get_data("myproject","pythonSheets-7e2e130f24ff.json")

setup(
name = 'project',

package_data={ 'geurboetiek_haarlem':

Also not sure why you've removed resources folder

0 Votes

shikhar_srivastava posted about 6 years ago

I have the same issue.
I am trying to import a list of links and them use it as a list for start_urls.

data = pkgutil.get_data("quotetutorial", "resources/link_list.txt")

data = data.decode('utf-8').splitlines()

start_urls = data Locally the code works fine. But when i try to deploy it, the same No such file or dictionary error shows up. I know you are pointing out that we are not supposed to Open the file. But I do not understand what command to use in that case.

0 Votes