Deploy failure due to settings.py load file issue

Posted almost 4 years ago by Lawrence Li

Post a topic
Un Answered
L
Lawrence Li

I want to ensure everything works using the Starter plan before I purchase the Professional plan.


My settings.py file in my scrapy project opens a file (user_agent_list.txt) to read a list of user agents which will be used to populate the USER_AGENT_LIST property.  The file is located in the resources directory, which is a directory where the settings.py file is located.  Here's my code snippet: 

user_agent_list_directory: Path = Path(__file__).parent / "resources"
user_agent_list_file: Path = user_agent_list_directory / "user_agent_list.txt"
file: TextIO
with user_agent_list_file.open() as file:
    USER_AGENT_LIST = [user_agent.rstrip('\n') for user_agent in file]

 I know I can probably set the USER_AGENT_LIST string directly in my settings.py file, but it would be a little cleaner to read a file to get the user agent list.


I get the following error when I run the shub deploy <project number> command.  Note, I've changed some strings to remove any identifying information:  

Packing version 99be743-master
Deploying to Scrapy Cloud project "######"
Deploy log last 30 lines:
  File "/usr/local/lib/python3.8/site-packages/sh_scrapy/crawl.py", line 209, in shub_image_info
    _run_usercode(None, ['scrapy', 'shub_image_info'] + sys.argv[1:],
  File "/usr/local/lib/python3.8/site-packages/sh_scrapy/crawl.py", line 138, in _run_usercode
    settings = populate_settings(apisettings_func(), spider)
  File "/usr/local/lib/python3.8/site-packages/sh_scrapy/settings.py", line 243, in populate_settings
    return _populate_settings_base(apisettings, _load_default_settings, spider)
  File "/usr/local/lib/python3.8/site-packages/sh_scrapy/settings.py", line 172, in _populate_settings_base
    settings = get_project_settings().copy()
  File "/usr/local/lib/python3.8/site-packages/scrapy/utils/project.py", line 69, in get_project_settings
    settings.setmodule(settings_module_path, priority='project')
  File "/usr/local/lib/python3.8/site-packages/scrapy/settings/__init__.py", line 287, in setmodule
    module = import_module(module)
  File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/tmp/unpacked-eggs/__main__.egg/my_project_name/settings.py", line 24, in <module>
    with user_agent_list_file.open() as file:
  File "/usr/local/lib/python3.8/pathlib.py", line 1213, in open
    return io.open(self, mode, buffering, encoding, errors, newline,
  File "/usr/local/lib/python3.8/pathlib.py", line 1069, in _opener
    return self._accessor.open(self, flags, mode)
NotADirectoryError: [Errno 20] Not a directory: '/tmp/unpacked-eggs/__main__.egg/my_project_name/settings.py/my_project_name/resources/user_agent_list.txt'
{"message": "shub-image-info exit code: 1", "details": null, "error": "image_info_error"}

{"status": "error", "message": "Internal error"}
Deploy log location: /tmp/shub_deploy_xzmb457v.log
Error: Deploy failed: b'{"status": "error", "message": "Internal error"}'

  This is the only issue in my deploy because when I replace the above code snippet with the following code snippet which sets the USER_AGENT instead: 

USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'

 The entire deploy works.  Please help.  Thanks!

0 Votes


0 Comments

Login to post a comment