Start a new topic

How to convert a Scrapy project into an executable with Pyinstaller???

I have been stuck for few days in the problem of converting my scrapy project into .exe file with Pyinstaller...


Anyone got ideas on this???


Thanks,




1 person has this question

Special thanks to coltoneakins for answering this on stack.


https://stackoverflow.com/questions/49085970/no-such-file-or-directory-error-using-pyinstaller-and-scrapy


You did not use Pyinstaller properly when you had built your stand-alone program. Here is a short, layman's description of how Pyinstaller works: Pyinstaller bundles the Python interpreter, necessary DLLs (for Windows), your project's source code, and all the modules it can find into a folder or self-extracting executable. Pyinstaller does not include modules or files it cannot find in the final .exe (Windows), .app (macOS), folder, etc. that results when you run Pyinstaller.

So, here is what happened:

 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/_MEIbxALM3/scrapy/VERSION'

You ran your frozen/stand-alone program. As soon as you did this, your program was 'extracted' to a new, temporary folder on your computer /temp/_MEIbxALM3/. This folder contains the Python interpreter, your program's source code, and the modules Pyinstaller managed to find (plus a couple other necessary files).

The Scrapy module is more than just a module. It is an entire framework. It has its own plain text files (besides Python files) that it uses. And, it imports a lot of modules itself.

The Scrapy framework especially does not get along with Pyinstaller because it uses many methods to import modules that Pyinstaller cannot 'see'. Also, Pyinstaller basically makes no attempt to include files in the final build that are not .py files unless you tell it to.

So, what really happened?

The text file 'VERSION' that exists in the 'normal' scrapy module on your computer (that you had installed with pip or pipenv) was not included in the copycat scrapy module in the build of your program. Scrapy needs this file; Python is giving you the FileNotFoundError because it simply was never included. So, you have to include the file in the build of your program with Pyinstaller.

How do you tell Pyinstaller where to find modules and files?

This guy says to just copy the missing files from where they are installed on your computer into your build folder spit out from Pyinstaller. This does work. But, there is a better way and Pyinstaller can do more of the work for you (preventing further ImportErrors and FileNotFoundErrors you may get). See below:

build.spec Files are Your Friend

spec files are just Python files that Pyinstaller uses like a configuration file to tell it how to build your program. Read more about them here. Below is an example of a real build.spec file I used recently to build a Scrapy program with a GUI for Windows (my project's name is B.O.T. Bot):

import gooey
gooey_root = os.path.dirname(gooey.__file__)
gooey_languages = Tree(os.path.join(gooey_root, 'languages'), prefix = 'gooey/languages')
gooey_images = Tree(os.path.join(gooey_root, 'images'), prefix = 'gooey/images')
a = Analysis(['botbotgui.py'],
             pathex=['C:\\Users\\Colton\\.virtualenvs\\bot-bot-JBkeVQQB\\Scripts', 'C:\\Program Files (x86)\\Windows Kits\\10\\Redist\\ucrt\\DLLs\\x86'],
             hiddenimports=['botbot.spiders.spider'],
             hookspath=['.\\hooks\\'],
             runtime_hooks=None,
             datas=[('.\\spiders\\','.\\spiders\\'), ('.\\settings.py','.'),
                    ('.\\scrapy.cfg','.'), ('.\\items.py','.'), ('.\\itemloaders.py','.'),
                    ('.\\middlewares.py','.'), ('.\\pipelines.py','.')
                   ]
             )
pyz = PYZ(a.pure)

options = [('u', None, 'OPTION'), ('u', None, 'OPTION'), ('u', None, 'OPTION')]

exe = EXE(pyz,
          a.scripts,
          a.binaries,
          a.zipfiles,
          a.datas,
          options,
          gooey_languages, # Add them in to collected files
          gooey_images, # Same here.
          name='BOT_Bot_GUI',
          debug=False,
          strip=None,
          upx=True,
          console=False,
          windowed=True,
          icon=os.path.join(gooey_root, 'images', 'program_icon.ico'))

#coll = COLLECT(exe,
    #a.binaries,
    #a.zipfiles,
    #a.datas,
    #options,
    #gooey_languages, # Add them in to collected files
    #gooey_images, # Same here.
    #name='BOT_Bot_GUI',
    #debug=False,
    #strip=False,
    #upx=True,
    #console=False,
    #windowed=True,
    #icon=os.path.join(gooey_root, 'images', 'program_icon.ico'))

Uncomment the last region if you want to build a folder instead of a stand-alone .exe. This is a configuration file specific to my computer and project structure. So in your file, you would have to change a few things (for example pathex to tell Pyinstaller where to find DLLs on Windows 10. But, the premise is the same.

My project directory looks like this:

botbotgui.py  botbot.py  hooks  images  __init__.py  itemloaders.py  items.py  middlewares.py  pipelines.py  __pycache__  scrapy.cfg  settings.py  spiders

Pay special attention to the hooks/ directory. Using hooks will save you from a lot of headaches down the road. Read more about Pyinstaller's hooks feature here. In the hooks/ directory there is a hook file for Scrapy. This will tell Pyinstaller to include many modules and files it would have otherwise missed if you did not use a .spec file. This is the most important thing I have wrote here so far. If you do not do this step, you will keep getting ImportErrors every time you try to run a Scrapy program built using Pyinstaller. Scrapy imports MANY modules that Pyinstaller misses.

hook-scrapy.py (Note: Your hook file must be named just like this.):

from PyInstaller.utils.hooks import collect_submodules, collect_data_files

# This collects all dynamically imported scrapy modules and data files.
hiddenimports = (collect_submodules('scrapy') +
                 collect_submodules('scrapy.pipelines') +
                 collect_submodules('scrapy.extensions') +
                 collect_submodules('scrapy.utils')
)
datas = collect_data_files('scrapy')

After you finished writing a proper build.spec file, all you need to do is run Pyinstaller like this in your shell prompt:

pyinstaller build.spec

Pyinstaller should then spit out a proper build of your program that should work. Problem solved.


where to put this build.spec file?


Login to post a comment