python-peps/pep-0273.txt

165 lines
6.7 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 273
Title: Import Modules from Zip Archives
Version: $Revision$
Last-Modified: $Date$
Author: jim@interet.com (James C. Ahlstrom)
Status: Draft
Type: Standards Track
Created: 11-Oct-2001
Post-History: 26-Oct-2001
Python-Version: 2.3
Abstract
This PEP adds the ability to import compiled Python modules
*.py[co] and packages from zip archives.
Specification
Currently, sys.path is a list of directory names as strings. If
this PEP is implemented, an item of sys.path can be a string
naming a zip file archive. The zip archive can contain a
subdirectory structure to support package imports. The zip
archive satisfies imports exactly as a subdirectory would.
The implementation is in C code in the Python core and works on
all supported Python platforms.
Any files may be present in the zip archive, but only files *.pyc,
*.pyo and __init__.py[co] are available for import. Zip import of
*.py and dynamic modules (*.pyd, *.so) is disallowed.
Just as sys.path currently has default directory names, default
zip archive names are added too. Otherwise there is no way to
import all Python library files from an archive.
Reading compressed zip archives requires the zlib module. An
import of zlib will be attempted prior to any other imports. If
zlib is not available at that time, only uncompressed archives
will be readable, even if zlib subsequently becomes available.
Subdirectory Equivalence
The zip archive must be treated exactly as a subdirectory tree so
we can support package imports based on current and future rules.
Zip archive files must be created with relative path names. That
is, archive file names are of the form: file1, file2, dir1/file3,
dir2/dir3/file4.
Suppose sys.path contains "/A/B/SubDir" and "/C/D/E/Archive.zip",
and we are trying to import modfoo from the Q package. Then
import.c will generate a list of paths and extensions and will
look for the file. The list of generated paths does not change
for zip imports. Suppose import.c generates the path
"/A/B/SubDir/Q/R/modfoo.pyc". Then it will also generate the path
"/C/D/E/Archive.zip/Q/R/modfoo.pyc". Finding the SubDir path is
exactly equivalent to finding "Q/R/modfoo.pyc" in the archive.
Suppose you zip up /A/B/SubDir/* and all its subdirectories. Then
your zip file will satisfy imports just as your subdirectory did.
Well, not quite. You can't satisfy dynamic modules from a zip
file. Dynamic modules have extensions like .dll, .pyd, and .so.
They are operating system dependent, and probably can't be loaded
except from a file. It might be possible to extract the dynamic
module from the zip file, write it to a plain file and load it.
But that would mean creating temporary files, and dealing with all
the dynload_*.c, and that's probably not a good idea.
You also can't import source files *.py from a zip archive. The
problem here is what to do with the compiled files. Python would
normally write these to the same directory as *.py, but surely we
don't want to write to the zip file. We could write to the
directory of the zip archive, but that would clutter it up, not
good if it is /usr/bin for example. We could just fail to write
the compiled files, but that makes zip imports very slow, and the
user would probably not figure out what is wrong. It is probably
best for users to put *.pyc into zip archives in the first place,
and this PEP enforces that rule.
So the only imports zip archives support are *.pyc and *.pyo, plus
the import of __init__.py[co] for packages, and the search of the
subdirectory structure for the same.
Efficiency
The only way to find files in a zip archive is linear search. So
for each zip file in sys.path, we search for its names once, and
put the names plus other relevant data into a static Python
dictionary. The key is the archive name from sys.path joined with
the file name (including any subdirectories) within the archive.
This is exactly the name generated by import.c, and makes lookup
easy.
zlib
Compressed zip archives require zlib for decompression. Prior to
any other imports, we attempt an import of zlib, and set a flag if
it is available. All compressed files are invisible unless this
flag is true.
It could happen that zlib was available later. For example, the
import of site.py might add the correct directory to sys.path so a
dynamic load succeeds. But compressed files will still be
invisible. It is unknown if it can happen that importing site.py
can cause zlib to appear, so maybe we're worrying about nothing.
On Windows and Linux, the early import of zlib succeeds without
site.py.
The problem here is the confusion caused by the reverse. Either a
zip file satisfies imports or it doesn't. It is silly to say that
site.py needs to be uncompressed, and that maybe imports will
succeed later. If you don't like this, create uncompressed zip
archives or make sure zlib is available, for example, as a
built-in module. Or we can write special search logic during zip
initialization.
Booting
Python imports site.py itself, and this imports os, nt, ntpath,
stat, and UserDict. It also imports sitecustomize.py which may
import more modules. Zip imports must be available before site.py
is imported.
Just as there are default directories in sys.path, there must be
one or more default zip archives too.
The problem is what the name should be. The name should be linked
with the Python version, so the Python executable can correctly
find its corresponding libraries even when there are multiple
Python versions on the same machine.
This PEP suggests a zip archive name equal to the Python
interpreter path with extension ".zip" (eg, /usr/bin/python.zip)
which is always prepended to sys.path. So a directory with python
and python.zip is complete. This would work fine on Windows, as
it is common to put supporting files in the directory of the
executable. But it may offend Unix fans, who dislike bin
directories being used for libraries. It might be fine to
generate different defaults for Windows and Unix if necessary, but
the code will be in C, and there is no sense getting complicated.
Implementation
A C implementation is available as SourceForge patch 476047.
http://sourceforge.net/tracker/index.php?func=detail&aid=476047&group_id=5470&atid=305470
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End: