Updated PEP 3147 with latest BDFL pronouncement.

This commit is contained in:
Barry Warsaw 2010-03-03 14:11:24 +00:00
parent 1f032d497c
commit d142f19eb9
3 changed files with 60 additions and 74 deletions

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 47 KiB

After

Width:  |  Height:  |  Size: 64 KiB

View File

@ -8,7 +8,7 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2009-12-16 Created: 2009-12-16
Python-Version: 3.2 Python-Version: 3.2
Post-History: 2010-01-30, 2010-02-25 Post-History: 2010-01-30, 2010-02-25, 2010-03-03
Abstract Abstract
@ -75,7 +75,7 @@ Python release was added or removed from the distribution. Because of
the sheer number of packages available, this amount of work is the sheer number of packages available, this amount of work is
infeasible. infeasible.
Even C extensions can be source compatible across multiple versions of C extensions can be source compatible across multiple versions of
Python. Compiled extension modules are usually not compatible though, Python. Compiled extension modules are usually not compatible though,
and PEP 384 [7]_ has been proposed to address this by defining a and PEP 384 [7]_ has been proposed to address this by defining a
stable ABI for extension modules. stable ABI for extension modules.
@ -101,10 +101,9 @@ Python's import machinery is extended to write and search for byte
code cache files in a single directory inside every Python package code cache files in a single directory inside every Python package
directory. This directory will be called `__pycache__`. directory. This directory will be called `__pycache__`.
Further, pyc files will contain a magic string that Further, pyc files will contain a magic string that differentiates the
differentiates the Python version they were compiled for. This allows Python version they were compiled for. This allows multiple byte
multiple byte compiled cache files to co-exist for a single Python compiled cache files to co-exist for a single Python source file.
source file.
This scheme has the added benefit of reducing the clutter in a Python This scheme has the added benefit of reducing the clutter in a Python
package directory. package directory.
@ -112,8 +111,8 @@ package directory.
What would this look like in practice? What would this look like in practice?
Let's say we have a Python package named `alpha` which contains a Let's say we have a Python package named `alpha` which contains a
sub-package name `beta`. The source directory layout might look like sub-package name `beta`. The source directory layout before byte
this:: compilation might look like this::
alpha/ alpha/
__init__.py __init__.py
@ -144,6 +143,8 @@ following layout::
three.py three.py
four.py four.py
*Note: listing order may differ depending on the platform.*
Let's say that two new versions of Python are installed, one is Python Let's say that two new versions of Python are installed, one is Python
3.3 and another is Unladen Swallow. After byte compilation, the file 3.3 and another is Unladen Swallow. After byte compilation, the file
system would look like this:: system would look like this::
@ -240,23 +241,29 @@ It's possible that the `foo.py` file somehow got removed, while
leaving the cached pyc file still on the file system. If the leaving the cached pyc file still on the file system. If the
`__pycache__/foo.<magic>.pyc` file exists, but the `foo.py` file used `__pycache__/foo.<magic>.pyc` file exists, but the `foo.py` file used
to create it does not, Python will raise an `ImportError` when asked to create it does not, Python will raise an `ImportError` when asked
to import foo. In other words, by default, Python will not support to import foo. In other words, Python will not import a pyc file from
importing a module unless the source file exists. the cache directory unless the source file exists.
Python users who want to deploy sourceless imports are instructed to
create a custom importer that supports this behavior. Options include
importing pycs from a zip file, or locating pyc files where the py
source file would have existed. (See the Open Issues section for more
discussion.)
Case 4: legacy pyc files Case 4: legacy pyc files and source-less imports
------------------------ ------------------------------------------------
Python will ignore all legacy pyc files when a source file exists next
to it. In other words, if a `foo.pyc` file exists next to the
`foo.py` file, the pyc file will be ignored in all cases
In order to continue to support source-less distributions though, if
the source file is missing, Python will import a lone pyc file if it
lives where the source file would have been.
Case 5: read-only file systems
------------------------------
When the source lives on a read-only file system, or the `__pycache__`
directory or pyc file cannot otherwise be written, all the same rules
apply.
Python will ignore all legacy pyc files. In other words, if a
`foo.pyc` file exists next to the `foo.py` file, it will be ignored in
all cases, including sourceless deployments. Python users wishing to
support this use case can create a custom importer.
Flow chart Flow chart
@ -273,7 +280,7 @@ Magic identifiers
pyc files inside of the `__pycache__` directories contain a magic pyc files inside of the `__pycache__` directories contain a magic
identifier in their file names. These are mnemonic tags for the identifier in their file names. These are mnemonic tags for the
actual magic numbers used by the importer. For example, for Python actual magic numbers used by the importer. For example, in Python
3.2, we could use the hexlified [10]_ magic number as a unique 3.2, we could use the hexlified [10]_ magic number as a unique
identifier:: identifier::
@ -402,8 +409,8 @@ possible to backport this PEP. However, in Python 3.2 (and possibly
2.7), this behavior will be turned on by default, and in fact, it will 2.7), this behavior will be turned on by default, and in fact, it will
replace the old behavior. Backports will need to support the old replace the old behavior. Backports will need to support the old
layout by default. We suggest supporting PEP 3147 through the use of layout by default. We suggest supporting PEP 3147 through the use of
an environment variable called `$PYTHONCACHEDIR` or the command line an environment variable called `$PYTHONENABLECACHEDIR` or the command
switch `-Xcachedir` to enable the feature. line switch `-Xenablecachedir` to enable the feature.
Alternatives Alternatives
@ -482,58 +489,40 @@ implementation remain in sync.
Open issues Open issues
=========== ===========
Byte code only packages __pycache__ vs. __cachepy__
----------------------- -----------------------------
Some users of Python distribute packages containing only the byte code Minor point, but __pycache__ sorts after __init__.py alphabetically so
files (pyc). The use cases for this are to make it more difficult for that might be a little jarring (see the directory layout examples
end-users to view the source code, and to reduce maintenance burdens above). It seems that `ls(1)` on Linux at least also sorts the files
when end users casually edit the source files. alphabetically, ignoring the leading underscores.
This PEP currently promote no default support for bytecode-only Should we name the cache directory something like `__cachepy__` so
packages. The primary motivator for this are that we can reduce stat that it sorts before `__init__.py`? OTOH, many graphical file system
calls if the importer only looks for .py files, making Python start-up navigators sort directories before plain files anyway, so maybe it
and import faster. doesn't matter.
The question is how to balance the requirements of bytecode-only users Here are some sample `ls(1) -l` output. First, with `__pycache__`::
with the more universally beneficial faster start up times for
requiring source files? Should all Python users pay the extra stat
call penalty in the general case for a minority use case by default?
Evidence shows that the extra stats can be fairly costly to start up
time.
There are several ways out of this. Should we decide that it's % ls -l
important enough to support bytecode-only packages, the semantics total 8
would be as follows: -rw-rw-r-- 1 user user 0 2010-03-03 08:29 alpha.py
drwxrwxr-x 2 user user 4096 2010-03-03 08:28 beta/
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 __init__.py
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 one.py
drwxrwxr-x 2 user user 4096 2010-03-03 08:28 __pycache__/
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 two.py
* If there is a traditional, non-magic-tagged .pyc file in the Now, with `__cachepy__`::
location where a .py file should be found, it will satisfy the
import.
* The `__file__` attribute of the module will point to the .pyc file.
* The `__cached__` attribute of the module will point to the .pyc file
too.
* The existence of a matching `__pycached__/foo.<magic>.pyc` file
without the source py file will *not* satisfy the import. This
means that if the source file is removed, the pyc file will be
ignored (unlike in today's implementation).
Other ways to satisfy the bytecode-only packagers requirements would % ls -l
have less impact on the general Python user population, and include: total 8
-rw-rw-r-- 1 user user 0 2010-03-03 08:29 alpha.py
* Add a `-X` switch and/or environment variable to enable drwxrwxr-x 2 user user 4096 2010-03-03 08:28 beta/
the bytecode-only search algorithm. drwxrwxr-x 2 user user 4096 2010-03-03 08:28 __cachepy__/
* Let those who want more protection against casual py hackers package -rw-rw-r-- 1 user user 0 2010-03-03 08:28 __init__.py
their code in a zip file, which is supported today. Sub-options -rw-rw-r-- 1 user user 0 2010-03-03 08:28 one.py
include supporting pyc-only imports only in zip files, or still -rw-rw-r-- 1 user user 0 2010-03-03 08:28 two.py
requiring the py file for zip imports.
* Provide a custom importer supporting bytecode-only packages, which
would have to be enabled explicitly by the application. Either
Python would provide such a custom importer or it would be left to
third parties to implement.
* Add a marker to a package's `__init__.py` file to enable
bytecode-only imports for everything else in the package.
* Leave it to third-party tools such as py2exe [20]_ to build an
ecosystem and standards around source-less distributions.
__cached__ vs. __compiled__ __cached__ vs. __compiled__
@ -592,9 +581,6 @@ References
.. [18] importlib: http://docs.python.org/3.1/library/importlib.html .. [18] importlib: http://docs.python.org/3.1/library/importlib.html
.. [19] http://mail.python.org/pipermail/python-dev/2010-March/098042.html
.. [20] py2exe: http://www.py2exe.org/
ACKNOWLEDGMENTS ACKNOWLEDGMENTS
=============== ===============