Updated PEP 3147 with latest BDFL pronouncement.
This commit is contained in:
parent
1f032d497c
commit
d142f19eb9
BIN
pep-3147-1.dia
BIN
pep-3147-1.dia
Binary file not shown.
BIN
pep-3147-1.png
BIN
pep-3147-1.png
Binary file not shown.
Before Width: | Height: | Size: 47 KiB After Width: | Height: | Size: 64 KiB |
134
pep-3147.txt
134
pep-3147.txt
|
@ -8,7 +8,7 @@ Type: Standards Track
|
|||
Content-Type: text/x-rst
|
||||
Created: 2009-12-16
|
||||
Python-Version: 3.2
|
||||
Post-History: 2010-01-30, 2010-02-25
|
||||
Post-History: 2010-01-30, 2010-02-25, 2010-03-03
|
||||
|
||||
|
||||
Abstract
|
||||
|
@ -75,7 +75,7 @@ Python release was added or removed from the distribution. Because of
|
|||
the sheer number of packages available, this amount of work is
|
||||
infeasible.
|
||||
|
||||
Even C extensions can be source compatible across multiple versions of
|
||||
C extensions can be source compatible across multiple versions of
|
||||
Python. Compiled extension modules are usually not compatible though,
|
||||
and PEP 384 [7]_ has been proposed to address this by defining a
|
||||
stable ABI for extension modules.
|
||||
|
@ -101,10 +101,9 @@ Python's import machinery is extended to write and search for byte
|
|||
code cache files in a single directory inside every Python package
|
||||
directory. This directory will be called `__pycache__`.
|
||||
|
||||
Further, pyc files will contain a magic string that
|
||||
differentiates the Python version they were compiled for. This allows
|
||||
multiple byte compiled cache files to co-exist for a single Python
|
||||
source file.
|
||||
Further, pyc files will contain a magic string that differentiates the
|
||||
Python version they were compiled for. This allows multiple byte
|
||||
compiled cache files to co-exist for a single Python source file.
|
||||
|
||||
This scheme has the added benefit of reducing the clutter in a Python
|
||||
package directory.
|
||||
|
@ -112,8 +111,8 @@ package directory.
|
|||
What would this look like in practice?
|
||||
|
||||
Let's say we have a Python package named `alpha` which contains a
|
||||
sub-package name `beta`. The source directory layout might look like
|
||||
this::
|
||||
sub-package name `beta`. The source directory layout before byte
|
||||
compilation might look like this::
|
||||
|
||||
alpha/
|
||||
__init__.py
|
||||
|
@ -144,6 +143,8 @@ following layout::
|
|||
three.py
|
||||
four.py
|
||||
|
||||
*Note: listing order may differ depending on the platform.*
|
||||
|
||||
Let's say that two new versions of Python are installed, one is Python
|
||||
3.3 and another is Unladen Swallow. After byte compilation, the file
|
||||
system would look like this::
|
||||
|
@ -240,23 +241,29 @@ It's possible that the `foo.py` file somehow got removed, while
|
|||
leaving the cached pyc file still on the file system. If the
|
||||
`__pycache__/foo.<magic>.pyc` file exists, but the `foo.py` file used
|
||||
to create it does not, Python will raise an `ImportError` when asked
|
||||
to import foo. In other words, by default, Python will not support
|
||||
importing a module unless the source file exists.
|
||||
|
||||
Python users who want to deploy sourceless imports are instructed to
|
||||
create a custom importer that supports this behavior. Options include
|
||||
importing pycs from a zip file, or locating pyc files where the py
|
||||
source file would have existed. (See the Open Issues section for more
|
||||
discussion.)
|
||||
to import foo. In other words, Python will not import a pyc file from
|
||||
the cache directory unless the source file exists.
|
||||
|
||||
|
||||
Case 4: legacy pyc files
|
||||
------------------------
|
||||
Case 4: legacy pyc files and source-less imports
|
||||
------------------------------------------------
|
||||
|
||||
Python will ignore all legacy pyc files when a source file exists next
|
||||
to it. In other words, if a `foo.pyc` file exists next to the
|
||||
`foo.py` file, the pyc file will be ignored in all cases
|
||||
|
||||
In order to continue to support source-less distributions though, if
|
||||
the source file is missing, Python will import a lone pyc file if it
|
||||
lives where the source file would have been.
|
||||
|
||||
|
||||
Case 5: read-only file systems
|
||||
------------------------------
|
||||
|
||||
When the source lives on a read-only file system, or the `__pycache__`
|
||||
directory or pyc file cannot otherwise be written, all the same rules
|
||||
apply.
|
||||
|
||||
Python will ignore all legacy pyc files. In other words, if a
|
||||
`foo.pyc` file exists next to the `foo.py` file, it will be ignored in
|
||||
all cases, including sourceless deployments. Python users wishing to
|
||||
support this use case can create a custom importer.
|
||||
|
||||
|
||||
Flow chart
|
||||
|
@ -273,7 +280,7 @@ Magic identifiers
|
|||
|
||||
pyc files inside of the `__pycache__` directories contain a magic
|
||||
identifier in their file names. These are mnemonic tags for the
|
||||
actual magic numbers used by the importer. For example, for Python
|
||||
actual magic numbers used by the importer. For example, in Python
|
||||
3.2, we could use the hexlified [10]_ magic number as a unique
|
||||
identifier::
|
||||
|
||||
|
@ -402,8 +409,8 @@ possible to backport this PEP. However, in Python 3.2 (and possibly
|
|||
2.7), this behavior will be turned on by default, and in fact, it will
|
||||
replace the old behavior. Backports will need to support the old
|
||||
layout by default. We suggest supporting PEP 3147 through the use of
|
||||
an environment variable called `$PYTHONCACHEDIR` or the command line
|
||||
switch `-Xcachedir` to enable the feature.
|
||||
an environment variable called `$PYTHONENABLECACHEDIR` or the command
|
||||
line switch `-Xenablecachedir` to enable the feature.
|
||||
|
||||
|
||||
Alternatives
|
||||
|
@ -482,58 +489,40 @@ implementation remain in sync.
|
|||
Open issues
|
||||
===========
|
||||
|
||||
Byte code only packages
|
||||
-----------------------
|
||||
__pycache__ vs. __cachepy__
|
||||
-----------------------------
|
||||
|
||||
Some users of Python distribute packages containing only the byte code
|
||||
files (pyc). The use cases for this are to make it more difficult for
|
||||
end-users to view the source code, and to reduce maintenance burdens
|
||||
when end users casually edit the source files.
|
||||
Minor point, but __pycache__ sorts after __init__.py alphabetically so
|
||||
that might be a little jarring (see the directory layout examples
|
||||
above). It seems that `ls(1)` on Linux at least also sorts the files
|
||||
alphabetically, ignoring the leading underscores.
|
||||
|
||||
This PEP currently promote no default support for bytecode-only
|
||||
packages. The primary motivator for this are that we can reduce stat
|
||||
calls if the importer only looks for .py files, making Python start-up
|
||||
and import faster.
|
||||
Should we name the cache directory something like `__cachepy__` so
|
||||
that it sorts before `__init__.py`? OTOH, many graphical file system
|
||||
navigators sort directories before plain files anyway, so maybe it
|
||||
doesn't matter.
|
||||
|
||||
The question is how to balance the requirements of bytecode-only users
|
||||
with the more universally beneficial faster start up times for
|
||||
requiring source files? Should all Python users pay the extra stat
|
||||
call penalty in the general case for a minority use case by default?
|
||||
Evidence shows that the extra stats can be fairly costly to start up
|
||||
time.
|
||||
Here are some sample `ls(1) -l` output. First, with `__pycache__`::
|
||||
|
||||
There are several ways out of this. Should we decide that it's
|
||||
important enough to support bytecode-only packages, the semantics
|
||||
would be as follows:
|
||||
% ls -l
|
||||
total 8
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:29 alpha.py
|
||||
drwxrwxr-x 2 user user 4096 2010-03-03 08:28 beta/
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 __init__.py
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 one.py
|
||||
drwxrwxr-x 2 user user 4096 2010-03-03 08:28 __pycache__/
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 two.py
|
||||
|
||||
* If there is a traditional, non-magic-tagged .pyc file in the
|
||||
location where a .py file should be found, it will satisfy the
|
||||
import.
|
||||
* The `__file__` attribute of the module will point to the .pyc file.
|
||||
* The `__cached__` attribute of the module will point to the .pyc file
|
||||
too.
|
||||
* The existence of a matching `__pycached__/foo.<magic>.pyc` file
|
||||
without the source py file will *not* satisfy the import. This
|
||||
means that if the source file is removed, the pyc file will be
|
||||
ignored (unlike in today's implementation).
|
||||
Now, with `__cachepy__`::
|
||||
|
||||
Other ways to satisfy the bytecode-only packagers requirements would
|
||||
have less impact on the general Python user population, and include:
|
||||
|
||||
* Add a `-X` switch and/or environment variable to enable
|
||||
the bytecode-only search algorithm.
|
||||
* Let those who want more protection against casual py hackers package
|
||||
their code in a zip file, which is supported today. Sub-options
|
||||
include supporting pyc-only imports only in zip files, or still
|
||||
requiring the py file for zip imports.
|
||||
* Provide a custom importer supporting bytecode-only packages, which
|
||||
would have to be enabled explicitly by the application. Either
|
||||
Python would provide such a custom importer or it would be left to
|
||||
third parties to implement.
|
||||
* Add a marker to a package's `__init__.py` file to enable
|
||||
bytecode-only imports for everything else in the package.
|
||||
* Leave it to third-party tools such as py2exe [20]_ to build an
|
||||
ecosystem and standards around source-less distributions.
|
||||
% ls -l
|
||||
total 8
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:29 alpha.py
|
||||
drwxrwxr-x 2 user user 4096 2010-03-03 08:28 beta/
|
||||
drwxrwxr-x 2 user user 4096 2010-03-03 08:28 __cachepy__/
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 __init__.py
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 one.py
|
||||
-rw-rw-r-- 1 user user 0 2010-03-03 08:28 two.py
|
||||
|
||||
|
||||
__cached__ vs. __compiled__
|
||||
|
@ -592,9 +581,6 @@ References
|
|||
|
||||
.. [18] importlib: http://docs.python.org/3.1/library/importlib.html
|
||||
|
||||
.. [19] http://mail.python.org/pipermail/python-dev/2010-March/098042.html
|
||||
|
||||
.. [20] py2exe: http://www.py2exe.org/
|
||||
|
||||
ACKNOWLEDGMENTS
|
||||
===============
|
||||
|
|
Loading…
Reference in New Issue