Updates to PYC Repository Directories, reflecting current thinking on the
approach.
This commit is contained in:
parent
9d361d1e78
commit
d3b8603bd9
BIN
pep-3147-1.dia
BIN
pep-3147-1.dia
Binary file not shown.
BIN
pep-3147-1.png
BIN
pep-3147-1.png
Binary file not shown.
Before Width: | Height: | Size: 54 KiB After Width: | Height: | Size: 47 KiB |
554
pep-3147.txt
554
pep-3147.txt
|
@ -8,7 +8,7 @@ Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 2009-12-16
|
Created: 2009-12-16
|
||||||
Python-Version: 3.2
|
Python-Version: 3.2
|
||||||
Post-History:
|
Post-History: 2010-01-30, 2010-02-25
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
|
@ -17,49 +17,75 @@ Abstract
|
||||||
This PEP describes an extension to Python's import mechanism which
|
This PEP describes an extension to Python's import mechanism which
|
||||||
improves sharing of Python source code files among multiple installed
|
improves sharing of Python source code files among multiple installed
|
||||||
different versions of the Python interpreter. It does this by
|
different versions of the Python interpreter. It does this by
|
||||||
allowing many different byte compilation files (.pyc files) to be
|
allowing more than one byte compilation file (.pyc files) to be
|
||||||
co-located with the Python source file (.py file). The extension
|
co-located with the Python source file (.py file). The extension
|
||||||
described here can also be used to support different Python
|
described here can also be used to support different Python
|
||||||
compilation caches, such as JIT output that may be produced by an
|
compilation caches, such as JIT output that may be produced by an
|
||||||
Unladen Swallow [1]_ enabled C Python.
|
Unladen Swallow [1]_ enabled C Python.
|
||||||
|
|
||||||
|
|
||||||
|
Background
|
||||||
|
==========
|
||||||
|
|
||||||
|
CPython compiles its source code into "byte code", and for performance
|
||||||
|
reasons, it caches this byte code on the file system whenever the
|
||||||
|
source file has changes. This makes loading of Python modules much
|
||||||
|
faster because the compilation phase can be bypassed. When your
|
||||||
|
source file is `foo.py`, CPython caches the byte code in a `foo.pyc`
|
||||||
|
file right next to the source.
|
||||||
|
|
||||||
|
Byte code files contain two 32-bit numbers followed by the marshaled
|
||||||
|
[2]_ code object. The 32-bit numbers represent a magic number and a
|
||||||
|
timestamp. The magic number changes whenever Python changes the byte
|
||||||
|
code format, e.g. by adding new byte codes to its virtual machine.
|
||||||
|
This ensures that pyc files built for previous versions of the VM
|
||||||
|
won't cause problems. The timestamp is used to make sure that the pyc
|
||||||
|
file is not older than the py file that was used to create it. When
|
||||||
|
either the magic number or timestamp do not match, the py file is
|
||||||
|
recompiled and a new pyc file is written.
|
||||||
|
|
||||||
|
In practice, it is well known that pyc files are not compatible across
|
||||||
|
Python major releases. A reading of import.c [3]_ in the Python
|
||||||
|
source code proves that within recent memory, every new CPython major
|
||||||
|
release has bumped the pyc magic number.
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
=========
|
=========
|
||||||
|
|
||||||
Linux distributions such as Ubuntu [2]_ and Debian [3]_ provide more
|
Linux distributions such as Ubuntu [4]_ and Debian [5]_ provide more
|
||||||
than one Python version at the same time to their users. For example,
|
than one Python version at the same time to their users. For example,
|
||||||
Ubuntu 9.10 Karmic Koala can install Python 2.5, 2.6, and 3.1, with
|
Ubuntu 9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1,
|
||||||
Python 2.6 being the default.
|
with Python 2.6 being the default.
|
||||||
|
|
||||||
In order to ease the burden on operating system packagers for these
|
This causes a conflict for Python source files installed by the
|
||||||
distributions, the distribution packages do not contain Python version
|
system (including third party packages), because you cannot compile a
|
||||||
numbers [4]_; they are shared across all Python versions installed on
|
single Python source file for more than one Python version at a time.
|
||||||
the system. Putting Python version numbers in the packages would be a
|
Thus if your system wanted to install a `/usr/share/python/foo.py`, it
|
||||||
maintenance nightmare, since all the packages - *and their
|
could not create a `/usr/share/python/foo.pyc` file usable across all
|
||||||
dependencies* - would have to be updated every time a new Python
|
installed Python versions.
|
||||||
release was added or removed from the distribution. Because of the
|
|
||||||
sheer number of packages available, this amount of work is infeasible.
|
|
||||||
|
|
||||||
For pure Python modules, sharing is possible because upstream
|
Furthermore, in order to ease the burden on operating system packagers
|
||||||
maintainers typically support multiple versions of Python in a source
|
for these distributions, the distribution packages do not contain
|
||||||
compatible way. In practice though, it is well known that pyc files
|
Python version numbers [6]_; they are shared across all Python
|
||||||
are not compatible across Python major releases. A reading of
|
versions installed on the system. Putting Python version numbers in
|
||||||
import.c [5]_ in the Python source code proves that within recent
|
the packages would be a maintenance nightmare, since all the packages
|
||||||
memory, every new CPython major release has bumped the pyc magic
|
- *and their dependencies* - would have to be updated every time a new
|
||||||
number.
|
Python release was added or removed from the distribution. Because of
|
||||||
|
the sheer number of packages available, this amount of work is
|
||||||
|
infeasible.
|
||||||
|
|
||||||
Even C extensions can be source compatible across multiple versions of
|
Even C extensions can be source compatible across multiple versions of
|
||||||
Python. Compiled extension modules are usually not compatible though,
|
Python. Compiled extension modules are usually not compatible though,
|
||||||
and PEP 384 [6]_ has been proposed to address this by defining a
|
and PEP 384 [7]_ has been proposed to address this by defining a
|
||||||
stable ABI for extension modules.
|
stable ABI for extension modules.
|
||||||
|
|
||||||
Because the distributions cannot share pyc files, elaborate mechanisms
|
Because these distributions cannot share pyc files, elaborate
|
||||||
have been developed to put the resulting pyc files in non-shared
|
mechanisms have been developed to put the resulting pyc files in
|
||||||
locations while the source code is still shared. Examples include the
|
non-shared locations while the source code is still shared. Examples
|
||||||
symlink-based Debian regimes python-support [7]_ and python-central
|
include the symlink-based Debian regimes python-support [8]_ and
|
||||||
[8]_. These approaches make for much more complicated, fragile,
|
python-central [9]_. These approaches make for much more complicated,
|
||||||
inscrutable, and fragmented policies for delivering Python
|
fragile, inscrutable, and fragmented policies for delivering Python
|
||||||
applications to a wide range of users. Arguably more users get Python
|
applications to a wide range of users. Arguably more users get Python
|
||||||
from their operating system vendor than from upstream tarballs. Thus,
|
from their operating system vendor than from upstream tarballs. Thus,
|
||||||
solving this pyc sharing problem for CPython is a high priority for
|
solving this pyc sharing problem for CPython is a high priority for
|
||||||
|
@ -71,29 +97,106 @@ This PEP proposes a solution to this problem.
|
||||||
Proposal
|
Proposal
|
||||||
========
|
========
|
||||||
|
|
||||||
Python's import machinery is extended to search for byte code cache
|
Python's import machinery is extended to write and search for byte
|
||||||
files in a directory co-located with the source file, but with an
|
code cache files in a single directory inside every Python package
|
||||||
extension 'pyr'. The pyr directory contains individual files with the
|
directory. This directory will be called `__pycache__`.
|
||||||
cached byte compilation of the source code, identical to current pyc
|
|
||||||
and pyo files. The files inside the pyr directory retain their file
|
|
||||||
extensions, but the base name is replaced by the hexlified [10]_ magic
|
|
||||||
number of the Python version the byte code is compatible with.
|
|
||||||
|
|
||||||
The file extension pyr was chosen because 'r' is a mnemonic for
|
Further, pyc files will contain a magic string that
|
||||||
'repository', and there appears to be no prior uses of the extension
|
differentiates the Python version they were compiled for. This allows
|
||||||
[9]_.
|
multiple byte compiled cache files to co-exist for a single Python
|
||||||
|
source file.
|
||||||
|
|
||||||
For example, a module `foo` with source code in `foo.py` and byte
|
This scheme has the added benefit of reducing the clutter in a Python
|
||||||
compiled with Python 2.5, Python 2.6, Python 2.6 `-O`, Python 2.6
|
package directory.
|
||||||
`-U`, and Python 3.1 would have the following file system layout::
|
|
||||||
|
|
||||||
foo.py
|
What would this look like in practice?
|
||||||
foo.pyr/
|
|
||||||
f2b30a0d.pyc # Python 2.5
|
Let's say we have a Python package named `alpha` which contains a
|
||||||
f2d10a0d.pyc # Python 2.6
|
sub-package name `beta`. The source directory layout might look like
|
||||||
f2d10a0d.pyo # Python 2.6 -O
|
this::
|
||||||
f2d20a0d.pyc # Python 2.6 -U
|
|
||||||
0c4f0a0d.pyc # Python 3.1
|
alpha/
|
||||||
|
__init__.py
|
||||||
|
one.py
|
||||||
|
two.py
|
||||||
|
beta/
|
||||||
|
__init__.py
|
||||||
|
three.py
|
||||||
|
four.py
|
||||||
|
|
||||||
|
After byte compiling this package with Python 3.2, you would see the
|
||||||
|
following layout::
|
||||||
|
|
||||||
|
alpha/
|
||||||
|
__pycache__
|
||||||
|
__init__.cpython-32.pyc
|
||||||
|
one.cpython-32.pyc
|
||||||
|
two.cpython-32.pyc
|
||||||
|
__init__.py
|
||||||
|
one.py
|
||||||
|
two.py
|
||||||
|
beta/
|
||||||
|
__pycache__
|
||||||
|
__init__.cpython-32.pyc
|
||||||
|
three.cpython-32.pyc
|
||||||
|
four.cpython-32.pyc
|
||||||
|
__init__.py
|
||||||
|
three.py
|
||||||
|
four.py
|
||||||
|
|
||||||
|
Let's say that two new versions of Python are installed, one is Python
|
||||||
|
3.3 and another is Unladen Swallow. After byte compilation, the file
|
||||||
|
system would look like this::
|
||||||
|
|
||||||
|
alpha/
|
||||||
|
__pycache__
|
||||||
|
__init__.cpython-32.pyc
|
||||||
|
__init__.cpython-33.pyc
|
||||||
|
__init__.unladen-10.pyc
|
||||||
|
one.cpython-32.pyc
|
||||||
|
one.cpython-33.pyc
|
||||||
|
one.unladen-10.pyc
|
||||||
|
two.cpython-32.pyc
|
||||||
|
two.cpython-33.pyc
|
||||||
|
two.unladen-10.pyc
|
||||||
|
__init__.py
|
||||||
|
one.py
|
||||||
|
two.py
|
||||||
|
beta/
|
||||||
|
__pycache__
|
||||||
|
__init__.cpython-32.pyc
|
||||||
|
__init__.cpython-33.pyc
|
||||||
|
__init__.unladen-10.pyc
|
||||||
|
three.cpython-32.pyc
|
||||||
|
three.cpython-33.pyc
|
||||||
|
three.unladen-10.pyc
|
||||||
|
four.cpython-32.pyc
|
||||||
|
four.cpython-33.pyc
|
||||||
|
four.unladen-10.pyc
|
||||||
|
__init__.py
|
||||||
|
three.py
|
||||||
|
four.py
|
||||||
|
|
||||||
|
As you can see, as long as the Python version identifier string is
|
||||||
|
unique, any number of pyc files can co-exist. These identifier
|
||||||
|
strings are described in more detail below.
|
||||||
|
|
||||||
|
A nice property of this layout is that the `__pycache__` directories
|
||||||
|
can generally be ignored, such that a normal directory listing would
|
||||||
|
show something like this::
|
||||||
|
|
||||||
|
alpha/
|
||||||
|
__pycache__
|
||||||
|
__init__.py
|
||||||
|
one.py
|
||||||
|
two.py
|
||||||
|
beta/
|
||||||
|
__pycache__
|
||||||
|
__init__.py
|
||||||
|
three.py
|
||||||
|
four.py
|
||||||
|
|
||||||
|
This is much less cluttered than even today's Python.
|
||||||
|
|
||||||
|
|
||||||
Python behavior
|
Python behavior
|
||||||
|
@ -105,56 +208,105 @@ one of several situations. As per current Python rules, the term
|
||||||
interpreter's magic number, and the source file is not newer than the
|
interpreter's magic number, and the source file is not newer than the
|
||||||
`pyc` file.
|
`pyc` file.
|
||||||
|
|
||||||
When Python finds a `foo.py` file for which no `foo.pyc` file or
|
|
||||||
`foo.pyr` directory exists, Python will by default load the `foo.py`
|
|
||||||
file and write a `foo.pyc` file next to the source file. This is
|
|
||||||
unchanged from current behavior.
|
|
||||||
|
|
||||||
When the Python executable is given a `-R` flag, or the environment
|
Case 1: The first import
|
||||||
variable `$PYTHONPYR` is set, then Python will create a `foo.pyr`
|
------------------------
|
||||||
directory and write a `pyc` file to that directory with the hexlified
|
|
||||||
magic number as the base name.
|
|
||||||
|
|
||||||
If during import, Python finds an existing `pyc` file but no `pyr`
|
When Python is asked to import module `foo`, it searches for a
|
||||||
directory, and the `$PYTHONPYR` environment variable is not set, then
|
`foo.py` file (or `foo` package, but that's not important for this
|
||||||
the `pyc` file is loaded as normal and no `pyr` directory is created.
|
discussion) along its `sys.path`. When Python locates the `foo.py`
|
||||||
|
file it will look for a `__pycache__` directory in the directory where
|
||||||
|
it found the `foo.py`. If the `__pycache__` directory is missing,
|
||||||
|
Python will create it. Then it will parse and byte compile the
|
||||||
|
`foo.py` file and save the byte code in `__pycache__/foo.<magic>.pyc`,
|
||||||
|
where <magic> is defined by the Python implementation, but will be a
|
||||||
|
human readable string such as `cpython-32`.
|
||||||
|
|
||||||
If during import, Python finds a `pyr` directory with a matching `pyc`
|
|
||||||
file, *regardless of whether `$PYTHONPYR` is set or not*, then
|
|
||||||
`foo.pyr/<magic>.pyc` is loaded and import completes successfully.
|
|
||||||
Thus a matching `pyc` file inside a `pyr` directory always takes
|
|
||||||
precedence over a sibling `pyc` file.
|
|
||||||
|
|
||||||
If during import, Python finds a `pyr` directory that does not contain
|
Case 2: The second import
|
||||||
a matching `pyc` file, and no sibling `foo.pyc` file exists, Python
|
-------------------------
|
||||||
will load the source file and write a sibling `foo.pyc` file, unless
|
|
||||||
the `-R` flag is given in which case a `foo.pyr/<magic>.pyc` file will
|
|
||||||
be written.
|
|
||||||
|
|
||||||
Here is a flowchart illustrating the rules.
|
When Python is asked to import module `foo` a second time (in a
|
||||||
|
different process of course), it will again search for the `foo.py`
|
||||||
|
file along its `sys.path`. When Python locates the `foo.py` file, it
|
||||||
|
looks for a matching `__pycache__/foo.<magic>.pyc` and finding this,
|
||||||
|
it reads the byte code and continues as usual.
|
||||||
|
|
||||||
|
|
||||||
|
Case 3: __pycache__/foo.pyc with no source
|
||||||
|
------------------------------------------
|
||||||
|
|
||||||
|
It's possible that the `foo.py` file somehow got removed, while
|
||||||
|
leaving the cached pyc file still on the file system. If the
|
||||||
|
`__pycache__/foo.pyc` file exists, but the `foo.py` file used to
|
||||||
|
create it does not, Python will raise an `ImportError` when asked to
|
||||||
|
import foo. In other words, by default, Python will not support
|
||||||
|
importing a module unless the source file exists.
|
||||||
|
|
||||||
|
Python users who want to deploy sourceless imports are instructed to
|
||||||
|
create a custom importer that supports this behavior. Options include
|
||||||
|
importing pycs from a zip file, or locating pyc files where the py
|
||||||
|
source file would have existed.
|
||||||
|
|
||||||
|
|
||||||
|
Case 4: legacy pyc files
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
Python will ignore all legacy pyc files. In other words, if a
|
||||||
|
`foo.pyc` file exists next to the `foo.py` file, it will be ignored in
|
||||||
|
all cases, including sourceless deployments. Python users wishing to
|
||||||
|
support this use case can create a custom importer.
|
||||||
|
|
||||||
|
|
||||||
|
Flow chart
|
||||||
|
==========
|
||||||
|
|
||||||
|
Here is a flow chart describing how modules are loaded:
|
||||||
|
|
||||||
.. image:: pep-3147-1.png
|
.. image:: pep-3147-1.png
|
||||||
:scale: 75
|
:scale: 75
|
||||||
|
|
||||||
|
|
||||||
Effects on non-conforming Python versions
|
Magic identifiers
|
||||||
=========================================
|
=================
|
||||||
|
|
||||||
Python implementations which don't know anything about `pyr`
|
pyc files inside of the `__pycache__` directories contain a magic
|
||||||
directories will ignore them. This means that they will read and
|
identifier in their file names. These are mnemonic tags for the
|
||||||
write `pyc` files as usual. A conforming implementation will still
|
actual magic numbers used by the importer. For example, for Python
|
||||||
prefer any existing `foo.pyr/<magic>.pyc` file over an existing
|
3.2, we could use the hexlified [10]_ magic number as a unique
|
||||||
sibling `pyc` file.
|
identifier::
|
||||||
|
|
||||||
The one possible conflicting state is where a sibling `pyc` file
|
>>> from binascii import hexlify
|
||||||
exists, but its magic number does not match.
|
>>> from imp import get_magic
|
||||||
|
>>> 'foo.{}.pyc'.format(hexlify(get_magic()).decode('ascii'))
|
||||||
|
'foo.580c0d0a.pyc'
|
||||||
|
|
||||||
In the default case, when Python finds a `pyc` file with a
|
This isn't particularly human friendly though. Instead, this PEP
|
||||||
non-matching magic number, it simply overwrites the `pyc` file with
|
proposes to add a mapping between internal magic numbers and a
|
||||||
the new byte code and magic number. In the absence of the `-R` flag,
|
user-friendly tag. Newer versions of Python can add to this mapping
|
||||||
this remains unchanged. When the `-R` flag was given, the
|
so that all later Pythons know the mapping between tags and magic
|
||||||
non-matching sibling `pyc` file is ignored - it is neither removed nor
|
numbers. By convention, the tag will contain the Python
|
||||||
overwritten - and a `foo.pyr/<magic>.pyc` file is written instead.
|
implementation name and version nickname, where the nickname is
|
||||||
|
generally the major version number and minor version number. Magic
|
||||||
|
numbers should not change between Python micro releases, but some
|
||||||
|
other convention can be used for changes in magic number between
|
||||||
|
pre-release development versions.
|
||||||
|
|
||||||
|
For example, CPython 3.2 would have a magic identifier tag of
|
||||||
|
`cpython-32` and write pyc files like this: `foo.cpython-32.pyc`.
|
||||||
|
When the `-O` flag is used, it would write `foo.cpython-32.pyo`. For
|
||||||
|
backports of this feature to Python 2, when the `-U` flag is used, a
|
||||||
|
file such as `foo.cpython-27u.pyc` can be written.
|
||||||
|
|
||||||
|
|
||||||
|
Alternative Python implementations
|
||||||
|
==================================
|
||||||
|
|
||||||
|
Alternative Python implementations such as Jython [11]_, IronPython
|
||||||
|
[12]_, PyPy [13]_, Pynie [14]_, and Unladen Swallow can also use the
|
||||||
|
`__pycache__` directory to store whatever compilation artifacts make
|
||||||
|
sense for their platforms. For example, Jython could store the class
|
||||||
|
file for the module in `__pycache__/foo.jython-32.class`.
|
||||||
|
|
||||||
|
|
||||||
Implementation strategy
|
Implementation strategy
|
||||||
|
@ -166,13 +318,97 @@ Vendors are free to backport the changes to earlier distributions as
|
||||||
they see fit.
|
they see fit.
|
||||||
|
|
||||||
|
|
||||||
|
Effects on existing code
|
||||||
|
========================
|
||||||
|
|
||||||
|
Adoption of this PEP will affect existing code and idioms, both inside
|
||||||
|
Python and outside. This section enumerates some of these effects.
|
||||||
|
|
||||||
|
|
||||||
|
__file__
|
||||||
|
---------
|
||||||
|
|
||||||
|
in Python 3, when you import a module, its `__file__` attribute points
|
||||||
|
to its source `py` file (in Python 2, it points to the `pyc` file). A
|
||||||
|
package's `__file__` points to the `py` file for its `__init__.py`.
|
||||||
|
E.g.::
|
||||||
|
|
||||||
|
>>> import foo
|
||||||
|
>>> foo.__file__
|
||||||
|
'foo.py'
|
||||||
|
# baz is a package
|
||||||
|
>>> import baz
|
||||||
|
>>> baz.__file__
|
||||||
|
'baz/__init__.py'
|
||||||
|
|
||||||
|
The implementation of this PEP would have to ensure that the same
|
||||||
|
directory level is returned from `__file__` as it currently does so
|
||||||
|
that the common idiom above continues to work.
|
||||||
|
|
||||||
|
As part of this PEP, we will add an `__cached__` attribute to modules,
|
||||||
|
which will always point to the actual `pyc` file that was read or
|
||||||
|
written. When the environment variable `$PYTHONDONTWRITEBYTECODE` is
|
||||||
|
set, or the `-B` option is given, or if the source lives on a
|
||||||
|
read-only filesystem, then the `__cached__` attribute will point to
|
||||||
|
the location that the `pyc` file *would* have been written to if it
|
||||||
|
didn't exist. This location of course includes the `__pycache__`
|
||||||
|
subdirectory in its path.
|
||||||
|
|
||||||
|
For alternative Python implementations which do not support `pyc`
|
||||||
|
files, the `__cached__` attribute may point to whatever
|
||||||
|
version-specific binary file was read for the module code. E.g. on
|
||||||
|
Jython, this might be the `.class` file for the module:
|
||||||
|
`__pycache__/foo.jython-32.class`. Alternative implementations for
|
||||||
|
which this scheme does not make sense should set the `__cached__`
|
||||||
|
attribute to `None`.
|
||||||
|
|
||||||
|
|
||||||
|
File extension checks
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
There exists some code which checks for files ending in `.pyc` and
|
||||||
|
simply chops off the last character to find the matching `.py` file.
|
||||||
|
This code will obviously fail once this PEP is implemented.
|
||||||
|
|
||||||
|
To support this use case, we'll add two new methods to the `imp`
|
||||||
|
package [15]_:
|
||||||
|
|
||||||
|
* `imp.source_from_cache(py_path)` -> `pyc_path`
|
||||||
|
* `imp.cache_from_source(pyc_path)` -> `py_path`
|
||||||
|
|
||||||
|
Alternative implementations are free to override these functions to
|
||||||
|
return reasonable values based on their own support for this PEP.
|
||||||
|
|
||||||
|
|
||||||
|
PEP 302 loaders
|
||||||
|
---------------
|
||||||
|
|
||||||
|
PEP 302 [16]_ defined loaders have a `.get_filename()` method which
|
||||||
|
points to the `__file__` for a module. As part of this PEP, we will
|
||||||
|
extend this API, to include a new method `.get_paths()` which will
|
||||||
|
return a 2-tuple containing the path to the source file and the path
|
||||||
|
to where the matching `pyc` file is (or would be).
|
||||||
|
|
||||||
|
|
||||||
|
Backports
|
||||||
|
---------
|
||||||
|
|
||||||
|
For versions of Python earlier than 3.2 (and possibly 2.7), it is
|
||||||
|
possible to backport this PEP. However, in Python 3.2 (and possibly
|
||||||
|
2.7), this behavior will be turned on by default, and in fact, it will
|
||||||
|
replace the old behavior. Backports will need to support the old
|
||||||
|
layout by default. We suggest supporting PEP 3147 through the use of
|
||||||
|
an environment variable called `$PYTHONCACHEDIR` or the command line
|
||||||
|
switch `-Xcachedir` to enable the feature.
|
||||||
|
|
||||||
|
|
||||||
Alternatives
|
Alternatives
|
||||||
============
|
============
|
||||||
|
|
||||||
PEP 304
|
PEP 304
|
||||||
-------
|
-------
|
||||||
|
|
||||||
There is some overlap between the goals of this PEP and PEP 304 [12]_,
|
There is some overlap between the goals of this PEP and PEP 304 [17]_,
|
||||||
which has been withdrawn. However PEP 304 would allow a user to
|
which has been withdrawn. However PEP 304 would allow a user to
|
||||||
create a shadow file system hierarchy in which to store `pyc` files.
|
create a shadow file system hierarchy in which to store `pyc` files.
|
||||||
This concept of a shadow hierarchy for `pyc` files could be used to
|
This concept of a shadow hierarchy for `pyc` files could be used to
|
||||||
|
@ -197,37 +433,6 @@ location of the `pyc` file in the shadow directory, and it may not be
|
||||||
possible to find the `my.dat` file relative to the source directory
|
possible to find the `my.dat` file relative to the source directory
|
||||||
from there.
|
from there.
|
||||||
|
|
||||||
On the other hand, this PEP keeps all byte code artifacts co-located
|
|
||||||
with the source file. Some adjustment will have to be made for the
|
|
||||||
fact that the `pyc` file lives in a subdirectory. For example, in
|
|
||||||
current Python, when you import a module, its `__file__` attribute
|
|
||||||
points to its `pyc` file. A package's `__file__` points to the `pyc`
|
|
||||||
file for its `__init__.py`. E.g.::
|
|
||||||
|
|
||||||
>>> import foo
|
|
||||||
>>> foo.__file__
|
|
||||||
'foo.pyc'
|
|
||||||
# baz is a package
|
|
||||||
>>> import baz
|
|
||||||
>>> baz.__file__
|
|
||||||
'baz/__init__.pyc'
|
|
||||||
|
|
||||||
The implementation of this PEP would have to ensure that the same
|
|
||||||
directory level is returned from `__file__` as it does without the
|
|
||||||
`pyr` directory, so that the common idiom above continues to work::
|
|
||||||
|
|
||||||
>>> import foo
|
|
||||||
>>> foo.__file__
|
|
||||||
'foo.pyr'
|
|
||||||
# baz is a package
|
|
||||||
>>> import baz
|
|
||||||
>>> baz.__file__
|
|
||||||
'baz/__init__.pyr'
|
|
||||||
|
|
||||||
Note that some existing Python code only checks for `.py` and `.pyc`
|
|
||||||
file extensions (and possibly `.pyo`). These would have to be
|
|
||||||
extended to also check for `.pyr` extensions.
|
|
||||||
|
|
||||||
|
|
||||||
Fat byte compilation files
|
Fat byte compilation files
|
||||||
--------------------------
|
--------------------------
|
||||||
|
@ -240,8 +445,10 @@ parallel Python implementations could be supported fairly efficiently,
|
||||||
but with extension lookup tables available to scale `pyf` byte code
|
but with extension lookup tables available to scale `pyf` byte code
|
||||||
objects as large as necessary.
|
objects as large as necessary.
|
||||||
|
|
||||||
The fat byte compilation files were fairly complex, so the current
|
The fat byte compilation files were fairly complex, and inherently
|
||||||
simplification of using directories was suggested.
|
introduced difficult race conditions, so the current simplification of
|
||||||
|
using directories was suggested. The same problem applies to using
|
||||||
|
zip files as the fat pyc file format.
|
||||||
|
|
||||||
|
|
||||||
Multiple file extensions
|
Multiple file extensions
|
||||||
|
@ -256,49 +463,16 @@ approach makes it more difficult (and an ongoing task) to update any
|
||||||
tools that are dependent on the file extension.
|
tools that are dependent on the file extension.
|
||||||
|
|
||||||
|
|
||||||
Open questions
|
|
||||||
==============
|
|
||||||
|
|
||||||
* Are there any concurrency issues added by this PEP, above those that
|
|
||||||
already exist? For example, what if two Python processes attempt to
|
|
||||||
write the same `<magic>.pyc` file? Is that any different than two
|
|
||||||
Python processes trying to write to the same `foo.pyc` file?
|
|
||||||
Current thinking is that there isn't since the exclusive open
|
|
||||||
mechanism currently used, will still be used to open `pyc` files
|
|
||||||
inside a `pyr` directory.
|
|
||||||
|
|
||||||
* How do the imp [13]_ and importlib [14]_ modules need to be updated
|
|
||||||
to conform to the `pyr` directories?
|
|
||||||
|
|
||||||
* What about `py` source files that are compatible with most but not
|
|
||||||
all installed Python versions. We might need a way to say "this py
|
|
||||||
file should be hidden from Python versions X.Y or earlier". There
|
|
||||||
are three options:
|
|
||||||
|
|
||||||
- Use file system tricks to only share py files that are actually
|
|
||||||
sharable in all installed Python versions (e.g. different search
|
|
||||||
directories for Python X.Y and Python X.Z).
|
|
||||||
- Introduce Python syntax that is legal before __future__ imports
|
|
||||||
and is evaluated to determine if the py file is compatible,
|
|
||||||
raising an `ImportError('no module named foo')` if not.
|
|
||||||
- Add an optional metadata file co-located with the py file that
|
|
||||||
declares which Python versions it is compatible with.
|
|
||||||
|
|
||||||
How does this requirement interact with PEP 382 namespace packages [15]_?
|
|
||||||
|
|
||||||
* Are there any opportunities for also sharing extension modules
|
|
||||||
(.so/.dll files) in a `pyr` directory?
|
|
||||||
|
|
||||||
* Would a moratorium on byte code changes, similar to the language
|
|
||||||
moratorium described in PEP 3003 [16]_ be a better approach to
|
|
||||||
pursue, and would that solve the problem for vendors? At the time
|
|
||||||
of this writing, PEP 3003 is silent on the issue.
|
|
||||||
|
|
||||||
|
|
||||||
Reference implementation
|
Reference implementation
|
||||||
========================
|
========================
|
||||||
|
|
||||||
TBD
|
A pure-Python reference implementation will be written using
|
||||||
|
importlib [18]_, which may need some modifications to its API and
|
||||||
|
abstract base classes. Once the semantics are agreed upon and the
|
||||||
|
implementation details are settled, we'll port this to the C
|
||||||
|
implementation in `import.c`. We will have extensive tests that
|
||||||
|
guarantee that the pure-Python implementation and the built-in
|
||||||
|
implementation remain in sync.
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
|
@ -306,41 +480,45 @@ References
|
||||||
|
|
||||||
.. [1] PEP 3146
|
.. [1] PEP 3146
|
||||||
|
|
||||||
.. [2] Ubuntu: <http://www.ubuntu.com>
|
.. [2] The marshal module:
|
||||||
|
http://www.python.org/doc/current/library/marshal.html
|
||||||
|
|
||||||
.. [3] Debian: <http://www.debian.org>
|
.. [3] import.c:
|
||||||
|
|
||||||
.. [4] Debian Python Policy:
|
|
||||||
http://www.debian.org/doc/packaging-manuals/python-policy/
|
|
||||||
|
|
||||||
.. [5] import.c:
|
|
||||||
http://svn.python.org/view/python/branches/py3k/Python/import.c?view=markup
|
http://svn.python.org/view/python/branches/py3k/Python/import.c?view=markup
|
||||||
|
|
||||||
.. [6] PEP 384
|
.. [4] Ubuntu: <http://www.ubuntu.com>
|
||||||
|
|
||||||
.. [7] python-support:
|
.. [5] Debian: <http://www.debian.org>
|
||||||
|
|
||||||
|
.. [6] Debian Python Policy:
|
||||||
|
http://www.debian.org/doc/packaging-manuals/python-policy/
|
||||||
|
|
||||||
|
.. [7] PEP 384
|
||||||
|
|
||||||
|
.. [8] python-support:
|
||||||
http://wiki.debian.org/DebianPythonFAQ#Whatispython-support.3F
|
http://wiki.debian.org/DebianPythonFAQ#Whatispython-support.3F
|
||||||
|
|
||||||
.. [8] python-central:
|
.. [9] python-central:
|
||||||
http://wiki.debian.org/DebianPythonFAQ#Whatispython-central.3F
|
http://wiki.debian.org/DebianPythonFAQ#Whatispython-central.3F
|
||||||
|
|
||||||
.. [9] http://www.filesuffix.com/?m=search&e=pyr&submit=Search
|
|
||||||
|
|
||||||
.. [10] binascii.hexlify():
|
.. [10] binascii.hexlify():
|
||||||
http://www.python.org/doc/current/library/binascii.html#binascii.hexlify
|
http://www.python.org/doc/current/library/binascii.html#binascii.hexlify
|
||||||
|
|
||||||
.. [11] The marshal module:
|
.. [11] Jython: http://www.jython.org/
|
||||||
http://www.python.org/doc/current/library/marshal.html
|
|
||||||
|
|
||||||
.. [12] PEP 304:
|
.. [12] IronPython: http://ironpython.net/
|
||||||
|
|
||||||
.. [13] imp: http://www.python.org/doc/current/library/imp.html
|
.. [13] PyPy: http://codespeak.net/pypy/dist/pypy/doc/
|
||||||
|
|
||||||
.. [14] importlib: http://docs.python.org/3.1/library/importlib.html
|
.. [14] Pynie: http://code.google.com/p/pynie/
|
||||||
|
|
||||||
.. [15] PEP 382
|
.. [15] imp: http://www.python.org/doc/current/library/imp.html
|
||||||
|
|
||||||
.. [16] PEP 3003
|
.. [16] PEP 302
|
||||||
|
|
||||||
|
.. [17] PEP 304
|
||||||
|
|
||||||
|
.. [18] importlib: http://docs.python.org/3.1/library/importlib.html
|
||||||
|
|
||||||
|
|
||||||
ACKNOWLEDGMENTS
|
ACKNOWLEDGMENTS
|
||||||
|
@ -350,7 +528,7 @@ Barry Warsaw's original idea was for fat Python byte code files.
|
||||||
Martin von Loewis reviewed an early draft of the PEP and suggested the
|
Martin von Loewis reviewed an early draft of the PEP and suggested the
|
||||||
simplification to store traditional `pyc` and `pyo` files in a
|
simplification to store traditional `pyc` and `pyo` files in a
|
||||||
directory. Many other people reviewed early versions of this PEP and
|
directory. Many other people reviewed early versions of this PEP and
|
||||||
provided useful feedback including:
|
provided useful feedback including but not limited to:
|
||||||
|
|
||||||
* David Malcolm
|
* David Malcolm
|
||||||
* Josselin Mouette
|
* Josselin Mouette
|
||||||
|
@ -368,26 +546,6 @@ Copyright
|
||||||
This document has been placed in the public domain.
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
Notes from python-dev
|
|
||||||
=====================
|
|
||||||
|
|
||||||
The python-dev discussion has been very fruitful. Here are some
|
|
||||||
in-progress notes from that thread which still needs to be reconciled
|
|
||||||
into the body of the PEP.
|
|
||||||
|
|
||||||
* Rarity of the use of this feature. Important for distros but
|
|
||||||
probably much less so for individual users (who may never even see
|
|
||||||
these things).
|
|
||||||
* Sibling vs folder-per-folder. Do performance measurements. Do stat
|
|
||||||
calls outweigh everything else? We need to do an analysis of the
|
|
||||||
current implementation as a baseline.
|
|
||||||
* Magic numbers in file names are magical; no one really knows the
|
|
||||||
mappings. Maybe we should use magic strings (with a lookup table?),
|
|
||||||
e.g. 'foo.cython-27.py'
|
|
||||||
* Modules should unambiguously name their __source__ and __cache__
|
|
||||||
file names. __file__ is ambiguous.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
..
|
..
|
||||||
Local Variables:
|
Local Variables:
|
||||||
|
|
Loading…
Reference in New Issue