PEP 488: make the base case not change the bytecode file name
This commit is contained in:
parent
5817306884
commit
87b93b66b6
120
pep-0488.txt
120
pep-0488.txt
|
@ -17,8 +17,8 @@ Abstract
|
|||
This PEP proposes eliminating the concept of PYO files from Python.
|
||||
To continue the support of the separation of bytecode files based on
|
||||
their optimization level, this PEP proposes extending the PYC file
|
||||
name to include the optimization level in bytecode repository
|
||||
directory (i.e., the ``__pycache__`` directory).
|
||||
name to include the optimization level in the bytecode repository
|
||||
directory when it's called for (i.e., the ``__pycache__`` directory).
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -29,11 +29,11 @@ file is the bytecode file generated and read from when no
|
|||
optimization level is specified at interpreter startup (i.e., ``-O``
|
||||
is not specified). A PYO file represents the bytecode file that is
|
||||
read/written when **any** optimization level is specified (i.e., when
|
||||
``-O`` is specified, including ``-OO``). This means that while PYC
|
||||
``-O`` **or** ``-OO`` is specified). This means that while PYC
|
||||
files clearly delineate the optimization level used when they were
|
||||
generated -- namely no optimizations beyond the peepholer -- the same
|
||||
is not true for PYO files. Put in terms of optimization levels and
|
||||
the file extension:
|
||||
is not true for PYO files. To put this in terms of optimization
|
||||
levels and the file extension:
|
||||
|
||||
- 0: ``.pyc``
|
||||
- 1 (``-O``): ``.pyo``
|
||||
|
@ -62,7 +62,9 @@ extension for multiple optimization levels.
|
|||
|
||||
As for distributing bytecode-only modules, having to distribute both
|
||||
``.pyc`` and ``.pyo`` files is unnecessary for the common use-case
|
||||
of code obfuscation and smaller file deployments.
|
||||
of code obfuscation and smaller file deployments. This means that
|
||||
bytecode-only modules will only load from their non-optimized
|
||||
``.pyc`` file name.
|
||||
|
||||
|
||||
Proposal
|
||||
|
@ -73,15 +75,22 @@ eliminating the concept of PYO files and their accompanying ``.pyo``
|
|||
file extension. To allow for the optimization level to be unambiguous
|
||||
as well as to avoid having to regenerate optimized bytecode files
|
||||
needlessly in the `__pycache__` directory, the optimization level
|
||||
used to generate a PYC file will be incorporated into the bytecode
|
||||
file name. Currently bytecode file names are created by
|
||||
used to generate the bytecode file will be incorporated into the
|
||||
bytecode file name. When no optimization level is specified, the
|
||||
pre-PEP ``.pyc`` file name will be used (i.e., no change in file name
|
||||
semantics). This increases backwards-compatibility while also being
|
||||
more understanding of Python implementations which have no use for
|
||||
optimization levels (e.g., PyPy[10]_).
|
||||
|
||||
Currently bytecode file names are created by
|
||||
``importlib.util.cache_from_source()``, approximately using the
|
||||
following expression defined by PEP 3147 [3]_, [4]_, [5]_::
|
||||
|
||||
'{name}.{cache_tag}.pyc'.format(name=module_name,
|
||||
cache_tag=sys.implementation.cache_tag)
|
||||
|
||||
This PEP proposes to change the expression to::
|
||||
This PEP proposes to change the expression when an optimization
|
||||
level is specified to::
|
||||
|
||||
'{name}.{cache_tag}.opt-{optimization}.pyc'.format(
|
||||
name=module_name,
|
||||
|
@ -94,8 +103,8 @@ the cache tag was chosen to preserve lexicographic sort order of
|
|||
bytecode file names based on module name and cache tag which will
|
||||
not vary for a single interpreter. The "opt-" prefix was chosen over
|
||||
"o" so as to be somewhat self-documenting. The "opt-" prefix was
|
||||
chosen over "O" so as to not have any confusion with "0" while being
|
||||
so close to the interpreter version number.
|
||||
chosen over "O" so as to not have any confusion in case "0" was the
|
||||
leading prefix of the optimization level.
|
||||
|
||||
A period was chosen over a hyphen as a separator so as to distinguish
|
||||
clearly that the optimization level is not part of the interpreter
|
||||
|
@ -103,10 +112,8 @@ version as specified by the cache tag. It also lends to the use of
|
|||
the period in the file name to delineate semantically different
|
||||
concepts.
|
||||
|
||||
For example, the bytecode file name of ``importlib.cpython-35.pyc``
|
||||
would become ``importlib.cpython-35.opt-0.pyc``. If ``-OO`` had been
|
||||
passed to the interpreter then instead of
|
||||
``importlib.cpython-35.pyo`` the file name would be
|
||||
For example, if ``-OO`` had been passed to the interpreter then instead
|
||||
of ``importlib.cpython-35.pyo`` the file name would be
|
||||
``importlib.cpython-35.opt-2.pyc``.
|
||||
|
||||
It should be noted that this change in no way affects the performance
|
||||
|
@ -114,9 +121,15 @@ of import. Since the import system looks for a single bytecode file
|
|||
based on the optimization level of the interpreter already and
|
||||
generates a new bytecode file if it doesn't exist, the introduction
|
||||
of potentially more bytecode files in the ``__pycache__`` directory
|
||||
has no effect. The interpreter will continue to look for only a
|
||||
single bytecode file based on the optimization level and thus no
|
||||
increase in stat calls will occur.
|
||||
has no effect in terms of stat calls. The interpreter will continue
|
||||
to look for only a single bytecode file based on the optimization
|
||||
level and thus no increase in stat calls will occur.
|
||||
|
||||
The only potentially negative result of this PEP is the probable
|
||||
increase in the number of ``.pyc`` files and thus increase in storage
|
||||
use. But for platforms where this is an issue,
|
||||
``sys.dont_write_bytecode`` exists to turn off bytecode generation so
|
||||
that it can be controlled offline.
|
||||
|
||||
|
||||
Implementation
|
||||
|
@ -139,18 +152,18 @@ This PEP proposes changing the signature in Python 3.5 to::
|
|||
The introduced ``optimization`` keyword-only parameter will control
|
||||
what optimization level is specified in the file name. If the
|
||||
argument is ``None`` then the current optimization level of the
|
||||
interpreter will be assumed. Any argument given for ``optimization``
|
||||
will be passed to ``str()`` and must have ``str.isalnum()`` be true,
|
||||
else ``ValueError`` will be raised (this prevents invalid characters
|
||||
being used in the file name). If the empty string is passed in for
|
||||
``optimization`` then the addition of the optimization will be
|
||||
suppressed, reverting to the file name format which predates this
|
||||
PEP.
|
||||
interpreter will be assumed (including no optimization). Any argument
|
||||
given for ``optimization`` will be passed to ``str()`` and must have
|
||||
``str.isalnum()`` be true, else ``ValueError`` will be raised (this
|
||||
prevents invalid characters being used in the file name). If the
|
||||
empty string is passed in for ``optimization`` then the addition of
|
||||
the optimization will be suppressed, reverting to the file name
|
||||
format which predates this PEP.
|
||||
|
||||
It is expected that beyond Python's own
|
||||
0-2 optimization levels, third-party code will use a hash of
|
||||
optimization names to specify the optimization level, e.g.
|
||||
``hashlib.sha256(','.join(['dead code elimination', 'constant folding'])).hexdigest()``.
|
||||
It is expected that beyond Python's own two optimization levels,
|
||||
third-party code will use a hash of optimization names to specify the
|
||||
optimization level, e.g.
|
||||
``hashlib.sha256(','.join(['no dead code', 'const folding'])).hexdigest()``.
|
||||
While this might lead to long file names, it is assumed that most
|
||||
users never look at the contents of the __pycache__ directory and so
|
||||
this won't be an issue.
|
||||
|
@ -238,15 +251,15 @@ Using the "opt-" prefix and placing the optimization level between
|
|||
the cache tag and file extension is not critical. All options which
|
||||
have been considered are:
|
||||
|
||||
* ``importlib.cpython-35.opt-0.pyc``
|
||||
* ``importlib.cpython-35.opt0.pyc``
|
||||
* ``importlib.cpython-35.o0.pyc``
|
||||
* ``importlib.cpython-35.O0.pyc``
|
||||
* ``importlib.cpython-35.0.pyc``
|
||||
* ``importlib.cpython-35-O0.pyc``
|
||||
* ``importlib.O0.cpython-35.pyc``
|
||||
* ``importlib.o0.cpython-35.pyc``
|
||||
* ``importlib.0.cpython-35.pyc``
|
||||
* ``importlib.cpython-35.opt-1.pyc``
|
||||
* ``importlib.cpython-35.opt1.pyc``
|
||||
* ``importlib.cpython-35.o1.pyc``
|
||||
* ``importlib.cpython-35.O1.pyc``
|
||||
* ``importlib.cpython-35.1.pyc``
|
||||
* ``importlib.cpython-35-O1.pyc``
|
||||
* ``importlib.O1.cpython-35.pyc``
|
||||
* ``importlib.o1.cpython-35.pyc``
|
||||
* ``importlib.1.cpython-35.pyc``
|
||||
|
||||
These were initially rejected either because they would change the
|
||||
sort order of bytecode files, possible ambiguity with the cache tag,
|
||||
|
@ -276,34 +289,6 @@ frees integrators from having to guess what users want and allows
|
|||
users to utilize the optimization level they want.
|
||||
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
||||
Not specifying the optimization level when it is at 0
|
||||
-----------------------------------------------------
|
||||
|
||||
It has been suggested that for the common case of when the
|
||||
optimizations are at level 0 that the entire part of the file name
|
||||
relating to the optimization level be left out. This would allow for
|
||||
file names of ``.pyc`` files to go unchanged, potentially leading to
|
||||
less backwards-compatibility issues (although Python 3.5 introduces a
|
||||
new magic number for bytecode so all bytecode files will have to be
|
||||
regenerated regardless of the outcome of this PEP).
|
||||
|
||||
It would also allow a potentially redundant bit of information to be
|
||||
left out of the file name if an implementation of Python did not
|
||||
allow for optimizing bytecode. This would only occur, though, if the
|
||||
interpreter didn't support ``-O`` **and** didn't implement the ast
|
||||
module, else users could implement their own optimizations.
|
||||
|
||||
Arguments against allowing this special case is "explicit is better
|
||||
than implicit" and "special cases aren't special enough to break the
|
||||
rules".
|
||||
|
||||
At this people have weakly supporting this idea while no one has
|
||||
explicitly come out against it.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
|
@ -334,6 +319,9 @@ References
|
|||
.. [9] Informal poll of file name format options on Google+
|
||||
(https://plus.google.com/u/0/+BrettCannon/posts/fZynLNwHWGm)
|
||||
|
||||
.. [10] The PyPy Project
|
||||
(http://pypy.org/)
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
|
Loading…
Reference in New Issue