new pep - initial feedback from python-dev incorporated.
This commit is contained in:
parent
5e382b78a4
commit
ae27188998
|
@ -0,0 +1,289 @@
|
|||
PEP: 304
|
||||
Title: Controlling generation of bytecode files
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Skip Montanaro
|
||||
Status: Active
|
||||
Type: Draft
|
||||
Content-Type: text/x-rst
|
||||
Created: 22-Jan-2003
|
||||
Post-History:
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP outlines a mechanism for controlling the generation and
|
||||
location of compiled Python bytecode files. This idea originally
|
||||
arose as a patch request [1]_ and evolved into a discussion thread on
|
||||
the python-dev mailing list [2]_. The introduction of an environment
|
||||
variable will allow people installing Python or Python-based
|
||||
third-party packages to control whether or not bytecode files
|
||||
should be generated, and if so, where they should be written.
|
||||
|
||||
|
||||
Proposal
|
||||
========
|
||||
|
||||
Add a new environment variable, PYTHONBYTECODEBASE, to the mix of
|
||||
environment variables which Python understands. Its interpretation
|
||||
is:
|
||||
|
||||
- If not present Python bytecode is generated in exactly the same way
|
||||
as is currently done. sys.pythonbytecodebase is set to the root
|
||||
directory (either / on Unix or the root directory of the startup
|
||||
drive -- typically ``C:\`` -- on Windows).
|
||||
|
||||
- If present and it refers to an existing directory,
|
||||
sys.pythonbytecodebase is set to that directory and bytecode files
|
||||
are written into a directory structure rooted at that location.
|
||||
|
||||
- If present but empty, sys.pythonbytecodebase is set to None and
|
||||
generation of bytecode files is suppressed altogether.
|
||||
|
||||
- If present and it does not refer to an existing directory, a warning
|
||||
is displayed, sys.pythonbytecodebase is set to None and generation
|
||||
of bytecode files is suppressed altogether.
|
||||
|
||||
After startup, all runtime references are to sys.pythonbytecodebase,
|
||||
not the PYTHONBYTECODEBASE enbironment variable. sys.path is not
|
||||
modified.
|
||||
|
||||
|
||||
Glossary
|
||||
--------
|
||||
|
||||
- "bytecode base" refers to the current setting of
|
||||
sys.pythonbytecodebase.
|
||||
|
||||
- "augmented directory" refers to the directory formed from the
|
||||
bytecode base and the directory name of the source file.
|
||||
|
||||
- PYTHONBYTECODEBASE refers to the environment variable when necessary
|
||||
to distinguish it from "bytecode base".
|
||||
|
||||
Locating bytecode files
|
||||
-----------------------
|
||||
|
||||
When the interpreter is searching for a module, it will use sys.path
|
||||
as usual. However, when a possible bytecode file is considered, an
|
||||
extra probe for a bytecode file may be made. First, a check is made
|
||||
for the bytecode file using the directory in sys.path which holds the
|
||||
source file (the current behavior). If a valid bytecode file is not
|
||||
found there (either one does not exist or exists but is out-of-date)
|
||||
and the bytecode base is not None, a second probe is made using the
|
||||
directory in sys.path prefixed appropriately by the bytecode base.
|
||||
|
||||
Writing bytecode files
|
||||
----------------------
|
||||
|
||||
When the bytecode base is not None, a new bytecode file is written to
|
||||
the appropriate augmented directory, never directly to a directory in
|
||||
sys.path.
|
||||
|
||||
|
||||
Defining augmented directories
|
||||
------------------------------
|
||||
|
||||
Conceptually, the augmented directory for a bytecode file is the
|
||||
directory in which the source file exists prefixed by the bytecode
|
||||
base. In a Unix environment this would be:
|
||||
|
||||
pcb = os.path.abspath(sys.pythonbytecodebase)
|
||||
if sourcefile[0] == os.sep: sourcefile = sourcefile[1:]
|
||||
augdir = os.path.join(pcb, os.path.dirname(sourcefile))
|
||||
|
||||
On Windows, which does not have a single-rooted directory tree, the
|
||||
drive letter of the directory containing the source file is treated as
|
||||
a directory component after removing the trailing colon. The
|
||||
augmented directory is thus derived as
|
||||
|
||||
pcb = os.path.abspath(sys.pythonbytecodebase)
|
||||
drive, base = os.path.splitdrive(os.path.dirname(sourcefile))
|
||||
drive = drive[:-1]
|
||||
if base[0] == "\\": base = base[1:]
|
||||
augdir = os.path.join(pcb, drive, base)
|
||||
|
||||
Fixing the location of the bytecode base
|
||||
----------------------------------------
|
||||
|
||||
During program startup, the value of the PYTHONBYTECODEBASE
|
||||
environment variable is made absolute, checked for validity and added
|
||||
to the sys module, effectively:
|
||||
|
||||
pcb = os.path.abspath(os.environ["PYTHONBYTECODEBASE"])
|
||||
try:
|
||||
probe = os.path.join(pcb, "foo")
|
||||
open(probe, "w")
|
||||
os.unlink(probe)
|
||||
sys.pythonbytecodebase = pcb
|
||||
except IOError:
|
||||
sys.pythonbytecodebase = None
|
||||
|
||||
This allows the user to specify the bytecode base as a relative path,
|
||||
but not have it subject to changes to the current working directory.
|
||||
(I can't imagine you'd want it to move around during program
|
||||
execution.)
|
||||
|
||||
There is nothing special about sys.pythonbytecodebase. The user may
|
||||
change it at runtime if she so chooses, but normally it will not be
|
||||
modified.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
In many environments it is not possible for non-root users to write
|
||||
into directories containing Python source files. Most of the time,
|
||||
this is not a problem as Python source is generally byte compiled
|
||||
during installation. However, there are situations where bytecode
|
||||
files are either missing or need to be updated. If the directory
|
||||
containing the source file is not writable by the current user a
|
||||
performance penalty is incurred each time a program importing the
|
||||
module is run. [3]_ Warning messages may also be generated in certain
|
||||
circumstances. If the directory is writable, nearly simultaneous
|
||||
attempts attempts to write the bytecode file by two separate processes
|
||||
may occur, resulting in file corruption. [4]_
|
||||
|
||||
In environments with ramdisks available, it may be desirable for
|
||||
performance reasons to write bytecode files to a directory on such a
|
||||
disk. Similarly, in environments where Python source code resides on
|
||||
network file systems, it may be desirable to cache bytecode files on
|
||||
local disks.
|
||||
|
||||
|
||||
Alternatives
|
||||
============
|
||||
|
||||
The only other alternative proposed so far [1]_ seems to be to add a
|
||||
-R flag to the interpreter to disable writing bytecode files
|
||||
altogether. This proposal subsumes that. Adding a command-line
|
||||
option is certainly possible, but is probably not sufficient, as the
|
||||
interpreter's command line is not readily available during
|
||||
installation.
|
||||
|
||||
|
||||
Issues
|
||||
======
|
||||
|
||||
- Interpretation of a module's __file__ attribute. I believe the
|
||||
__file__ attribute of a module should reflect the true location of
|
||||
the bytecode file. If people want to locate a module's source code,
|
||||
they should use imp.find_module(module).
|
||||
|
||||
- Security - What if root has PYTHONBYTECODEBASE set? Yes, this can
|
||||
present a security risk, but so can many other things the root user
|
||||
does. The root user should probably not set PYTHONBYTECODEBASE
|
||||
except during installation. Still, perhaps this problem can be
|
||||
minimized. When running as root the interpreter should check to see
|
||||
if PYTHONBYTECODEBASE refers to a directory which is writable by
|
||||
anyone other than root. If so, it could raise an exception or
|
||||
warning and set sys.pythonbytecodebase to None. Or, see the next
|
||||
item.
|
||||
|
||||
- More security - What if PYTHONBYTECODEBASE refers to a general
|
||||
directory (say, /tmp)? In this case, perhaps loading of a
|
||||
preexisting bytecode file should occur only if the file is owned by
|
||||
the current user or root. (Does this matter on Windows?)
|
||||
|
||||
|
||||
Examples
|
||||
========
|
||||
|
||||
In the examples which follow, the urllib source code resides in
|
||||
/usr/lib/python2.3/urllib.py and /usr/lib/python2.3 is in sys.path but
|
||||
is not writable by the current user.
|
||||
|
||||
- The bytecode base is /tmp. /usr/lib/python2.3/urllib.pyc exists and
|
||||
is valid. When urllib is imported, the contents of
|
||||
/usr/lib/python2.3/urllib.pyc are used. The augmented directory is
|
||||
not consulted. No other bytecode file is generated.
|
||||
|
||||
- The bytecode base is /tmp. /usr/lib/python2.3/urllib.pyc exists,
|
||||
but is out-of-date. When urllib is imported, the generated bytecode
|
||||
file is written to urllib.pyc in the augmented directory.
|
||||
Intermediate directories will be created as needed.
|
||||
|
||||
- The bytecode base is None. No urllib.pyc file is found. When
|
||||
urllib is imported, no bytecode file is written.
|
||||
|
||||
- The bytecode base is /tmp. No urllib.pyc file is found. When
|
||||
urllib is imported, the generated bytecode file is written to the
|
||||
augmented directory, creating intermediate directories as needed.
|
||||
|
||||
- At startup, PYTHONBYTECODEBASE is /tmp/foobar, which does not exist.
|
||||
A warning is emitted, sys.pythonbytecodebase is set to None and no
|
||||
bytecode files are written during program execution unless
|
||||
sys.pythonbytecodebase is later changed to refer to a valid,
|
||||
writable directory.
|
||||
|
||||
- At startup, PYTHONBYTECODEBASE is set to /, which exists, but is not
|
||||
writable by the current user. A warning is emitted,
|
||||
sys.pythonbytecodebase is set to None and no bytecode files are
|
||||
written during program execution unless sys.pythonbytecodebase is
|
||||
later changed to refer to a valid, writable directory. Note that
|
||||
even though the augmented directory constructed for a particular
|
||||
bytecode file may be writable by the current user, what counts is
|
||||
that the bytecode base directory itself is writable.
|
||||
|
||||
- At startup PYTHONBYTECODEBASE is set to the empty string.
|
||||
sys.pythonbytecodebase is set to None. No warning is generated,
|
||||
however. If no urllib.pyc file is found when urllib is imported, no
|
||||
bytecode file is written.
|
||||
|
||||
In the Windows examples which follow, the urllib source code resides
|
||||
in ``C:\PYTHON22\urllib.py``. ``C:\\PYTHON22`` is in sys.path but is
|
||||
not writable by the current user.
|
||||
|
||||
- The bytecode base is set to ``C:\TEMP``. ``C:\PYTHON22\urllib.pyc``
|
||||
exists and is valid. When urllib is imported, the contents of
|
||||
``C:\PYTHON22\urllib.pyc`` are used. The augmented directory is not
|
||||
consulted.
|
||||
|
||||
- The bytecode base is set to ``C:\TEMP``. ``C:\PYTHON22\urllib.pyc``
|
||||
exists, but is out-of-date. When urllib is imported, a new bytecode
|
||||
file is written to the augmented directory. Intermediate
|
||||
directories will be created as needed.
|
||||
|
||||
- At startuyp PYTHONBYTECODEBASE is set to ``TEMP`` and the current
|
||||
working directory at application startup is ``H:\NET``. The
|
||||
potential bytecode base is thus ``H:\NET\TEMP``. If this directory
|
||||
exists and is writable by the current user, sys.pythonbytecodebase
|
||||
will be set to that value. If not, a warning will be emitted and
|
||||
sys.pythonbytecodebase will be set to None.
|
||||
|
||||
- The bytecode base is ``C:\TEMP``. No urllib.pyc file is found.
|
||||
When urllib is imported, the generated bytecode file is written to
|
||||
the augmented directory, creating intermediate directories as
|
||||
needed.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] patch 602345, Option for not writing py.[co] files, Klose
|
||||
(http://www.python.org/sf/602345)
|
||||
|
||||
.. [2] python-dev thread, Disable writing .py[co], Norwitz
|
||||
(http://mail.python.org/pipermail/python-dev/2003-January/032270.html)
|
||||
|
||||
.. [3] Debian bug report, Mailman is writing to /usr in cron, Wegner
|
||||
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=96111)
|
||||
|
||||
.. [4] python-dev thread, Parallel pyc construction, Dubois
|
||||
(http://mail.python.org/pipermail/python-dev/2003-January/032060.html)
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
End:
|
Loading…
Reference in New Issue