290 lines
11 KiB
Plaintext
290 lines
11 KiB
Plaintext
|
PEP: 304
|
|||
|
Title: Controlling generation of bytecode files
|
|||
|
Version: $Revision$
|
|||
|
Last-Modified: $Date$
|
|||
|
Author: Skip Montanaro
|
|||
|
Status: Active
|
|||
|
Type: Draft
|
|||
|
Content-Type: text/x-rst
|
|||
|
Created: 22-Jan-2003
|
|||
|
Post-History:
|
|||
|
|
|||
|
|
|||
|
Abstract
|
|||
|
========
|
|||
|
|
|||
|
This PEP outlines a mechanism for controlling the generation and
|
|||
|
location of compiled Python bytecode files. This idea originally
|
|||
|
arose as a patch request [1]_ and evolved into a discussion thread on
|
|||
|
the python-dev mailing list [2]_. The introduction of an environment
|
|||
|
variable will allow people installing Python or Python-based
|
|||
|
third-party packages to control whether or not bytecode files
|
|||
|
should be generated, and if so, where they should be written.
|
|||
|
|
|||
|
|
|||
|
Proposal
|
|||
|
========
|
|||
|
|
|||
|
Add a new environment variable, PYTHONBYTECODEBASE, to the mix of
|
|||
|
environment variables which Python understands. Its interpretation
|
|||
|
is:
|
|||
|
|
|||
|
- If not present Python bytecode is generated in exactly the same way
|
|||
|
as is currently done. sys.pythonbytecodebase is set to the root
|
|||
|
directory (either / on Unix or the root directory of the startup
|
|||
|
drive -- typically ``C:\`` -- on Windows).
|
|||
|
|
|||
|
- If present and it refers to an existing directory,
|
|||
|
sys.pythonbytecodebase is set to that directory and bytecode files
|
|||
|
are written into a directory structure rooted at that location.
|
|||
|
|
|||
|
- If present but empty, sys.pythonbytecodebase is set to None and
|
|||
|
generation of bytecode files is suppressed altogether.
|
|||
|
|
|||
|
- If present and it does not refer to an existing directory, a warning
|
|||
|
is displayed, sys.pythonbytecodebase is set to None and generation
|
|||
|
of bytecode files is suppressed altogether.
|
|||
|
|
|||
|
After startup, all runtime references are to sys.pythonbytecodebase,
|
|||
|
not the PYTHONBYTECODEBASE enbironment variable. sys.path is not
|
|||
|
modified.
|
|||
|
|
|||
|
|
|||
|
Glossary
|
|||
|
--------
|
|||
|
|
|||
|
- "bytecode base" refers to the current setting of
|
|||
|
sys.pythonbytecodebase.
|
|||
|
|
|||
|
- "augmented directory" refers to the directory formed from the
|
|||
|
bytecode base and the directory name of the source file.
|
|||
|
|
|||
|
- PYTHONBYTECODEBASE refers to the environment variable when necessary
|
|||
|
to distinguish it from "bytecode base".
|
|||
|
|
|||
|
Locating bytecode files
|
|||
|
-----------------------
|
|||
|
|
|||
|
When the interpreter is searching for a module, it will use sys.path
|
|||
|
as usual. However, when a possible bytecode file is considered, an
|
|||
|
extra probe for a bytecode file may be made. First, a check is made
|
|||
|
for the bytecode file using the directory in sys.path which holds the
|
|||
|
source file (the current behavior). If a valid bytecode file is not
|
|||
|
found there (either one does not exist or exists but is out-of-date)
|
|||
|
and the bytecode base is not None, a second probe is made using the
|
|||
|
directory in sys.path prefixed appropriately by the bytecode base.
|
|||
|
|
|||
|
Writing bytecode files
|
|||
|
----------------------
|
|||
|
|
|||
|
When the bytecode base is not None, a new bytecode file is written to
|
|||
|
the appropriate augmented directory, never directly to a directory in
|
|||
|
sys.path.
|
|||
|
|
|||
|
|
|||
|
Defining augmented directories
|
|||
|
------------------------------
|
|||
|
|
|||
|
Conceptually, the augmented directory for a bytecode file is the
|
|||
|
directory in which the source file exists prefixed by the bytecode
|
|||
|
base. In a Unix environment this would be:
|
|||
|
|
|||
|
pcb = os.path.abspath(sys.pythonbytecodebase)
|
|||
|
if sourcefile[0] == os.sep: sourcefile = sourcefile[1:]
|
|||
|
augdir = os.path.join(pcb, os.path.dirname(sourcefile))
|
|||
|
|
|||
|
On Windows, which does not have a single-rooted directory tree, the
|
|||
|
drive letter of the directory containing the source file is treated as
|
|||
|
a directory component after removing the trailing colon. The
|
|||
|
augmented directory is thus derived as
|
|||
|
|
|||
|
pcb = os.path.abspath(sys.pythonbytecodebase)
|
|||
|
drive, base = os.path.splitdrive(os.path.dirname(sourcefile))
|
|||
|
drive = drive[:-1]
|
|||
|
if base[0] == "\\": base = base[1:]
|
|||
|
augdir = os.path.join(pcb, drive, base)
|
|||
|
|
|||
|
Fixing the location of the bytecode base
|
|||
|
----------------------------------------
|
|||
|
|
|||
|
During program startup, the value of the PYTHONBYTECODEBASE
|
|||
|
environment variable is made absolute, checked for validity and added
|
|||
|
to the sys module, effectively:
|
|||
|
|
|||
|
pcb = os.path.abspath(os.environ["PYTHONBYTECODEBASE"])
|
|||
|
try:
|
|||
|
probe = os.path.join(pcb, "foo")
|
|||
|
open(probe, "w")
|
|||
|
os.unlink(probe)
|
|||
|
sys.pythonbytecodebase = pcb
|
|||
|
except IOError:
|
|||
|
sys.pythonbytecodebase = None
|
|||
|
|
|||
|
This allows the user to specify the bytecode base as a relative path,
|
|||
|
but not have it subject to changes to the current working directory.
|
|||
|
(I can't imagine you'd want it to move around during program
|
|||
|
execution.)
|
|||
|
|
|||
|
There is nothing special about sys.pythonbytecodebase. The user may
|
|||
|
change it at runtime if she so chooses, but normally it will not be
|
|||
|
modified.
|
|||
|
|
|||
|
|
|||
|
Rationale
|
|||
|
=========
|
|||
|
|
|||
|
In many environments it is not possible for non-root users to write
|
|||
|
into directories containing Python source files. Most of the time,
|
|||
|
this is not a problem as Python source is generally byte compiled
|
|||
|
during installation. However, there are situations where bytecode
|
|||
|
files are either missing or need to be updated. If the directory
|
|||
|
containing the source file is not writable by the current user a
|
|||
|
performance penalty is incurred each time a program importing the
|
|||
|
module is run. [3]_ Warning messages may also be generated in certain
|
|||
|
circumstances. If the directory is writable, nearly simultaneous
|
|||
|
attempts attempts to write the bytecode file by two separate processes
|
|||
|
may occur, resulting in file corruption. [4]_
|
|||
|
|
|||
|
In environments with ramdisks available, it may be desirable for
|
|||
|
performance reasons to write bytecode files to a directory on such a
|
|||
|
disk. Similarly, in environments where Python source code resides on
|
|||
|
network file systems, it may be desirable to cache bytecode files on
|
|||
|
local disks.
|
|||
|
|
|||
|
|
|||
|
Alternatives
|
|||
|
============
|
|||
|
|
|||
|
The only other alternative proposed so far [1]_ seems to be to add a
|
|||
|
-R flag to the interpreter to disable writing bytecode files
|
|||
|
altogether. This proposal subsumes that. Adding a command-line
|
|||
|
option is certainly possible, but is probably not sufficient, as the
|
|||
|
interpreter's command line is not readily available during
|
|||
|
installation.
|
|||
|
|
|||
|
|
|||
|
Issues
|
|||
|
======
|
|||
|
|
|||
|
- Interpretation of a module's __file__ attribute. I believe the
|
|||
|
__file__ attribute of a module should reflect the true location of
|
|||
|
the bytecode file. If people want to locate a module's source code,
|
|||
|
they should use imp.find_module(module).
|
|||
|
|
|||
|
- Security - What if root has PYTHONBYTECODEBASE set? Yes, this can
|
|||
|
present a security risk, but so can many other things the root user
|
|||
|
does. The root user should probably not set PYTHONBYTECODEBASE
|
|||
|
except during installation. Still, perhaps this problem can be
|
|||
|
minimized. When running as root the interpreter should check to see
|
|||
|
if PYTHONBYTECODEBASE refers to a directory which is writable by
|
|||
|
anyone other than root. If so, it could raise an exception or
|
|||
|
warning and set sys.pythonbytecodebase to None. Or, see the next
|
|||
|
item.
|
|||
|
|
|||
|
- More security - What if PYTHONBYTECODEBASE refers to a general
|
|||
|
directory (say, /tmp)? In this case, perhaps loading of a
|
|||
|
preexisting bytecode file should occur only if the file is owned by
|
|||
|
the current user or root. (Does this matter on Windows?)
|
|||
|
|
|||
|
|
|||
|
Examples
|
|||
|
========
|
|||
|
|
|||
|
In the examples which follow, the urllib source code resides in
|
|||
|
/usr/lib/python2.3/urllib.py and /usr/lib/python2.3 is in sys.path but
|
|||
|
is not writable by the current user.
|
|||
|
|
|||
|
- The bytecode base is /tmp. /usr/lib/python2.3/urllib.pyc exists and
|
|||
|
is valid. When urllib is imported, the contents of
|
|||
|
/usr/lib/python2.3/urllib.pyc are used. The augmented directory is
|
|||
|
not consulted. No other bytecode file is generated.
|
|||
|
|
|||
|
- The bytecode base is /tmp. /usr/lib/python2.3/urllib.pyc exists,
|
|||
|
but is out-of-date. When urllib is imported, the generated bytecode
|
|||
|
file is written to urllib.pyc in the augmented directory.
|
|||
|
Intermediate directories will be created as needed.
|
|||
|
|
|||
|
- The bytecode base is None. No urllib.pyc file is found. When
|
|||
|
urllib is imported, no bytecode file is written.
|
|||
|
|
|||
|
- The bytecode base is /tmp. No urllib.pyc file is found. When
|
|||
|
urllib is imported, the generated bytecode file is written to the
|
|||
|
augmented directory, creating intermediate directories as needed.
|
|||
|
|
|||
|
- At startup, PYTHONBYTECODEBASE is /tmp/foobar, which does not exist.
|
|||
|
A warning is emitted, sys.pythonbytecodebase is set to None and no
|
|||
|
bytecode files are written during program execution unless
|
|||
|
sys.pythonbytecodebase is later changed to refer to a valid,
|
|||
|
writable directory.
|
|||
|
|
|||
|
- At startup, PYTHONBYTECODEBASE is set to /, which exists, but is not
|
|||
|
writable by the current user. A warning is emitted,
|
|||
|
sys.pythonbytecodebase is set to None and no bytecode files are
|
|||
|
written during program execution unless sys.pythonbytecodebase is
|
|||
|
later changed to refer to a valid, writable directory. Note that
|
|||
|
even though the augmented directory constructed for a particular
|
|||
|
bytecode file may be writable by the current user, what counts is
|
|||
|
that the bytecode base directory itself is writable.
|
|||
|
|
|||
|
- At startup PYTHONBYTECODEBASE is set to the empty string.
|
|||
|
sys.pythonbytecodebase is set to None. No warning is generated,
|
|||
|
however. If no urllib.pyc file is found when urllib is imported, no
|
|||
|
bytecode file is written.
|
|||
|
|
|||
|
In the Windows examples which follow, the urllib source code resides
|
|||
|
in ``C:\PYTHON22\urllib.py``. ``C:\\PYTHON22`` is in sys.path but is
|
|||
|
not writable by the current user.
|
|||
|
|
|||
|
- The bytecode base is set to ``C:\TEMP``. ``C:\PYTHON22\urllib.pyc``
|
|||
|
exists and is valid. When urllib is imported, the contents of
|
|||
|
``C:\PYTHON22\urllib.pyc`` are used. The augmented directory is not
|
|||
|
consulted.
|
|||
|
|
|||
|
- The bytecode base is set to ``C:\TEMP``. ``C:\PYTHON22\urllib.pyc``
|
|||
|
exists, but is out-of-date. When urllib is imported, a new bytecode
|
|||
|
file is written to the augmented directory. Intermediate
|
|||
|
directories will be created as needed.
|
|||
|
|
|||
|
- At startuyp PYTHONBYTECODEBASE is set to ``TEMP`` and the current
|
|||
|
working directory at application startup is ``H:\NET``. The
|
|||
|
potential bytecode base is thus ``H:\NET\TEMP``. If this directory
|
|||
|
exists and is writable by the current user, sys.pythonbytecodebase
|
|||
|
will be set to that value. If not, a warning will be emitted and
|
|||
|
sys.pythonbytecodebase will be set to None.
|
|||
|
|
|||
|
- The bytecode base is ``C:\TEMP``. No urllib.pyc file is found.
|
|||
|
When urllib is imported, the generated bytecode file is written to
|
|||
|
the augmented directory, creating intermediate directories as
|
|||
|
needed.
|
|||
|
|
|||
|
References
|
|||
|
==========
|
|||
|
|
|||
|
.. [1] patch 602345, Option for not writing py.[co] files, Klose
|
|||
|
(http://www.python.org/sf/602345)
|
|||
|
|
|||
|
.. [2] python-dev thread, Disable writing .py[co], Norwitz
|
|||
|
(http://mail.python.org/pipermail/python-dev/2003-January/032270.html)
|
|||
|
|
|||
|
.. [3] Debian bug report, Mailman is writing to /usr in cron, Wegner
|
|||
|
(http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=96111)
|
|||
|
|
|||
|
.. [4] python-dev thread, Parallel pyc construction, Dubois
|
|||
|
(http://mail.python.org/pipermail/python-dev/2003-January/032060.html)
|
|||
|
|
|||
|
|
|||
|
Copyright
|
|||
|
=========
|
|||
|
|
|||
|
This document has been placed in the public domain.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
..
|
|||
|
Local Variables:
|
|||
|
mode: indented-text
|
|||
|
indent-tabs-mode: nil
|
|||
|
sentence-end-double-space: t
|
|||
|
fill-column: 70
|
|||
|
End:
|