python-peps/pep-0404.txt

659 lines
27 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 404
Title: Python Virtual Environments
Version: $Revision$
Last-Modified: $Date$
Author: Carl Meyer <carl@oddbird.net>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 13-Jun-2011
Python-Version: 3.3
Post-History: 24-Oct-2011, 28-Oct-2011
Abstract
========
This PEP proposes to add to Python a mechanism for lightweight
"virtual environments" with their own site directories, optionally
isolated from system site directories. Each virtual environment has
its own Python binary (allowing creation of environments with various
Python versions) and can have its own independent set of installed
Python packages in its site directories, but shares the standard
library with the base installed Python.
Motivation
==========
The utility of Python virtual environments has already been well
established by the popularity of existing third-party
virtual-environment tools, primarily Ian Bicking's `virtualenv`_.
Virtual environments are already widely used for dependency management
and isolation, ease of installing and using Python packages without
system-administrator access, and automated testing of Python software
across multiple Python versions, among other uses.
Existing virtual environment tools suffer from lack of support from
the behavior of Python itself. Tools such as `rvirtualenv`_, which do
not copy the Python binary into the virtual environment, cannot
provide reliable isolation from system site directories. Virtualenv,
which does copy the Python binary, is forced to duplicate much of
Python's ``site`` module and manually symlink/copy an ever-changing
set of standard-library modules into the virtual environment in order
to perform a delicate boot-strapping dance at every startup.
(Virtualenv copies the binary because symlinking it does not provide
isolation, as Python dereferences a symlinked executable before
searching for ``sys.prefix``.)
The ``PYTHONHOME`` environment variable, Python's only existing
built-in solution for virtual environments, requires
copying/symlinking the entire standard library into every environment.
Copying the whole standard library is not a lightweight solution, and
cross-platform support for symlinks remains inconsistent (even on
Windows platforms that do support them, creating them often requires
administrator privileges).
A virtual environment mechanism integrated with Python and drawing on
years of experience with existing third-party tools can be lower
maintenance, more reliable, and more easily available to all Python
users.
.. _virtualenv: http://www.virtualenv.org
.. _rvirtualenv: https://github.com/kvbik/rvirtualenv
Specification
=============
When the Python binary is executed, it attempts to determine its
prefix (which it stores in ``sys.prefix``), which is then used to find
the standard library and other key files, and by the ``site`` module
to determine the location of the site-package directories. Currently
the prefix is found (assuming ``PYTHONHOME`` is not set) by first
walking up the filesystem tree looking for a marker file (``os.py``)
that signifies the presence of the standard library, and if none is
found, falling back to the build-time prefix hardcoded in the binary.
This PEP proposes to add a new first step to this search. If a
``pyvenv.cfg`` file is found either adjacent to the Python executable,
or one directory above it, this file is scanned for lines of the form
``key = value``. If a ``home`` key is found, this signifies that the
Python binary belongs to a virtual environment, and the value of the
``home`` key is the directory containing the Python executable used to
create this virtual environment.
In this case, prefix-finding continues as normal using the value of
the ``home`` key as the effective Python binary location, which
results in ``sys.prefix`` being set to the system installation prefix,
while ``sys.site_prefix`` is set to the directory containing
``pyvenv.cfg``.
(If ``pyvenv.cfg`` is not found or does not contain the ``home`` key,
prefix-finding continues normally, and ``sys.site_prefix`` will be
equal to ``sys.prefix``.)
The ``site`` and ``sysconfig`` standard-library modules are modified
such that site-package directories ("purelib" and "platlib", in
``sysconfig`` terms) are found relative to ``sys.site_prefix``, while
other directories (the standard library, include files) are still
found relative to ``sys.prefix``.
(Also, ``sys.site_exec_prefix`` is added, and handled similarly with
regard to ``sys.exec_prefix``.)
Thus, a Python virtual environment in its simplest form would consist
of nothing more than a copy or symlink of the Python binary
accompanied by a ``pyvenv.cfg`` file and a site-packages directory.
Isolation from system site-packages
-----------------------------------
By default, a virtual environment is entirely isolated from the
system-level site-packages directories.
If the ``pyvenv.cfg`` file also contains a key
``include-system-site-packages`` with a value of ``true`` (not case
sensitive), the ``site`` module will also add the system site
directories to ``sys.path`` after the virtual environment site
directories. Thus system-installed packages will still be importable,
but a package of the same name installed in the virtual environment
will take precedence.
:pep:`370` user-level site-packages are considered part of the system
site-packages for venv purposes: they are not available from an
isolated venv, but are available from an
``include-system-site-packages = true`` venv.
Creating virtual environments
-----------------------------
This PEP also proposes adding a new ``venv`` module to the standard
library which implements the creation of virtual environments. This
module can be executed using the ``-m`` flag::
python3 -m venv /path/to/new/virtual/environment
A ``pyvenv`` installed script is also provided to make this more
convenient::
pyvenv /path/to/new/virtual/environment
Running this command creates the target directory (creating any parent
directories that don't exist already) and places a ``pyvenv.cfg`` file
in it with a ``home`` key pointing to the Python installation the
command was run from. It also creates a ``bin/`` (or ``Scripts`` on
Windows) subdirectory containing a copy (or symlink) of the
``python3`` executable, and the ``pysetup3`` script from the
``packaging`` standard library module (to facilitate easy installation
of packages from PyPI into the new virtualenv). And it creates an
(initially empty) ``lib/pythonX.Y/site-packages`` (or
``Lib\site-packages`` on Windows) subdirectory.
If the target directory already exists an error will be raised, unless
the ``--clear`` option was provided, in which case the target
directory will be deleted and virtual environment creation will
proceed as usual.
The created ``pyvenv.cfg`` file also includes the
``include-system-site-packages`` key, set to ``true`` if ``venv`` is
run with the ``--system-site-packages`` option, ``false`` by default.
Multiple paths can be given to ``pyvenv``, in which case an identical
virtualenv will be created, according to the given options, at each
provided path.
The ``venv`` module also adds a ``pysetup3`` script into each venv.
In order to allow ``pysetup`` and other Python package managers to
install packages into the virtual environment the same way they would
install into a normal Python installation, and avoid special-casing
virtual environments in ``sysconfig`` beyond using ``sys.site_prefix``
in place of ``sys.prefix``, the internal virtual environment layout
mimics the layout of the Python installation itself on each platform.
So a typical virtual environment layout on a POSIX system would be::
pyvenv.cfg
bin/python3
bin/python
bin/pysetup3
lib/python3.3/site-packages/
While on a Windows system::
pyvenv.cfg
Scripts/python.exe
Scripts/python3.dll
Scripts/pysetup3.exe
Scripts/pysetup3-script.py
... other DLLs and pyds...
Lib/site-packages/
Third-party packages installed into the virtual environment will have
their Python modules placed in the ``site-packages`` directory, and
their executables placed in ``bin/`` or ``Scripts\``.
.. note::
On a normal Windows system-level installation, the Python binary
itself wouldn't go inside the "Scripts/" subdirectory, as it does
in the default venv layout. This is useful in a virtual
environment so that a user only has to add a single directory to
their shell PATH in order to effectively "activate" the virtual
environment.
.. note::
On Windows, it is necessary to also copy or symlink DLLs and pyd
files from compiled stdlib modules into the env, because if the
venv is created from a non-system-wide Python installation,
Windows won't be able to find the Python installation's copies of
those files when Python is run from the venv.
Copies versus symlinks
----------------------
The technique in this PEP works equally well in general with a copied
or symlinked Python binary (and other needed DLLs on Windows). Some
users prefer a copied binary (for greater isolation from system
changes) and some prefer a symlinked one (so that e.g. security
updates automatically propagate to virtual environments).
There are some cross-platform difficulties with symlinks:
* Not all Windows versions support symlinks, and even on those that
do, creating them often requires administrator privileges.
* On OSX framework builds of Python, sys.executable is just a stub
that executes the real Python binary. Symlinking this stub does not
work with the implementation in this PEP; it must be copied.
(Fortunately the stub is also small, so copying it is not an issue).
Because of these issues, this PEP proposes to copy the Python binary
by default, to maintain cross-platform consistency in the default
behavior.
The ``pyvenv`` script accepts a ``--symlink`` option. If this option
is provided, the script will attempt to symlink instead of copy. If a
symlink fails (e.g. because they are not supported by the platform, or
additional privileges are needed), the script will warn the user and
fall back to a copy.
On OSX framework builds, where a symlink of the executable would
succeed but create a non-functional virtual environment, the script
will fail with an error message that symlinking is not supported on
OSX framework builds.
API
---
The high-level method described above makes use of a simple API which
provides mechanisms for third-party virtual environment creators to
customize environment creation according to their needs.
The ``venv`` module contains an ``EnvBuilder`` class which accepts the
following keyword arguments on instantiation:
* ``system_site_packages`` - A Boolean value indicating that the
system Python site-packages should be available to the environment.
Defaults to ``False``.
* ``clear`` - A Boolean value which, if true, will delete any existing
target directory instead of raising an exception. Defaults to
``False``.
* ``use_symlinks`` - A Boolean value indicating whether to attempt to
symlink the Python binary (and any necessary DLLs or other binaries,
e.g. ``pythonw.exe``), rather than copying. Defaults to ``False``.
The returned env-builder is an object with a ``create`` method, which
takes as required argument the path (absolute or relative to the
current directory) of the target directory which is to contain the
virtual environment. The ``create`` method either creates the
environment in the specified directory, or raises an appropriate
exception.
Creators of third-party virtual environment tools are free to use the
provided ``EnvBuilder`` class as a base class.
The ``venv`` module also provides a module-level function as a
convenience::
def create(env_dir,
system_site_packages=False, clear=False, use_symlinks=False):
builder = EnvBuilder(
system_site_packages=system_site_packages,
clear=clear,
use_symlinks=use_symlinks)
builder.create(env_dir)
The ``create`` method of the ``EnvBuilder`` class illustrates the
hooks available for customization::
def create(self, env_dir):
"""
Create a virtualized Python environment in a directory.
:param env_dir: The target directory to create an environment in.
"""
env_dir = os.path.abspath(env_dir)
context = self.create_directories(env_dir)
self.create_configuration(context)
self.setup_python(context)
self.post_setup(context)
Each of the methods ``create_directories``, ``create_configuration``,
``setup_python``, and ``post_setup`` can be overridden. The functions
of these methods are:
* ``create_directories`` - creates the environment directory and all
necessary directories, and returns a context object. This is just a
holder for attributes (such as paths), for use by the other methods.
* ``create_configuration`` - creates the ``pyvenv.cfg`` configuration
file in the environment.
* ``setup_python`` - creates a copy of the Python executable (and,
under Windows, DLLs) in the environment.
* ``post_setup`` - A (no-op by default) hook method which can be
overridden in third party implementations to pre-install packages or
install scripts in the virtual environment.
In addition, ``EnvBuilder`` provides a utility method that can be
called from ``post_setup`` in subclasses to assist in installing
scripts into the virtual environment. The method ``install_scripts``
accepts as arguments the ``context`` object (see above) and a
bytestring. The bytestring should be a base64-encoded zip file
containing directories "common", "posix", "nt", each containing
scripts destined for the bin directory in the environment. The
contents of "common" and the directory corresponding to ``os.name``
are copied after doing some text replacement of placeholders:
* ``__VENV_DIR__`` is replaced with absolute path of the environment
directory.
* ``__VENV_NAME__`` is replaced with the environment name (final path
segment of environment directory).
* ``__VENV_BIN_NAME__`` is replaced with the name of the bin directory
(either ``bin`` or ``Scripts``).
* ``__VENV_PYTHON__`` is replaced with the absolute path of the
environment's executable.
The ``DistributeEnvBuilder`` subclass in the reference implementation
illustrates how the customization hook can be used in practice to
pre-install Distribute and shell activation scripts into the virtual
environment. It's not envisaged that ``DistributeEnvBuilder`` will be
actually added to Python core, but it makes the reference
implementation more immediately useful for testing and exploratory
purposes.
The "shell activation scripts" provided by ``DistributeEnvBuilder``
simply add the virtual environment's ``bin/`` (or ``Scripts\``)
directory to the front of the user's shell PATH. This is not strictly
necessary for use of a virtual environment (as an explicit path to the
venv's python binary or scripts can just as well be used), but it is
convenient.
This PEP does not propose that the ``venv`` module in core Python will
add such activation scripts by default, as they are shell-specific.
Adding activation scripts for the wide variety of possible shells is
an added maintenance burden, and is left to third-party extension
tools.
No doubt the process of PEP review will show up any customization
requirements which have not yet been considered.
Backwards Compatibility
=======================
Splitting the meanings of ``sys.prefix``
----------------------------------------
Any virtual environment tool along these lines (which attempts to
isolate site-packages, while still making use of the base Python's
standard library with no need for it to be symlinked into the virtual
environment) is proposing a split between two different meanings
(among others) that are currently both wrapped up in ``sys.prefix``:
the answers to the questions "Where is the standard library?" and
"Where is the site-packages location where third-party modules should
be installed?"
This split could be handled by introducing a new ``sys`` attribute for
either the former prefix or the latter prefix. Either option
potentially introduces some backwards-incompatibility with software
written to assume the other meaning for ``sys.prefix``. (Such
software should preferably be using the APIs in the ``site`` and
``sysconfig`` modules to answer these questions rather than using
``sys.prefix`` directly, in which case there is no
backwards-compatibility issue, but in practice ``sys.prefix`` is
sometimes used.)
The `documentation`__ for ``sys.prefix`` describes it as "A string
giving the site-specific directory prefix where the platform
independent Python files are installed," and specifically mentions the
standard library and header files as found under ``sys.prefix``. It
does not mention ``site-packages``.
__ http://docs.python.org/dev/library/sys.html#sys.prefix
This PEP currently proposes to leave ``sys.prefix`` pointing to the
base system installation (which is where the standard library and
header files are found), and introduce a new value in ``sys``
(``sys.site_prefix``) to point to the prefix for ``site-packages``.
This maintains the documented semantics of ``sys.prefix``, but risks
breaking isolation if third-party code uses ``sys.prefix`` rather than
``sys.site_prefix`` or the appropriate ``site`` API to find
site-packages directories.
The most notable case is probably `setuptools`_ and its fork
`distribute`_, which mostly use ``distutils``/``sysconfig`` APIs, but
do use ``sys.prefix`` directly to build up a list of site directories
for pre-flight checking where ``pth`` files can usefully be placed.
It would be trivial to modify these tools (currently only
`distribute`_ is Python 3 compatible) to check ``sys.site_prefix`` and
fall back to ``sys.prefix`` if it doesn't exist (for earlier versions
of Python). If Distribute is modified in this way and released before
Python 3.3 is released with the ``venv`` module, there would be no
likely reason for an older version of Distribute to ever be installed
in a virtual environment.
In terms of other third-party usage, a `Google Code Search`_ turns up
what appears to be a roughly even mix of usage between packages using
``sys.prefix`` to build up a site-packages path and packages using it
to e.g. eliminate the standard-library from code-execution tracing.
Either choice that's made here will require one or the other of these
uses to be updated.
.. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools
.. _distribute: http://packages.python.org/distribute/
.. _Google Code Search: http://www.google.com/codesearch#search/&q=sys\.prefix&p=1&type=cs
Open Questions
==============
Naming of the new ``sys`` prefix attributes
-------------------------------------------
The name ``sys.site_prefix`` was chosen with the following
considerations in mind:
* Inasmuch as "site" has a meaning in Python, it means a combination
of Python version, standard library, and specific set of
site-packages. This is, fundamentally, what a venv is (although it
shares the standard library with its "base" site).
* It is the Python ``site`` module which implements adding
site-packages directories to ``sys.path``, so ``sys.site_prefix`` is
a prefix used (and set) primarily by the ``site`` module.
A concern has been raised that the term ``site`` in Python is already
overloaded and of unclear meaning, and this usage will increase the
overload.
One proposed alternative is ``sys.venv_prefix``, which has the
advantage of being clearly related to the venv implementation. The
downside of this proposal is that it implies the attribute is only
useful/relevant when in a venv and should be absent or ``None`` when
not in a venv. This imposes an unnecessary extra burden on code using
the attribute: ``sys.venv_prefix if sys.venv_prefix else sys.prefix``.
The prefix attributes are more usable and general if they are always
present and set, and split by meaning (stdlib vs site-packages,
roughly), rather than specifically tied to venv. Also, third-party
code should be encouraged to not know or care whether it is running in
a virtual environment or not; this option seems to work against that
goal.
Another option would be ``sys.local_prefix``, which has both the
advantage and disadvantage, depending on perspective, that it
introduces the new term "local" rather than drawing on existing
associations with the term "site".
Why not modify sys.prefix?
--------------------------
As discussed above under `Backwards Compatibility`_, this PEP proposes
to add ``sys.site_prefix`` as "the prefix relative to which
site-package directories are found". This maintains compatibility
with the documented meaning of ``sys.prefix`` (as the location
relative to which the standard library can be found), but means that
code assuming that site-packages directories are found relative to
``sys.prefix`` will not respect the virtual environment correctly.
Since it is unable to modify ``distutils``/``sysconfig``,
`virtualenv`_ is forced to instead re-point ``sys.prefix`` at the
virtual environment.
An argument could be made that this PEP should follow virtualenv's
lead here (and introduce something like ``sys.base_prefix`` to point
to the standard library and header files), since virtualenv already
does this and it doesn't appear to have caused major problems with
existing code.
Another argument in favor of this is that it would be preferable to
err on the side of greater, rather than lesser, isolation. Changing
``sys.prefix`` to point to the virtual environment and introducing a
new ``sys.base_prefix`` attribute would err on the side of greater
isolation in the face of existing code's use of ``sys.prefix``.
What about include files?
-------------------------
For example, ZeroMQ installs ``zmq.h`` and ``zmq_utils.h`` in
``$VE/include``, whereas SIP (part of PyQt4) installs sip.h by default
in ``$VE/include/pythonX.Y``. With virtualenv, everything works
because the PythonX.Y include is symlinked, so everything that's
needed is in ``$VE/include``. At the moment the reference
implementation doesn't do anything with include files, besides
creating the include directory; this might need to change, to
copy/symlink ``$VE/include/pythonX.Y``.
As in Python there's no abstraction for a site-specific include
directory, other than for platform-specific stuff, then the user
expectation would seem to be that all include files anyone could ever
want should be found in one of just two locations, with sysconfig
labels "include" & "platinclude".
There's another issue: what if includes are Python-version-specific?
For example, SIP installs by default into ``$VE/include/pythonX.Y``
rather than ``$VE/include``, presumably because there's
version-specific stuff in there - but even if that's not the case with
SIP, it could be the case with some other package. And the problem
that gives is that you can't just symlink the ``include/pythonX.Y``
directory, but actually have to provide a writable directory and
symlink/copy the contents from the system ``include/pythonX.Y``. Of
course this is not hard to do, but it does seem inelegant. OTOH it's
really because there's no supporting concept in ``Python/sysconfig``.
Interface with packaging tools
------------------------------
Some work will be needed in packaging tools (Python 3.3 packaging,
Distribute) to support implementation of this PEP. For example:
* How Distribute and packaging use ``sys.prefix`` and/or
``sys.site_prefix``. Clearly, in practice we'll need to use
Distribute for a while, until packages have migrated over to usage
of setup.cfg.
* How packaging and Distribute set up shebang lines in scripts which they
install in virtual environments.
Testability and Source Build Issues
-----------------------------------
Currently in the reference implementation, virtual environments must
be created with an installed Python, rather than a source build, as
the base installation. In order to be able to fully test the ``venv``
module in the Python regression test suite, some anomalies in how
sysconfig data is configured in source builds will need to be removed.
For example, ``sysconfig.get_paths()`` in a source build gives
(partial output)::
{
'include': '/home/vinay/tools/pythonv/Include',
'libdir': '/usr/lib ; or /usr/lib64 on a multilib system',
'platinclude': '/home/vinay/tools/pythonv',
'platlib': '/usr/local/lib/python3.3/site-packages',
'platstdlib': '/usr/local/lib/python3.3',
'purelib': '/usr/local/lib/python3.3/site-packages',
'stdlib': '/usr/local/lib/python3.3'
}
Need for ``install_name_tool`` on OSX?
--------------------------------------
`Virtualenv uses`_ ``install_name_tool``, a tool provided in the Xcode
developer tools, to modify the copied executable on OSX. We need
input from OSX developers on whether this is actually necessary in
this PEP's implementation of virtual environments, and if so, if there
is an alternative to ``install_name_tool`` that would allow ``venv``
to not require that Xcode is installed.
.. _Virtualenv uses: https://github.com/pypa/virtualenv/issues/168
Activation and Utility Scripts
------------------------------
Virtualenv provides shell "activation" scripts as a user convenience,
to put the virtual environment's Python binary first on the shell
PATH. This is a maintenance burden, as separate activation scripts
need to be provided and maintained for every supported shell. For
this reason, this PEP proposes to leave such scripts to be provided by
third-party extensions; virtual environments created by the core
functionality would be used by directly invoking the environment's
Python binary or scripts.
If we are going to rely on external code to provide these
conveniences, we need to check with existing third-party projects in
this space (virtualenv, zc.buildout) and ensure that the proposed API
meets their needs.
(Virtualenv would be fine with the proposed API; it would become a
relatively thin wrapper with a subclass of the env builder that adds
shell activation and automatic installation of ``pip`` inside the
virtual environment).
Provide a mode that is isolated only from user site packages?
-------------------------------------------------------------
Is there sufficient rationale for providing a mode that isolates the
venv from :pep:`370` user site packages, but not from the system-level
site-packages?
Other Python implementations?
-----------------------------
We should get feedback from Jython, IronPython, and PyPy about whether
there's anything in this PEP that they foresee as a difficulty for
their implementation.
Reference Implementation
========================
The in-progress reference implementation is found in `a clone of the
CPython Mercurial repository`_. To test it, build and install it (the
virtual environment tool currently does not run from a source tree).
From the installed Python, run ``bin/pyvenv /path/to/new/virtualenv``
to create a virtual environment.
The reference implementation (like this PEP!) is a work in progress.
.. _a clone of the CPython Mercurial repository: https://bitbucket.org/vinay.sajip/pythonv
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: