python-peps/pep-0382.txt

176 lines
7.0 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 382
Title: Namespace Packages
Version: $Revision$
Last-Modified: $Date$
Author: Martin v. Löwis <martin@v.loewis.de>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 02-Apr-2009
Python-Version: 3.2
Post-History:
Abstract
========
Namespace packages are a mechanism for splitting a single Python
package across multiple directories on disk. In current Python
versions, an algorithm to compute the packages __path__ must be
formulated. With the enhancement proposed here, the import machinery
itself will construct the list of directories that make up the
package.
Terminology
===========
Within this PEP, the term package refers to Python packages as defined
by Python's import statement. The term distribution refers to
separately installable sets of Python modules as stored in the Python
package index, and installed by distutils or setuptools. The term
vendor package refers to groups of files installed by an operating
system's packaging mechanism (e.g. Debian or Redhat packages install
on Linux systems).
The term portion refers to a set of files in a single directory (possibly
stored in a zip file) that contribute to a namespace package.
Namespace packages today
========================
Python currently provides the pkgutil.extend_path to denote a package as
a namespace package. The recommended way of using it is to put::
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
in the package's ``__init__.py``. Every distribution needs to provide
the same contents in its ``__init__.py``, so that extend_path is
invoked independent of which portion of the package gets imported
first. As a consequence, the package's ``__init__.py`` cannot
practically define any names as it depends on the order of the package
fragments on sys.path which portion is imported first. As a special
feature, extend_path reads files named ``<packagename>.pkg`` which
allow to declare additional portions.
setuptools provides a similar function pkg_resources.declare_namespace
that is used in the form::
import pkg_resources
pkg_resources.declare_namespace(__name__)
In the portion's __init__.py, no assignment to __path__ is necessary,
as declare_namespace modifies the package __path__ through sys.modules.
As a special feature, declare_namespace also supports zip files, and
registers the package name internally so that future additions to sys.path
by setuptools can properly add additional portions to each package.
setuptools allows declaring namespace packages in a distribution's
setup.py, so that distribution developers don't need to put the
magic __path__ modification into __init__.py themselves.
Rationale
=========
The current imperative approach to namespace packages has lead to
multiple slightly-incompatible mechanisms for providing namespace
packages. For example, pkgutil supports ``*.pkg`` files; setuptools
doesn't. Likewise, setuptools supports inspecting zip files, and
supports adding portions to its _namespace_packages variable, whereas
pkgutil doesn't.
In addition, the current approach causes problems for system vendors.
Vendor packages typically must not provide overlapping files, and an
attempt to install a vendor package that has a file already on disk
will fail or cause unpredictable behavior. As vendors might chose to
package distributions such that they will end up all in a single
directory for the namespace package, all portions would contribute
conflicting __init__.py files.
Specification
=============
Rather than using an imperative mechanism for importing packages, a
declarative approach is proposed here, as an extension to the existing
``*.pth`` mechanism available on the top-level python path.
The import statement is extended so that it directly considers
``*.pth`` files during import; a directory is considered a package if
it either contains a file named __init__.py, or a file whose name ends
with ".pth". Unlike .pth files on the top level, lines starting with
"import" are not supported in per-package .pth files.
In addition, the format of the ``*.pth`` file is extended: a line with
the single character ``*`` indicates that the entire sys.path will
be searched for portions of the namespace package at the time the
namespace packages is imported. For a sub-package, the package's
__path__ is searched (instead of sys.path).
Importing a package will immediately compute the package's __path__;
the ``*.pth`` files are not considered anymore after the initial
import. If a ``*.pth`` package contains an asterisk, this asterisk is
prepended to the package's __path__ to indicate that the package is a
namespace package (and that thus further extensions to sys.path might
also want to extend __path__). At most one such asterisk gets
prepended to the path. In addition, the (possibly dotted) names of all
namespace packages are added to the set sys.namespace_packages.
No other change to the importing mechanism is made; searching modules
(including __init__.py) will continue to stop at the first module
encountered. In summary, the process import a package foo works like
this:
1. sys.path is search for a directory foo, or a file foo.<ext>.
If a file is found and no directory, it is treated as a module, and imported.
2. if it is a directory, it checks for \*.pth files. If it finds
any, a package is created, and its __path__ is extended.
3. The __init__ module is imported; this import will search the
__path__ that got computed already.
4. If neither a \*.pth file nor an __init__.py was found, the
directory is skipped, and search for the module/package
continues.
Discussion
==========
With the addition of ``*.pth`` files to the import mechanism, namespace
packages can stop filling out the namespace package's __init__.py.
As a consequence, extend_path and declare_namespace become obsolete.
It is recommended that distributions put a file <distribution>.pth
into their namespace packages, with a single asterisk. This allows
vendor packages to install multiple portions of namespace package
into a single directory, with no risk of overlapping files.
Namespace packages can start providing non-trivial __init__.py
implementations; to do so, it is recommended that a single distribution
provides a portion with just the namespace package's __init__.py
(and potentially other modules that belong to the namespace package
proper).
The mechanism is mostly compatible with the existing namespace
mechanisms. extend_path will be adjusted to this specification;
any other mechanism might cause portions to get added twice to
__path__.
It has been proposed to also add this feature to Python 2.7. Given
that 2.x reaches its end-of-life, it is questionable whether the
addition of the feature would really do more good than harm (in
having users and tools starting to special-case 2.7). Prospective
users of this feature are encouraged to comment on this particular
question.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: