PEP 421 changes by Eric + formatting/spelling fixes.

This commit is contained in:
Georg Brandl 2012-05-01 08:51:22 +02:00
parent d8dd373e41
commit 21b4644d46
1 changed files with 251 additions and 120 deletions

View File

@ -13,11 +13,11 @@ Post-History: 26-April-2012
Abstract
========
This PEP introduces a new variable for the sys module: ``sys.implementation``.
The variable holds consolidated information about the implementation of
the running interpreter. Thus ``sys.implementation`` is the source to
which the standard library may look for implementation-specific
information.
This PEP introduces a new attribute for the ``sys`` module:
``sys.implementation``. The attribute holds consolidated information
about the implementation of the running interpreter. Thus
``sys.implementation`` is the source to which the standard library may
look for implementation-specific information.
The proposal in this PEP is in line with a broader emphasis on making
Python friendlier to alternate implementations. It describes the new
@ -34,121 +34,211 @@ this change is due to the emergence of Jython, IronPython, and PyPy as
viable alternate implementations of Python.
Consider, however, the nearly two decades of CPython-centric Python
(i.e. most of its existance). That focus had understandably contributed
to quite a few CPython-specific artifacts both in the standard library
and exposed in the interpreter. Though the core developers have made an
effort in recent years to address this, quite a few of the artifacts
remain.
(i.e. most of its existence). That focus had understandably
contributed to quite a few CPython-specific artifacts both in the
standard library and exposed in the interpreter. Though the core
developers have made an effort in recent years to address this, quite
a few of the artifacts remain.
Part of the solution is presented in this PEP: a single namespace on
Part of the solution is presented in this PEP: a single namespace in
which to consolidate implementation specifics. This will help focus
efforts to differentiate the implementation specifics from the language.
Additionally, it will foster a multiple-implementation mindset.
efforts to differentiate the implementation specifics from the
language. Additionally, it will foster a multiple-implementation
mindset.
Proposal
========
We will add ``sys.implementation``, in the sys module, as a namespace to
contain implementation-specific information.
We will add a new variable to the ``sys`` module, called
``sys.implementation``, as a mapping to contain
implementation-specific information.
The contents of this namespace will remain fixed during interpreter
The contents of this mapping will remain fixed during interpreter
execution and through the course of an implementation version. This
ensures behaviors don't change between versions which depend on variables
in ``sys.implementation``.
ensures behaviors don't change between versions which depend on
variables in ``sys.implementation``.
``sys.implementation`` is a dictionary, as opposed to any form of "named"
tuple (a la ``sys.version_info``). This is partly because it doesn't
have meaning as a sequence, and partly because it's a potentially more
variable data structure.
The namespace will contain at least the variables described in the
`Required Variables`_ section below. However, implementations are free
to add other implementation information there. Some possible extra
variables are described in the `Other Possible Variables`_ section.
The mapping will contain at least the values described in the
`Required Variables`_ section below. However, implementations are
free to add other implementation information there. Some
*conceivable* extra variables are described in the `Other Possible
Variables`_ section.
This proposal takes a conservative approach in requiring only two
variables. As more become appropriate, they may be added with discretion.
variables. As more become appropriate, they may be added with
discretion.
Required Variables
--------------------
These are variables in ``sys.implementation`` on which the standard
library would rely, meaning they would need to be defined:
library would rely, meaning implementers must define them:
name
the name of the implementation (case sensitive).
**name**
This is the name of the implementation (case sensitive). Examples
include 'PyPy', 'Jython', 'IronPython', and 'CPython'.
version
the version of the implementation, as opposed to the version of the
language it implements. This would use a standard format, similar to
``sys.version_info`` (see `Version Format`_).
**version**
This is the version of the implementation, as opposed to the
version of the language it implements. This value conforms to the
format described in `Version Format`_.
Other Possible Variables
------------------------
These variables could be useful, but don't necessarily have a clear use
case presently:
These variables could be useful, but don't necessarily have a clear
use case presently. They are listed here as values to consider in the
future, with appropriate discussion, relative to clear use-cases.
Their descriptions are therefore intentionally unhindered by details.
cache_tag
a string used for the PEP 3147 cache tag (e.g. 'cpython33' for
CPython 3.3). The name and version from above could be used to
compose this, though an implementation may want something else.
However, module caching is not a requirement of implementations, nor
is the use of cache tags.
**cache_tag**
A string used for the PEP 3147 cache tag (e.g. 'cpython33' for
CPython 3.3). The name and version from above could be used to
compose this, though an implementation may want something else.
However, module caching is not a requirement of implementations,
nor is the use of cache tags.
repository
the implementation's repository URL.
**vcs_url**
The URL pointing to the main VCS repository for the implementation
project.
repository_revision
the revision identifier for the implementation.
**vcs_revision_id**
A value that identifies the VCS revision of the implementation
that is currently running.
build_toolchain
identifies the tools used to build the interpreter.
**build_toolchain**
Identifies the tools used to build the interpreter.
url (or website)
the URL of the implementation's site.
**homepage**
The URL of the implementation's website.
site_prefix
the preferred site prefix for this implementation.
**site_prefix**
The preferred site prefix for this implementation.
runtime
the run-time environment in which the interpreter is running.
**runtime**
The run-time environment in which the interpreter is running, as
in "Common Language *Runtime*" (.NET CLR) or "Java *Runtime*
Executable".
gc_type
the type of garbage collection used.
**gc_type**
The type of garbage collection used, like "reference counting" or
"mark and sweep".
Version Format
--------------
XXX same as sys.version_info?
A main point of ``sys.implementation`` is to contain information that
will be used internally in the standard library. In order to
facilitate the usefulness of a version variable, its value should be
in a consistent format across implementations.
XXX Subject to feedback
As such, the format of ``sys.implementation['version']`` must follow
that of ``sys.version_info``, which is effectively a named tuple. It
is a familiar format and generally consistent with normal version
format conventions.
Keep in mind, however, that ``sys.implementation['version']`` is the
version of the Python *implementation*, while ``sys.version_info``
(and friends) is the version of the Python language.
Rationale
=========
The status quo for implementation-specific information gives us that
information in a more fragile, harder to maintain way. It's spread out
over different modules or inferred from other information, as we see with
``platform.python_implementation()``.
information in a more fragile, harder to maintain way. It's spread
out over different modules or inferred from other information, as we
see with ``platform.python_implementation()``.
This PEP is the main alternative to that approach. It consolidates the
implementation-specific information into a single namespace and makes
explicit that which was implicit.
This PEP is the main alternative to that approach. It consolidates
the implementation-specific information into a single namespace and
makes explicit that which was implicit.
Why a Dictionary?
-----------------
A dictionary reflects a simple namespace. It maps names to values and
that's it. Really that's all we need.
The alternatives to a dictionary are considered separately here:
**"Named" Tuple**
The first alternative is a namedtuple or a structseq or some other
tuple type with dotted access (a la ``sys.version_info``). This type
is immutable and simple. It is a well established pattern for
implementation- specific variables in Python. Dotted access on a
namespace is also very convenient.
However, sys.implementation does not have meaning as a sequence.
Also, unlike other such variables, it has the potential for more
variation over time. Finally, generic lookup may favor dicts::
cache_tag = sys.implementation.get('cache_tag')
vs.
cache_tag = getattr(sys.implementation.get, 'cache_tag', None)
If a named tuple were used, we'd be very clear in the documentation
that the length and order of the value are not reliable. Iterability
would not be guaranteed.
**Concrete Class**
Another option would be to have a dedicated class, of which
``sys.implementation`` is an instance. This would facilitate the
dotted access of a "named" tuple, without exposing any confusion about
ordering and iteration.
One downside is that you lose the immutable aspect of a tuple, making
it less clear that ``sys.implementation`` should not be manipulated.
Another downside is that classes often imply the presence (or
possibility) of methods, which may be misleading in this case.
**Module**
Using a module instead of a dict is another option. It has similar
characteristics to an instance, but with a slight hint of immutability
(at least by convention). Such a module could be a stand-alone sub-
module of ``sys`` or added on, like ``os.path``. Unlike a concrete
class, no new type would be necessary.
The downsides are similar to those of a concrete class.
Why a Part of ``sys``?
----------------------
The ``sys`` module should hold the new namespace because ``sys`` is
the depot for interpreter-centric variables and functions. Many
implementation-specific variables are already found in ``sys``.
Why Strict Constraints on Any of the Values?
--------------------------------------------
As already noted in `Version Format`_, values in
``sys.implementation`` are intended for use by the standard library.
Constraining those values, essentially specifying an API for them,
allows them to be used consistently, regardless of how they are
implemented otherwise.
With the single-namespace-under-sys so straightforward, no alternatives
have been considered for this PEP.
Discussion
==========
The topic of ``sys.implementation`` came up on the python-ideas list in
2009, where the reception was broadly positive [1]_. I revived the
discussion recently while working on a pure-python ``imp.get_tag()`` [2]_.
The messages in `issue #14673`_ are also relevant.
The topic of ``sys.implementation`` came up on the python-ideas list
in 2009, where the reception was broadly positive [1]_. I revived the
discussion recently while working on a pure-python ``imp.get_tag()``
[2]_. The messages in `issue #14673`_ are also relevant.
Use-cases
@ -159,62 +249,66 @@ Use-cases
"explicit is better than implicit"
The platform module guesses the python implementation by looking for
clues in a couple different sys variables [3]_. However, this approach
is fragile. Beyond that, it's limited to those implementations that core
developers have blessed by special-casing them in the platform module.
The ``platform`` module guesses the python implementation by looking
for clues in a couple different ``sys`` variables [3]_. However, this
approach is fragile. Beyond that, it's limited to those
implementations that core developers have blessed by special-casing
them in the ``platform`` module.
With ``sys.implementation`` the various implementations would
*explicitly* set the values in their own version of the sys module.
*explicitly* set the values in their own version of the ``sys``
module.
Aside from the guessing, another concern is that the platform module is
part of the stdlib, which ideally would minimize implementation details
such as would be moved to ``sys.implementation``.
Aside from the guessing, another concern is that the ``platform``
module is part of the stdlib, which ideally would minimize
implementation details such as would be moved to
``sys.implementation``.
Any overlap between ``sys.implementation`` and the platform module would
simply defer to ``sys.implementation`` (with the same interface in
platform wrapping it).
Any overlap between ``sys.implementation`` and the ``platform`` module
would simply defer to ``sys.implementation`` (with the same interface
in ``platform`` wrapping it).
Cache Tag Generation in Frozen Importlib
----------------------------------------
PEP 3147 defined the use of a module cache and cache tags for file names.
The importlib bootstrap code, frozen into the Python binary as of 3.3,
uses the cache tags during the import process. Part of the project to
bootstrap importlib has been to clean out of Lib/import.c any code that
did not need to be there.
PEP 3147 defined the use of a module cache and cache tags for file
names. The importlib bootstrap code, frozen into the Python binary as
of 3.3, uses the cache tags during the import process. Part of the
project to bootstrap importlib has been to clean out of
`Python/import.c` any code that did not need to be there.
The cache tag defined in Lib/import.c was hard-coded to
The cache tag defined in `Python/import.c` was hard-coded to
``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either
hard-coding it in the same way, or guessing the implementation in the
same way as does ``platform.python_implementation()``.
As long as the hard-coded tag is limited to CPython-specific code, it's
livable. However, inasmuch as other Python implementations use the
importlib code to work with the module cache, a hard-coded tag would
become a problem..
As long as the hard-coded tag is limited to CPython-specific code,
it's livable. However, inasmuch as other Python implementations use
the importlib code to work with the module cache, a hard-coded tag
would become a problem..
Directly using the platform module in this case is a non-starter. Any
module used in the importlib bootstrap must be built-in or frozen,
neither of which apply to the platform module. This is the point that
led to the recent interest in ``sys.implementation``.
Directly using the ``platform`` module in this case is a non-starter.
Any module used in the importlib bootstrap must be built-in or frozen,
neither of which apply to the ``platform`` module. This is the point
that led to the recent interest in ``sys.implementation``.
Regardless of how the implementation name is gotten, the version to use
for the cache tag is more likely to be the implementation version rather
than the language version. That implementation version is not readily
Regardless of the outcome for the implementation name used, another
problem relates to the version used in the cache tag. That version is
likely to be the implementation version rather than the language
version. However, the implementation version is not readily
identified anywhere in the standard library.
Implementation-Specific Tests
-----------------------------
XXX
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275
Currently there are a number of implementation-specific tests in the
test suite under ``Lib/test``. The test support module
(`Lib/test/support.py`_) provides some functionality for dealing with
these tests. However, like the ``platform`` module, ``test.support``
must do some guessing that ``sys.implementation`` would render
unnecessary.
Jython's ``os.name`` Hack
@ -225,14 +319,8 @@ XXX
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512
Impact on CPython
=================
XXX
Feedback From Other Python Implementators
=========================================
Feedback From Other Python Implementers
=======================================
IronPython
----------
@ -253,28 +341,62 @@ XXX
Past Efforts
============
XXX PEP 3139
XXX PEP 399
PEP 3139
--------
This PEP from 2008 recommended a clean-up of the ``sys`` module in
part by extracting implementation-specific variables and functions
into a separate module. PEP 421 is a much lighter version of that
idea. While PEP 3139 was rejected, its goals are reflected in PEP 421
to a large extent, though with a much lighter approach.
PEP 399
-------
This informational PEP dictates policy regarding the standard library,
helping to make it friendlier to alternate implementations. PEP 421
is proposed in that same spirit.
Alternatives
============
With the single-namespace-under-sys so straightforward, no
alternatives have been considered for this PEP.
Open Issues
===========
* What are the long-term objectives for sys.implementation?
* What are the long-term objectives for ``sys.implementation``?
- pull in implementation detail from the main sys namespace and
elsewhere (PEP 3137 lite).
- possibly pull in implementation details from the main ``sys``
namespace and elsewhere (PEP 3137 lite).
* Alternatives to the approach dictated by this PEP?
* ``sys.implementation`` as a proper namespace rather than a dict. It
would be it's own module or an instance of a concrete class.
* Do we really want to commit to using a dict for
``sys.implementation``?
Backward compatibility issues will make it difficult to change our
minds later.
The type we use ultimately depends on how general we expect the
consumption of ``sys.implementation`` to be. If its practicality is
oriented toward internal use then the data structure is not as
critical. However, ``sys.implementation`` is intended to have a
non-localized impact across the standard library and the interpreter.
It's better to *not* make hacking it become an attractive nuisance,
regardless of our intentions for usage.
* use (immutable?) nameddict (analogous to namedtuple/structseq)?
Implementation
==============
The implementatation of this PEP is covered in `issue #14673`_.
The implementation of this PEP is covered in `issue #14673`_.
References
@ -288,8 +410,17 @@ References
.. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121
.. [5] Examples of implementation-specific handling in test.support:
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275
.. _issue #14673: http://bugs.python.org/issue14673
.. _Lib/test/support.py: http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py
Copyright
=========