PEP 421: edits from Eric.

This commit is contained in:
Georg Brandl 2012-05-06 11:17:25 +02:00
parent ddc930b416
commit b5d1d94750
1 changed files with 211 additions and 129 deletions

View File

@ -50,82 +50,61 @@ mindset.
Proposal
========
We will add a new variable to the ``sys`` module, called
``sys.implementation``, as a mapping to contain
We will add a new attribute to the ``sys`` module, called
``sys.implementation``, as an instance of a new type to contain
implementation-specific information.
The contents of this mapping will remain fixed during interpreter
The attributes of this object will remain fixed during interpreter
execution and through the course of an implementation version. This
ensures behaviors don't change between versions which depend on
variables in ``sys.implementation``.
The mapping will contain at least the values described in the
`Required Variables`_ section below. However, implementations are
free to add other implementation information there. Some
*conceivable* extra variables are described in the `Other Possible
Variables`_ section.
The object will have each of the attributes described in the `Required
Variables`_ section below. Any other per-implementation values may be
stored in ``sys.implementation.metadata``. However, nothing in the
standard library will rely on ``sys.implementation.metadata``.
Examples of possible metadata values are described in the `Example
Metadata Values`_ section.
This proposal takes a conservative approach in requiring only two
This proposal takes a conservative approach in requiring only four
variables. As more become appropriate, they may be added with
discretion.
Required Variables
--------------------
------------------
These are variables in ``sys.implementation`` on which the standard
library would rely, meaning implementers must define them:
library would rely, with the exception of ``metadata``, meaning
implementers must define them:
**name**
This is the name of the implementation (case sensitive). Examples
include 'PyPy', 'Jython', 'IronPython', and 'CPython'.
This is the common name of the implementation (case sensitive).
Examples include 'PyPy', 'Jython', 'IronPython', and 'CPython'.
**version**
This is the version of the implementation, as opposed to the
version of the language it implements. This value conforms to the
format described in `Version Format`_.
Other Possible Variables
------------------------
These variables could be useful, but don't necessarily have a clear
use case presently. They are listed here as values to consider in the
future, with appropriate discussion, relative to clear use-cases.
Their descriptions are therefore intentionally unhindered by details.
This is the version of the implementation, as opposed to the
version of the language it implements. This value conforms to the
format described in `Version Format`_.
**cache_tag**
A string used for the PEP 3147 cache tag (e.g. 'cpython33' for
CPython 3.3). The name and version from above could be used to
compose this, though an implementation may want something else.
However, module caching is not a requirement of implementations,
nor is the use of cache tags.
A string used for the PEP 3147 cache tag [#cachetag]_. It would
normally be a composite of the name and version (e.g. 'cpython-33'
for CPython 3.3). However, an implementation may explicitly use a
different cache tag. If ``cache_tag`` is set to None, it indicates
that module caching should be disabled.
**vcs_url**
The URL pointing to the main VCS repository for the implementation
project.
**metadata**
Any other values that an implementation wishes to specify,
particularly informational ones. Neither the standard library nor
the language specification will rely on implementation metadata.
Also see the list of `Example Metadata Values`_.
**vcs_revision_id**
A value that identifies the VCS revision of the implementation
that is currently running.
Adding New Required Attributes
------------------------------
**build_toolchain**
Identifies the tools used to build the interpreter.
**homepage**
The URL of the implementation's website.
**site_prefix**
The preferred site prefix for this implementation.
**runtime**
The run-time environment in which the interpreter is running, as
in "Common Language *Runtime*" (.NET CLR) or "Java *Runtime*
Executable".
**gc_type**
The type of garbage collection used, like "reference counting" or
"mark and sweep".
XXX PEP? something lighter?
Version Format
@ -136,82 +115,143 @@ will be used internally in the standard library. In order to
facilitate the usefulness of a version variable, its value should be
in a consistent format across implementations.
XXX Subject to feedback
As such, the format of ``sys.implementation.version`` must follow that
of ``sys.version_info``, which is effectively a named tuple. It is a
familiar format and generally consistent with normal version format
conventions.
As such, the format of ``sys.implementation['version']`` must follow
that of ``sys.version_info``, which is effectively a named tuple. It
is a familiar format and generally consistent with normal version
format conventions.
XXX The following is not exactly true:
Keep in mind, however, that ``sys.implementation['version']`` is the
Keep in mind, however, that ``sys.implementation.version`` is the
version of the Python *implementation*, while ``sys.version_info``
(and friends) is the version of the Python language.
Example Metadata Values
-----------------------
These are the sorts of values an implementation may put into
``sys.implementation.metadata``. However, these names and
descriptions are only examples and are not being proposed here. If
they later have meaningful uses cases, they can be added by following
the process described in `Adding New Required Attributes`_.
**vcs_url**
The URL pointing to the main VCS repository for the implementation
project.
**vcs_revision_id**
A value that identifies the VCS revision of the implementation that
is currently running.
**build_toolchain**
Identifies the tools used to build the interpreter.
**build_date**
The timestamp of when the interpreter was built.
**homepage**
The URL of the implementation's website.
**site_prefix**
The preferred site prefix for this implementation.
**runtime**
The run-time environment in which the interpreter is running, as
in "Common Language *Runtime*" (.NET CLR) or "Java *Runtime*
Executable".
**gc_type**
The type of garbage collection used, like "reference counting" or
"mark and sweep".
Rationale
=========
The status quo for implementation-specific information gives us that
information in a more fragile, harder to maintain way. It's spread
information in a more fragile, harder to maintain way. It is spread
out over different modules or inferred from other information, as we
see with ``platform.python_implementation()``.
see with `platform.python_implementation()`_.
This PEP is the main alternative to that approach. It consolidates
the implementation-specific information into a single namespace and
makes explicit that which was implicit.
Why a Dictionary?
-----------------
Why a Custom Type?
------------------
A dictionary reflects a simple namespace. It maps names to values and
that's it. Really that's all we need.
A dedicated class, of which ``sys.implementation`` is an instance, would
facilitate the dotted access of a "named" tuple. At the same time, it
allows us to avoid the problems of the other approaches (see below),
like confusion about ordering and iteration.
The alternatives to a dictionary are considered separately here:
**"Named" Tuple**
**Dictionary**
The first alternative is a namedtuple or a structseq or some other
A dictionary reflects a simple namespace with item access. It
maps names to values and that's all. It also reflects the more variable
nature of ``sys.implementation``.
However, a simple dictionary does not set expectations very well about
the nature of ``sys.implementation``. The custom type approach, with
a fixed set of required attributes, does a better job of this.
**Named Tuple**
Another close alternative is a namedtuple or a structseq or some other
tuple type with dotted access (a la ``sys.version_info``). This type
is immutable and simple. It is a well established pattern for
implementation- specific variables in Python. Dotted access on a
implementation-specific variables in Python. Dotted access on a
namespace is also very convenient.
However, sys.implementation does not have meaning as a sequence.
Also, unlike other such variables, it has the potential for more
variation over time. Finally, generic lookup may favor dicts::
Fallback lookup may favor dicts::
cache_tag = sys.implementation.get('cache_tag')
cache_tag = sys.implementation.get('cache_tag')
vs.
vs.
cache_tag = getattr(sys.implementation.get, 'cache_tag', None)
cache_tag = getattr(sys.implementation.get, 'cache_tag', None)
However, this is mitigated by having ``sys.implementation.metadata``.
One problem with using a named tuple is that ``sys.implementation`` does
not have meaning as a sequence. Also, unlike other similar ``sys``
variables, it has a far greater potential to change over time.
If a named tuple were used, we'd be very clear in the documentation
that the length and order of the value are not reliable. Iterability
would not be guaranteed.
**Concrete Class**
Another option would be to have a dedicated class, of which
``sys.implementation`` is an instance. This would facilitate the
dotted access of a "named" tuple, without exposing any confusion about
ordering and iteration.
One downside is that you lose the immutable aspect of a tuple, making
it less clear that ``sys.implementation`` should not be manipulated.
Another downside is that classes often imply the presence (or
possibility) of methods, which may be misleading in this case.
**Module**
Using a module instead of a dict is another option. It has similar
characteristics to an instance, but with a slight hint of immutability
(at least by convention). Such a module could be a stand-alone sub-
module of ``sys`` or added on, like ``os.path``. Unlike a concrete
class, no new type would be necessary.
class, no new type would be necessary. This is a pretty close fit to
what we need.
The downsides are similar to those of a concrete class.
The downside is that the module type is much less conducive to
extension, making it more difficult to address the weaknesses of using
an instance of a concrete class.
Why metadata?
-------------
``sys.implementation.metadata`` will hold any optional, strictly-
informational, or per-implementation data. This allows us to restrict
``sys.implementation`` to the required attributes. In that way, its
type can reflect the more stable namespace and
``sys.implementation.metadata`` (as a dict) can reflect the less
certain namespace.
``sys.implementation.metadata`` is the place an implementation can put
values that must be built-in, "without having to pollute the main sys
namespace" [#Nick]_.
Why a Part of ``sys``?
@ -229,40 +269,41 @@ As already noted in `Version Format`_, values in
``sys.implementation`` are intended for use by the standard library.
Constraining those values, essentially specifying an API for them,
allows them to be used consistently, regardless of how they are
implemented otherwise.
otherwise implemented.
Discussion
==========
The topic of ``sys.implementation`` came up on the python-ideas list
in 2009, where the reception was broadly positive [1]_. I revived the
discussion recently while working on a pure-python ``imp.get_tag()``
[2]_. The messages in `issue #14673`_ are also relevant.
in 2009, where the reception was broadly positive [#original]_. I
revived the discussion recently while working on a pure-python
``imp.get_tag()`` [#revived]_. Discussion has been ongoing
[#feedback]_. The messages in `issue #14673`_ are also relevant.
Use-cases
=========
``platform.python_implementation()``
------------------------------------
platform.python_implementation()
--------------------------------
"explicit is better than implicit"
The ``platform`` module guesses the python implementation by looking
for clues in a couple different ``sys`` variables [3]_. However, this
approach is fragile. Beyond that, it's limited to those
implementations that core developers have blessed by special-casing
them in the ``platform`` module.
The ``platform`` module determines the python implementation by
looking for clues in a couple different ``sys`` variables [#guess]_.
However, this approach is fragile, requiring changes to the standard
library each time an implementation changes. Beyond that, support in
``platform`` is limited to those implementations that core developers
have blessed by special-casing them in the ``platform`` module.
With ``sys.implementation`` the various implementations would
*explicitly* set the values in their own version of the ``sys``
module.
Aside from the guessing, another concern is that the ``platform``
module is part of the stdlib, which ideally would minimize
implementation details such as would be moved to
``sys.implementation``.
Another concern is that the ``platform`` module is part of the stdlib,
which ideally would minimize implementation details such as would be
moved to ``sys.implementation``.
Any overlap between ``sys.implementation`` and the ``platform`` module
would simply defer to ``sys.implementation`` (with the same interface
@ -275,18 +316,18 @@ Cache Tag Generation in Frozen Importlib
PEP 3147 defined the use of a module cache and cache tags for file
names. The importlib bootstrap code, frozen into the Python binary as
of 3.3, uses the cache tags during the import process. Part of the
project to bootstrap importlib has been to clean out of
`Python/import.c` any code that did not need to be there.
project to bootstrap importlib has been to clean code out of
`Python/import.c`_ that did not need to be there any longer.
The cache tag defined in `Python/import.c` was hard-coded to
``"cpython" MAJOR MINOR`` [4]_. For importlib the options are either
hard-coding it in the same way, or guessing the implementation in the
same way as does ``platform.python_implementation()``.
The cache tag defined in ``Python/import.c`` was hard-coded to
``"cpython" MAJOR MINOR`` [#cachetag]_. For importlib the options are
either hard-coding it in the same way, or guessing the implementation
in the same way as does ``platform.python_implementation()``.
As long as the hard-coded tag is limited to CPython-specific code,
it's livable. However, inasmuch as other Python implementations use
the importlib code to work with the module cache, a hard-coded tag
would become a problem..
As long as the hard-coded tag is limited to CPython-specific code, it
is livable. However, inasmuch as other Python implementations use the
importlib code to work with the module cache, a hard-coded tag would
become a problem.
Directly using the ``platform`` module in this case is a non-starter.
Any module used in the importlib bootstrap must be built-in or frozen,
@ -314,9 +355,11 @@ unnecessary.
Jython's ``os.name`` Hack
-------------------------
XXX
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512
In Jython, ``os.name`` is set to 'java' to accommodate special
treatment of the java environment in the standard library [#os_name]_
[#javatest]_. Unfortunately it masks the os name that would otherwise
go there. ``sys.implementation`` would help obviate the need for this
special case.
Feedback From Other Python Implementers
@ -362,8 +405,8 @@ is proposed in that same spirit.
Alternatives
============
With the single-namespace-under-sys so straightforward, no
alternatives have been considered for this PEP.
Since the single-namespace-under-sys approach is relatively
straightforward, no alternatives have been considered for this PEP.
Open Issues
@ -374,6 +417,16 @@ Open Issues
- possibly pull in implementation details from the main ``sys``
namespace and elsewhere (PEP 3137 lite).
* What is the process for introducing new required variables? PEP?
* Is the ``sys.version_info`` format the right one here?
* Should ``sys.implementation.hexversion`` be part of the PEP?
* Does ``sys.(version|version_info|hexversion)`` need to better
reflect the version of the language spec? Micro version, series,
and release seem like implementation-specific values.
* Alternatives to the approach dictated by this PEP?
* Do we really want to commit to using a dict for
@ -387,10 +440,19 @@ Open Issues
oriented toward internal use then the data structure is not as
critical. However, ``sys.implementation`` is intended to have a
non-localized impact across the standard library and the
interpreter. It's better to *not* make hacking it become an
interpreter. It is better to *not* make hacking it become an
attractive nuisance, regardless of our intentions for usage.
* use (immutable?) nameddict (analogous to namedtuple/structseq)?
* Should ``sys.implementation`` and its values be immutable? A benefit
of an immutable type is it communicates that the value is not
expected to change and should not be manipulated.
* Should ``sys.implementation`` be strictly disallowed to have methods?
Classes often imply the presence (or possibility) of methods, which
may be misleading in this case.
* Should ``sys.implementation`` implement the collections.abc.Mapping
interface?
Implementation
@ -402,25 +464,45 @@ The implementation of this PEP is covered in `issue #14673`_.
References
==========
.. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html
.. [#original] The 2009 sys.implementation discussion:
http://mail.python.org/pipermail/python-dev/2009-October/092893.html
.. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html
.. [#revived] The initial 2012 discussion:
http://mail.python.org/pipermail/python-ideas/2012-April/014878.html
.. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247
.. [#feedback] Feedback on the PEP:
http://mail.python.org/pipermail/python-ideas/2012-April/014954.html
.. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121
.. [#guess] The ``platform`` code which divines the implementation name:
http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247
.. [5] Examples of implementation-specific handling in test.support:
.. [#cachetag] The definition for cache tags in PEP 3147:
http://www.python.org/dev/peps/pep-3147/#id53
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275
.. [#tag_impl] The original implementation of the cache tag in CPython:
http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121
.. [#tests] Examples of implementation-specific handling in test.support:
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275
.. [#os_name] The standard library entry for os.name:
http://docs.python.org/3.3/library/os.html#os.name
.. [#javatest] The use of ``os.name`` as 'java' in the stdlib test suite.
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512
.. [#Nick] Nick Coghlan's proposal for ``sys.implementation.metadata``:
http://mail.python.org/pipermail/python-ideas/2012-May/014984.html
.. _issue #14673: http://bugs.python.org/issue14673
.. _Lib/test/support.py: http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py
.. _Python/import.c: http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c
Copyright
=========