522 lines
18 KiB
Plaintext
522 lines
18 KiB
Plaintext
PEP: 421
|
||
Title: Adding sys.implementation
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Eric Snow <ericsnowcurrently@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 26-April-2012
|
||
Post-History: 26-April-2012
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP introduces a new attribute for the ``sys`` module:
|
||
``sys.implementation``. The attribute holds consolidated information
|
||
about the implementation of the running interpreter. Thus
|
||
``sys.implementation`` is the source to which the standard library may
|
||
look for implementation-specific information.
|
||
|
||
The proposal in this PEP is in line with a broader emphasis on making
|
||
Python friendlier to alternate implementations. It describes the new
|
||
variable and the constraints on what that variable contains. The PEP
|
||
also explains some immediate use cases for ``sys.implementation``.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
For a number of years now, the distinction between Python-the-language
|
||
and CPython (the reference implementation) has been growing. Most of
|
||
this change is due to the emergence of Jython, IronPython, and PyPy as
|
||
viable alternate implementations of Python.
|
||
|
||
Consider, however, the nearly two decades of CPython-centric Python
|
||
(i.e. most of its existence). That focus had understandably
|
||
contributed to quite a few CPython-specific artifacts both in the
|
||
standard library and exposed in the interpreter. Though the core
|
||
developers have made an effort in recent years to address this, quite
|
||
a few of the artifacts remain.
|
||
|
||
Part of the solution is presented in this PEP: a single namespace in
|
||
which to consolidate implementation specifics. This will help focus
|
||
efforts to differentiate the implementation specifics from the
|
||
language. Additionally, it will foster a multiple-implementation
|
||
mindset.
|
||
|
||
|
||
Proposal
|
||
========
|
||
|
||
We will add a new attribute to the ``sys`` module, called
|
||
``sys.implementation``, as an instance of a new type to contain
|
||
implementation-specific information.
|
||
|
||
The attributes of this object will remain fixed during interpreter
|
||
execution and through the course of an implementation version. This
|
||
ensures behaviors don't change between versions which depend on
|
||
variables in ``sys.implementation``.
|
||
|
||
The object will have each of the attributes described in the `Required
|
||
Variables`_ section below. Any other per-implementation values may be
|
||
stored in ``sys.implementation.metadata``. However, nothing in the
|
||
standard library will rely on ``sys.implementation.metadata``.
|
||
Examples of possible metadata values are described in the `Example
|
||
Metadata Values`_ section.
|
||
|
||
This proposal takes a conservative approach in requiring only four
|
||
variables. As more become appropriate, they may be added with
|
||
discretion.
|
||
|
||
|
||
Required Variables
|
||
------------------
|
||
|
||
These are variables in ``sys.implementation`` on which the standard
|
||
library would rely, with the exception of ``metadata``, meaning
|
||
implementers must define them:
|
||
|
||
**name**
|
||
This is the common name of the implementation (case sensitive).
|
||
Examples include 'PyPy', 'Jython', 'IronPython', and 'CPython'.
|
||
|
||
**version**
|
||
This is the version of the implementation, as opposed to the
|
||
version of the language it implements. This value conforms to the
|
||
format described in `Version Format`_.
|
||
|
||
**cache_tag**
|
||
A string used for the PEP 3147 cache tag [#cachetag]_. It would
|
||
normally be a composite of the name and version (e.g. 'cpython-33'
|
||
for CPython 3.3). However, an implementation may explicitly use a
|
||
different cache tag. If ``cache_tag`` is set to None, it indicates
|
||
that module caching should be disabled.
|
||
|
||
**metadata**
|
||
Any other values that an implementation wishes to specify,
|
||
particularly informational ones. Neither the standard library nor
|
||
the language specification will rely on implementation metadata.
|
||
Also see the list of `Example Metadata Values`_.
|
||
|
||
|
||
Adding New Required Attributes
|
||
------------------------------
|
||
|
||
XXX PEP? something lighter?
|
||
|
||
|
||
Version Format
|
||
--------------
|
||
|
||
A main point of ``sys.implementation`` is to contain information that
|
||
will be used internally in the standard library. In order to
|
||
facilitate the usefulness of a version variable, its value should be
|
||
in a consistent format across implementations.
|
||
|
||
As such, the format of ``sys.implementation.version`` must follow that
|
||
of ``sys.version_info``, which is effectively a named tuple. It is a
|
||
familiar format and generally consistent with normal version format
|
||
conventions.
|
||
|
||
XXX The following is not exactly true:
|
||
|
||
Keep in mind, however, that ``sys.implementation.version`` is the
|
||
version of the Python *implementation*, while ``sys.version_info``
|
||
(and friends) is the version of the Python language.
|
||
|
||
|
||
Example Metadata Values
|
||
-----------------------
|
||
|
||
These are the sorts of values an implementation may put into
|
||
``sys.implementation.metadata``. However, these names and
|
||
descriptions are only examples and are not being proposed here. If
|
||
they later have meaningful uses cases, they can be added by following
|
||
the process described in `Adding New Required Attributes`_.
|
||
|
||
**vcs_url**
|
||
The URL pointing to the main VCS repository for the implementation
|
||
project.
|
||
|
||
**vcs_revision_id**
|
||
A value that identifies the VCS revision of the implementation that
|
||
is currently running.
|
||
|
||
**build_toolchain**
|
||
Identifies the tools used to build the interpreter.
|
||
|
||
**build_date**
|
||
The timestamp of when the interpreter was built.
|
||
|
||
**homepage**
|
||
The URL of the implementation's website.
|
||
|
||
**site_prefix**
|
||
The preferred site prefix for this implementation.
|
||
|
||
**runtime**
|
||
The run-time environment in which the interpreter is running, as
|
||
in "Common Language *Runtime*" (.NET CLR) or "Java *Runtime*
|
||
Executable".
|
||
|
||
**gc_type**
|
||
The type of garbage collection used, like "reference counting" or
|
||
"mark and sweep".
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
The status quo for implementation-specific information gives us that
|
||
information in a more fragile, harder to maintain way. It is spread
|
||
out over different modules or inferred from other information, as we
|
||
see with `platform.python_implementation()`_.
|
||
|
||
This PEP is the main alternative to that approach. It consolidates
|
||
the implementation-specific information into a single namespace and
|
||
makes explicit that which was implicit.
|
||
|
||
|
||
Why a Custom Type?
|
||
------------------
|
||
|
||
A dedicated class, of which ``sys.implementation`` is an instance, would
|
||
facilitate the dotted access of a "named" tuple. At the same time, it
|
||
allows us to avoid the problems of the other approaches (see below),
|
||
like confusion about ordering and iteration.
|
||
|
||
The alternatives to a dictionary are considered separately here:
|
||
|
||
**Dictionary**
|
||
|
||
A dictionary reflects a simple namespace with item access. It
|
||
maps names to values and that's all. It also reflects the more variable
|
||
nature of ``sys.implementation``.
|
||
|
||
However, a simple dictionary does not set expectations very well about
|
||
the nature of ``sys.implementation``. The custom type approach, with
|
||
a fixed set of required attributes, does a better job of this.
|
||
|
||
**Named Tuple**
|
||
|
||
Another close alternative is a namedtuple or a structseq or some other
|
||
tuple type with dotted access (a la ``sys.version_info``). This type
|
||
is immutable and simple. It is a well established pattern for
|
||
implementation-specific variables in Python. Dotted access on a
|
||
namespace is also very convenient.
|
||
|
||
Fallback lookup may favor dicts::
|
||
|
||
cache_tag = sys.implementation.get('cache_tag')
|
||
|
||
vs.
|
||
|
||
cache_tag = getattr(sys.implementation.get, 'cache_tag', None)
|
||
|
||
However, this is mitigated by having ``sys.implementation.metadata``.
|
||
|
||
One problem with using a named tuple is that ``sys.implementation`` does
|
||
not have meaning as a sequence. Also, unlike other similar ``sys``
|
||
variables, it has a far greater potential to change over time.
|
||
|
||
If a named tuple were used, we'd be very clear in the documentation
|
||
that the length and order of the value are not reliable. Iterability
|
||
would not be guaranteed.
|
||
|
||
**Module**
|
||
|
||
Using a module instead of a dict is another option. It has similar
|
||
characteristics to an instance, but with a slight hint of immutability
|
||
(at least by convention). Such a module could be a stand-alone sub-
|
||
module of ``sys`` or added on, like ``os.path``. Unlike a concrete
|
||
class, no new type would be necessary. This is a pretty close fit to
|
||
what we need.
|
||
|
||
The downside is that the module type is much less conducive to
|
||
extension, making it more difficult to address the weaknesses of using
|
||
an instance of a concrete class.
|
||
|
||
|
||
Why metadata?
|
||
-------------
|
||
|
||
``sys.implementation.metadata`` will hold any optional, strictly-
|
||
informational, or per-implementation data. This allows us to restrict
|
||
``sys.implementation`` to the required attributes. In that way, its
|
||
type can reflect the more stable namespace and
|
||
``sys.implementation.metadata`` (as a dict) can reflect the less
|
||
certain namespace.
|
||
|
||
``sys.implementation.metadata`` is the place an implementation can put
|
||
values that must be built-in, "without having to pollute the main sys
|
||
namespace" [#Nick]_.
|
||
|
||
|
||
Why a Part of ``sys``?
|
||
----------------------
|
||
|
||
The ``sys`` module should hold the new namespace because ``sys`` is
|
||
the depot for interpreter-centric variables and functions. Many
|
||
implementation-specific variables are already found in ``sys``.
|
||
|
||
|
||
Why Strict Constraints on Any of the Values?
|
||
--------------------------------------------
|
||
|
||
As already noted in `Version Format`_, values in
|
||
``sys.implementation`` are intended for use by the standard library.
|
||
Constraining those values, essentially specifying an API for them,
|
||
allows them to be used consistently, regardless of how they are
|
||
otherwise implemented.
|
||
|
||
|
||
Discussion
|
||
==========
|
||
|
||
The topic of ``sys.implementation`` came up on the python-ideas list
|
||
in 2009, where the reception was broadly positive [#original]_. I
|
||
revived the discussion recently while working on a pure-python
|
||
``imp.get_tag()`` [#revived]_. Discussion has been ongoing
|
||
[#feedback]_. The messages in `issue #14673`_ are also relevant.
|
||
|
||
|
||
Use-cases
|
||
=========
|
||
|
||
platform.python_implementation()
|
||
--------------------------------
|
||
|
||
"explicit is better than implicit"
|
||
|
||
The ``platform`` module determines the python implementation by
|
||
looking for clues in a couple different ``sys`` variables [#guess]_.
|
||
However, this approach is fragile, requiring changes to the standard
|
||
library each time an implementation changes. Beyond that, support in
|
||
``platform`` is limited to those implementations that core developers
|
||
have blessed by special-casing them in the ``platform`` module.
|
||
|
||
With ``sys.implementation`` the various implementations would
|
||
*explicitly* set the values in their own version of the ``sys``
|
||
module.
|
||
|
||
Another concern is that the ``platform`` module is part of the stdlib,
|
||
which ideally would minimize implementation details such as would be
|
||
moved to ``sys.implementation``.
|
||
|
||
Any overlap between ``sys.implementation`` and the ``platform`` module
|
||
would simply defer to ``sys.implementation`` (with the same interface
|
||
in ``platform`` wrapping it).
|
||
|
||
|
||
Cache Tag Generation in Frozen Importlib
|
||
----------------------------------------
|
||
|
||
PEP 3147 defined the use of a module cache and cache tags for file
|
||
names. The importlib bootstrap code, frozen into the Python binary as
|
||
of 3.3, uses the cache tags during the import process. Part of the
|
||
project to bootstrap importlib has been to clean code out of
|
||
`Python/import.c`_ that did not need to be there any longer.
|
||
|
||
The cache tag defined in ``Python/import.c`` was hard-coded to
|
||
``"cpython" MAJOR MINOR`` [#cachetag]_. For importlib the options are
|
||
either hard-coding it in the same way, or guessing the implementation
|
||
in the same way as does ``platform.python_implementation()``.
|
||
|
||
As long as the hard-coded tag is limited to CPython-specific code, it
|
||
is livable. However, inasmuch as other Python implementations use the
|
||
importlib code to work with the module cache, a hard-coded tag would
|
||
become a problem.
|
||
|
||
Directly using the ``platform`` module in this case is a non-starter.
|
||
Any module used in the importlib bootstrap must be built-in or frozen,
|
||
neither of which apply to the ``platform`` module. This is the point
|
||
that led to the recent interest in ``sys.implementation``.
|
||
|
||
Regardless of the outcome for the implementation name used, another
|
||
problem relates to the version used in the cache tag. That version is
|
||
likely to be the implementation version rather than the language
|
||
version. However, the implementation version is not readily
|
||
identified anywhere in the standard library.
|
||
|
||
|
||
Implementation-Specific Tests
|
||
-----------------------------
|
||
|
||
Currently there are a number of implementation-specific tests in the
|
||
test suite under ``Lib/test``. The test support module
|
||
(`Lib/test/support.py`_) provides some functionality for dealing with
|
||
these tests. However, like the ``platform`` module, ``test.support``
|
||
must do some guessing that ``sys.implementation`` would render
|
||
unnecessary.
|
||
|
||
|
||
Jython's ``os.name`` Hack
|
||
-------------------------
|
||
|
||
In Jython, ``os.name`` is set to 'java' to accommodate special
|
||
treatment of the java environment in the standard library [#os_name]_
|
||
[#javatest]_. Unfortunately it masks the os name that would otherwise
|
||
go there. ``sys.implementation`` would help obviate the need for this
|
||
special case.
|
||
|
||
|
||
Feedback From Other Python Implementers
|
||
=======================================
|
||
|
||
IronPython
|
||
----------
|
||
|
||
XXX
|
||
|
||
Jython
|
||
------
|
||
|
||
XXX
|
||
|
||
PyPy
|
||
----
|
||
|
||
XXX
|
||
|
||
|
||
Past Efforts
|
||
============
|
||
|
||
PEP 3139
|
||
--------
|
||
|
||
This PEP from 2008 recommended a clean-up of the ``sys`` module in
|
||
part by extracting implementation-specific variables and functions
|
||
into a separate module. PEP 421 is a much lighter version of that
|
||
idea. While PEP 3139 was rejected, its goals are reflected in PEP 421
|
||
to a large extent, though with a much lighter approach.
|
||
|
||
|
||
PEP 399
|
||
-------
|
||
|
||
This informational PEP dictates policy regarding the standard library,
|
||
helping to make it friendlier to alternate implementations. PEP 421
|
||
is proposed in that same spirit.
|
||
|
||
|
||
Alternatives
|
||
============
|
||
|
||
Since the single-namespace-under-sys approach is relatively
|
||
straightforward, no alternatives have been considered for this PEP.
|
||
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
* What are the long-term objectives for ``sys.implementation``?
|
||
|
||
- possibly pull in implementation details from the main ``sys``
|
||
namespace and elsewhere (PEP 3137 lite).
|
||
|
||
* What is the process for introducing new required variables? PEP?
|
||
|
||
* Is the ``sys.version_info`` format the right one here?
|
||
|
||
* Should ``sys.implementation.hexversion`` be part of the PEP?
|
||
|
||
* Does ``sys.(version|version_info|hexversion)`` need to better
|
||
reflect the version of the language spec? Micro version, series,
|
||
and release seem like implementation-specific values.
|
||
|
||
* Alternatives to the approach dictated by this PEP?
|
||
|
||
* Do we really want to commit to using a dict for
|
||
``sys.implementation``?
|
||
|
||
Backward compatibility issues will make it difficult to change our
|
||
minds later.
|
||
|
||
The type we use ultimately depends on how general we expect the
|
||
consumption of ``sys.implementation`` to be. If its practicality is
|
||
oriented toward internal use then the data structure is not as
|
||
critical. However, ``sys.implementation`` is intended to have a
|
||
non-localized impact across the standard library and the
|
||
interpreter. It is better to *not* make hacking it become an
|
||
attractive nuisance, regardless of our intentions for usage.
|
||
|
||
* Should ``sys.implementation`` and its values be immutable? A benefit
|
||
of an immutable type is it communicates that the value is not
|
||
expected to change and should not be manipulated.
|
||
|
||
* Should ``sys.implementation`` be strictly disallowed to have methods?
|
||
Classes often imply the presence (or possibility) of methods, which
|
||
may be misleading in this case.
|
||
|
||
* Should ``sys.implementation`` implement the collections.abc.Mapping
|
||
interface?
|
||
|
||
|
||
Implementation
|
||
==============
|
||
|
||
The implementation of this PEP is covered in `issue #14673`_.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [#original] The 2009 sys.implementation discussion:
|
||
http://mail.python.org/pipermail/python-dev/2009-October/092893.html
|
||
|
||
.. [#revived] The initial 2012 discussion:
|
||
http://mail.python.org/pipermail/python-ideas/2012-April/014878.html
|
||
|
||
.. [#feedback] Feedback on the PEP:
|
||
http://mail.python.org/pipermail/python-ideas/2012-April/014954.html
|
||
|
||
.. [#guess] The ``platform`` code which divines the implementation name:
|
||
http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247
|
||
|
||
.. [#cachetag] The definition for cache tags in PEP 3147:
|
||
http://www.python.org/dev/peps/pep-3147/#id53
|
||
|
||
.. [#tag_impl] The original implementation of the cache tag in CPython:
|
||
http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121
|
||
|
||
.. [#tests] Examples of implementation-specific handling in test.support:
|
||
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
|
||
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
|
||
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
|
||
* http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275
|
||
|
||
.. [#os_name] The standard library entry for os.name:
|
||
http://docs.python.org/3.3/library/os.html#os.name
|
||
|
||
.. [#javatest] The use of ``os.name`` as 'java' in the stdlib test suite.
|
||
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512
|
||
|
||
.. [#Nick] Nick Coghlan's proposal for ``sys.implementation.metadata``:
|
||
http://mail.python.org/pipermail/python-ideas/2012-May/014984.html
|
||
|
||
.. _issue #14673: http://bugs.python.org/issue14673
|
||
|
||
.. _Lib/test/support.py: http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py
|
||
|
||
.. _Python/import.c: http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|