Merge from upstream.

This commit is contained in:
Carl Meyer 2012-03-05 17:08:36 -07:00
commit 322d7df21a
46 changed files with 5312 additions and 1316 deletions

View File

@ -18,7 +18,7 @@ all: pep-0000.txt $(TARGETS)
$(TARGETS): pep2html.py $(TARGETS): pep2html.py
pep-0000.txt: $(wildcard pep-????.txt) pep-0000.txt: $(wildcard pep-????.txt) $(wildcard pep0/*.py)
$(PYTHON) genpepindex.py . $(PYTHON) genpepindex.py .
install: install:

View File

@ -461,8 +461,7 @@ References and Footnotes
======================== ========================
.. [1] This historical record is available by the normal hg commands .. [1] This historical record is available by the normal hg commands
for retrieving older revisions. For those without direct access to for retrieving older revisions, and can also be browsed via HTTP here:
the hg repo, you can browse the current and past PEP revisions here:
http://hg.python.org/peps/ http://hg.python.org/peps/
.. [2] PEP 2, Procedure for Adding New Modules, Faassen .. [2] PEP 2, Procedure for Adding New Modules, Faassen

View File

@ -72,7 +72,8 @@ Code lay-out
} }
- Code structure: one space between keywords like 'if', 'for' and - Code structure: one space between keywords like 'if', 'for' and
the following left paren; no spaces inside the paren; braces as the following left paren; no spaces inside the paren; braces may be
omitted where C permits but when present, they should be formatted as
shown: shown:
if (mro != NULL) { if (mro != NULL) {

View File

@ -840,12 +840,6 @@ Programming Recommendations
Worse: if greeting is True: Worse: if greeting is True:
Rules that apply only to the standard library
- Do not use function type annotations in the standard library.
These are reserved for users and third-party modules. See
PEP 3107 and the bug 10899 for details.
References References

View File

@ -3,8 +3,8 @@ Title: Feature Requests
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Jeremy Hylton <jeremy@alum.mit.edu> Author: Jeremy Hylton <jeremy@alum.mit.edu>
Status: Active Status: Final
Type: Informational Type: Process
Created: 12-Sep-2000 Created: 12-Sep-2000
Post-History: Post-History:

View File

@ -62,11 +62,10 @@ How to Make A Release
http://hg.python.org/release/ http://hg.python.org/release/
We use the following conventions in the examples below. Where a release We use the following conventions in the examples below. Where a release
number is given, it is of the form X.YaZ, e.g. 2.6a3 for Python 2.6 alpha number is given, it is of the form X.Y.ZaN, e.g. 3.3.0a3 for Python 3.3.0
3, where "a" == alpha, "b" == beta, "rc" == release candidate. If a micro alpha 3, where "a" == alpha, "b" == beta, "rc" == release candidate.
release number is used, then we'll say X.Y.MaZ.
Release tags are named "vX.YaZ". The branch name for minor release Release tags are named "vX.Y.ZaN". The branch name for minor release
maintenance branches is "X.Y". maintenance branches is "X.Y".
This helps by performing several automatic editing steps, and guides you This helps by performing several automatic editing steps, and guides you
@ -156,7 +155,7 @@ How to Make A Release
___ Bump version numbers via the release script. ___ Bump version numbers via the release script.
$ .../release/release.py --bump X.YaZ $ .../release/release.py --bump X.Y.ZaN
This automates updating various release numbers, but you will have to This automates updating various release numbers, but you will have to
modify a few files manually. If your $EDITOR environment variable is modify a few files manually. If your $EDITOR environment variable is
@ -197,9 +196,9 @@ How to Make A Release
alpha or beta releases. Note that Andrew Kuchling often takes care of alpha or beta releases. Note that Andrew Kuchling often takes care of
this. this.
___ Tag the release for X.YaZ. ___ Tag the release for X.Y.ZaN.
$ .../release/release.py --tag X.YaZ $ .../release/release.py --tag X.Y.ZaN
___ If this is a final major release, branch the tree for X.Y. ___ If this is a final major release, branch the tree for X.Y.
@ -309,10 +308,10 @@ How to Make A Release
___ Use the release script to create the source gzip and bz2 tarballs, md5 ___ Use the release script to create the source gzip and bz2 tarballs, md5
checksums, documentation tar and zip files, and gpg signature files. checksums, documentation tar and zip files, and gpg signature files.
$ .../release/release.py --export X.YaZ $ .../release/release.py --export X.Y.ZaN
This will leave all the relevant files in a subdirectory called This will leave all the relevant files in a subdirectory called
'X.YaZ/src', and the built docs in 'X.YaZ/docs' (for final releases). 'X.Y.ZaN/src', and the built docs in 'X.Y.ZaN/docs' (for final releases).
___ scp or rsync all the files to your home directory on dinsdale.python.org. ___ scp or rsync all the files to your home directory on dinsdale.python.org.
@ -361,7 +360,7 @@ How to Make A Release
Python-3.2.tgz, along with a "prev" subdirectory containing Python-3.2.tgz, along with a "prev" subdirectory containing
Python-3.2a1.msi, Python-3.2a1.tgz, Python-3.2a1.tar.bz2, etc. Python-3.2a1.msi, Python-3.2a1.tgz, Python-3.2a1.tar.bz2, etc.
___ On dinsdale, cd /data/ftp.python.org/pub/python/X.Y[.Z] ___ On dinsdale, cd /data/ftp.python.org/pub/python/X.Y.Z
creating it if necessary. Make sure it is owned by group 'webmaster' creating it if necessary. Make sure it is owned by group 'webmaster'
and group-writable. and group-writable.
@ -383,14 +382,14 @@ How to Make A Release
___ md5sum the files and make sure they got uploaded intact. ___ md5sum the files and make sure they got uploaded intact.
___ If this is a final release: Move the doc zips and tarballs to ___ If this is a final release: Move the doc zips and tarballs to
/data/ftp.python.org/pub/python/doc/X.Y[.Z] creating the directory /data/ftp.python.org/pub/python/doc/X.Y.Z creating the directory
if necessary, and adapt the "current" symlink in .../doc to point to if necessary, and adapt the "current" symlink in .../doc to point to
that directory. Note though that if you're releasing a maintenance that directory. Note though that if you're releasing a maintenance
release for an older version, don't change the current link. release for an older version, don't change the current link.
___ If this is a final release (even a maintenance release), also unpack ___ If this is a final release (even a maintenance release), also unpack
the HTML docs to the HTML docs to
/data/ftp.python.org/pub/docs.python.org/release/X.Y[.Z]. /data/ftp.python.org/pub/docs.python.org/release/X.Y.Z.
___ Let the DE check if the docs are built and work all right. ___ Let the DE check if the docs are built and work all right.
@ -513,7 +512,7 @@ How to Make A Release
___ Do the guided post-release steps with the release script. ___ Do the guided post-release steps with the release script.
$ .../release/release.py --done X.YaZ $ .../release/release.py --done X.Y.ZaN
Review and commit these changes. Review and commit these changes.

View File

@ -3,8 +3,8 @@ Title: Externally Maintained Packages
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Brett Cannon <brett@python.org> Author: Brett Cannon <brett@python.org>
Status: Active Status: Final
Type: Informational Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 30-May-2006 Created: 30-May-2006
Post-History: Post-History:
@ -13,7 +13,7 @@ Post-History:
.. warning:: No new modules are to be added to this PEP. It has been .. warning:: No new modules are to be added to this PEP. It has been
deemed dangerous to codify external maintenance of any deemed dangerous to codify external maintenance of any
code checked into Python's code repository. Code code checked into Python's code repository. Code
contributers should expect Python's development contributors should expect Python's development
methodology to be used for any and all code checked into methodology to be used for any and all code checked into
Python's code repository. Python's code repository.
@ -66,11 +66,8 @@ ElementTree
:Contact person: :Contact person:
Fredrik Lundh Fredrik Lundh
Patches should not be directly applied to Python HEAD, but instead Fredrik has ceded ElementTree maintenance to the core Python development
reported to the Python tracker [#python-tracker]_ (critical bug fixes team [#element-tree]_.
are the exception). Bugs should also be reported to the Python
tracker. Both bugs and patches should be assigned to Fredrik Lundh.
Expat XML parser Expat XML parser
---------------- ----------------
@ -83,6 +80,7 @@ Expat XML parser
:Contact person: :Contact person:
None None
Optik Optik
----- -----
@ -93,7 +91,9 @@ Optik
:Contact person: :Contact person:
Greg Ward Greg Ward
External development seems to have ceased. External development seems to have ceased. For new applications, optparse
itself has been largely superseded by argparse.
wsgiref wsgiref
------- -------
@ -104,16 +104,16 @@ wsgiref
:Contact Person: :Contact Person:
Phillip J. Eby Phillip J. Eby
Bugs and patches should pass through the Web-SIG mailing list [#web-sig]_ This module is maintained in the standard library, but significant bug
before being applied to HEAD. External maintenance seems to have reports and patches should pass through the Web-SIG mailing list
ceased. [#web-sig]_ for discussion.
References References
========== ==========
.. [#python-tracker] Python tracker .. [#element-tree] Fredrik's handing over of ElementTree
(http://sourceforge.net/tracker/?group_id=5470) (http://mail.python.org/pipermail/python-dev/2012-February/116389.html)
.. [#web-sig] Web-SIG mailing list .. [#web-sig] Web-SIG mailing list
(http://mail.python.org/mailman/listinfo/web-sig) (http://mail.python.org/mailman/listinfo/web-sig)

View File

@ -63,9 +63,15 @@ Release Schedule
Nov 06 2008: Python 3.0rc2 released Nov 06 2008: Python 3.0rc2 released
Nov 21 2008: Python 3.0rc3 released Nov 21 2008: Python 3.0rc3 released
Dec 03 2008: Python 3.0 final released Dec 03 2008: Python 3.0 final released
Dec 04 2008: Python 2.6.1 final release Dec 04 2008: Python 2.6.1 final released
Apr 14 2009: Python 2.6.2 final release Apr 14 2009: Python 2.6.2 final released
Oct 02 2009: Python 2.6.3 final release Oct 02 2009: Python 2.6.3 final released
Oct 25 2009: Python 2.6.4 final released
Mar 19 2010: Python 2.6.5 final released
Aug 24 2010: Python 2.6.6 final released
Jun 03 2011: Python 2.6.7 final released (security-only)
Python 2.6.8 (security-only) planned for Feb 10-17 2012
See the public `Google calendar`_ See the public `Google calendar`_

View File

@ -44,7 +44,7 @@ be inferred once and stored without changes to the signature object
representation affecting the function it represents (but this is an representation affecting the function it represents (but this is an
`Open Issues`_). `Open Issues`_).
Indirecation of signature introspection can also occur. If a Indirection of signature introspection can also occur. If a
decorator took a decorated function's signature object and set it on decorator took a decorated function's signature object and set it on
the decorating function then introspection could be redirected to what the decorating function then introspection could be redirected to what
is actually expected instead of the typical ``*args, **kwargs`` is actually expected instead of the typical ``*args, **kwargs``

View File

@ -13,10 +13,10 @@ Python-Version: 2.7
Abstract Abstract
======== ========
This document describes the development and release schedule for Python 2.7. This document describes the development and release schedule for
The schedule primarily concerns itself with PEP-sized items. Small features may Python 2.7. The schedule primarily concerns itself with PEP-sized
be added up to and including the first beta release. Bugs may be fixed until items. Small features may be added up to and including the first beta
the final release. release. Bugs may be fixed until the final release.
Release Manager and Crew Release Manager and Crew

View File

@ -7,7 +7,7 @@ Author: Brett Cannon <brett@python.org>,
Alexandre Vassalotti <alexandre@peadrop.com>, Alexandre Vassalotti <alexandre@peadrop.com>,
Barry Warsaw <barry@python.org>, Barry Warsaw <barry@python.org>,
Dirkjan Ochtman <dirkjan@ochtman.nl> Dirkjan Ochtman <dirkjan@ochtman.nl>
Status: Active Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 07-Nov-2008 Created: 07-Nov-2008

View File

@ -3,7 +3,7 @@ Title: Python 3.1 Release Schedule
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Benjamin Peterson <benjamin@python.org> Author: Benjamin Peterson <benjamin@python.org>
Status: Active Status: Final
Type: Informational Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 8-Feb-2009 Created: 8-Feb-2009

View File

@ -3,7 +3,7 @@ Title: Syntax for Delegating to a Subgenerator
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Gregory Ewing <greg.ewing@canterbury.ac.nz> Author: Gregory Ewing <greg.ewing@canterbury.ac.nz>
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 13-Feb-2009 Created: 13-Feb-2009

View File

@ -18,7 +18,7 @@ package across multiple directories on disk. In current Python
versions, an algorithm to compute the packages __path__ must be versions, an algorithm to compute the packages __path__ must be
formulated. With the enhancement proposed here, the import machinery formulated. With the enhancement proposed here, the import machinery
itself will construct the list of directories that make up the itself will construct the list of directories that make up the
package. package. An implementation of this PEP is available at [1]_.
Terminology Terminology
=========== ===========
@ -94,11 +94,15 @@ declarative approach is proposed here: A directory whose name ends
with ``.pyp`` (for Python package) contains a portion of a package. with ``.pyp`` (for Python package) contains a portion of a package.
The import statement is extended so that computes the package's The import statement is extended so that computes the package's
``__path__`` attribute for a package named ``P`` as consisting of all ``__path__`` attribute for a package named ``P`` as consisting of
directories named ``P.pyp``, plus optionally a single directory named optionally a single directory name ``P`` containing a file
``P`` containing a file ``__init__.py``. If either of these are ``__init__.py``, plus all directories named ``P.pyp``, in the order in
found on the path of the parent package (or sys.path for a top-level which they are found in the parent's package ``__path__`` (or
package), search for additional portions of the package continues. ``sys.path``). If either of these are found, search for additional
portions of the package continues.
A directory may contain both a package in the ``P/__init__.py`` and
the ``P.pyp`` form.
No other change to the importing mechanism is made; searching modules No other change to the importing mechanism is made; searching modules
(including __init__.py) will continue to stop at the first module (including __init__.py) will continue to stop at the first module
@ -129,27 +133,22 @@ Finders need to support looking for \*.pth files in step 1 of above
algorithm. To do so, a finder used as a path hook must support a algorithm. To do so, a finder used as a path hook must support a
method: method:
finder.is_package_portion(fullname) finder.find_package_portion(fullname)
This method will be called in the same manner as find_module, and it This method will be called in the same manner as find_module, and it
must return True if there is a package portion with that name; False must return an string to be added to the package's ``__path__``.
if fullname indicates a subpackage/submodule that is not a package If the finder doesn't find a portion of the package, it shall return
portion; otherwise, it shall raise an ImportError. ``None``. Raising ``AttributeError`` from above call will be treated
as non-conformance with this PEP, and the exception will be ignored.
All other exceptions are reported.
If any \*.pyp directories are found, but no loader was returned from A finder may report both success from ``find_module`` and from
find_module, a package is created and initialized with the path. ``find_package_portion``, allowing for both a package containing
an ``__init__.py`` and a portion of the same package.
If a loader was return, but no \*.pyp directories, load_module is
called as defined in PEP 302.
If both \*.pyp directories where found, and a loader was returned, a
new method is called on the loader:
loader.load_module_with_path(load_module, path)
where the path parameter is the list that will become the __path__
attribute of the new package.
All strings returned from ``find_package_portion``, along with all
path names of ``.pyp`` directories are added to the new package's
``__path__``.
Discussion Discussion
========== ==========
@ -181,7 +180,7 @@ noticed that Jython could easily mistake a directory that is a Java
package as being a Python package, if there is no need to declare package as being a Python package, if there is no need to declare
Python packages. Python packages.
packages can stop filling out the namespace package's __init__.py. As Packages can stop filling out the namespace package's __init__.py. As
a consequence, extend_path and declare_namespace become obsolete. a consequence, extend_path and declare_namespace become obsolete.
Namespace packages can start providing non-trivial __init__.py Namespace packages can start providing non-trivial __init__.py
@ -195,14 +194,11 @@ mechanisms. extend_path will be adjusted to this specification;
any other mechanism might cause portions to get added twice to any other mechanism might cause portions to get added twice to
__path__. __path__.
It has been proposed to also add this feature to Python 2.7. Given References
that 2.x reaches its end-of-life, it is questionable whether the ==========
addition of the feature would really do more good than harm (in having
users and tools starting to special-case 2.7). Prospective users of
this feature are encouraged to comment on this particular question. In
addition, Python 2.7 is in bug-fix mode now, so adding new features to
it would be a violation of the established policies.
.. [1] PEP 382 branch
(http://hg.python.org/features/pep-382-2#pep-382)
Copyright Copyright
========= =========

View File

@ -89,7 +89,7 @@ affected by this specification. They are neither enhanced nor
deprecated. deprecated.
External libraries that operate on file names (such as GUI file External libraries that operate on file names (such as GUI file
chosers) should also encode them according to the PEP. choosers) should also encode them according to the PEP.
Discussion Discussion
========== ==========

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Dirkjan Ochtman <dirkjan@ochtman.nl>, Author: Dirkjan Ochtman <dirkjan@ochtman.nl>,
Antoine Pitrou <solipsis@pitrou.net>, Antoine Pitrou <solipsis@pitrou.net>,
Georg Brandl <georg@python.org> Georg Brandl <georg@python.org>
Status: Active Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 25-May-2009 Created: 25-May-2009
@ -62,15 +62,6 @@ The current schedule for conversion milestones:
switch over to the new repository. switch over to the new repository.
Todo list
=========
The current list of issues to resolve at various steps in the
conversion is kept `in the pymigr repo`_.
.. _in the pymigr repo: http://hg.python.org/pymigr/file/tip/todo.txt
Transition plan Transition plan
=============== ===============
@ -158,14 +149,13 @@ Author map
In order to provide user names the way they are common in hg (in the In order to provide user names the way they are common in hg (in the
'First Last <user@example.org>' format), we need an author map to map 'First Last <user@example.org>' format), we need an author map to map
cvs and svn user names to real names and their email addresses. We cvs and svn user names to real names and their email addresses. We
have a complete version of such a map in my `migration tools have a complete version of such a map in the migration tools
repository`_. The email addresses in it might be out of date; that's repository (not publicly accessible to avoid leaking addresses to
harvesters). The email addresses in it might be out of date; that's
bound to happen, although it would be nice to try and have as many bound to happen, although it would be nice to try and have as many
people as possible review it for addresses that are out of date. The people as possible review it for addresses that are out of date. The
current version also still seems to contain some encoding problems. current version also still seems to contain some encoding problems.
.. _migration tools repository: http://hg.python.org/pymigr/
Generating .hgignore Generating .hgignore
-------------------- --------------------
@ -313,15 +303,13 @@ hgwebdir
A more or less stock hgwebdir installation should be set up. We might A more or less stock hgwebdir installation should be set up. We might
want to come up with a style to match the Python website. want to come up with a style to match the Python website.
A `small WSGI application`_ has been written that can look up A small WSGI application has been written that can look up
Subversion revisions and redirect to the appropriate hgweb page for Subversion revisions and redirect to the appropriate hgweb page for
the given changeset, regardless in which repository the converted the given changeset, regardless in which repository the converted
revision ended up (since one big Subversion repository is converted revision ended up (since one big Subversion repository is converted
into several Mercurial repositories). It can also look up Mercurial into several Mercurial repositories). It can also look up Mercurial
changesets by their hexadecimal ID. changesets by their hexadecimal ID.
.. _small WSGI application: http://hg.python.org/pymigr/file/tip/hglookup.py
roundup roundup
------- -------

View File

@ -64,6 +64,15 @@ syntax and no additions to the builtins may be made.
No large-scale changes have been recorded yet. No large-scale changes have been recorded yet.
Bugfix Releases
===============
- 3.2.1: released July 10, 2011
- 3.2.2: released September 4, 2011
- 3.2.3: planned February 10-17, 2012
References References
========== ==========

View File

@ -189,7 +189,7 @@ PyUnicode_2BYTE_KIND (2), or PyUnicode_4BYTE_KIND (3). PyUnicode_DATA
gives the void pointer to the data. Access to individual characters gives the void pointer to the data. Access to individual characters
should use PyUnicode_{READ|WRITE}[_CHAR]: should use PyUnicode_{READ|WRITE}[_CHAR]:
- PyUnciode_READ(kind, data, index) - PyUnicode_READ(kind, data, index)
- PyUnicode_WRITE(kind, data, index, value) - PyUnicode_WRITE(kind, data, index, value)
- PyUnicode_READ_CHAR(unicode, index) - PyUnicode_READ_CHAR(unicode, index)
@ -372,6 +372,25 @@ the iobench, stringbench, and json benchmarks see typically
slowdowns of 1% to 30%; for specific benchmarks, speedups may slowdowns of 1% to 30%; for specific benchmarks, speedups may
happen as may happen significantly larger slowdowns. happen as may happen significantly larger slowdowns.
In actual measurements of a Django application ([2]_), significant
reductions of memory usage could be found. For example, the storage
for Unicode objects reduced to 2216807 bytes, down from 6378540 bytes
for a wide Unicode build, and down from 3694694 bytes for a narrow
Unicode build (all on a 32-bit system). This reduction came from the
prevalence of ASCII strings in this application; out of 36,000 strings
(with 1,310,000 chars), 35713 where ASCII strings (with 1,300,000
chars). The sources for these strings where not further analysed;
many of them likely originate from identifiers in the library, and
string constants in Django's source code.
In comparison to Python 2, both Unicode and byte strings need to be
accounted. In the test application, Unicode and byte strings combined
had a length of 2,046,000 units (bytes/chars) in 2.x, and 2,200,000
units in 3.x. On a 32-bit system, where the 2.x build used 32-bit
wchar_t/Py_UNICODE, the 2.x test used 3,620,000 bytes, and the 3.x
build 3,340,000 bytes. This reduction in 3.x using the PEP compared
to 2.x only occurs when comparing with a wide unicode build.
Porting Guidelines Porting Guidelines
================== ==================
@ -435,6 +454,8 @@ References
.. [1] PEP 393 branch .. [1] PEP 393 branch
https://bitbucket.org/t0rsten/pep-393 https://bitbucket.org/t0rsten/pep-393
.. [2] Django measurement results
http://www.dcl.hpi.uni-potsdam.de/home/loewis/djmemprof/
Copyright Copyright
========= =========

View File

@ -4,11 +4,12 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Kerrick Staley <mail@kerrickstaley.com>, Author: Kerrick Staley <mail@kerrickstaley.com>,
Nick Coghlan <ncoghlan@gmail.com> Nick Coghlan <ncoghlan@gmail.com>
Status: Draft Status: Active
Type: Informational Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 02-Mar-2011 Created: 02-Mar-2011
Post-History: 04-Mar-2011, 20-Jul-2011 Post-History: 04-Mar-2011, 20-Jul-2011, 16-Feb-2012
Resolution: http://mail.python.org/pipermail/python-dev/2012-February/116594.html
Abstract Abstract
@ -39,7 +40,7 @@ Recommendation
Python as either ``python2`` or ``python3``. Python as either ``python2`` or ``python3``.
* For the time being, it is recommended that ``python`` should refer to * For the time being, it is recommended that ``python`` should refer to
``python2`` (however, some distributions have already chosen otherwise; see ``python2`` (however, some distributions have already chosen otherwise; see
Notes below). the `Rationale`_ and `Migration Notes`_ below).
* The Python 2.x ``idle``, ``pydoc``, and ``python-config`` commands should * The Python 2.x ``idle``, ``pydoc``, and ``python-config`` commands should
likewise be available as ``idle2``, ``pydoc2``, and ``python2-config``, likewise be available as ``idle2``, ``pydoc2``, and ``python2-config``,
with the original commands invoking these versions by default, but possibly with the original commands invoking these versions by default, but possibly
@ -48,7 +49,7 @@ Recommendation
* In order to tolerate differences across platforms, all new code that needs * In order to tolerate differences across platforms, all new code that needs
to invoke the Python interpreter should not specify ``python``, but rather to invoke the Python interpreter should not specify ``python``, but rather
should specify either ``python2`` or ``python3`` (or the more specific should specify either ``python2`` or ``python3`` (or the more specific
``python2.x`` and ``python3.x`` versions; see the Notes). ``python2.x`` and ``python3.x`` versions; see the `Migration Notes`_).
This distinction should be made in shebangs, when invoking from a shell This distinction should be made in shebangs, when invoking from a shell
script, when invoking via the system() call, or when invoking in any other script, when invoking via the system() call, or when invoking in any other
context. context.
@ -59,29 +60,48 @@ Recommendation
``sys.executable`` to avoid hardcoded assumptions regarding the ``sys.executable`` to avoid hardcoded assumptions regarding the
interpreter location remains the preferred approach. interpreter location remains the preferred approach.
These recommendations are the outcome of the relevant python-dev discussion in These recommendations are the outcome of the relevant python-dev discussions
March and July 2011 [1][2] (NOTE: More accurately, they will be such once the in March and July 2011 ([1]_, [2]_) and February 2012 ([4]_).
"Draft" status disappears from the PEP header, it has been moved into the
"Other Informational PEP" section in PEP 0 and this note has been deleted)
Rationale Rationale
========= =========
This is needed as, even though the majority of distributions still alias the This recommendation is needed as, even though the majority of distributions
``python`` command to Python 2, some now alias it to Python 3. Some of still alias the ``python`` command to Python 2, some now alias it to
the former also do not provide a ``python2`` command; hence, there is Python 3 ([5]_). As some of the former distributions do not yet provide a
currently no way for Python 2 code (or any code that invokes the Python 2 ``python2`` command by default, there is currently no way for Python 2 code
interpreter directly rather than via ``sys.executable``) to reliably run on (or any code that invokes the Python 2 interpreter directly rather than via
all Unix-like systems without modification, as the ``python`` command will ``sys.executable``) to reliably run on all Unix-like systems without
invoke the wrong interpreter version on some systems, and the ``python2`` modification, as the ``python`` command will invoke the wrong interpreter
command will fail completely on others. The recommendations in this PEP version on some systems, and the ``python2`` command will fail completely
provide a very simple mechanism to restore cross-platform support, with on others. The recommendations in this PEP provide a very simple mechanism
minimal additional work required on the part of distribution maintainers. to restore cross-platform support, with minimal additional work required
on the part of distribution maintainers.
Notes Future Changes to this Recommendation
===== =====================================
It is anticipated that there will eventually come a time where the third
party ecosystem surrounding Python 3 is sufficiently mature for this
recommendation to be updated to suggest that the ``python`` symlink
refer to ``python3`` rather than ``python2``.
This recommendation will be periodically reviewed over the next few years,
and updated when the core development team judges it appropriate. As a
point of reference, regular maintenance releases for the Python 2.7 series
will continue until at least 2015.
Migration Notes
===============
This section does not contain any official recommendations from the core
CPython developers. It's merely a collection of notes regarding various
aspects of migrating to Python 3 as the default version of Python for a
system. They will hopefully be helpful to any distributions considering
making such a change.
* Distributions that only include ``python3`` in their base install (i.e. * Distributions that only include ``python3`` in their base install (i.e.
they do not provide ``python2`` by default) along with those that are they do not provide ``python2`` by default) along with those that are
@ -107,7 +127,7 @@ Notes
rather being provided as a separate binary file. rather being provided as a separate binary file.
* It is suggested that even distribution-specific packages follow the * It is suggested that even distribution-specific packages follow the
``python2``/``python3`` convention, even in code that is not intended to ``python2``/``python3`` convention, even in code that is not intended to
operate on other distributions. This will prevent problems if the operate on other distributions. This will reduce problems if the
distribution later decides to change the version of the Python interpreter distribution later decides to change the version of the Python interpreter
that the ``python`` command invokes, or if a sysadmin installs a custom that the ``python`` command invokes, or if a sysadmin installs a custom
``python`` command with a different major version than the distribution ``python`` command with a different major version than the distribution
@ -120,16 +140,14 @@ Notes
versa. That way, if a sysadmin does decide to replace the installed versa. That way, if a sysadmin does decide to replace the installed
``python`` file, they can do so without inadvertently deleting the ``python`` file, they can do so without inadvertently deleting the
previously installed binary. previously installed binary.
* As an alternative to the recommendation presented above, some distributions
may choose to leave the ``python`` command itself undefined, leaving
sysadmins and users with the responsibility to choose their own preferred
version to be made available as the ``python`` command.
* If the Python 2 interpreter becomes uncommon, scripts should nevertheless * If the Python 2 interpreter becomes uncommon, scripts should nevertheless
continue to use the ``python3`` convention rather that just ``python``. This continue to use the ``python3`` convention rather that just ``python``. This
will ease transition in the event that yet another major version of Python will ease transition in the event that yet another major version of Python
is released. is released.
* If these conventions are adhered to, it will be the case that the ``python`` * If these conventions are adhered to, it will become the case that the
command is only executed in an interactive manner. ``python`` command is only executed in an interactive manner as a user
convenience, or to run scripts that are source compatible with both Python
2 and Python 3.
Backwards Compatibility Backwards Compatibility
@ -147,25 +165,38 @@ Python 3 interpreter.
Application to the CPython Reference Interpreter Application to the CPython Reference Interpreter
================================================ ================================================
While technically a new feature, the ``make install`` command in the 2.7 While technically a new feature, the ``make install`` and ``make bininstall``
version of CPython will be adjusted to create the ``python2.7``, ``idle2.7``, command in the 2.7 version of CPython will be adjusted to create the
``pydoc2.7``, and ``python2.7-config`` binaries, with ``python2``, ``idle2``, following chains of symbolic links in the relevant ``bin`` directory (the
``pydoc2``, and ``python2-config`` as hard links to the respective binaries, final item listed in the chain is the actual installed binary, preceding
and ``python``, ``idle``, ``pydoc``, and ``python-config`` as symbolic links items are relative symbolic links)::
to the respective hard links. This feature will first appear in CPython
2.7.3.
The ``make install`` command in the CPython 3.x series will similarly install python -> python2 -> python2.7
the ``python3.x``, ``idle3.x``, ``pydoc3.x``, and ``python3.x-config`` python-config -> python2-config -> python2.7-config
binaries (with appropriate ``x``), and ``python3``, ``idle3``, ``pydoc3``,
and ``python3-config`` as hard links. This feature will first appear in
CPython 3.3.
Similar adjustments will be made to the Mac OS X binary installer. Similar adjustments will be made to the Mac OS X binary installer.
As implementation of these features in the default installers does not alter This feature will first appear in the default installation process in
the recommendations in this PEP, the implementation progress is managed on the CPython 2.7.3.
tracker as issue <TBD>.
The installation commands in the CPython 3.x series already create the
appropriate symlinks. For example, CPython 3.2 creates::
python3 -> python3.2
idle3 -> idle3.2
pydoc3 -> pydoc3.2
python3-config -> python3.2-config
And CPython 3.3 will create::
python3 -> python3.3
idle3 -> idle3.3
pydoc3 -> pydoc3.3
python3-config -> python3.3-config
pysetup3 -> pysetup3.3
The implementation progress of these features in the default installers is
managed on the tracker as issue #12627 ([3]_).
Impact on PYTHON* Environment Variables Impact on PYTHON* Environment Variables
@ -192,12 +223,20 @@ address this issue.
References References
========== ==========
[1] Support the /usr/bin/python2 symlink upstream (with bonus grammar class!) .. [1] Support the /usr/bin/python2 symlink upstream (with bonus grammar class!)
(http://mail.python.org/pipermail/python-dev/2011-March/108491.html) (http://mail.python.org/pipermail/python-dev/2011-March/108491.html)
[2] Rebooting PEP 394 (aka Support the /usr/bin/python2 symlink upstream) .. [2] Rebooting \PEP 394 (aka Support the /usr/bin/python2 symlink upstream)
(http://mail.python.org/pipermail/python-dev/2011-July/112322.html) (http://mail.python.org/pipermail/python-dev/2011-July/112322.html)
.. [3] Implement \PEP 394 in the CPython Makefile
(http://bugs.python.org/issue12627)
.. [4] \PEP 394 request for pronouncement (python2 symlink in \*nix systems)
(http://mail.python.org/pipermail/python-dev/2012-February/116435.html)
.. [5] Arch Linux announcement that their "python" link now refers Python 3
(https://www.archlinux.org/news/python-is-now-python-3/)
Copyright Copyright
=========== ===========

View File

@ -1,5 +1,5 @@
PEP: 395 PEP: 395
Title: Module Aliasing Title: Qualifed Names for Modules
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com> Author: Nick Coghlan <ncoghlan@gmail.com>
@ -8,19 +8,41 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 4-Mar-2011 Created: 4-Mar-2011
Python-Version: 3.3 Python-Version: 3.3
Post-History: 5-Mar-2011 Post-History: 5-Mar-2011, 19-Nov-2011
Abstract Abstract
======== ========
This PEP proposes new mechanisms that eliminate some longstanding traps for This PEP proposes new mechanisms that eliminate some longstanding traps for
the unwary when dealing with Python's import system, the pickle module and the unwary when dealing with Python's import system, as well as serialisation
introspection interfaces. and introspection of functions and classes.
It builds on the "Qualified Name" concept defined in PEP 3155. It builds on the "Qualified Name" concept defined in PEP 3155.
Relationship with Other PEPs
----------------------------
This PEP builds on the "qualified name" concept introduced by PEP 3155, and
also shares in that PEP's aim of fixing some ugly corner cases when dealing
with serialisation of arbitrary functions and classes.
It also builds on PEP 366, which took initial tentative steps towards making
explicit relative imports from the main module work correctly in at least
*some* circumstances.
This PEP is also affected by the two competing "namespace package" PEPs
(PEP 382 and PEP 402). This PEP would require some minor adjustments to
accommodate PEP 382, but has some critical incompatibilities with respect to
the implicit namespace package mechanism proposed in PEP 402.
Finally, PEP 328 eliminated implicit relative imports from imported modules.
This PEP proposes that the de facto implicit relative imports from main
modules that are provided by the current initialisation behaviour for
``sys.path[0]`` also be eliminated.
What's in a ``__name__``? What's in a ``__name__``?
========================= =========================
@ -41,71 +63,173 @@ The key use cases identified for this module attribute are:
Traps for the Unwary Traps for the Unwary
==================== ====================
The overloading of the semantics of ``__name__`` have resulted in several The overloading of the semantics of ``__name__``, along with some historically
traps for the unwary. These traps can be quite annoying in practice, as associated behaviour in the initialisation of ``sys.path[0]``, has resulted in
they are highly unobvious and can cause quite confusing behaviour. A lot of several traps for the unwary. These traps can be quite annoying in practice,
the time, you won't even notice them, which just makes them all the more as they are highly unobvious (especially to beginners) and can cause quite
surprising when they do come up. confusing behaviour.
Why are my imports broken?
--------------------------
There's a general principle that applies when modifying ``sys.path``: *never*
put a package directory directly on ``sys.path``. The reason this is
problematic is that every module in that directory is now potentially
accessible under two different names: as a top level module (since the
package directory is on ``sys.path``) and as a submodule of the package (if
the higher level directory containing the package itself is also on
``sys.path``).
As an example, Django (up to and including version 1.3) is guilty of setting
up exactly this situation for site-specific applications - the application
ends up being accessible as both ``app`` and ``site.app`` in the module
namespace, and these are actually two *different* copies of the module. This
is a recipe for confusion if there is any meaningful mutable module level
state, so this behaviour is being eliminated from the default site set up in
version 1.4 (site-specific apps will always be fully qualified with the site
name).
However, it's hard to blame Django for this, when the same part of Python
responsible for setting ``__name__ = "__main__"`` in the main module commits
the exact same error when determining the value for ``sys.path[0]``.
The impact of this can be seen relatively frequently if you follow the
"python" and "import" tags on Stack Overflow. When I had the time to follow
it myself, I regularly encountered people struggling to understand the
behaviour of straightforward package layouts like the following (I actually
use package layouts along these lines in my own projects)::
project/
setup.py
example/
__init__.py
foo.py
tests/
__init__.py
test_foo.py
While I would often see it without the ``__init__.py`` files first, that's a
trivial fix to explain. What's hard to explain is that all of the following
ways to invoke ``test_foo.py`` *probably won't work* due to broken imports
(either failing to find ``example`` for absolute imports, complaining
about relative imports in a non-package or beyond the toplevel package for
explicit relative imports, or issuing even more obscure errors if some other
submodule happens to shadow the name of a top-level module, such as an
``example.json`` module that handled serialisation or an
``example.tests.unittest`` test runner)::
# These commands will most likely *FAIL*, even if the code is correct
# working directory: project/example/tests
./test_foo.py
python test_foo.py
python -m package.tests.test_foo
python -c "from package.tests.test_foo import main; main()"
# working directory: project/package
tests/test_foo.py
python tests/test_foo.py
python -m package.tests.test_foo
python -c "from package.tests.test_foo import main; main()"
# working directory: project
example/tests/test_foo.py
python example/tests/test_foo.py
# working directory: project/..
project/example/tests/test_foo.py
python project/example/tests/test_foo.py
# The -m and -c approaches don't work from here either, but the failure
# to find 'package' correctly is easier to explain in this case
That's right, that long list is of all the methods of invocation that will
almost certainly *break* if you try them, and the error messages won't make
any sense if you're not already intimately familiar not only with the way
Python's import system works, but also with how it gets initialised.
For a long time, the only way to get ``sys.path`` right with that kind of
setup was to either set it manually in ``test_foo.py`` itself (hardly
something a novice, or even many veteran, Python programmers are going to
know how to do) or else to make sure to import the module instead of
executing it directly::
# working directory: project
python -c "from package.tests.test_foo import main; main()"
Since the implementation of PEP 366 (which defined a mechanism that allows
relative imports to work correctly when a module inside a package is executed
via the ``-m`` switch), the following also works properly::
# working directory: project
python -m package.tests.test_foo
The fact that most methods of invoking Python code from the command line
break when that code is inside a package, and the two that do work are highly
sensitive to the current working directory is all thoroughly confusing for a
beginner. I personally believe it is one of the key factors leading
to the perception that Python packages are complicated and hard to get right.
This problem isn't even limited to the command line - if ``test_foo.py`` is
open in Idle and you attempt to run it by pressing F5, or if you try to run
it by clicking on it in a graphical filebrowser, then it will fail in just
the same way it would if run directly from the command line.
There's a reason the general "no package directories on ``sys.path``"
guideline exists, and the fact that the interpreter itself doesn't follow
it when determining ``sys.path[0]`` is the root cause of all sorts of grief.
In the past, this couldn't be fixed due to backwards compatibility concerns.
However, scripts potentially affected by this problem will *already* require
fixes when porting to the Python 3.x (due to the elimination of implicit
relative imports when importing modules normally). This provides a convenient
opportunity to implement a corresponding change in the initialisation
semantics for ``sys.path[0]``.
Importing the main module twice Importing the main module twice
------------------------------- -------------------------------
The most venerable of these traps is the issue of (effectively) importing Another venerable trap is the issue of importing ``__main__`` twice. This
``__main__`` twice. This occurs when the main module is also imported under occurs when the main module is also imported under its real name, effectively
its real name, effectively creating two instances of the same module under creating two instances of the same module under different names.
different names.
This problem used to be significantly worse due to implicit relative imports If the state stored in ``__main__`` is significant to the correct operation
from the main module, but the switch to allowing only absolute imports and of the program, or if there is top-level code in the main module that has
explicit relative imports means this issue is now restricted to affecting the non-idempotent side effects, then this duplication can cause obscure and
main module itself. surprising errors.
Why are my relative imports broken?
-----------------------------------
PEP 366 defines a mechanism that allows relative imports to work correctly
when a module inside a package is executed via the ``-m`` switch.
Unfortunately, many users still attempt to directly execute scripts inside
packages. While this no longer silently does the wrong thing by
creating duplicate copies of peer modules due to implicit relative imports, it
now fails noisily at the first explicit relative import, even though the
interpreter actually has sufficient information available on the filesystem to
make it work properly.
<TODO: Anyone want to place bets on how many Stack Overflow links I could find
to put here if I really went looking?>
In a bit of a pickle In a bit of a pickle
-------------------- --------------------
Something many users may not realise is that the ``pickle`` module serialises Something many users may not realise is that the ``pickle`` module sometimes
objects based on the ``__name__`` of the containing module. So objects relies on the ``__module__`` attribute when serialising instances of arbitrary
defined in ``__main__`` are pickled that way, and won't be unpickled classes. So instances of classes defined in ``__main__`` are pickled that way,
correctly by another python instance that only imported that module instead and won't be unpickled correctly by another python instance that only imported
of running it directly. This behaviour is the underlying reason for the that module instead of running it directly. This behaviour is the underlying
advice from many Python veterans to do as little as possible in the reason for the advice from many Python veterans to do as little as possible
``__main__`` module in any application that involves any form of object in the ``__main__`` module in any application that involves any form of
serialisation and persistence. object serialisation and persistence.
Similarly, when creating a pseudo-module\*, pickles rely on the name of the Similarly, when creating a pseudo-module (see next paragraph), pickles rely
module where a class is actually defined, rather than the officially on the name of the module where a class is actually defined, rather than the
documented location for that class in the module hierarchy. officially documented location for that class in the module hierarchy.
While this PEP focuses specifically on ``pickle`` as the principal For the purposes of this PEP, a "pseudo-module" is a package designed like
serialisation scheme in the standard library, this issue may also affect
other mechanisms that support serialisation of arbitrary class instances.
\*For the purposes of this PEP, a "pseudo-module" is a package designed like
the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These
packages are documented as if they were single modules, but are in fact packages are documented as if they were single modules, but are in fact
internally implemented as a package. This is *supposed* to be an internally implemented as a package. This is *supposed* to be an
implementation detail that users and other implementations don't need to worry implementation detail that users and other implementations don't need to
about, but, thanks to ``pickle`` (and serialisation in general), the details worry about, but, thanks to ``pickle`` (and serialisation in general),
are exposed and effectively become part of the public API. the details are often exposed and can effectively become part of the public
API.
While this PEP focuses specifically on ``pickle`` as the principal
serialisation scheme in the standard library, this issue may also affect
other mechanisms that support serialisation of arbitrary class instances
and rely on ``__module__`` attributes to determine how to handle
deserialisation.
Where's the source? Where's the source?
@ -134,15 +258,73 @@ that simply aren't valid whenever the main module isn't an ordinary directly
executed script or top-level module. Packages and non-top-level modules executed script or top-level module. Packages and non-top-level modules
executed via the ``-m`` switch, as well as directly executed zipfiles or executed via the ``-m`` switch, as well as directly executed zipfiles or
directories, are likely to make multiprocessing on Windows do the wrong thing directories, are likely to make multiprocessing on Windows do the wrong thing
(either quietly or noisily) when spawning a new process. (either quietly or noisily, depending on application details) when spawning a
new process.
While this issue currently only affects Windows directly, it also impacts While this issue currently only affects Windows directly, it also impacts
any proposals to provide Windows-style "clean process" invocation via the any proposals to provide Windows-style "clean process" invocation via the
multiprocessing module on other platforms. multiprocessing module on other platforms.
Proposed Changes Qualified Names for Modules
================ ===========================
To make it feasible to fix these problems once and for all, it is proposed
to add a new module level attribute: ``__qualname__``. This abbreviation of
"qualified name" is taken from PEP 3155, where it is used to store the naming
path to a nested class or function definition relative to the top level
module.
For modules, ``__qualname__`` will normally be the same as ``__name__``, just
as it is for top-level functions and classes in PEP 3155. However, it will
differ in some situations so that the above problems can be addressed.
Specifically, whenever ``__name__`` is modified for some other purpose (such
as to denote the main module), then ``__qualname__`` will remain unchanged,
allowing code that needs it to access the original unmodified value.
If a module loader does not initialise ``__qualname__`` itself, then the
import system will add it automatically (setting it to the same value as
``__name__``).
Alternative Names
-----------------
Two alternative names were also considered for the new attribute: "full name"
(``__fullname__``) and "implementation name" (``__implname__``).
Either of those would actually be valid for the use case in this PEP.
However, as a meta-issue, PEP 3155 is *also* adding a new attribute (for
functions and classes) that is "like ``__name__``, but different in some cases
where ``__name__`` is missing necessary information" and those terms aren't
accurate for the PEP 3155 function and class use case.
PEP 3155 deliberately omits the module information, so the term "full name"
is simply untrue, and "implementation name" implies that it may specify an
object other than that specified by ``__name__``, and that is never the
case for PEP 3155 (in that PEP, ``__name__`` and ``__qualname__`` always
refer to the same function or class, it's just that ``__name__`` is
insufficient to accurately identify nested functions and classes).
Since it seems needlessly inconsistent to add *two* new terms for attributes
that only exist because backwards compatibility concerns keep us from
changing the behaviour of ``__name__`` itself, this PEP instead chose to
adopt the PEP 3155 terminology.
If the relative inscrutability of "qualified name" and ``__qualname__``
encourages interested developers to look them up at least once rather than
assuming they know what they mean just from the name and guessing wrong,
that's not necessarily a bad outcome.
Besides, 99% of Python developers should never need to even care these extra
attributes exist - they're really an implementation detail to let us fix a
few problematic behaviours exhibited by imports, pickling and introspection,
not something people are going to be dealing with on a regular basis.
Eliminating the Traps
=====================
The following changes are interrelated and make the most sense when The following changes are interrelated and make the most sense when
considered together. They collectively either completely eliminate the traps considered together. They collectively either completely eliminate the traps
@ -150,104 +332,362 @@ for the unwary noted above, or else provide straightforward mechanisms for
dealing with them. dealing with them.
A rough draft of some of the concepts presented here was first posted on the A rough draft of some of the concepts presented here was first posted on the
python-ideas list [1], but they have evolved considerably since first being python-ideas list ([1]_), but they have evolved considerably since first being
discussed in that thread. discussed in that thread. Further discussion has subsequently taken place on
the import-sig mailing list ([2]_. [3]_).
Fixing main module imports inside packages
------------------------------------------
To eliminate this trap, it is proposed that an additional filesystem check be
performed when determining a suitable value for ``sys.path[0]``. This check
will look for Python's explicit package directory markers and use them to find
the appropriate directory to add to ``sys.path``.
The current algorithm for setting ``sys.path[0]`` in relevant cases is roughly
as follows::
# Interactive prompt, -m switch, -c switch
sys.path.insert(0, '')
::
# Valid sys.path entry execution (i.e. directory and zip execution)
sys.path.insert(0, sys.argv[0])
::
# Direct script execution
sys.path.insert(0, os.path.dirname(sys.argv[0]))
It is proposed that this initialisation process be modified to take
package details stored on the filesystem into account::
# Interactive prompt, -m switch, -c switch
in_package, path_entry, _ignored = split_path_module(os.getcwd(), '')
if in_package:
sys.path.insert(0, path_entry)
else:
sys.path.insert(0, '')
# Start interactive prompt or run -c command as usual
# __main__.__qualname__ is set to "__main__"
# The -m switches uses the same sys.path[0] calculation, but:
# modname is the argument to the -m switch
# modname is passed to ``runpy._run_module_as_main()`` as usual
# __main__.__qualname__ is set to modname
::
# Valid sys.path entry execution (i.e. directory and zip execution)
modname = "__main__"
path_entry, modname = split_path_module(sys.argv[0], modname)
sys.path.insert(0, path_entry)
# modname (possibly adjusted) is passed to ``runpy._run_module_as_main()``
# __main__.__qualname__ is set to modname
::
# Direct script execution
in_package, path_entry, modname = split_path_module(sys.argv[0])
sys.path.insert(0, path_entry)
if in_package:
# Pass modname to ``runpy._run_module_as_main()``
else:
# Run script directly
# __main__.__qualname__ is set to modname
The ``split_path_module()`` supporting function used in the above pseudo-code
would have the following semantics::
def _splitmodname(fspath):
path_entry, fname = os.path.split(fspath)
modname = os.path.splitext(fname)[0]
return path_entry, modname
def _is_package_dir(fspath):
return any(os.exists("__init__" + info[0]) for info
in imp.get_suffixes())
def split_path_module(fspath, modname=None):
"""Given a filesystem path and a relative module name, determine an
appropriate sys.path entry and a fully qualified module name.
Returns a 3-tuple of (package_depth, fspath, modname). A reported
package depth of 0 indicates that this would be a top level import.
If no relative module name is given, it is derived from the final
component in the supplied path with the extension stripped.
"""
if modname is None:
fspath, modname = _splitmodname(fspath)
package_depth = 0
while _is_package_dir(fspath):
fspath, pkg = _splitmodname(fspath)
modname = pkg + '.' + modname
return package_depth, fspath, modname
This PEP also proposes that the ``split_path_module()`` functionality be
exposed directly to Python users via the ``runpy`` module.
With this fix in place, and the same simple package layout described earlier,
*all* of the following commands would invoke the test suite correctly::
# working directory: project/example/tests
./test_foo.py
python test_foo.py
python -m package.tests.test_foo
python -c "from .test_foo import main; main()"
python -c "from ..tests.test_foo import main; main()"
python -c "from package.tests.test_foo import main; main()"
# working directory: project/package
tests/test_foo.py
python tests/test_foo.py
python -m package.tests.test_foo
python -c "from .tests.test_foo import main; main()"
python -c "from package.tests.test_foo import main; main()"
# working directory: project
example/tests/test_foo.py
python example/tests/test_foo.py
python -m package.tests.test_foo
python -c "from package.tests.test_foo import main; main()"
# working directory: project/..
project/example/tests/test_foo.py
python project/example/tests/test_foo.py
# The -m and -c approaches still don't work from here, but the failure
# to find 'package' correctly is pretty easy to explain in this case
With these changes, clicking Python modules in a graphical file browser
should always execute them correctly, even if they live inside a package.
Depending on the details of how it invokes the script, Idle would likely also
be able to run ``test_foo.py`` correctly with F5, without needing any Idle
specific fixes.
Optional addition: command line relative imports
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With the above changes in place, it would be a fairly minor addition to allow
explicit relative imports as arguments to the ``-m`` switch::
# working directory: project/example/tests
python -m .test_foo
python -m ..tests.test_foo
# working directory: project/example/
python -m .tests.test_foo
With this addition, system initialisation for the ``-m`` switch would change
as follows::
# -m switch (permitting explicit relative imports)
in_package, path_entry, pkg_name = split_path_module(os.getcwd(), '')
qualname= <<arguments to -m switch>>
if qualname.startswith('.'):
modname = qualname
while modname.startswith('.'):
modname = modname[1:]
pkg_name, sep, _ignored = pkg_name.rpartition('.')
if not sep:
raise ImportError("Attempted relative import beyond top level package")
qualname = pkg_name + '.' modname
if in_package:
sys.path.insert(0, path_entry)
else:
sys.path.insert(0, '')
# qualname is passed to ``runpy._run_module_as_main()``
# _main__.__qualname__ is set to qualname
Compatibility with PEP 382
~~~~~~~~~~~~~~~~~~~~~~~~~~
Making this proposal compatible with the PEP 382 namespace packaging PEP is
trivial. The semantics of ``_is_package_dir()`` are merely changed to be::
def _is_package_dir(fspath):
return (fspath.endswith(".pyp") or
any(os.exists("__init__" + info[0]) for info
in imp.get_suffixes()))
Incompatibility with PEP 402
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PEP 402 proposes the elimination of explicit markers in the file system for
Python packages. This fundamentally breaks the proposed concept of being able
to take a filesystem path and a Python module name and work out an unambiguous
mapping to the Python module namespace. Instead, the appropriate mapping
would depend on the current values in ``sys.path``, rendering it impossible
to ever fix the problems described above with the calculation of
``sys.path[0]`` when the interpreter is initialised.
While some aspects of this PEP could probably be salvaged if PEP 402 were
adopted, the core concept of making import semantics from main and other
modules more consistent would no longer be feasible.
This incompatibility is discussed in more detail in the relevant import-sig
threads ([2]_, [3]_).
Potential incompatibilities with scripts stored in packages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The proposed change to ``sys.path[0]`` initialisation *may* break some
existing code. Specifically, it will break scripts stored in package
directories that rely on the implicit relative imports from ``__main__`` in
order to run correctly under Python 3.
While such scripts could be imported in Python 2 (due to implicit relative
imports) it is already the case that they cannot be imported in Python 3,
as implicit relative imports are no longer permitted when a module is
imported.
By disallowing implicit relatives imports from the main module as well,
such modules won't even work as scripts with this PEP. Switching them
over to explicit relative imports will then get them working again as
both executable scripts *and* as importable modules.
To support earlier versions of Python, a script could be written to use
different forms of import based on the Python version::
if __name__ == "__main__" and sys.version_info < (3, 3):
import peer # Implicit relative import
else:
from . import peer # explicit relative import
Fixing dual imports of the main module Fixing dual imports of the main module
-------------------------------------- --------------------------------------
Two simple changes are proposed to fix this problem: Given the above proposal to get ``__qualname__`` consistently set correctly
in the main module, one simple change is proposed to eliminate the problem
of dual imports of the main module: the addition of a ``sys.metapath`` hook
that detects attempts to import ``__main__`` under its real name and returns
the original main module instead::
1. In ``runpy``, modify the implementation of the ``-m`` switch handling to class AliasImporter:
install the specified module in ``sys.modules`` under both its real name def __init__(self, module, alias):
and the name ``__main__``. (Currently it is only installed as the latter) self.module = module
2. When directly executing a module, install it in ``sys.modules`` under self.alias = alias
``os.path.splitext(os.path.basename(__file__))[0]`` as well as under
``__main__``.
With the main module also stored under its "real" name, attempts to import it def __repr__(self):
will pick it up from the ``sys.modules`` cache rather than reimporting it fmt = "{0.__class__.__name__}({0.module.__name__}, {0.alias})"
under the new name. return fmt.format(self)
def find_module(self, fullname, path=None):
if path is None and fullname == self.alias:
return self
return None
Fixing direct execution inside packages def load_module(self, fullname):
--------------------------------------- if fullname != self.alias:
raise ImportError("{!r} cannot load {!r}".format(self, fullname))
return self.main_module
To fix this problem, it is proposed that an additional filesystem check be This metapath hook would be added automatically during import system
performed before proceeding with direct execution of a ``PY_SOURCE`` or initialisation based on the following logic::
``PY_COMPILED`` file that has been named on the command line.
This additional check would look for an ``__init__`` file that is a peer to main = sys.modules["__main__"]
the specified file with a matching extension (either ``.py``, ``.pyc`` or if main.__name__ != main.__qualname__:
``.pyo``, depending what was passed on the command line). sys.metapath.append(AliasImporter(main, main.__qualname__))
If this check fails to find anything, direct execution proceeds as usual. This is probably the least important proposal in the PEP - it just
closes off the last mechanism that is likely to lead to module duplication
If, however, it finds something, execution is handed over to a after the configuration of ``sys.path[0]`` at interpreter startup is
helper function in the ``runpy`` module that ``runpy.run_path`` also invokes addressed.
in the same circumstances. That function will walk back up the
directory hierarchy from the supplied path, looking for the first directory
that doesn't contain an ``__init__`` file. Once that directory is found, it
will be set to ``sys.path[0]``, ``sys.argv[0]`` will be set to ``-m`` and
``runpy._run_module_as_main`` will be invoked with the appropriate module
name (as calculated based on the original filename and the directories
traversed while looking for a directory without an ``__init__`` file).
The two current PEPs for namespace packages (PEP 382 and PEP 402) would both
affect this part of the proposal. For PEP 382 (with its current suggestion of
"*.pyp" package directories, this check would instead just walk up the
supplied path, looking for the first non-package directory (this would not
require any filesystem stat calls). Since PEP 402 deliberately omits explicit
directory markers, it would need an alternative approach, based on checking
the supplied path against the contents of ``sys.path``. In both cases, the
direct execution behaviour can still be corrected.
Fixing pickling without breaking introspection Fixing pickling without breaking introspection
---------------------------------------------- ----------------------------------------------
To fix this problem, it is proposed to add a new optional module level To fix this problem, it is proposed to make use of the new module level
attribute: ``__qname__``. This abbreviation of "qualified name" is taken ``__qualname__`` attributes to determine the real module location when
from PEP 3155, where it is used to store the naming path to a nested class ``__name__`` has been modified for any reason.
or function definition relative to the top level module. By default,
``__qname__`` will be the same as ``__name__``, which covers the typical
case where there is a one-to-one correspondence between the documented API
and the actual module implementation.
Functions and classes will gain a corresponding ``__qmodule__`` attribute In the main module, ``__qualname__`` will automatically be set to the main
that refers to their module's ``__qname__``. module's "real" name (as described above) by the interpreter.
Pseudo-modules that adjust ``__name__`` to point to the public namespace will Pseudo-modules that adjust ``__name__`` to point to the public namespace will
leave ``__qname__`` untouched, so the implementation location remains readily leave ``__qualname__`` untouched, so the implementation location remains readily
accessible for introspection. accessible for introspection.
In the main module, ``__qname__`` will automatically be set to the main If ``__name__`` is adjusted at the top of a module, then this will
module's "real" name (as described above under the fix to prevent duplicate automatically adjust the ``__module__`` attribute for all functions and
imports of the main module) by the interpreter. classes subsequently defined in that module.
At the interactive prompt, both ``__name__`` and ``__qname__`` will be set Since multiple submodules may be set to use the same "public" namespace,
to ``"__main__"``. functions and classes will be given a new ``__qualmodule__`` attribute
that refers to the ``__qualname__`` of their module.
These changes on their own will fix most pickling and serialisation problems, This isn't strictly necessary for functions (you could find out their
but one additional change is needed to fix the problem with serialisation of module's qualified name by looking in their globals dictionary), but it is
items in ``__main__``: as a slight adjustment to the definition process for needed for classes, since they don't hold a reference to the globals of
functions and classes, in the ``__name__ == "__main__"`` case, the module their defining module. Once a new attribute is added to classes, it is
``__qname__`` attribute will be used to set ``__module__``. more convenient to keep the API consistent and add a new attribute to
functions as well.
These changes mean that adjusting ``__name__`` (and, either directly or
indirectly, the corresponding function and class ``__module__`` attributes)
becomes the officially sanctioned way to implement a namespace as a package,
while exposing the API as if it were still a single module.
All serialisation code that currently uses ``__name__`` and ``__module__``
attributes will then avoid exposing implementation details by default.
To correctly handle serialisation of items from the main module, the class
and function definition logic will be updated to also use ``__qualname__``
for the ``__module__`` attribute in the case where ``__name__ == "__main__"``.
With ``__name__`` and ``__module__`` being officially blessed as being used
for the *public* names of things, the introspection tools in the standard
library will be updated to use ``__qualname__`` and ``__qualmodule__``
where appropriate. For example:
- ``pydoc`` will report both public and qualified names for modules
- ``inspect.getsource()`` (and similar tools) will use the qualified names
that point to the implementation of the code
- additional ``pydoc`` and/or ``inspect`` APIs may be provided that report
all modules with a given public ``__name__``.
``pydoc`` and ``inspect`` would also be updated appropriately to:
- use ``__qname__`` instead of ``__name__`` and ``__qmodule__`` instead of
``__module__``where appropriate (e.g. ``inspect.getsource()`` would prefer
the qualified variants)
- report both the public names and the qualified names for affected objects
Fixing multiprocessing on Windows Fixing multiprocessing on Windows
--------------------------------- ---------------------------------
With ``__qname__`` now available to tell ``multiprocessing`` the real With ``__qualname__`` now available to tell ``multiprocessing`` the real
name of the main module, it should be able to simply include it in the name of the main module, it will be able to simply include it in the
serialised information passed to the child process, eliminating the serialised information passed to the child process, eliminating the
need for dubious reverse engineering of the ``__file__`` attribute. need for the current dubious introspection of the ``__file__`` attribute.
For older Python versions, ``multiprocessing`` could be improved by applying
the ``split_path_module()`` algorithm described above when attempting to
work out how to execute the main module based on its ``__file__`` attribute.
Explicit relative imports
=========================
This PEP proposes that ``__package__`` be unconditionally defined in the
main module as ``__qualname__.rpartition('.')[0]``. Aside from that, it
proposes that the behaviour of explicit relative imports be left alone.
In particular, if ``__package__`` is not set in a module when an explicit
relative import occurs, the automatically cached value will continue to be
derived from ``__name__`` rather than ``__qualname__``. This minimises any
backwards incompatibilities with existing code that deliberately manipulates
relative imports by adjusting ``__name__`` rather than setting ``__package__``
directly.
This PEP does *not* propose that ``__package__`` be deprecated. While it is
technically redundant following the introduction of ``__qualname__``, it just
isn't worth the hassle of deprecating it within the lifetime of Python 3.x.
Reference Implementation Reference Implementation
@ -262,6 +702,14 @@ References
.. [1] Module aliases and/or "real names" .. [1] Module aliases and/or "real names"
(http://mail.python.org/pipermail/python-ideas/2011-January/008983.html) (http://mail.python.org/pipermail/python-ideas/2011-January/008983.html)
.. [2] PEP 395 (Module aliasing) and the namespace PEPs
(http://mail.python.org/pipermail/import-sig/2011-November/000382.html)
.. [3] Updated PEP 395 (aka "Implicit Relative Imports Must Die!")
(http://mail.python.org/pipermail/import-sig/2011-November/000397.html)
.. [4] Elaboration of compatibility problems between this PEP and PEP 402
(http://mail.python.org/pipermail/import-sig/2011-November/000403.html)
Copyright Copyright
========= =========

View File

@ -57,29 +57,40 @@ of the crew.
Features for 3.3 Features for 3.3
================ ================
Implemented / Final PEPs:
* PEP 380: Syntax for Delegating to a Subgenerator
* PEP 393: Flexible String Representation
* PEP 399: Pure Python/C Accelerator Module Compatibility Requirements
* PEP 409: Suppressing exception context
* PEP 414: Explicit Unicode Literal for Python 3.3
* PEP 3151: Reworking the OS and IO exception hierarchy
* PEP 3155: Qualified name for classes and functions
Other final large-scale changes:
* Addition of the "packaging" module, deprecating "distutils"
* Addition of the "faulthandler" module
* Addition of the "lzma" module, and lzma/xz support in tarfile
Candidate PEPs: Candidate PEPs:
* PEP 362: Function Signature Object * PEP 362: Function Signature Object
* PEP 380: Syntax for Delegating to a Subgenerator
* PEP 382: Namespace Packages * PEP 382: Namespace Packages
* PEP 393: Flexible String Representation
* PEP 395: Module Aliasing * PEP 395: Module Aliasing
* PEP 397: Python launcher for Windows * PEP 397: Python launcher for Windows
* PEP 3143: Standard daemon process library * PEP 3143: Standard daemon process library
* PEP 3151: Reworking the OS and IO exception hierarchy
(Note that these are not accepted yet and even if they are, they might (Note that these are not accepted yet and even if they are, they might
not be finished in time for Python 3.3.) not be finished in time for Python 3.3.)
Other planned large-scale changes: Other planned large-scale changes:
* Addition of the "packaging" module, replacing "distutils" * Addition of the "regex" module
* Implementing ``__import__`` using importlib
* Email version 6 * Email version 6
* Implementing ``__import__`` using importlib
* A standard event-loop interface (PEP by Jim Fulton pending) * A standard event-loop interface (PEP by Jim Fulton pending)
* Adding the faulthandler module.
* Breaking out standard library and docs in separate repos? * Breaking out standard library and docs in separate repos?
* A PEP on supplementing C modules with equivalent Python modules?
Copyright Copyright

View File

@ -2,7 +2,7 @@ PEP: 400
Title: Deprecate codecs.StreamReader and codecs.StreamWriter Title: Deprecate codecs.StreamReader and codecs.StreamWriter
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@haypocalc.com> Author: Victor Stinner <victor.stinner@gmail.com>
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst

View File

@ -1,13 +1,13 @@
PEP: 403 PEP: 403
Title: Prefix syntax for post function definition operations Title: Statement local functions and classes
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com> Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Withdrawn Status: Deferred
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2011-10-13 Created: 2011-10-13
Python-Version: 3.x Python-Version: 3.4
Post-History: 2011-10-13 Post-History: 2011-10-13
Resolution: TBD Resolution: TBD
@ -15,33 +15,23 @@ Resolution: TBD
Abstract Abstract
======== ========
This PEP proposes the addition of ``postdef`` as a new function prefix This PEP proposes the addition of a new ``in`` statement that accepts a
syntax (analogous to decorators) that permits the execution of a single simple statement local function or class definition.
statement (potentially including substatements separated by semi-colons) after
In addition, the new syntax would allow the 'def' keyword to be used to refer The statement accepts a single simple statement that can make a forward
to the function being defined without needing to repeat the name. reference to a trailing function or class definition.
When the 'postdef' prefix syntax is used, the associated statement would be This new statement is designed to be used whenever a "one-shot" function or
executed *in addition to* the normal local name binding implicit in function class is needed, and placing the function or class definition before the
definitions. Any name collision are expected to be minor, analagous to those statement that uses it actually makes the code harder to read. It also
encountered with ``for`` loop iteration variables. avoids any name shadowing concerns by making sure the new name is visible
only to the statement in the ``in`` clause.
This PEP is based heavily on many of the ideas in PEP 3150 (Statement Local This PEP is based heavily on many of the ideas in PEP 3150 (Statement Local
Namespaces) so some elements of the rationale will be familiar to readers of Namespaces) so some elements of the rationale will be familiar to readers of
that PEP. That PEP has now been withdrawn in favour of this one. that PEP. That PEP has now been withdrawn in favour of this one.
PEP Withdrawal
==============
The python-ideas thread discussing this PEP [1]_ persuaded me that it was
essentially am unnecessarily cryptic, wholly inferior version of PEP 3150's
statement local namespaces. The discussion also resolved some of my concerns
with PEP 3150, so I am withdrawing this more limited version of the idea in
favour of resurrecting the original concept.
Basic Examples Basic Examples
============== ==============
@ -49,14 +39,14 @@ Before diving into the long history of this problem and the detailed
rationale for this specific proposed solution, here are a few simple rationale for this specific proposed solution, here are a few simple
examples of the kind of code it is designed to simplify. examples of the kind of code it is designed to simplify.
As a trivial example, weakref callbacks could be defined as follows:: As a trivial example, a weakref callback could be defined as follows::
postdef x = weakref.ref(target, def) in x = weakref.ref(target, report_destruction)
def report_destruction(obj): def report_destruction(obj):
print("{} is being destroyed".format(obj)) print("{} is being destroyed".format(obj))
This contrasts with the current repetitive "out of order" syntax for this This contrasts with the current (conceptually) "out of order" syntax for
operation:: this operation::
def report_destruction(obj): def report_destruction(obj):
print("{} is being destroyed".format(obj)) print("{} is being destroyed".format(obj))
@ -66,11 +56,19 @@ operation::
That structure is OK when you're using the callable multiple times, but That structure is OK when you're using the callable multiple times, but
it's irritating to be forced into it for one-off operations. it's irritating to be forced into it for one-off operations.
If the repetition of the name seems especially annoying, then a throwaway
name like ``f`` can be used instead::
in x = weakref.ref(target, f)
def f(obj):
print("{} is being destroyed".format(obj))
Similarly, a sorted operation on a particularly poorly defined type could Similarly, a sorted operation on a particularly poorly defined type could
now be defined as:: now be defined as::
postdef sorted_list = sorted(original, key=def) in sorted_list = sorted(original, key=f)
def force_sort(item): def f(item):
try: try:
return item.calc_sort_order() return item.calc_sort_order()
except NotSortableError: except NotSortableError:
@ -88,32 +86,41 @@ Rather than::
And early binding semantics in a list comprehension could be attained via:: And early binding semantics in a list comprehension could be attained via::
postdef funcs = [def(i) for i in range(10)] in funcs = [adder(i) for i in range(10)]
def make_incrementor(i): def adder(i):
postdef return def return lambda x: x + i
def incrementor(x):
return x + i
Proposal Proposal
======== ========
This PEP proposes the addition of an optional block prefix clause to the This PEP proposes the addition of a new ``in`` statement that is a variant
syntax for function and class definitions. of the existing class and function definition syntax.
This block prefix would be introduced by a leading ``postdef`` and would be The new ``in`` clause replaces the decorator lines, and allows forward
allowed to contain any simple statement (including those that don't references to the trailing function or class definition.
make any sense in that context - while such code would be legal,
there wouldn't be any point in writing it). This permissive structure is
easier to define and easier to explain, but a more restrictive approach that
only permits operations that "make sense" would also be possible (see PEP
3150 for a list of possible candidates)
The function definition keyword ``def`` would be repurposed inside the block prefix The trailing function or class definition is always named - the name of
to refer to the function being defined. the trailing definition is then used to make the forward reference from the
preceding statement.
When a block prefix is provided, the standard local name binding implicit The ``in`` clause is allowed to contain any simple statement (including those
in the function definition still takes place. that don't make any sense in that context, such as ``pass`` - while such code
would be legal, there wouldn't be any point in writing it). This permissive
structure is easier to define and easier to explain, but a more restrictive
approach that only permits operations that "make sense" would also be
possible (see PEP 3150 for a list of possible candidates).
The ``in`` statement will not create a new scope - all name binding
operations aside from the trailing function or class definition will affect
the containing scope.
The name used in the trailing function or class definition is only visible
from the associated ``in`` clause, and behaves as if it was an ordinary
variable defined in that scope. If any nested scopes are created in either
the ``in`` clause or the trailing function or class definition, those scopes
will see the trailing function or class definition rather than any other
bindings for that name in the containing scope.
Background Background
@ -125,10 +132,11 @@ block functionality for me to finally understand why this bugs people
so much: Python's demand that the function be named and introduced so much: Python's demand that the function be named and introduced
before the operation that needs it breaks the developer's flow of thought. before the operation that needs it breaks the developer's flow of thought.
They get to a point where they go "I need a one-shot operation that does They get to a point where they go "I need a one-shot operation that does
<X>", and instead of being able to just *say* that, they instead have to back <X>", and instead of being able to just *say* that directly, they instead
up, name a function to do <X>, then call that function from the operation have to back up, name a function to do <X>, then call that function from
they actually wanted to do in the first place. Lambda expressions can help the operation they actually wanted to do in the first place. Lambda
sometimes, but they're no substitute for being able to use a full suite. expressions can help sometimes, but they're no substitute for being able to
use a full suite.
Ruby's block syntax also heavily inspired the style of the solution in this Ruby's block syntax also heavily inspired the style of the solution in this
PEP, by making it clear that even when limited to *one* anonymous function per PEP, by making it clear that even when limited to *one* anonymous function per
@ -144,13 +152,19 @@ the heavy lifting:
However, adopting Ruby's block syntax directly won't work for Python, since However, adopting Ruby's block syntax directly won't work for Python, since
the effectiveness of Ruby's blocks relies heavily on various conventions in the effectiveness of Ruby's blocks relies heavily on various conventions in
the way functions are *defined* (specifically, Ruby's ``yield`` syntax to the way functions are *defined* (specifically, using Ruby's ``yield`` syntax
call blocks directly and the ``&arg`` mechanism to accept a block as a to call blocks directly and the ``&arg`` mechanism to accept a block as a
function's final argument). function's final argument).
Since Python has relied on named functions for so long, the signatures of Since Python has relied on named functions for so long, the signatures of
APIs that accept callbacks are far more diverse, thus requiring a solution APIs that accept callbacks are far more diverse, thus requiring a solution
that allows anonymous functions to be slotted in at the appropriate location. that allows one-shot functions to be slotted in at the appropriate location.
The approach taken in this PEP is to retain the requirement to name the
function explicitly, but allow the relative order of the definition and the
statement that references it to be changed to match the developer's flow of
thought. The rationale is essentially the same as that used when introducing
decorators, but covering a broader set of applications.
Relation to PEP 3150 Relation to PEP 3150
@ -166,8 +180,9 @@ with something else (like assigning the result of the function to a value).
This PEP also achieves most of the other effects described in PEP 3150 This PEP also achieves most of the other effects described in PEP 3150
without introducing a new brainbending kind of scope. All of the complex without introducing a new brainbending kind of scope. All of the complex
scoping rules in PEP 3150 are replaced in this PEP with the simple ``def`` scoping rules in PEP 3150 are replaced in this PEP with allowing a forward
reference to the associated function definition. reference to the associated function or class definition without creating an
actual name binding in the current scope.
Keyword Choice Keyword Choice
@ -176,53 +191,49 @@ Keyword Choice
The proposal definitely requires *some* kind of prefix to avoid parsing The proposal definitely requires *some* kind of prefix to avoid parsing
ambiguity and backwards compatibility problems with existing constructs. ambiguity and backwards compatibility problems with existing constructs.
It also needs to be clearly highlighted to readers, since it declares that It also needs to be clearly highlighted to readers, since it declares that
the following piece of code is going to be executed out of order. the following piece of code is going to be executed only after the trailing
function or class definition has been executed.
The 'postdef' keyword was chosen as a literal explanation of exactly what The ``in`` keyword was chosen as an existing keyword that can be used to
the new clause does: execute the specified statement *after* the associated denote the concept of a forward reference.
function definition, even though it is physically written *before* the
definition in the source code. For functions, the construct is intended to be read as "in <this statement
that references NAME> define NAME as a function that does <operation>".
The mapping to English prose isn't as obvious for the class definition case,
but the concept remains the same.
Requirement to Name Functions Better Debugging Support for Functions and Classes with Short Names
============================= ===================================================================
One of the objections to widespread use of lambda expressions is that they One of the objections to widespread use of lambda expressions is that they
have an atrocious effect on traceback intelligibility and other aspects of have a negative effect on traceback intelligibility and other aspects of
introspection. Accordingly, this PEP requires that even throwaway functions introspection. Similarly objections are raised regarding constructs that
be given some kind of name. promote short, cryptic function names (including this one, which requires
that the name of the trailing definition be supplied at least twice)
To help encourage the use of meaningful names without users having to repeat However, the introduction of qualified names in PEP 3155 means that even
themselves, the PEP suggests the provision of the ``def`` shorthand reference anonymous classes and functions will now have different representations if
to the current function from the ``postdef`` clause. they occur in different scopes. For example::
>>> def f():
... return lambda: y
...
>>> f()
<function f.<locals>.<lambda> at 0x7f6f46faeae0>
Anonymous functions (or functions that share a name) within the *same* scope
will still share representations (aside from the object ID), but this is
still a major improvement over the historical situation where everything
*except* the object ID was identical.
Syntax Change Syntax Change
============= =============
Current::
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' |
NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False')
Changed::
atom: ('(' [yield_expr|testlist_comp] ')' |
'[' [testlist_comp] ']' |
'{' [dictorsetmaker] '}' |
NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False' | 'def')
New:: New::
blockprefix: 'postdef' simple_stmt in_stmt: 'in' simple_stmt (classdef|funcdef)
block: blockprefix funcdef
The above is the general idea, but I suspect that the change to the 'atom'
definition may cause an ambiguity problem in the parser when it comes to
detecting function definitions. So the actual implementation may need to be
more complex than that.
Grammar: http://hg.python.org/cpython/file/default/Grammar/Grammar Grammar: http://hg.python.org/cpython/file/default/Grammar/Grammar
@ -230,14 +241,18 @@ Grammar: http://hg.python.org/cpython/file/default/Grammar/Grammar
Possible Implementation Strategy Possible Implementation Strategy
================================ ================================
This proposal has one titanic advantage over PEP 3150: implementation This proposal has at least one titanic advantage over PEP 3150:
should be relatively straightforward. implementation should be relatively straightforward.
The post definition statement can be incorporated into the AST for the The AST for the ``in`` statement will include both the function or class
function node and simply visited out of sequence. definition and the statement that references it, so it should just be a
matter of emitting the two operations out of order and using a hidden
variable to link up any references.
The one potentially tricky part is working out how to allow the dual The one potentially tricky part is changing the meaning of the references to
use of 'def' without rewriting half the grammar definition. the statement local function or namespace while within the scope of the
``in`` statement, but that shouldn't be too hard to address by maintaining
some additional state within the compiler.
More Examples More Examples
@ -253,7 +268,7 @@ Calculating attributes without polluting the local namespace (from os.py)::
del _createenviron del _createenviron
# Becomes: # Becomes:
postdef environ = def() in environ = _createenviron()
def _createenviron(): def _createenviron():
... # 27 line function ... # 27 line function
@ -263,17 +278,25 @@ Loop early binding::
funcs = [(lambda x, i=i: x + i) for i in range(10)] funcs = [(lambda x, i=i: x + i) for i in range(10)]
# Becomes: # Becomes:
postdef funcs = [def(i) for i in range(10)] in funcs = [adder(i) for i in range(10)]
def make_incrementor(i): def adder(i):
return lambda x: x + i return lambda x: x + i
# Or even: # Or even:
postdef funcs = [def(i) for i in range(10)] in funcs = [adder(i) for i in range(10)]
def make_incrementor(i): def adder(i):
postdef return def in return incr
def incrementor(x): def incr(x):
return x + i return x + i
A trailing class can be used as a statement local namespace::
# Evaluate subexpressions only once
in c = math.sqrt(x.a*x.a + x.b*x.b)
class x:
a = calculate_a()
b = calculate_b()
Reference Implementation Reference Implementation
======================== ========================
@ -288,15 +311,36 @@ Huge thanks to Gary Bernhardt for being blunt in pointing out that I had no
idea what I was talking about in criticising Ruby's blocks, kicking off a idea what I was talking about in criticising Ruby's blocks, kicking off a
rather enlightening process of investigation. rather enlightening process of investigation.
Even though this PEP has been withdrawn, the process of writing and arguing
in its favour has been quite influential on the future direction of PEP 3150. Rejected Concepts
=================
A previous incarnation of this PEP (see [1]) proposed a much uglier syntax
that (quite rightly) was not well received. The current proposal is
significantly easier both to read and write.
A more recent variant always used ``...`` for forward references, along
with genuinely anonymous function and class definitions. However, this
degenerated quickly into a mass of unintelligible dots in more complex
cases::
in funcs = [...(i) for i in range(10)]
def ...(i):
in return ...
def ...(x):
return x + i
in c = math.sqrt(....a*....a + ....b*....b)
class ...:
a = calculate_a()
b = calculate_b()
References References
========== ==========
[1] Start of python-ideas thread: .. [1] Start of python-ideas thread:
http://mail.python.org/pipermail/python-ideas/2011-October/012276.html http://mail.python.org/pipermail/python-ideas/2011-October/012276.html
Copyright Copyright

View File

@ -1,536 +1,175 @@
PEP: 404 PEP: 404
Title: Python Virtual Environments Title: Python 2.8 Un-release Schedule
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Carl Meyer <carl@oddbird.net> Author: Barry Warsaw <barry@python.org>
Status: Draft Status: Final
Type: Standards Track Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 13-Jun-2011 Created: 2011-11-09
Python-Version: 3.3 Python-Version: 2.8
Post-History: 24-Oct-2011, 28-Oct-2011
Abstract Abstract
======== ========
This PEP proposes to add to Python a mechanism for lightweight This document describes the un-development and un-release schedule for Python
"virtual environments" with their own site directories, optionally 2.8.
isolated from system site directories. Each virtual environment has
its own Python binary (allowing creation of environments with various
Python versions) and can have its own independent set of installed
Python packages in its site directories, but shares the standard
library with the base installed Python.
Motivation
==========
The utility of Python virtual environments has already been well
established by the popularity of existing third-party
virtual-environment tools, primarily Ian Bicking's `virtualenv`_.
Virtual environments are already widely used for dependency management
and isolation, ease of installing and using Python packages without
system-administrator access, and automated testing of Python software
across multiple Python versions, among other uses.
Existing virtual environment tools suffer from lack of support from
the behavior of Python itself. Tools such as `rvirtualenv`_, which do
not copy the Python binary into the virtual environment, cannot
provide reliable isolation from system site directories. Virtualenv,
which does copy the Python binary, is forced to duplicate much of
Python's ``site`` module and manually symlink/copy an ever-changing
set of standard-library modules into the virtual environment in order
to perform a delicate boot-strapping dance at every startup.
(Virtualenv must copy the binary in order to provide isolation, as
Python dereferences a symlinked executable before searching for
``sys.prefix``.)
The ``PYTHONHOME`` environment variable, Python's only existing
built-in solution for virtual environments, requires
copying/symlinking the entire standard library into every environment.
Copying the whole standard library is not a lightweight solution, and
cross-platform support for symlinks remains inconsistent (even on
Windows platforms that do support them, creating them often requires
administrator privileges).
A virtual environment mechanism integrated with Python and drawing on
years of experience with existing third-party tools can be lower
maintenance, more reliable, and more easily available to all Python
users.
.. _virtualenv: http://www.virtualenv.org
.. _rvirtualenv: https://github.com/kvbik/rvirtualenv
Specification
=============
When the Python binary is executed, it attempts to determine its
prefix (which it stores in ``sys.prefix``), which is then used to find
the standard library and other key files, and by the ``site`` module
to determine the location of the site-package directories. Currently
the prefix is found (assuming ``PYTHONHOME`` is not set) by first
walking up the filesystem tree looking for a marker file (``os.py``)
that signifies the presence of the standard library, and if none is
found, falling back to the build-time prefix hardcoded in the binary.
This PEP proposes to add a new first step to this search. If a
``pyvenv.cfg`` file is found either adjacent to the Python executable,
or one directory above it, this file is scanned for lines of the form
``key = value``. If a ``home`` key is found, this signifies that the
Python binary belongs to a virtual environment, and the value of the
``home`` key is the directory containing the Python executable used to
create this virtual environment.
In this case, prefix-finding continues as normal using the value of
the ``home`` key as the effective Python binary location, which finds
the prefix of the base installation. ``sys.base_prefix`` is set to
this value, while ``sys.prefix`` is set to the directory containing
``pyvenv.cfg``.
(If ``pyvenv.cfg`` is not found or does not contain the ``home`` key,
prefix-finding continues normally, and ``sys.prefix`` will be equal to
``sys.base_prefix``.)
The ``site`` and ``sysconfig`` standard-library modules are modified
such that the standard library and header files are are found relative
to ``sys.base_prefix``, while site-package directories ("purelib" and
"platlib", in ``sysconfig`` terms) are still found relative to
``sys.prefix``.
(Also, ``sys.base_exec_prefix`` is added, and handled similarly with
regard to ``sys.exec_prefix``.)
Thus, a Python virtual environment in its simplest form would consist
of nothing more than a copy or symlink of the Python binary
accompanied by a ``pyvenv.cfg`` file and a site-packages directory.
Isolation from system site-packages
-----------------------------------
By default, a virtual environment is entirely isolated from the
system-level site-packages directories.
If the ``pyvenv.cfg`` file also contains a key
``include-system-site-packages`` with a value of ``true`` (not case
sensitive), the ``site`` module will also add the system site
directories to ``sys.path`` after the virtual environment site
directories. Thus system-installed packages will still be importable,
but a package of the same name installed in the virtual environment
will take precedence.
:pep:`370` user-level site-packages are considered part of the system
site-packages for venv purposes: they are not available from an
isolated venv, but are available from an
``include-system-site-packages = true`` venv.
Creating virtual environments
-----------------------------
This PEP also proposes adding a new ``venv`` module to the standard
library which implements the creation of virtual environments. This
module can be executed using the ``-m`` flag::
python3 -m venv /path/to/new/virtual/environment
A ``pyvenv`` installed script is also provided to make this more
convenient::
pyvenv /path/to/new/virtual/environment
Running this command creates the target directory (creating any parent
directories that don't exist already) and places a ``pyvenv.cfg`` file
in it with a ``home`` key pointing to the Python installation the
command was run from. It also creates a ``bin/`` (or ``Scripts`` on
Windows) subdirectory containing a copy (or symlink) of the
``python3`` executable, and the ``pysetup3`` script from the
``packaging`` standard library module (to facilitate easy installation
of packages from PyPI into the new virtualenv). And it creates an
(initially empty) ``lib/pythonX.Y/site-packages`` (or
``Lib\site-packages`` on Windows) subdirectory.
If the target directory already exists an error will be raised, unless
the ``--clear`` option was provided, in which case the target
directory will be deleted and virtual environment creation will
proceed as usual.
The created ``pyvenv.cfg`` file also includes the
``include-system-site-packages`` key, set to ``true`` if ``venv`` is
run with the ``--system-site-packages`` option, ``false`` by default.
Multiple paths can be given to ``pyvenv``, in which case an identical
virtualenv will be created, according to the given options, at each
provided path.
The ``venv`` module also provides "shell activation scripts" for POSIX
and Windows systems which simply add the virtual environment's ``bin``
(or ``Scripts``) directory to the front of the user's shell PATH.
This is not strictly necessary for use of a virtual environment (as an
explicit path to the venv's python binary or scripts can just as well
be used), but it is convenient.
The ``venv`` module also adds a ``pysetup3`` script into each venv.
In order to allow ``pysetup`` and other Python package managers to
install packages into the virtual environment the same way they would
install into a normal Python installation, and avoid special-casing
virtual environments in ``sysconfig`` beyond using ``sys.site_prefix``
in place of ``sys.prefix``, the internal virtual environment layout
mimics the layout of the Python installation itself on each platform.
So a typical virtual environment layout on a POSIX system would be::
pyvenv.cfg
bin/python3
bin/python
bin/pysetup3
include/
lib/python3.3/site-packages/
While on a Windows system::
pyvenv.cfg
Scripts/python.exe
Scripts/python3.dll
Scripts/pysetup3.exe
Scripts/pysetup3-script.py
... other DLLs and pyds...
Include/
Lib/site-packages/
Third-party packages installed into the virtual environment will have
their Python modules placed in the ``site-packages`` directory, and
their executables placed in ``bin/`` or ``Scripts\``.
.. note::
On a normal Windows system-level installation, the Python binary
itself wouldn't go inside the "Scripts/" subdirectory, as it does
in the default venv layout. This is useful in a virtual
environment so that a user only has to add a single directory to
their shell PATH in order to effectively "activate" the virtual
environment.
.. note::
On Windows, it is necessary to also copy or symlink DLLs and pyd
files from compiled stdlib modules into the env, because if the
venv is created from a non-system-wide Python installation,
Windows won't be able to find the Python installation's copies of
those files when Python is run from the venv.
Copies versus symlinks
----------------------
The technique in this PEP works equally well in general with a copied
or symlinked Python binary (and other needed DLLs on Windows). Some
users prefer a copied binary (for greater isolation from system
changes) and some prefer a symlinked one (so that e.g. security
updates automatically propagate to virtual environments).
There are some cross-platform difficulties with symlinks:
* Not all Windows versions support symlinks, and even on those that
do, creating them often requires administrator privileges.
* On OSX framework builds of Python, sys.executable is just a stub
that executes the real Python binary. Symlinking this stub does not
work with the implementation in this PEP; it must be copied.
(Fortunately the stub is also small, so copying it is not an issue).
Because of these issues, this PEP proposes to copy the Python binary
by default, to maintain cross-platform consistency in the default
behavior.
The ``pyvenv`` script accepts a ``--symlink`` option. If this option
is provided, the script will attempt to symlink instead of copy. If a
symlink fails (e.g. because they are not supported by the platform, or
additional privileges are needed), the script will warn the user and
fall back to a copy.
On OSX framework builds, where a symlink of the executable would
succeed but create a non-functional virtual environment, the script
will fail with an error message that symlinking is not supported on
OSX framework builds.
API
---
The high-level method described above makes use of a simple API which
provides mechanisms for third-party virtual environment creators to
customize environment creation according to their needs.
The ``venv`` module contains an ``EnvBuilder`` class which accepts the
following keyword arguments on instantiation:
* ``system_site_packages`` - A Boolean value indicating that the
system Python site-packages should be available to the environment.
Defaults to ``False``.
* ``clear`` - A Boolean value which, if true, will delete any existing
target directory instead of raising an exception. Defaults to
``False``.
* ``symlinks`` - A Boolean value indicating whether to attempt to
symlink the Python binary (and any necessary DLLs or other binaries,
e.g. ``pythonw.exe``), rather than copying. Defaults to ``False``.
The instantiated env-builder has a ``create`` method, which takes as
required argument the path (absolute or relative to the current
directory) of the target directory which is to contain the virtual
environment. The ``create`` method either creates the environment in
the specified directory, or raises an appropriate exception.
The ``venv`` module also provides a module-level ``create`` function
as a convenience::
def create(env_dir,
system_site_packages=False, clear=False, use_symlinks=False):
builder = EnvBuilder(
system_site_packages=system_site_packages,
clear=clear,
use_symlinks=use_symlinks)
builder.create(env_dir)
Creators of third-party virtual environment tools are free to use the
provided ``EnvBuilder`` class as a base class.
The ``create`` method of the ``EnvBuilder`` class illustrates the
hooks available for customization::
def create(self, env_dir):
"""
Create a virtualized Python environment in a directory.
:param env_dir: The target directory to create an environment in.
"""
env_dir = os.path.abspath(env_dir)
context = self.create_directories(env_dir)
self.create_configuration(context)
self.setup_python(context)
self.post_setup(context)
Each of the methods ``create_directories``, ``create_configuration``,
``setup_python``, and ``post_setup`` can be overridden. The functions
of these methods are:
* ``create_directories`` - creates the environment directory and all
necessary directories, and returns a context object. This is just a
holder for attributes (such as paths), for use by the other methods.
* ``create_configuration`` - creates the ``pyvenv.cfg`` configuration
file in the environment.
* ``setup_python`` - creates a copy of the Python executable (and,
under Windows, DLLs) in the environment.
* ``post_setup`` - A (no-op by default) hook method which can be
overridden in third party subclasses to pre-install packages or
install scripts in the virtual environment.
In addition, ``EnvBuilder`` provides a utility method that can be
called from ``post_setup`` in subclasses to assist in installing
custom scripts into the virtual environment. The method
``install_scripts`` accepts as arguments the ``context`` object (see
above) and a path to a directory. The directory should contain
subdirectories "common", "posix", "nt", each containing scripts
destined for the bin directory in the environment. The contents of
"common" and the directory corresponding to ``os.name`` are copied
after doing some text replacement of placeholders:
* ``__VENV_DIR__`` is replaced with absolute path of the environment
directory.
* ``__VENV_NAME__`` is replaced with the environment name (final path
segment of environment directory).
* ``__VENV_BIN_NAME__`` is replaced with the name of the bin directory
(either ``bin`` or ``Scripts``).
* ``__VENV_PYTHON__`` is replaced with the absolute path of the
environment's executable.
The ``DistributeEnvBuilder`` subclass in the reference implementation
illustrates how the customization hook can be used in practice to
pre-install Distribute into the virtual environment. It's not
envisaged that ``DistributeEnvBuilder`` will be actually added to
Python core, but it makes the reference implementation more
immediately useful for testing and exploratory purposes.
Backwards Compatibility
=======================
Splitting the meanings of ``sys.prefix``
----------------------------------------
Any virtual environment tool along these lines (which attempts to
isolate site-packages, while still making use of the base Python's
standard library with no need for it to be symlinked into the virtual
environment) is proposing a split between two different meanings
(among others) that are currently both wrapped up in ``sys.prefix``:
the answers to the questions "Where is the standard library?" and
"Where is the site-packages location where third-party modules should
be installed?"
This split could be handled by introducing a new ``sys`` attribute for
either the former prefix or the latter prefix. Either option
potentially introduces some backwards-incompatibility with software
written to assume the other meaning for ``sys.prefix``. (Such
software should preferably be using the APIs in the ``site`` and
``sysconfig`` modules to answer these questions rather than using
``sys.prefix`` directly, in which case there is no
backwards-compatibility issue, but in practice ``sys.prefix`` is
sometimes used.)
The `documentation`__ for ``sys.prefix`` describes it as "A string
giving the site-specific directory prefix where the platform
independent Python files are installed," and specifically mentions the
standard library and header files as found under ``sys.prefix``. It
does not mention ``site-packages``.
__ http://docs.python.org/dev/library/sys.html#sys.prefix
Maintaining this documented definition would mean leaving
``sys.prefix`` pointing to the base system installation (which is
where the standard library and header files are found), and
introducing a new value in ``sys`` (something like
``sys.site_prefix``) to point to the prefix for ``site-packages``.
This would maintain the documented semantics of ``sys.prefix``, but
risk breaking isolation if third-party code uses ``sys.prefix`` rather
than ``sys.site_prefix`` or the appropriate ``site`` API to find
site-packages directories.
The most notable case is probably `setuptools`_ and its fork
`distribute`_, which mostly use ``distutils``/``sysconfig`` APIs, but
do use ``sys.prefix`` directly to build up a list of site directories
for pre-flight checking where ``pth`` files can usefully be placed.
Otherwise, a `Google Code Search`_ turns up what appears to be a
roughly even mix of usage between packages using ``sys.prefix`` to
build up a site-packages path and packages using it to e.g. eliminate
the standard-library from code-execution tracing.
Although it requires modifying the documented definition of
``sys.prefix``, this PEP prefers to have ``sys.prefix`` point to the
virtual environment (where ``site-packages`` is found), and introduce
``sys.base_prefix`` to point to the standard library and Python header
files. Rationale for this choice:
* It is preferable to err on the side of greater isolation of the
virtual environment.
* Virtualenv already modifies ``sys.prefix`` to point at the virtual
environment, and in practice this has not been a problem.
* No modification is required to setuptools/distribute.
.. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools
.. _distribute: http://packages.python.org/distribute/
.. _Google Code Search: http://www.google.com/codesearch#search/&q=sys\.prefix&p=1&type=cs
Open Questions
==============
What about include files?
-------------------------
For example, ZeroMQ installs ``zmq.h`` and ``zmq_utils.h`` in
``$VE/include``, whereas SIP (part of PyQt4) installs sip.h by default
in ``$VE/include/pythonX.Y``. With virtualenv, everything works
because the PythonX.Y include is symlinked, so everything that's
needed is in ``$VE/include``. At the moment the reference
implementation doesn't do anything with include files, besides
creating the include directory; this might need to change, to
copy/symlink ``$VE/include/pythonX.Y``.
As in Python there's no abstraction for a site-specific include
directory, other than for platform-specific stuff, then the user
expectation would seem to be that all include files anyone could ever
want should be found in one of just two locations, with sysconfig
labels "include" & "platinclude".
There's another issue: what if includes are Python-version-specific?
For example, SIP installs by default into ``$VE/include/pythonX.Y``
rather than ``$VE/include``, presumably because there's
version-specific stuff in there - but even if that's not the case with
SIP, it could be the case with some other package. And the problem
that gives is that you can't just symlink the ``include/pythonX.Y``
directory, but actually have to provide a writable directory and
symlink/copy the contents from the system ``include/pythonX.Y``. Of
course this is not hard to do, but it does seem inelegant. OTOH it's
really because there's no supporting concept in ``Python/sysconfig``.
Testability and Source Build Issues
-----------------------------------
Currently in the reference implementation, virtual environments must
be created with an installed Python, rather than a source build, as
the base installation. In order to be able to fully test the ``venv``
module in the Python regression test suite, some anomalies in how
sysconfig data is configured in source builds will need to be removed.
For example, ``sysconfig.get_paths()`` in a source build gives
(partial output)::
{
'include': '/home/vinay/tools/pythonv/Include',
'libdir': '/usr/lib ; or /usr/lib64 on a multilib system',
'platinclude': '/home/vinay/tools/pythonv',
'platlib': '/usr/local/lib/python3.3/site-packages',
'platstdlib': '/usr/local/lib/python3.3',
'purelib': '/usr/local/lib/python3.3/site-packages',
'stdlib': '/usr/local/lib/python3.3'
}
Need for ``install_name_tool`` on OSX?
--------------------------------------
`Virtualenv uses`_ ``install_name_tool``, a tool provided in the Xcode
developer tools, to modify the copied executable on OSX. We need
input from OSX developers on whether this is actually necessary in
this PEP's implementation of virtual environments, and if so, if there
is an alternative to ``install_name_tool`` that would allow ``venv``
to not require that Xcode is installed.
.. _Virtualenv uses: https://github.com/pypa/virtualenv/issues/168
Un-release Manager and Crew
===========================
Provide a mode that is isolated only from user site packages? ============================ ==================
------------------------------------------------------------- Position Name
============================ ==================
2.8 Un-release Manager Cardinal Biggles
============================ ==================
Is there sufficient rationale for providing a mode that isolates the
venv from :pep:`370` user site packages, but not from the system-level
site-packages?
Un-release Schedule
===================
Other Python implementations? The current un-schedule is:
-----------------------------
We should get feedback from Jython, IronPython, and PyPy about whether - 2.8 final Never
there's anything in this PEP that they foresee as a difficulty for
their implementation.
Reference Implementation Official pronouncement
======================== ======================
The in-progress reference implementation is found in `a clone of the Rule number six: there is *no* official Python 2.8 release. There never will
CPython Mercurial repository`_. To test it, build and install it (the be an official Python 2.8 release. It is an ex-release. Python 2.7
virtual environment tool currently does not run from a source tree). is the end of the Python 2 line of development.
From the installed Python, run ``bin/pyvenv /path/to/new/virtualenv``
to create a virtual environment.
The reference implementation (like this PEP!) is a work in progress.
.. _a clone of the CPython Mercurial repository: https://bitbucket.org/vinay.sajip/pythonv Upgrade path
============
The official upgrade path from Python 2.7 is to Python 3.
And Now For Something Completely Different
==========================================
In all seriousness, there are important reasons why there won't be an
official Python 2.8 release, and why you should plan to migrate
instead to Python 3.
Python is (as of this writing) more than 20 years old, and Guido and the
community have learned a lot in those intervening years. Guido's
original concept for Python 3 was to make changes to the language
primarily to remove the warts that had grown in the preceding
versions. Python 3 was not to be a complete redesign, but instead an
evolution of the language, and while maintaining full backward
compatibility with Python 2 was explicitly off-the-table, neither were
gratuitous changes in syntax or semantics acceptable. In most cases,
Python 2 code can be translated fairly easily to Python 3, sometimes
entirely mechanically by such tools as `2to3`_ (there's also a non-trivial
subset of the language that will run without modification on both 2.7 and
3.x).
Because maintaining multiple versions of Python is a significant drag
on the resources of the Python developers, and because the
improvements to the language and libraries embodied in Python 3 are so
important, it was decided to end the Python 2 lineage with Python
2.7. Thus, all new development occurs in the Python 3 line of
development, and there will never be an official Python 2.8 release.
Python 2.7 will however be maintained for longer than the usual period
of time.
Here are some highlights of the significant improvements in Python 3.
You can read in more detail on the differences_ between Python 2 and
Python 3. There are also many good guides on porting_ from Python 2
to Python 3.
Strings and bytes
-----------------
Python 2's basic original strings are called 8-bit strings, and
they play a dual role in Python 2 as both ASCII text and as byte
sequences. While Python 2 also has a unicode string type, the
fundamental ambiguity of the core string type, coupled with Python 2's
default behavior of supporting automatic coercion from 8-bit strings
to unicode objects when the two are combined, often leads to
``UnicodeError``\ s. Python 3's standard string type is Unicode based, and
Python 3 adds a dedicated bytes type, but critically, no automatic coercion
between bytes and unicode strings is provided. The closest the language gets
to implicit coercion are a few text-based APIs that assume a default
encoding (usually UTF-8) if no encoding is explicitly stated. Thus, the core
interpreter, its I/O libraries, module names, etc. are clear in their
distinction between unicode strings and bytes. Python 3's unicode
support even extends to the filesystem, so that non-ASCII file names are
natively supported.
This string/bytes clarity is often a source of difficulty in
transitioning existing code to Python 3, because many third party
libraries and applications are themselves ambiguous in this
distinction. Once migrated though, most ``UnicodeError``\ s can be
eliminated.
Numbers
-------
Python 2 has two basic integer types, a native machine-sized ``int``
type, and an arbitrary length ``long`` type. These have been merged in
Python 3 into a single ``int`` type analogous to Python 2's ``long``
type.
In addition, integer division now produces floating point numbers for
non-integer results.
Classes
-------
Python 2 has two core class hierarchies, often called *classic
classes* and *new-style classes*. The latter allow for such things as
inheriting from the builtin basic types, support descriptor based tools
like the ``property`` builtin and provide a generally more sane and coherent
system for dealing with multiple inheritance. Python 3 provided the
opportunity to completely drop support for classic classes, so all classes
in Python 3 automatically use the new-style semantics (although that's a
misnomer now). There is no need to explicitly inherit from ``object`` or set
the default metatype to enable them (in fact, setting a default metatype at
the module level is no longer supported - the default metatype is always
``object``).
The mechanism for explicitly specifying a metaclass has also changed to use
a ``metaclass`` keyword argument in the class header line rather than a
``__metaclass__`` magic attribute in the class body.
Multiple spellings
------------------
There are many cases in Python 2 where multiple spellings of some
constructs exist, such as ``repr()`` and *backticks*, or the two
inequality operators ``!=`` and ``<>``. In all cases, Python 3 has chosen
exactly one spelling and removed the other (e.g. ``repr()`` and ``!=``
were kept).
Imports
-------
In Python 3, implicit relative imports within packages are no longer
available - only absolute imports and explicit relative imports are
supported. In addition, star imports (e.g. ``from x import *``) are only
permitted in module level code.
Also, some areas of the standard library have been reorganized to make
the naming scheme more intuitive. Some rarely used builtins have been
relocated to standard library modules.
Iterators and views
-------------------
Many APIs, which in Python 2 returned concrete lists, in Python 3 now
return iterators or lightweight *views*.
Copyright Copyright
@ -539,13 +178,17 @@ Copyright
This document has been placed in the public domain. This document has been placed in the public domain.
.. _`2to3`: http://docs.python.org/library/2to3.html
.. _differences: http://docs.python.org/release/3.0.1/whatsnew/3.0.html
.. _porting: http://python3porting.com/
.. ..
Local Variables: Local Variables:
mode: indented-text mode: indented-text
indent-tabs-mode: nil indent-tabs-mode: nil
sentence-end-double-space: t sentence-end-double-space: t
fill-column: 70 fill-column: 70
coding: utf-8 coding: utf-8
End: End:

551
pep-0405.txt Normal file
View File

@ -0,0 +1,551 @@
PEP: 405
Title: Python Virtual Environments
Version: $Revision$
Last-Modified: $Date$
Author: Carl Meyer <carl@oddbird.net>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 13-Jun-2011
Python-Version: 3.3
Post-History: 24-Oct-2011, 28-Oct-2011
Abstract
========
This PEP proposes to add to Python a mechanism for lightweight
"virtual environments" with their own site directories, optionally
isolated from system site directories. Each virtual environment has
its own Python binary (allowing creation of environments with various
Python versions) and can have its own independent set of installed
Python packages in its site directories, but shares the standard
library with the base installed Python.
Motivation
==========
The utility of Python virtual environments has already been well
established by the popularity of existing third-party
virtual-environment tools, primarily Ian Bicking's `virtualenv`_.
Virtual environments are already widely used for dependency management
and isolation, ease of installing and using Python packages without
system-administrator access, and automated testing of Python software
across multiple Python versions, among other uses.
Existing virtual environment tools suffer from lack of support from
the behavior of Python itself. Tools such as `rvirtualenv`_, which do
not copy the Python binary into the virtual environment, cannot
provide reliable isolation from system site directories. Virtualenv,
which does copy the Python binary, is forced to duplicate much of
Python's ``site`` module and manually symlink/copy an ever-changing
set of standard-library modules into the virtual environment in order
to perform a delicate boot-strapping dance at every startup.
(Virtualenv must copy the binary in order to provide isolation, as
Python dereferences a symlinked executable before searching for
``sys.prefix``.)
The ``PYTHONHOME`` environment variable, Python's only existing
built-in solution for virtual environments, requires
copying/symlinking the entire standard library into every environment.
Copying the whole standard library is not a lightweight solution, and
cross-platform support for symlinks remains inconsistent (even on
Windows platforms that do support them, creating them often requires
administrator privileges).
A virtual environment mechanism integrated with Python and drawing on
years of experience with existing third-party tools can be lower
maintenance, more reliable, and more easily available to all Python
users.
.. _virtualenv: http://www.virtualenv.org
.. _rvirtualenv: https://github.com/kvbik/rvirtualenv
Specification
=============
When the Python binary is executed, it attempts to determine its
prefix (which it stores in ``sys.prefix``), which is then used to find
the standard library and other key files, and by the ``site`` module
to determine the location of the site-package directories. Currently
the prefix is found (assuming ``PYTHONHOME`` is not set) by first
walking up the filesystem tree looking for a marker file (``os.py``)
that signifies the presence of the standard library, and if none is
found, falling back to the build-time prefix hardcoded in the binary.
This PEP proposes to add a new first step to this search. If a
``pyvenv.cfg`` file is found either adjacent to the Python executable,
or one directory above it, this file is scanned for lines of the form
``key = value``. If a ``home`` key is found, this signifies that the
Python binary belongs to a virtual environment, and the value of the
``home`` key is the directory containing the Python executable used to
create this virtual environment.
In this case, prefix-finding continues as normal using the value of
the ``home`` key as the effective Python binary location, which finds
the prefix of the base installation. ``sys.base_prefix`` is set to
this value, while ``sys.prefix`` is set to the directory containing
``pyvenv.cfg``.
(If ``pyvenv.cfg`` is not found or does not contain the ``home`` key,
prefix-finding continues normally, and ``sys.prefix`` will be equal to
``sys.base_prefix``.)
The ``site`` and ``sysconfig`` standard-library modules are modified
such that the standard library and header files are are found relative
to ``sys.base_prefix``, while site-package directories ("purelib" and
"platlib", in ``sysconfig`` terms) are still found relative to
``sys.prefix``.
(Also, ``sys.base_exec_prefix`` is added, and handled similarly with
regard to ``sys.exec_prefix``.)
Thus, a Python virtual environment in its simplest form would consist
of nothing more than a copy or symlink of the Python binary
accompanied by a ``pyvenv.cfg`` file and a site-packages directory.
Isolation from system site-packages
-----------------------------------
By default, a virtual environment is entirely isolated from the
system-level site-packages directories.
If the ``pyvenv.cfg`` file also contains a key
``include-system-site-packages`` with a value of ``true`` (not case
sensitive), the ``site`` module will also add the system site
directories to ``sys.path`` after the virtual environment site
directories. Thus system-installed packages will still be importable,
but a package of the same name installed in the virtual environment
will take precedence.
:pep:`370` user-level site-packages are considered part of the system
site-packages for venv purposes: they are not available from an
isolated venv, but are available from an
``include-system-site-packages = true`` venv.
Creating virtual environments
-----------------------------
This PEP also proposes adding a new ``venv`` module to the standard
library which implements the creation of virtual environments. This
module can be executed using the ``-m`` flag::
python3 -m venv /path/to/new/virtual/environment
A ``pyvenv`` installed script is also provided to make this more
convenient::
pyvenv /path/to/new/virtual/environment
Running this command creates the target directory (creating any parent
directories that don't exist already) and places a ``pyvenv.cfg`` file
in it with a ``home`` key pointing to the Python installation the
command was run from. It also creates a ``bin/`` (or ``Scripts`` on
Windows) subdirectory containing a copy (or symlink) of the
``python3`` executable, and the ``pysetup3`` script from the
``packaging`` standard library module (to facilitate easy installation
of packages from PyPI into the new virtualenv). And it creates an
(initially empty) ``lib/pythonX.Y/site-packages`` (or
``Lib\site-packages`` on Windows) subdirectory.
If the target directory already exists an error will be raised, unless
the ``--clear`` option was provided, in which case the target
directory will be deleted and virtual environment creation will
proceed as usual.
The created ``pyvenv.cfg`` file also includes the
``include-system-site-packages`` key, set to ``true`` if ``venv`` is
run with the ``--system-site-packages`` option, ``false`` by default.
Multiple paths can be given to ``pyvenv``, in which case an identical
virtualenv will be created, according to the given options, at each
provided path.
The ``venv`` module also provides "shell activation scripts" for POSIX
and Windows systems which simply add the virtual environment's ``bin``
(or ``Scripts``) directory to the front of the user's shell PATH.
This is not strictly necessary for use of a virtual environment (as an
explicit path to the venv's python binary or scripts can just as well
be used), but it is convenient.
The ``venv`` module also adds a ``pysetup3`` script into each venv.
In order to allow ``pysetup`` and other Python package managers to
install packages into the virtual environment the same way they would
install into a normal Python installation, and avoid special-casing
virtual environments in ``sysconfig`` beyond using ``sys.site_prefix``
in place of ``sys.prefix``, the internal virtual environment layout
mimics the layout of the Python installation itself on each platform.
So a typical virtual environment layout on a POSIX system would be::
pyvenv.cfg
bin/python3
bin/python
bin/pysetup3
include/
lib/python3.3/site-packages/
While on a Windows system::
pyvenv.cfg
Scripts/python.exe
Scripts/python3.dll
Scripts/pysetup3.exe
Scripts/pysetup3-script.py
... other DLLs and pyds...
Include/
Lib/site-packages/
Third-party packages installed into the virtual environment will have
their Python modules placed in the ``site-packages`` directory, and
their executables placed in ``bin/`` or ``Scripts\``.
.. note::
On a normal Windows system-level installation, the Python binary
itself wouldn't go inside the "Scripts/" subdirectory, as it does
in the default venv layout. This is useful in a virtual
environment so that a user only has to add a single directory to
their shell PATH in order to effectively "activate" the virtual
environment.
.. note::
On Windows, it is necessary to also copy or symlink DLLs and pyd
files from compiled stdlib modules into the env, because if the
venv is created from a non-system-wide Python installation,
Windows won't be able to find the Python installation's copies of
those files when Python is run from the venv.
Copies versus symlinks
----------------------
The technique in this PEP works equally well in general with a copied
or symlinked Python binary (and other needed DLLs on Windows). Some
users prefer a copied binary (for greater isolation from system
changes) and some prefer a symlinked one (so that e.g. security
updates automatically propagate to virtual environments).
There are some cross-platform difficulties with symlinks:
* Not all Windows versions support symlinks, and even on those that
do, creating them often requires administrator privileges.
* On OSX framework builds of Python, sys.executable is just a stub
that executes the real Python binary. Symlinking this stub does not
work with the implementation in this PEP; it must be copied.
(Fortunately the stub is also small, so copying it is not an issue).
Because of these issues, this PEP proposes to copy the Python binary
by default, to maintain cross-platform consistency in the default
behavior.
The ``pyvenv`` script accepts a ``--symlink`` option. If this option
is provided, the script will attempt to symlink instead of copy. If a
symlink fails (e.g. because they are not supported by the platform, or
additional privileges are needed), the script will warn the user and
fall back to a copy.
On OSX framework builds, where a symlink of the executable would
succeed but create a non-functional virtual environment, the script
will fail with an error message that symlinking is not supported on
OSX framework builds.
API
---
The high-level method described above makes use of a simple API which
provides mechanisms for third-party virtual environment creators to
customize environment creation according to their needs.
The ``venv`` module contains an ``EnvBuilder`` class which accepts the
following keyword arguments on instantiation:
* ``system_site_packages`` - A Boolean value indicating that the
system Python site-packages should be available to the environment.
Defaults to ``False``.
* ``clear`` - A Boolean value which, if true, will delete any existing
target directory instead of raising an exception. Defaults to
``False``.
* ``symlinks`` - A Boolean value indicating whether to attempt to
symlink the Python binary (and any necessary DLLs or other binaries,
e.g. ``pythonw.exe``), rather than copying. Defaults to ``False``.
The instantiated env-builder has a ``create`` method, which takes as
required argument the path (absolute or relative to the current
directory) of the target directory which is to contain the virtual
environment. The ``create`` method either creates the environment in
the specified directory, or raises an appropriate exception.
The ``venv`` module also provides a module-level ``create`` function
as a convenience::
def create(env_dir,
system_site_packages=False, clear=False, use_symlinks=False):
builder = EnvBuilder(
system_site_packages=system_site_packages,
clear=clear,
use_symlinks=use_symlinks)
builder.create(env_dir)
Creators of third-party virtual environment tools are free to use the
provided ``EnvBuilder`` class as a base class.
The ``create`` method of the ``EnvBuilder`` class illustrates the
hooks available for customization::
def create(self, env_dir):
"""
Create a virtualized Python environment in a directory.
:param env_dir: The target directory to create an environment in.
"""
env_dir = os.path.abspath(env_dir)
context = self.create_directories(env_dir)
self.create_configuration(context)
self.setup_python(context)
self.post_setup(context)
Each of the methods ``create_directories``, ``create_configuration``,
``setup_python``, and ``post_setup`` can be overridden. The functions
of these methods are:
* ``create_directories`` - creates the environment directory and all
necessary directories, and returns a context object. This is just a
holder for attributes (such as paths), for use by the other methods.
* ``create_configuration`` - creates the ``pyvenv.cfg`` configuration
file in the environment.
* ``setup_python`` - creates a copy of the Python executable (and,
under Windows, DLLs) in the environment.
* ``post_setup`` - A (no-op by default) hook method which can be
overridden in third party subclasses to pre-install packages or
install scripts in the virtual environment.
In addition, ``EnvBuilder`` provides a utility method that can be
called from ``post_setup`` in subclasses to assist in installing
custom scripts into the virtual environment. The method
``install_scripts`` accepts as arguments the ``context`` object (see
above) and a path to a directory. The directory should contain
subdirectories "common", "posix", "nt", each containing scripts
destined for the bin directory in the environment. The contents of
"common" and the directory corresponding to ``os.name`` are copied
after doing some text replacement of placeholders:
* ``__VENV_DIR__`` is replaced with absolute path of the environment
directory.
* ``__VENV_NAME__`` is replaced with the environment name (final path
segment of environment directory).
* ``__VENV_BIN_NAME__`` is replaced with the name of the bin directory
(either ``bin`` or ``Scripts``).
* ``__VENV_PYTHON__`` is replaced with the absolute path of the
environment's executable.
The ``DistributeEnvBuilder`` subclass in the reference implementation
illustrates how the customization hook can be used in practice to
pre-install Distribute into the virtual environment. It's not
envisaged that ``DistributeEnvBuilder`` will be actually added to
Python core, but it makes the reference implementation more
immediately useful for testing and exploratory purposes.
Backwards Compatibility
=======================
Splitting the meanings of ``sys.prefix``
----------------------------------------
Any virtual environment tool along these lines (which attempts to
isolate site-packages, while still making use of the base Python's
standard library with no need for it to be symlinked into the virtual
environment) is proposing a split between two different meanings
(among others) that are currently both wrapped up in ``sys.prefix``:
the answers to the questions "Where is the standard library?" and
"Where is the site-packages location where third-party modules should
be installed?"
This split could be handled by introducing a new ``sys`` attribute for
either the former prefix or the latter prefix. Either option
potentially introduces some backwards-incompatibility with software
written to assume the other meaning for ``sys.prefix``. (Such
software should preferably be using the APIs in the ``site`` and
``sysconfig`` modules to answer these questions rather than using
``sys.prefix`` directly, in which case there is no
backwards-compatibility issue, but in practice ``sys.prefix`` is
sometimes used.)
The `documentation`__ for ``sys.prefix`` describes it as "A string
giving the site-specific directory prefix where the platform
independent Python files are installed," and specifically mentions the
standard library and header files as found under ``sys.prefix``. It
does not mention ``site-packages``.
__ http://docs.python.org/dev/library/sys.html#sys.prefix
Maintaining this documented definition would mean leaving
``sys.prefix`` pointing to the base system installation (which is
where the standard library and header files are found), and
introducing a new value in ``sys`` (something like
``sys.site_prefix``) to point to the prefix for ``site-packages``.
This would maintain the documented semantics of ``sys.prefix``, but
risk breaking isolation if third-party code uses ``sys.prefix`` rather
than ``sys.site_prefix`` or the appropriate ``site`` API to find
site-packages directories.
The most notable case is probably `setuptools`_ and its fork
`distribute`_, which mostly use ``distutils``/``sysconfig`` APIs, but
do use ``sys.prefix`` directly to build up a list of site directories
for pre-flight checking where ``pth`` files can usefully be placed.
Otherwise, a `Google Code Search`_ turns up what appears to be a
roughly even mix of usage between packages using ``sys.prefix`` to
build up a site-packages path and packages using it to e.g. eliminate
the standard-library from code-execution tracing.
Although it requires modifying the documented definition of
``sys.prefix``, this PEP prefers to have ``sys.prefix`` point to the
virtual environment (where ``site-packages`` is found), and introduce
``sys.base_prefix`` to point to the standard library and Python header
files. Rationale for this choice:
* It is preferable to err on the side of greater isolation of the
virtual environment.
* Virtualenv already modifies ``sys.prefix`` to point at the virtual
environment, and in practice this has not been a problem.
* No modification is required to setuptools/distribute.
.. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools
.. _distribute: http://packages.python.org/distribute/
.. _Google Code Search: http://www.google.com/codesearch#search/&q=sys\.prefix&p=1&type=cs
Open Questions
==============
What about include files?
-------------------------
For example, ZeroMQ installs ``zmq.h`` and ``zmq_utils.h`` in
``$VE/include``, whereas SIP (part of PyQt4) installs sip.h by default
in ``$VE/include/pythonX.Y``. With virtualenv, everything works
because the PythonX.Y include is symlinked, so everything that's
needed is in ``$VE/include``. At the moment the reference
implementation doesn't do anything with include files, besides
creating the include directory; this might need to change, to
copy/symlink ``$VE/include/pythonX.Y``.
As in Python there's no abstraction for a site-specific include
directory, other than for platform-specific stuff, then the user
expectation would seem to be that all include files anyone could ever
want should be found in one of just two locations, with sysconfig
labels "include" & "platinclude".
There's another issue: what if includes are Python-version-specific?
For example, SIP installs by default into ``$VE/include/pythonX.Y``
rather than ``$VE/include``, presumably because there's
version-specific stuff in there - but even if that's not the case with
SIP, it could be the case with some other package. And the problem
that gives is that you can't just symlink the ``include/pythonX.Y``
directory, but actually have to provide a writable directory and
symlink/copy the contents from the system ``include/pythonX.Y``. Of
course this is not hard to do, but it does seem inelegant. OTOH it's
really because there's no supporting concept in ``Python/sysconfig``.
Testability and Source Build Issues
-----------------------------------
Currently in the reference implementation, virtual environments must
be created with an installed Python, rather than a source build, as
the base installation. In order to be able to fully test the ``venv``
module in the Python regression test suite, some anomalies in how
sysconfig data is configured in source builds will need to be removed.
For example, ``sysconfig.get_paths()`` in a source build gives
(partial output)::
{
'include': '/home/vinay/tools/pythonv/Include',
'libdir': '/usr/lib ; or /usr/lib64 on a multilib system',
'platinclude': '/home/vinay/tools/pythonv',
'platlib': '/usr/local/lib/python3.3/site-packages',
'platstdlib': '/usr/local/lib/python3.3',
'purelib': '/usr/local/lib/python3.3/site-packages',
'stdlib': '/usr/local/lib/python3.3'
}
Need for ``install_name_tool`` on OSX?
--------------------------------------
`Virtualenv uses`_ ``install_name_tool``, a tool provided in the Xcode
developer tools, to modify the copied executable on OSX. We need
input from OSX developers on whether this is actually necessary in
this PEP's implementation of virtual environments, and if so, if there
is an alternative to ``install_name_tool`` that would allow ``venv``
to not require that Xcode is installed.
.. _Virtualenv uses: https://github.com/pypa/virtualenv/issues/168
Provide a mode that is isolated only from user site packages?
-------------------------------------------------------------
Is there sufficient rationale for providing a mode that isolates the
venv from :pep:`370` user site packages, but not from the system-level
site-packages?
Other Python implementations?
-----------------------------
We should get feedback from Jython, IronPython, and PyPy about whether
there's anything in this PEP that they foresee as a difficulty for
their implementation.
Reference Implementation
========================
The in-progress reference implementation is found in `a clone of the
CPython Mercurial repository`_. To test it, build and install it (the
virtual environment tool currently does not run from a source tree).
From the installed Python, run ``bin/pyvenv /path/to/new/virtualenv``
to create a virtual environment.
The reference implementation (like this PEP!) is a work in progress.
.. _a clone of the CPython Mercurial repository: https://bitbucket.org/vinay.sajip/pythonv
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

274
pep-0406.txt Normal file
View File

@ -0,0 +1,274 @@
PEP: 406
Title: Improved Encapsulation of Import State
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>, Greg Slodkowicz <jergosh@gmail.com>
Status: Deferred
Type: Standards Track
Content-Type: text/x-rst
Created: 4-Jul-2011
Python-Version: 3.4
Post-History: 31-Jul-2011, 13-Nov-2011, 4-Dec-2011
Abstract
========
This PEP proposes the introduction of a new 'ImportEngine' class as part of
``importlib`` which would encapsulate all state related to importing modules
into a single object. Creating new instances of this object would then provide
an alternative to completely replacing the built-in implementation of the
import statement, by overriding the ``__import__()`` function. To work with
the builtin import functionality and importing via import engine objects,
this PEP proposes a context management based approach to temporarily replacing
the global import state.
The PEP also proposes inclusion of a ``GlobalImportEngine`` subclass and a
globally accessible instance of that class, which "writes through" to the
process global state. This provides a backwards compatible bridge between the
proposed encapsulated API and the legacy process global state, and allows
straightforward support for related state updates (e.g. selectively
invalidating path cache entries when ``sys.path`` is modified).
PEP Deferral
============
The import system is already seeing substantial changes in Python 3.3, to
natively handle packages split across multiple directories (PEP 382) and
(potentially) to make the import semantics in the main module better match
those in other modules (PEP 395).
Accordingly, the proposal in this PEP will not be seriously considered until
Python 3.4 at the earliest.
Rationale
=========
Currently, most state related to the import system is stored as module level
attributes in the ``sys`` module. The one exception is the import lock, which
is not accessible directly, but only via the related functions in the ``imp``
module. The current process global import state comprises:
* sys.modules
* sys.path
* sys.path_hooks
* sys.meta_path
* sys.path_importer_cache
* the import lock (imp.lock_held()/acquire_lock()/release_lock())
Isolating this state would allow multiple import states to be
conveniently stored within a process. Placing the import functionality
in a self-contained object would also allow subclassing to add additional
features (e.g. module import notifications or fine-grained control
over which modules can be imported). The engine would also be
subclassed to make it possible to use the import engine API to
interact with the existing process-global state.
The namespace PEPs (especially PEP 402) raise a potential need for
*additional* process global state, in order to correctly update package paths
as ``sys.path`` is modified.
Finally, providing a coherent object for all this state makes it feasible to
also provide context management features that allow the import state to be
temporarily substituted.
Proposal
========
We propose introducing an ImportEngine class to encapsulate import
functionality. This includes an ``__import__()`` method which can
be used as an alternative to the built-in ``__import__()`` when
desired and also an ``import_module()`` method, equivalent to
``importlib.import_module()`` [3]_.
Since there are global import state invariants that are assumed and should be
maintained, we introduce a ``GlobalImportState`` class with an interface
identical to ``ImportEngine`` but directly accessing the current global import
state. This can be easily implemented using class properties.
Specification
=============
ImportEngine API
~~~~~~~~~~~~~~~~
The proposed extension consists of the following objects:
``importlib.engine.ImportEngine``
``from_engine(self, other)``
Create a new import object from another ImportEngine instance. The
new object is initialised with a copy of the state in ``other``. When
called on ``importlib engine.sysengine``, ``from_engine()`` can be
used to create an ``ImportEngine`` object with a **copy** of the
global import state.
``__import__(self, name, globals={}, locals={}, fromlist=[], level=0)``
Reimplementation of the builtin ``__import__()`` function. The
import of a module will proceed using the state stored in the
ImportEngine instance rather than the global import state. For full
documentation of ``__import__`` funtionality, see [2]_ .
``__import__()`` from ``ImportEngine`` and its subclasses can be used
to customise the behaviour of the ``import`` statement by replacing
``__builtin__.__import__`` with ``ImportEngine().__import__``.
``import_module(name, package=None)``
A reimplementation of ``importlib.import_module()`` which uses the
import state stored in the ImportEngine instance. See [3]_ for a full
reference.
``modules, path, path_hooks, meta_path, path_importer_cache``
Instance-specific versions of their process global ``sys`` equivalents
``importlib.engine.GlobalImportEngine(ImportEngine)``
Convenience class to provide engine-like access to the global state.
Provides ``__import__()``, ``import_module()`` and ``from_engine()``
methods like ``ImportEngine`` but writes through to the global state
in ``sys``.
To support various namespace package mechanisms, when ``sys.path`` is altered,
tools like ``pkgutil.extend_path`` should be used to also modify other parts
of the import state (in this case, package ``__path__`` attributes). The path
importer cache should also be invalidated when a variety of changes are made.
The ``ImportEngine`` API will provide convenience methods that automatically
make related import state updates as part of a single operation.
Global variables
~~~~~~~~~~~~~~~~
``importlib.engine.sysengine``
A precreated instance of ``GlobalImportEngine``. Intended for use by
importers and loaders that have been updated to accept optional ``engine``
parameters and with ``ImportEngine.from_engine(sysengine)`` to start with
a copy of the process global import state.
No changes to finder/loader interfaces
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Rather than attempting to update the PEP 302 APIs to accept additional state,
this PEP proposes that ``ImportEngine`` support the content management
protocol (similar to the context substitution mechanisms in the ``decimal``
module).
The context management mechanism for ``ImportEngine`` would:
* On entry:
* Acquire the import lock
* Substitute the global import state with the import engine's own state
* On exit:
* Restore the previous global import state
* Release the import lock
The precise API for this is TBD (but will probably use a distinct context
management object, along the lines of that created by
``decimal.localcontext``).
Open Issues
===========
API design for falling back to global import state
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The current proposal relies on the ``from_engine()`` API to fall back to the
global import state. It may be desirable to offer a variant that instead falls
back to the global import state dynamically.
However, one big advantage of starting with an "as isolated as possible"
design is that it becomes possible to experiment with subclasses that blur
the boundaries between the engine instance state and the process global state
in various ways.
Builtin and extension modules must be process global
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Due to platform limitations, only one copy of each builtin and extension
module can readily exist in each process. Accordingly, it is impossible for
each ``ImportEngine`` instance to load such modules independently.
The simplest solution is for ``ImportEngine`` to refuse to load such modules,
raising ``ImportError``. ``GlobalImportEngine`` would be able to load them
normally.
``ImportEngine`` will still return such modules from a prepopulated module
cache - it's only loading them directly which causes problems.
Scope of substitution
~~~~~~~~~~~~~~~~~~~~~
Related to the previous open issue is the question of what state to substitute
when using the context management API. It is currently the case that replacing
``sys.modules`` can be unreliable due to cached references and there's the
underlying fact that having independent copies of some modules is simply
impossible due to platform limitations.
As part of this PEP, it will be necessary to document explicitly:
* Which parts of the global import state can be substituted (and declare code
which caches references to that state without dealing with the substitution
case buggy)
* Which parts must be modified in-place (and hence are not substituted by the
``ImportEngine`` context management API, or otherwise scoped to
``ImportEngine`` instances)
Reference Implementation
========================
A reference implementation [4]_ for an earlier draft of this PEP, based on
Brett Cannon's importlib has been developed by Greg Slodkowicz as part of the
2011 Google Summer of Code. Note that the current implementation avoids
modifying existing code, and hence duplicates a lot of things unnecessarily.
An actual implementation would just modify any such affected code in place.
That earlier draft of the PEP proposed change the PEP 302 APIs to support passing
in an optional engine instance. This had the (serious) downside of not correctly
affecting further imports from the imported module, hence the change to the
context management based proposal for substituting the global state.
References
==========
.. [1] PEP 302, New Import Hooks, J van Rossum, Moore
(http://www.python.org/dev/peps/pep-0302)
.. [2] __import__() builtin function, The Python Standard Library documentation
(http://docs.python.org/library/functions.html#__import__)
.. [3] Importlib documentation, Cannon
(http://docs.python.org/dev/library/importlib)
.. [4] Reference implentation
(https://bitbucket.org/jergosh/gsoc_import_engine/src/default/Lib/importlib/engine.py)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

179
pep-0407.txt Normal file
View File

@ -0,0 +1,179 @@
PEP: 407
Title: New release cycle and introducing long-term support versions
Version: $Revision$
Last-Modified: $Date$
Author: Antoine Pitrou <solipsis@pitrou.net>,
Georg Brandl <georg@python.org>,
Barry Warsaw <barry@python.org>
Status: Draft
Type: Process
Content-Type: text/x-rst
Created: 2012-01-12
Post-History: http://mail.python.org/pipermail/python-dev/2012-January/115838.html
Resolution: TBD
Abstract
========
Finding a release cycle for an open-source project is a delicate
exercise in managing mutually contradicting constraints: developer
manpower, availability of release management volunteers, ease of
maintenance for users and third-party packagers, quick availability of
new features (and behavioural changes), availability of bug fixes
without pulling in new features or behavioural changes.
The current release cycle errs on the conservative side. It is
adequate for people who value stability over reactivity. This PEP is
an attempt to keep the stability that has become a Python trademark,
while offering a more fluid release of features, by introducing the
notion of long-term support versions.
Scope
=====
This PEP doesn't try to change the maintenance period or release
scheme for the 2.7 branch. Only 3.x versions are considered.
Proposal
========
Under the proposed scheme, there would be two kinds of feature
versions (sometimes dubbed "minor versions", for example 3.2 or 3.3):
normal feature versions and long-term support (LTS) versions.
Normal feature versions would get either zero or at most one bugfix
release; the latter only if needed to fix critical issues. Security
fix handling for these branches needs to be decided.
LTS versions would get regular bugfix releases until the next LTS
version is out. They then would go into security fixes mode, up to a
termination date at the release manager's discretion.
Periodicity
-----------
A new feature version would be released every X months. We
tentatively propose X = 6 months.
LTS versions would be one out of N feature versions. We tentatively
propose N = 4.
With these figures, a new LTS version would be out every 24 months,
and remain supported until the next LTS version 24 months later. This
is mildly similar to today's 18 months bugfix cycle for every feature
version.
Pre-release versions
--------------------
More frequent feature releases imply a smaller number of disruptive
changes per release. Therefore, the number of pre-release builds
(alphas and betas) can be brought down considerably. Two alpha builds
and a single beta build would probably be enough in the regular case.
The number of release candidates depends, as usual, on the number of
last-minute fixes before final release.
Effects
=======
Effect on development cycle
---------------------------
More feature releases might mean more stress on the development and
release management teams. This is quantitatively alleviated by the
smaller number of pre-release versions; and qualitatively by the
lesser amount of disruptive changes (meaning less potential for
breakage). The shorter feature freeze period (after the first beta
build until the final release) is easier to accept. The rush for
adding features just before feature freeze should also be much
smaller.
Effect on bugfix cycle
----------------------
The effect on fixing bugs should be minimal with the proposed figures.
The same number of branches would be simultaneously open for bugfix
maintenance (two until 2.x is terminated, then one).
Effect on workflow
------------------
The workflow for new features would be the same: developers would only
commit them on the ``default`` branch.
The workflow for bug fixes would be slightly updated: developers would
commit bug fixes to the current LTS branch (for example ``3.3``) and
then merge them into ``default``.
If some critical fixes are needed to a non-LTS version, they can be
grafted from the current LTS branch to the non-LTS branch, just like
fixes are ported from 3.x to 2.7 today.
Effect on the community
-----------------------
People who value stability can just synchronize on the LTS releases
which, with the proposed figures, would give a similar support cycle
(both in duration and in stability).
People who value reactivity and access to new features (without taking
the risk to install alpha versions or Mercurial snapshots) would get
much more value from the new release cycle than currently.
People who want to contribute new features or improvements would be
more motivated to do so, knowing that their contributions will be more
quickly available to normal users. Also, a smaller feature freeze
period makes it less cumbersome to interact with contributors of
features.
Discussion
==========
These are open issues that should be worked out during discussion:
* Decide on X (months between feature releases) and N (feature releases
per LTS release) as defined above.
* For given values of X and N, is the no-bugfix-releases policy for
non-LTS versions feasible?
* What is the policy for security fixes?
* Restrict new syntax and similar changes (i.e. everything that was
prohibited by PEP 3003) to LTS versions?
* What is the effect on packagers such as Linux distributions?
* How will release version numbers or other identifying and marketing
material make it clear to users which versions are normal feature
releases and which are LTS releases? How do we manage user
expectations?
* Does the faster release cycle mean we could some day reach 3.10 and
above? Some people expressed a tacit expectation that version numbers
always fit in one decimal digit.
A community poll or survey to collect opinions from the greater Python
community would be valuable before making a final decision.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

317
pep-0408.txt Normal file
View File

@ -0,0 +1,317 @@
PEP: 408
Title: Standard library __preview__ package
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>,
Eli Bendersky <eliben@gmail.com>
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 2012-01-07
Python-Version: 3.3
Post-History: 2012-01-27
Resolution: http://mail.python.org/pipermail/python-dev/2012-January/115962.html
Abstract
========
The process of including a new module into the Python standard library is
hindered by the API lock-in and promise of backward compatibility implied by
a module being formally part of Python. This PEP proposes a transitional
state for modules - inclusion in a special ``__preview__`` package for the
duration of a minor release (roughly 18 months) prior to full acceptance into
the standard library. On one hand, this state provides the module with the
benefits of being formally part of the Python distribution. On the other hand,
the core development team explicitly states that no promises are made with
regards to the module's eventual full inclusion into the standard library,
or to the stability of its API, which may change for the next release.
PEP Rejection
=============
Based on his experience with a similar "labs" namespace in Google App Engine,
Guido has rejected this PEP [3] in favour of the simpler alternative of
explicitly marking provisional modules as such in their documentation.
If a module is otherwise considered suitable for standard library inclusion,
but some concerns remain regarding maintainability or certain API details,
then the module can be accepted on a provisional basis. While it is considered
an unlikely outcome, such modules *may* be removed from the standard library
without a deprecation period if the lingering concerns prove well-founded.
As part of the same announcement, Guido explicitly accepted Matthew
Barnett's 'regex' module [4] as a provisional addition to the standard
library for Python 3.3 (using the 'regex' name, rather than as a drop-in
replacement for the existing 're' module).
Proposal - the __preview__ package
==================================
Whenever the Python core development team decides that a new module should be
included into the standard library, but isn't entirely sure about whether the
module's API is optimal, the module can be placed in a special package named
``__preview__`` for a single minor release.
In the next minor release, the module may either be "graduated" into the
standard library (and occupy its natural place within its namespace, leaving the
``__preview__`` package), or be rejected and removed entirely from the Python
source tree. If the module ends up graduating into the standard library after
spending a minor release in ``__preview__``, its API may be changed according
to accumulated feedback. The core development team explicitly makes no
guarantees about API stability and backward compatibility of modules in
``__preview__``.
Entry into the ``__preview__`` package marks the start of a transition of the
module into the standard library. It means that the core development team
assumes responsibility of the module, similarly to any other module in the
standard library.
Which modules should go through ``__preview__``
-----------------------------------------------
We expect most modules proposed for addition into the Python standard library
to go through a minor release in ``__preview__``. There may, however, be some
exceptions, such as modules that use a pre-defined API (for example ``lzma``,
which generally follows the API of the existing ``bz2`` module), or modules
with an API that has wide acceptance in the Python development community.
In any case, modules that are proposed to be added to the standard library,
whether via ``__preview__`` or directly, must fulfill the acceptance conditions
set by PEP 2.
It is important to stress that the aim of of this proposal is not to make the
process of adding new modules to the standard library more difficult. On the
contrary, it tries to provide a means to add *more* useful libraries. Modules
which are obvious candidates for entry can be added as before. Modules which
due to uncertainties about the API could be stalled for a long time now have
a means to still be distributed with Python, via an incubation period in the
``__preview__`` package.
Criteria for "graduation"
-------------------------
In principle, most modules in the ``__preview__`` package should eventually
graduate to the stable standard library. Some reasons for not graduating are:
* The module may prove to be unstable or fragile, without sufficient developer
support to maintain it.
* A much better alternative module may be found during the preview release
Essentially, the decision will be made by the core developers on a per-case
basis. The point to emphasize here is that a module's appearance in the
``__preview__`` package in some release does not guarantee it will continue
being part of Python in the next release.
Example
-------
Suppose the ``example`` module is a candidate for inclusion in the standard
library, but some Python developers aren't convinced that it presents the best
API for the problem it intends to solve. The module can then be added to the
``__preview__`` package in release ``3.X``, importable via::
from __preview__ import example
Assuming the module is then promoted to the the standard library proper in
release ``3.X+1``, it will be moved to a permanent location in the library::
import example
And importing it from ``__preview__`` will no longer work.
Rationale
=========
Benefits for the core development team
--------------------------------------
Currently, the core developers are really reluctant to add new interfaces to
the standard library. This is because as soon as they're published in a
release, API design mistakes get locked in due to backward compatibility
concerns.
By gating all major API additions through some kind of a preview mechanism
for a full release, we get one full release cycle of community feedback
before we lock in the APIs with our standard backward compatibility guarantee.
We can also start integrating preview modules with the rest of the standard
library early, so long as we make it clear to packagers that the preview
modules should not be considered optional. The only difference between preview
APIs and the rest of the standard library is that preview APIs are explicitly
exempted from the usual backward compatibility guarantees.
Essentially, the ``__preview__`` package is intended to lower the risk of
locking in minor API design mistakes for extended periods of time. Currently,
this concern can block new additions, even when the core development team
consensus is that a particular addition is a good idea in principle.
Benefits for end users
----------------------
For future end users, the broadest benefit lies in a better "out-of-the-box"
experience - rather than being told "oh, the standard library tools for task X
are horrible, download this 3rd party library instead", those superior tools
are more likely to be just be an import away.
For environments where developers are required to conduct due diligence on
their upstream dependencies (severely harming the cost-effectiveness of, or
even ruling out entirely, much of the material on PyPI), the key benefit lies
in ensuring that anything in the ``__preview__`` package is clearly under
python-dev's aegis from at least the following perspectives:
* Licensing: Redistributed by the PSF under a Contributor Licensing Agreement.
* Documentation: The documentation of the module is published and organized via
the standard Python documentation tools (i.e. ReST source, output generated
with Sphinx and published on http://docs.python.org).
* Testing: The module test suites are run on the python.org buildbot fleet
and results published via http://www.python.org/dev/buildbot.
* Issue management: Bugs and feature requests are handled on
http://bugs.python.org
* Source control: The master repository for the software is published
on http://hg.python.org.
Candidates for inclusion into __preview__
=========================================
For Python 3.3, there are a number of clear current candidates:
* ``regex`` (http://pypi.python.org/pypi/regex)
* ``daemon`` (PEP 3143)
* ``ipaddr`` (PEP 3144)
Other possible future use cases include:
* Improved HTTP modules (e.g. ``requests``)
* HTML 5 parsing support (e.g. ``html5lib``)
* Improved URL/URI/IRI parsing
* A standard image API (PEP 368)
* Encapsulation of the import state (PEP 368)
* Standard event loop API (PEP 3153)
* A binary version of WSGI for Python 3 (e.g. PEP 444)
* Generic function support (e.g. ``simplegeneric``)
Relationship with PEP 407
=========================
PEP 407 proposes a change to the core Python release cycle to permit interim
releases every 6 months (perhaps limited to standard library updates). If
such a change to the release cycle is made, the following policy for the
``__preview__`` namespace is suggested:
* For long term support releases, the ``__preview__`` namespace would always
be empty.
* New modules would be accepted into the ``__preview__`` namespace only in
interim releases that immediately follow a long term support release.
* All modules added will either be migrated to their final location in the
standard library or dropped entirely prior to the next long term support
release.
Rejected alternatives and variations
====================================
Using ``__future__``
--------------------
Python already has a "forward-looking" namespace in the form of the
``__future__`` module, so it's reasonable to ask why that can't be re-used for
this new purpose.
There are two reasons why doing so not appropriate:
1. The ``__future__`` module is actually linked to a separate compiler
directives feature that can actually change the way the Python interpreter
compiles a module. We don't want that for the preview package - we just want
an ordinary Python package.
2. The ``__future__`` module comes with an express promise that names will be
maintained in perpetuity, long after the associated features have become the
compiler's default behaviour. Again, this is precisely the opposite of what is
intended for the preview package - it is almost certain that all names added to
the preview will be removed at some point, most likely due to their being moved
to a permanent home in the standard library, but also potentially due to their
being reverted to third party package status (if community feedback suggests the
proposed addition is irredeemably broken).
Versioning the package
----------------------
One proposed alternative [1]_ was to add explicit versioning to the
``__preview__`` package, i.e. ``__preview34__``. We think that it's better to
simply define that a module being in ``__preview__`` in Python 3.X will either
graduate to the normal standard library namespace in Python 3.X+1 or will
disappear from the Python source tree altogether. Versioning the ``_preview__``
package complicates the process and does not align well with the main intent of
this proposal.
Using a package name without leading and trailing underscores
-------------------------------------------------------------
It was proposed [1]_ to use a package name like ``preview`` or ``exp``, instead
of ``__preview__``. This was rejected in the discussion due to the special
meaning a "dunder" package name (that is, a name *with* leading and
trailing double-underscores) conveys in Python. Besides, a non-dunder name
would suggest normal standard library API stability guarantees, which is not
the intention of the ``__preview__`` package.
Preserving pickle compatibility
-------------------------------
A pickled class instance based on a module in ``__preview__`` in release 3.X
won't be unpickle-able in release 3.X+1, where the module won't be in
``__preview__``. Special code may be added to make this work, but this goes
against the intent of this proposal, since it implies backward compatibility.
Therefore, this PEP does not propose to preserve pickle compatibility.
Credits
=======
Dj Gilcrease initially proposed the idea of having a ``__preview__`` package
in Python [2]_. Although his original proposal uses the name
``__experimental__``, we feel that ``__preview__`` conveys the meaning of this
package in a better way.
References
==========
.. [#] Discussed in this thread:
http://mail.python.org/pipermail/python-ideas/2012-January/013246.html
.. [#] http://mail.python.org/pipermail/python-ideas/2011-August/011278.html
.. [#] Guido's decision:
http://mail.python.org/pipermail/python-dev/2012-January/115962.html
.. [#] Proposal for inclusion of regex: http://bugs.python.org/issue2636
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

201
pep-0409.txt Normal file
View File

@ -0,0 +1,201 @@
PEP: 409
Title: Suppressing exception context
Version: $Revision$
Last-Modified: $Date$
Author: Ethan Furman <ethan@stoneleaf.us>
Status: Accepted
Type: Standards Track
Content-Type: text/x-rst
Created: 26-Jan-2012
Post-History: 30-Aug-2002, 01-Feb-2012, 03-Feb-2012
Resolution: http://mail.python.org/pipermail/python-dev/2012-February/116136.html
Abstract
========
One of the open issues from PEP 3134 is suppressing context: currently
there is no way to do it. This PEP proposes one.
Rationale
=========
There are two basic ways to generate exceptions:
1) Python does it (buggy code, missing resources, ending loops, etc.)
2) manually (with a raise statement)
When writing libraries, or even just custom classes, it can become
necessary to raise exceptions; moreover it can be useful, even
necessary, to change from one exception to another. To take an example
from my dbf module::
try:
value = int(value)
except Exception:
raise DbfError(...)
Whatever the original exception was (``ValueError``, ``TypeError``, or
something else) is irrelevant. The exception from this point on is a
``DbfError``, and the original exception is of no value. However, if
this exception is printed, we would currently see both.
Alternatives
============
Several possibilities have been put forth:
* ``raise as NewException()``
Reuses the ``as`` keyword; can be confusing since we are not really
reraising the originating exception
* ``raise NewException() from None``
Follows existing syntax of explicitly declaring the originating
exception
* ``exc = NewException(); exc.__context__ = None; raise exc``
Very verbose way of the previous method
* ``raise NewException.no_context(...)``
Make context suppression a class method.
All of the above options will require changes to the core.
Proposal
========
I proprose going with the second option::
raise NewException from None
It has the advantage of using the existing pattern of explicitly setting
the cause::
raise KeyError() from NameError()
but because the cause is ``None`` the previous context is not displayed
by the default exception printing routines.
Implementation Discussion
=========================
Currently, ``None`` is the default for both ``__context__`` and ``__cause__``.
In order to support ``raise ... from None`` (which would set ``__cause__`` to
``None``) we need a different default value for ``__cause__``. Several ideas
were put forth on how to implement this at the language level:
* Overwrite the previous exception information (side-stepping the issue and
leaving ``__cause__`` at ``None``).
Rejected as this can seriously hinder debugging due to
`poor error messages`_.
* Use one of the boolean values in ``__cause__``: ``False`` would be the
default value, and would be replaced when ``from ...`` was used with the
explicity chained exception or ``None``.
Rejected as this encourages the use of two different objects types for
``__cause__`` with one of them (boolean) not allowed to have the full range
of possible values (``True`` would never be used).
* Create a special exception class, ``__NoException__``.
Rejected as possibly confusing, possibly being mistakenly raised by users,
and not being a truly unique value as ``None``, ``True``, and ``False`` are.
* Use ``Ellipsis`` as the default value (the ``...`` singleton).
Accepted.
Ellipses are commonly used in English as place holders when words are
omitted. This works in our favor here as a signal that ``__cause__`` is
omitted, so look in ``__context__`` for more details.
Ellipsis is not an exception, so cannot be raised.
There is only one Ellipsis, so no unused values.
Error information is not thrown away, so custom code can trace the entire
exception chain even if the default code does not.
Language Details
================
To support ``raise Exception from None``, ``__context__`` will stay as it is,
but ``__cause__`` will start out as ``Ellipsis`` and will change to ``None``
when the ``raise Exception from None`` method is used.
============================================ ================== =======================================
form __context__ __cause__
============================================ ================== =======================================
raise ``None`` ``Ellipsis``
reraise previous exception ``Ellipsis``
reraise from ``None`` | ``ChainedException`` previous exception ``None`` | explicitly chained exception
============================================ ================== =======================================
The default exception printing routine will then:
* If ``__cause__`` is ``Ellipsis`` the ``__context__`` (if any) will be
printed.
* If ``__cause__`` is ``None`` the ``__context__`` will not be printed.
* if ``__cause__`` is anything else, ``__cause__`` will be printed.
In both of the latter cases the exception chain will stop being followed.
Because the default value for ``__cause__`` is now ``Ellipsis`` and ``raise
Exception from Cause`` is simply syntactic sugar for::
_exc = NewException()
_exc.__cause__ = Cause()
raise _exc
``Ellipsis``, as well as ``None``, is now allowed as a cause::
raise Exception from Ellipsis
Patches
=======
There is a patch for CPython implementing this attached to `Issue 6210`_.
References
==========
Discussion and refinements in this `thread on python-dev`_.
.. _poor error messages:
http://bugs.python.org/msg152294
.. _issue 6210:
http://bugs.python.org/issue6210
.. _Thread on python-dev:
http://mail.python.org/pipermail/python-dev/2012-January/115838.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

547
pep-0410.txt Normal file
View File

@ -0,0 +1,547 @@
PEP: 410
Title: Use decimal.Decimal type for timestamps
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 01-February-2012
Python-Version: 3.3
Resolution: http://mail.python.org/pipermail/python-dev/2012-February/116837.html
Rejection Notice
================
This PEP is rejected.
See http://mail.python.org/pipermail/python-dev/2012-February/116837.html.
Abstract
========
Decimal becomes the official type for high-resolution timestamps to make Python
support new functions using a nanosecond resolution without loss of precision.
Rationale
=========
Python 2.3 introduced float timestamps to support sub-second resolutions.
os.stat() uses float timestamps by default since Python 2.5. Python 3.3
introduced functions supporting nanosecond resolutions:
* os module: futimens(), utimensat()
* time module: clock_gettime(), clock_getres(), monotonic(), wallclock()
os.stat() reads nanosecond timestamps but returns timestamps as float.
The Python float type uses binary64 format of the IEEE 754 standard. With a
resolution of one nanosecond (10\ :sup:`-9`), float timestamps lose precision
for values bigger than 2\ :sup:`24` seconds (194 days: 1970-07-14 for an Epoch
timestamp).
Nanosecond resolution is required to set the exact modification time on
filesystems supporting nanosecond timestamps (e.g. ext4, btrfs, NTFS, ...). It
helps also to compare the modification time to check if a file is newer than
another file. Use cases: copy the modification time of a file using
shutil.copystat(), create a TAR archive with the tarfile module, manage a
mailbox with the mailbox module, etc.
An arbitrary resolution is preferred over a fixed resolution (like nanosecond)
to not have to change the API when a better resolution is required. For
example, the NTP protocol uses fractions of 2\ :sup:`32` seconds
(approximatively 2.3 × 10\ :sup:`-10` second), whereas the NTP protocol version
4 uses fractions of 2\ :sup:`64` seconds (5.4 × 10\ :sup:`-20` second).
.. note::
With a resolution of 1 microsecond (10\ :sup:`-6`), float timestamps lose
precision for values bigger than 2\ :sup:`33` seconds (272 years: 2242-03-16
for an Epoch timestamp). With a resolution of 100 nanoseconds
(10\ :sup:`-7`, resolution used on Windows), float timestamps lose precision
for values bigger than 2\ :sup:`29` seconds (17 years: 1987-01-05 for an
Epoch timestamp).
Specification
=============
Add decimal.Decimal as a new type for timestamps. Decimal supports any
timestamp resolution, support arithmetic operations and is comparable. It is
possible to coerce a Decimal to float, even if the conversion may lose
precision. The clock resolution can also be stored in a Decimal object.
Add an optional *timestamp* argument to:
* os module: fstat(), fstatat(), lstat(), stat() (st_atime,
st_ctime and st_mtime fields of the stat structure),
sched_rr_get_interval(), times(), wait3() and wait4()
* resource module: ru_utime and ru_stime fields of getrusage()
* signal module: getitimer(), setitimer()
* time module: clock(), clock_gettime(), clock_getres(),
monotonic(), time() and wallclock()
The *timestamp* argument value can be float or Decimal, float is still the
default for backward compatibility. The following functions support Decimal as
input:
* datetime module: date.fromtimestamp(), datetime.fromtimestamp() and
datetime.utcfromtimestamp()
* os module: futimes(), futimesat(), lutimes(), utime()
* select module: epoll.poll(), kqueue.control(), select()
* signal module: setitimer(), sigtimedwait()
* time module: ctime(), gmtime(), localtime(), sleep()
The os.stat_float_times() function is deprecated: use an explicit cast using
int() instead.
.. note::
The decimal module is implemented in Python and is slower than float, but
there is a new C implementation which is almost ready for inclusion in
CPython.
Backwards Compatibility
=======================
The default timestamp type (float) is unchanged, so there is no impact on
backward compatibility nor on performances. The new timestamp type,
decimal.Decimal, is only returned when requested explicitly.
Objection: clocks accuracy
==========================
Computer clocks and operating systems are inaccurate and fail to provide
nanosecond accuracy in practice. A nanosecond is what it takes to execute a
couple of CPU instructions. Even on a real-time operating system, a
nanosecond-precise measurement is already obsolete when it starts being
processed by the higher-level application. A single cache miss in the CPU will
make the precision worthless.
.. note::
Linux *actually* is able to measure time in nanosecond precision, even
though it is not able to keep its clock synchronized to UTC with a
nanosecond accuracy.
Alternatives: Timestamp types
=============================
To support timestamps with an arbitrary or nanosecond resolution, the following
types have been considered:
* decimal.Decimal
* number of nanoseconds
* 128-bits float
* datetime.datetime
* datetime.timedelta
* tuple of integers
* timespec structure
Criteria:
* Doing arithmetic on timestamps must be possible
* Timestamps must be comparable
* An arbitrary resolution, or at least a resolution of one nanosecond without
losing precision
* It should be possible to coerce the new timestamp to float for backward
compatibility
A resolution of one nanosecond is enough to support all current C functions.
The best resolution used by operating systems is one nanosecond. In practice,
most clock accuracy is closer to microseconds than nanoseconds. So it sounds
reasonable to use a fixed resolution of one nanosecond.
Number of nanoseconds (int)
---------------------------
A nanosecond resolution is enough for all current C functions and so a
timestamp can simply be a number of nanoseconds, an integer, not a float.
The number of nanoseconds format has been rejected because it would require to
add new specialized functions for this format because it not possible to
differentiate a number of nanoseconds and a number of seconds just by checking
the object type.
128-bits float
--------------
Add a new IEEE 754-2008 quad-precision binary float type. The IEEE 754-2008
quad precision float has 1 sign bit, 15 bits of exponent and 112 bits of
mantissa. 128-bits float is supported by GCC (4.3), Clang and ICC compilers.
Python must be portable and so cannot rely on a type only available on some
platforms. For example, Visual C++ 2008 doesn't support 128-bits float, whereas
it is used to build the official Windows executables. Another example: GCC 4.3
does not support __float128 in 32-bit mode on x86 (but GCC 4.4 does).
There is also a license issue: GCC uses the MPFR library for 128-bits float,
library distributed under the GNU LGPL license. This license is not compatible
with the Python license.
.. note::
The x87 floating point unit of Intel CPU supports 80-bit floats. This format
is not supported by the SSE instruction set, which is now preferred over
float, especially on x86_64. Other CPU vendors don't support 80-bit float.
datetime.datetime
-----------------
The datetime.datetime type is the natural choice for a timestamp because it is
clear that this type contains a timestamp, whereas int, float and Decimal are
raw numbers. It is an absolute timestamp and so is well defined. It gives
direct access to the year, month, day, hours, minutes and seconds. It has
methods related to time like methods to format the timestamp as string (e.g.
datetime.datetime.strftime).
The major issue is that except os.stat(), time.time() and
time.clock_gettime(time.CLOCK_GETTIME), all time functions have an unspecified
starting point and no timezone information, and so cannot be converted to
datetime.datetime.
datetime.datetime has also issues with timezone. For example, a datetime object
without timezone (unaware) and a datetime with a timezone (aware) cannot be
compared. There is also an ordering issues with daylight saving time (DST) in
the duplicate hour of switching from DST to normal time.
datetime.datetime has been rejected because it cannot be used for functions
using an unspecified starting point like os.times() or time.clock().
For time.time() and time.clock_gettime(time.CLOCK_GETTIME): it is already
possible to get the current time as a datetime.datetime object using::
datetime.datetime.now(datetime.timezone.utc)
For os.stat(), it is simple to create a datetime.datetime object from a
decimal.Decimal timestamp in the UTC timezone::
datetime.datetime.fromtimestamp(value, datetime.timezone.utc)
.. note::
datetime.datetime only supports microsecond resolution, but can be enhanced
to support nanosecond.
datetime.timedelta
------------------
datetime.timedelta is the natural choice for a relative timestamp because it is
clear that this type contains a timestamp, whereas int, float and Decimal are
raw numbers. It can be used with datetime.datetime to get an absolute timestamp
when the starting point is known.
datetime.timedelta has been rejected because it cannot be coerced to float and
has a fixed resolution. One new standard timestamp type is enough, Decimal is
preferred over datetime.timedelta. Converting a datetime.timedelta to float
requires an explicit call to the datetime.timedelta.total_seconds() method.
.. note::
datetime.timedelta only supports microsecond resolution, but can be enhanced
to support nanosecond.
.. _tuple:
Tuple of integers
-----------------
To expose C functions in Python, a tuple of integers is the natural choice to
store a timestamp because the C language uses structures with integers fields
(e.g. timeval and timespec structures). Using only integers avoids the loss of
precision (Python supports integers of arbitrary length). Creating and parsing
a tuple of integers is simple and fast.
Depending of the exact format of the tuple, the precision can be arbitrary or
fixed. The precision can be choose as the loss of precision is smaller than
an arbitrary limit like one nanosecond.
Different formats have been proposed:
* A: (numerator, denominator)
* value = numerator / denominator
* resolution = 1 / denominator
* denominator > 0
* B: (seconds, numerator, denominator)
* value = seconds + numerator / denominator
* resolution = 1 / denominator
* 0 <= numerator < denominator
* denominator > 0
* C: (intpart, floatpart, base, exponent)
* value = intpart + floatpart / base\ :sup:`exponent`
* resolution = 1 / base \ :sup:`exponent`
* 0 <= floatpart < base \ :sup:`exponent`
* base > 0
* exponent >= 0
* D: (intpart, floatpart, exponent)
* value = intpart + floatpart / 10\ :sup:`exponent`
* resolution = 1 / 10 \ :sup:`exponent`
* 0 <= floatpart < 10 \ :sup:`exponent`
* exponent >= 0
* E: (sec, nsec)
* value = sec + nsec × 10\ :sup:`-9`
* resolution = 10 \ :sup:`-9` (nanosecond)
* 0 <= nsec < 10 \ :sup:`9`
All formats support an arbitrary resolution, except of the format (E).
The format (D) may not be able to store the exact value (may loss of precision)
if the clock frequency is arbitrary and cannot be expressed as a power of 10.
The format (C) has a similar issue, but in such case, it is possible to use
base=frequency and exponent=1.
The formats (C), (D) and (E) allow optimization for conversion to float if the
base is 2 and to decimal.Decimal if the base is 10.
The format (A) is a simple fraction. It supports arbitrary precision, is simple
(only two fields), only requires a simple division to get the floating point
value, and is already used by float.as_integer_ratio().
To simplify the implementation (especially the C implementation to avoid
integer overflow), a numerator bigger than the denominator can be accepted.
The tuple may be normalized later.
Tuple of integers have been rejected because they don't support arithmetic
operations.
.. note::
On Windows, the ``QueryPerformanceCounter()`` clock uses the frequency of
the processor which is an arbitrary number and so may not be a power or 2 or
10. The frequency can be read using ``QueryPerformanceFrequency()``.
timespec structure
------------------
timespec is the C structure used to store timestamp with a nanosecond
resolution. Python can use a type with the same structure: (seconds,
nanoseconds). For convenience, arithmetic operations on timespec are supported.
Example of an incomplete timespec type supporting addition, subtraction and
coercion to float::
class timespec(tuple):
def __new__(cls, sec, nsec):
if not isinstance(sec, int):
raise TypeError
if not isinstance(nsec, int):
raise TypeError
asec, nsec = divmod(nsec, 10 ** 9)
sec += asec
obj = tuple.__new__(cls, (sec, nsec))
obj.sec = sec
obj.nsec = nsec
return obj
def __float__(self):
return self.sec + self.nsec * 1e-9
def total_nanoseconds(self):
return self.sec * 10 ** 9 + self.nsec
def __add__(self, other):
if not isinstance(other, timespec):
raise TypeError
ns_sum = self.total_nanoseconds() + other.total_nanoseconds()
return timespec(*divmod(ns_sum, 10 ** 9))
def __sub__(self, other):
if not isinstance(other, timespec):
raise TypeError
ns_diff = self.total_nanoseconds() - other.total_nanoseconds()
return timespec(*divmod(ns_diff, 10 ** 9))
def __str__(self):
if self.sec < 0 and self.nsec:
sec = abs(1 + self.sec)
nsec = 10**9 - self.nsec
return '-%i.%09u' % (sec, nsec)
else:
return '%i.%09u' % (self.sec, self.nsec)
def __repr__(self):
return '<timespec(%s, %s)>' % (self.sec, self.nsec)
The timespec type is similar to the format (E) of tuples of integer, except
that it supports arithmetic and coercion to float.
The timespec type was rejected because it only supports nanosecond resolution
and requires to implement each arithmetic operation, whereas the Decimal type
is already implemented and well tested.
Alternatives: API design
========================
Add a string argument to specify the return type
------------------------------------------------
Add an string argument to function returning timestamps, example:
time.time(format="datetime"). A string is more extensible than a type: it is
possible to request a format that has no type, like a tuple of integers.
This API was rejected because it was necessary to import implicitly modules to
instantiate objects (e.g. import datetime to create datetime.datetime).
Importing a module may raise an exception and may be slow, such behaviour is
unexpected and surprising.
Add a global flag to change the timestamp type
----------------------------------------------
A global flag like os.stat_decimal_times(), similar to os.stat_float_times(),
can be added to set globally the timestamp type.
A global flag may cause issues with libraries and applications expecting float
instead of Decimal. Decimal is not fully compatible with float. float+Decimal
raises a TypeError for example. The os.stat_float_times() case is different
because an int can be coerced to float and int+float gives float.
Add a protocol to create a timestamp
------------------------------------
Instead of hard coding how timestamps are created, a new protocol can be added
to create a timestamp from a fraction.
For example, time.time(timestamp=type) would call the class method
type.__fromfraction__(numerator, denominator) to create a timestamp object of
the specified type. If the type doesn't support the protocol, a fallback is
used: type(numerator) / type(denominator).
A variant is to use a "converter" callback to create a timestamp. Example
creating a float timestamp:
def timestamp_to_float(numerator, denominator):
return float(numerator) / float(denominator)
Common converters can be provided by time, datetime and other modules, or maybe
a specific "hires" module. Users can define their own converters.
Such protocol has a limitation: the timestamp structure has to be decided once
and cannot be changed later. For example, adding a timezone or the absolute
start of the timestamp would break the API.
The protocol proposition was as being excessive given the requirements, but
that the specific syntax proposed (time.time(timestamp=type)) allows this to be
introduced later if compelling use cases are discovered.
.. note::
Other formats may be used instead of a fraction: see the tuple of integers
section for example.
Add new fields to os.stat
-------------------------
To get the creation, modification and access time of a file with a nanosecond
resolution, three fields can be added to os.stat() structure.
The new fields can be timestamps with nanosecond resolution (e.g. Decimal) or
the nanosecond part of each timestamp (int).
If the new fields are timestamps with nanosecond resolution, populating the
extra fields would be time consuming. Any call to os.stat() would be slower,
even if os.stat() is only called to check if a file exists. A parameter can be
added to os.stat() to make these fields optional, the structure would have a
variable number of fields.
If the new fields only contain the fractional part (nanoseconds), os.stat()
would be efficient. These fields would always be present and so set to zero if
the operating system does not support sub-second resolution. Splitting a
timestamp in two parts, seconds and nanoseconds, is similar to the timespec
type and tuple of integers, and so have the same drawbacks.
Adding new fields to the os.stat() structure does not solve the nanosecond
issue in other modules (e.g. the time module).
Add a boolean argument
----------------------
Because we only need one new type (Decimal), a simple boolean flag can be
added. Example: time.time(decimal=True) or time.time(hires=True).
Such flag would require to do an hidden import which is considered as a bad
practice.
The boolean argument API was rejected because it is not "pythonic". Changing
the return type with a parameter value is preferred over a boolean parameter (a
flag).
Add new functions
-----------------
Add new functions for each type, examples:
* time.clock_decimal()
* time.time_decimal()
* os.stat_decimal()
* os.stat_timespec()
* etc.
Adding a new function for each function creating timestamps duplicate a lot of
code and would be a pain to maintain.
Add a new hires module
----------------------
Add a new module called "hires" with the same API than the time module, except
that it would return timestamp with high resolution, e.g. decimal.Decimal.
Adding a new module avoids to link low-level modules like time or os to the
decimal module.
This idea was rejected because it requires to duplicate most of the code of the
time module, would be a pain to maintain, and timestamps are used modules other
than the time module. Examples: signal.sigtimedwait(), select.select(),
resource.getrusage(), os.stat(), etc. Duplicate the code of each module is not
acceptable.
Links
=====
Python:
* `Issue #7652: Merge C version of decimal into py3k <http://bugs.python.org/issue7652>`_ (cdecimal)
* `Issue #11457: os.stat(): add new fields to get timestamps as Decimal objects with nanosecond resolution <http://bugs.python.org/issue11457>`_
* `Issue #13882: PEP 410: Use decimal.Decimal type for timestamps <http://bugs.python.org/issue13882>`_
* `[Python-Dev] Store timestamps as decimal.Decimal objects <http://mail.python.org/pipermail/python-dev/2012-January/116025.html>`_
Other languages:
* Ruby (1.9.3), the `Time class <http://ruby-doc.org/core-1.9.3/Time.html>`_
supports picosecond (10\ :sup:`-12`)
* .NET framework, `DateTime type <http://msdn.microsoft.com/en-us/library/system.datetime.ticks.aspx>`_:
number of 100-nanosecond intervals that have elapsed since 12:00:00
midnight, January 1, 0001. DateTime.Ticks uses a signed 64-bit integer.
* Java (1.5), `System.nanoTime() <http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/System.html#nanoTime()>`_:
wallclock with an unspecified starting point as a number of nanoseconds, use
a signed 64 bits integer (long).
* Perl, `Time::Hiref module <http://perldoc.perl.org/Time/HiRes.html>`_:
use float so has the same loss of precision issue with nanosecond resolution
than Python float timestamps
Copyright
=========
This document has been placed in the public domain.

208
pep-0411.txt Normal file
View File

@ -0,0 +1,208 @@
PEP: 411
Title: Provisional packages in the Python standard library
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>,
Eli Bendersky <eliben@gmail.com>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 2012-02-10
Python-Version: 3.3
Post-History: 2012-02-10
Abstract
========
The process of including a new package into the Python standard library is
hindered by the API lock-in and promise of backward compatibility implied by
a package being formally part of Python. This PEP describes a methodology
for marking a standard library package "provisional" for the period of a single
minor release. A provisional package may have its API modified prior to
"graduating" into a "stable" state. On one hand, this state provides the
package with the benefits of being formally part of the Python distribution.
On the other hand, the core development team explicitly states that no promises
are made with regards to the the stability of the package's API, which may
change for the next release. While it is considered an unlikely outcome,
such packages may even be removed from the standard library without a
deprecation period if the concerns regarding their API or maintenance prove
well-founded.
Proposal - a documented provisional state
=========================================
Whenever the Python core development team decides that a new package should be
included into the standard library, but isn't entirely sure about whether the
package's API is optimal, the package can be included and marked as
"provisional".
In the next minor release, the package may either be "graduated" into a normal
"stable" state in the standard library, remain in provisional state, or be
rejected and removed entirely from the Python source tree. If the package ends
up graduating into the stable state after being provisional, its API may
be changed according to accumulated feedback. The core development team
explicitly makes no guarantees about API stability and backward compatibility
of provisional packages.
Marking a package provisional
-----------------------------
A package will be marked provisional by a notice in its documentation page and
its docstring. The following paragraph will be added as a note at the top of
the documentation page:
The <X> package has been included in the standard library on a
provisional basis. Backwards incompatible changes (up to and including
removal of the package) may occur if deemed necessary by the core
developers.
The phrase "provisional basis" will then be a link to the glossary term
"provisional package", defined as:
A provisional package is one which has been deliberately excluded from the
standard library's normal backwards compatibility guarantees. While major
changes to such packages are not expected, as long as they are marked
provisional, backwards incompatible changes (up to and including removal of
the package) may occur if deemed necessary by core developers. Such changes
will not be made gratuitously - they will occur only if serious flaws are
uncovered that were missed prior to the inclusion of the package.
This process allows the standard library to continue to evolve over time,
without locking in problematic design errors for extended periods of time.
See PEP 411 for more details.
The following will be added to the start of the packages's docstring:
The API of this package is currently provisional. Refer to the
documentation for details.
Moving a package from the provisional to the stable state simply implies
removing these notes from its documentation page and docstring.
Which packages should go through the provisional state
------------------------------------------------------
We expect most packages proposed for addition into the Python standard library
to go through a minor release in the provisional state. There may, however,
be some exceptions, such as packages that use a pre-defined API (for example
``lzma``, which generally follows the API of the existing ``bz2`` package),
or packages with an API that has wide acceptance in the Python development
community.
In any case, packages that are proposed to be added to the standard library,
whether via the provisional state or directly, must fulfill the acceptance
conditions set by PEP 2.
Criteria for "graduation"
-------------------------
In principle, most provisional packages should eventually graduate to the
stable standard library. Some reasons for not graduating are:
* The package may prove to be unstable or fragile, without sufficient developer
support to maintain it.
* A much better alternative package may be found during the preview release.
Essentially, the decision will be made by the core developers on a per-case
basis. The point to emphasize here is that a package's inclusion in the
standard library as "provisional" in some release does not guarantee it will
continue being part of Python in the next release.
Rationale
=========
Benefits for the core development team
--------------------------------------
Currently, the core developers are really reluctant to add new interfaces to
the standard library. This is because as soon as they're published in a
release, API design mistakes get locked in due to backward compatibility
concerns.
By gating all major API additions through some kind of a provisional mechanism
for a full release, we get one full release cycle of community feedback
before we lock in the APIs with our standard backward compatibility guarantee.
We can also start integrating provisional packages with the rest of the standard
library early, so long as we make it clear to packagers that the provisional
packages should not be considered optional. The only difference between
provisional APIs and the rest of the standard library is that provisional APIs
are explicitly exempted from the usual backward compatibility guarantees.
Benefits for end users
----------------------
For future end users, the broadest benefit lies in a better "out-of-the-box"
experience - rather than being told "oh, the standard library tools for task X
are horrible, download this 3rd party library instead", those superior tools
are more likely to be just be an import away.
For environments where developers are required to conduct due diligence on
their upstream dependencies (severely harming the cost-effectiveness of, or
even ruling out entirely, much of the material on PyPI), the key benefit lies
in ensuring that all packages in the provisional state are clearly under
python-dev's aegis from at least the following perspectives:
* Licensing: Redistributed by the PSF under a Contributor Licensing Agreement.
* Documentation: The documentation of the package is published and organized via
the standard Python documentation tools (i.e. ReST source, output generated
with Sphinx and published on http://docs.python.org).
* Testing: The package test suites are run on the python.org buildbot fleet
and results published via http://www.python.org/dev/buildbot.
* Issue management: Bugs and feature requests are handled on
http://bugs.python.org
* Source control: The master repository for the software is published
on http://hg.python.org.
Candidates for provisional inclusion into the standard library
==============================================================
For Python 3.3, there are a number of clear current candidates:
* ``regex`` (http://pypi.python.org/pypi/regex) - approved by Guido [#]_.
* ``daemon`` (PEP 3143)
* ``ipaddr`` (PEP 3144)
Other possible future use cases include:
* Improved HTTP modules (e.g. ``requests``)
* HTML 5 parsing support (e.g. ``html5lib``)
* Improved URL/URI/IRI parsing
* A standard image API (PEP 368)
* Improved encapsulation of import state (PEP 406)
* Standard event loop API (PEP 3153)
* A binary version of WSGI for Python 3 (e.g. PEP 444)
* Generic function support (e.g. ``simplegeneric``)
Rejected alternatives and variations
====================================
See PEP 408.
References
==========
.. [#] http://mail.python.org/pipermail/python-dev/2012-January/115962.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

181
pep-0412.txt Normal file
View File

@ -0,0 +1,181 @@
PEP: 412
Title: Key-Sharing Dictionary
Version: $Revision$
Last-Modified: $Date$
Author: Mark Shannon <mark@hotpy.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 08-Feb-2012
Python-Version: 3.3 or 3.4
Post-History: 08-Feb-2012
Abstract
========
This PEP proposes a change in the implementation of the builtin dictionary
type ``dict``. The new implementation allows dictionaries which are used as
attribute dictionaries (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.
Motivation
==========
The current dictionary implementation uses more memory than is necessary
when used as a container for object attributes as the keys are
replicated for each instance rather than being shared across many instances
of the same class.
Despite this, the current dictionary implementation is finely tuned and
performs very well as a general-purpose mapping object.
By separating the keys (and hashes) from the values it is possible to share
the keys between multiple dictionaries and improve memory use.
By ensuring that keys are separated from the values only when beneficial,
it is possible to retain the high-performance of the current dictionary
implementation when used as a general-purpose mapping object.
Behaviour
=========
The new dictionary behaves in the same way as the old implementation.
It fully conforms to the Python API, the C API and the ABI.
Performance
===========
Memory Usage
------------
Reduction in memory use is directly related to the number of dictionaries
with shared keys in existence at any time. These dictionaries are typically
half the size of the current dictionary implementation.
Benchmarking shows that memory use is reduced by 10% to 20% for
object-oriented programs with no significant change in memory use
for other programs.
Speed
-----
The performance of the new implementation is dominated by memory locality
effects. When keys are not shared (for example in module dictionaries
and dictionary explicitly created by dict() or {} ) then performance is
unchanged (within a percent or two) from the current implementation.
For the shared keys case, the new implementation tends to separate keys
from values, but reduces total memory usage. This will improve performance
in many cases as the effects of reduced memory usage outweigh the loss of
locality, but some programs may show a small slow down.
Benchmarking shows no significant change of speed for most benchmarks.
Object-oriented benchmarks show small speed ups when they create large
numbers of objects of the same class (the gcbench benchmark shows a 10%
speed up; this is likely to be an upper limit).
Implementation
==============
Both the old and new dictionaries consist of a fixed-sized dict struct and
a re-sizeable table.
In the new dictionary the table can be further split into a keys table and
values array.
The keys table holds the keys and hashes and (for non-split tables) the
values as well. It differs only from the original implementation in that it
contains a number of fields that were previously in the dict struct.
If a table is split the values in the keys table are ignored, instead the
values are held in a separate array.
Split-Table dictionaries
------------------------
When dictionaries are created to fill the __dict__ slot of an object, they are
created in split form. The keys table is cached in the type, potentially
allowing all attribute dictionaries of instances of one class to share keys.
In the event of the keys of these dictionaries starting to diverge,
individual dictionaries will lazily convert to the combined-table form.
This ensures good memory use in the common case, and correctness in all cases.
When resizing a split dictionary it is converted to a combined table.
If resizing is as a result of storing an instance attribute, and there is
only instance of a class, then the dictionary will be re-split immediately.
Since most OO code will set attributes in the __init__ method, all attributes
will be set before a second instance is created and no more resizing will be
necessary as all further instance dictionaries will have the correct size.
For more complex use patterns, it is impossible to know what is the best
approach, so the implementation allows extra insertions up to the point
of a resize when it reverts to the combined table (non-shared keys).
A deletion from a split dictionary does not change the keys table, it simply
removes the value from the values array.
Combined-Table dictionaries
---------------------------
Explicit dictionaries (dict() or {}), module dictionaries and most other
dictionaries are created as combined-table dictionaries.
A combined-table dictionary never becomes a split-table dictionary.
Combined tables are laid out in much the same way as the tables in the old
dictionary, resulting in very similar performance.
Implementation
==============
The new dictionary implementation is available at [1]_.
Pros and Cons
=============
Pros
----
Significant memory savings for object-oriented applications.
Small improvement to speed for programs which create lots of similar objects.
Cons
----
Change to data structures:
Third party modules which meddle with the internals of the dictionary
implementation will break.
Changes to repr() output and iteration order:
For most cases, this will be unchanged.
However for some split-table dictionaries the iteration order will
change.
Neither of these cons should be a problem.
Modules which meddle with the internals of the dictionary
implementation are already broken and should be fixed to use the API.
The iteration order of dictionaries was never defined and has always been
arbitrary; it is different for Jython and PyPy.
Alternative Implementation
--------------------------
An alternative implementation for split tables, which could save even more
memory, is to store an index in the value field of the keys table (instead
of ignoring the value field). This index would explicitly state where in the
value array to look. The value array would then only require 1 field for each
usable slot in the key table, rather than each slot in the key table.
This "indexed" version would reduce the size of value array by about
one third. The keys table would need an extra "values_size" field, increasing
the size of combined dicts by one word.
The extra indirection adds more complexity to the code, potentially reducing
performance a little.
The "indexed" version will not be included in this implementation,
but should be considered deferred rather than rejected,
pending further experimentation.
References
==========
.. [1] Reference Implementation:
https://bitbucket.org/markshannon/cpython_new_dict
Copyright
=========
This document has been placed in the public domain.

919
pep-0413.txt Normal file
View File

@ -0,0 +1,919 @@
PEP: 413
Title: Faster evolution of the Python Standard Library
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Draft
Type: Process
Content-Type: text/x-rst
Created: 2012-02-24
Post-History: 2012-02-24, 2012-02-25
Resolution: TBD
Abstract
========
This PEP proposes the adoption of a separate versioning scheme for the
standard library (distinct from, but coupled to, the existing language
versioning scheme) that allows accelerated releases of the Python standard
library, while maintaining (or even slowing down) the current rate of
change in the core language definition.
Like PEP 407, it aims to adjust the current balance between measured
change that allows the broader community time to adapt and being able to
keep pace with external influences that evolve more rapidly than the current
release cycle can handle (this problem is particularly notable for
standard library elements that relate to web technologies).
However, it's more conservative in its aims than PEP 407, seeking to
restrict the increased pace of development to builtin and standard library
interfaces, without affecting the rate of change for other elements such
as the language syntax and version numbering as well as the CPython
binary API and bytecode format.
Rationale
=========
To quote the PEP 407 abstract:
Finding a release cycle for an open-source project is a delicate exercise
in managing mutually contradicting constraints: developer manpower,
availability of release management volunteers, ease of maintenance for
users and third-party packagers, quick availability of new features (and
behavioural changes), availability of bug fixes without pulling in new
features or behavioural changes.
The current release cycle errs on the conservative side. It is adequate
for people who value stability over reactivity. This PEP is an attempt to
keep the stability that has become a Python trademark, while offering a
more fluid release of features, by introducing the notion of long-term
support versions.
I agree with the PEP 407 authors that the current release cycle of the
*standard library* is too slow to effectively cope with the pace of change
in some key programming areas (specifically, web protocols and related
technologies, including databases, templating and serialisation formats).
However, I have written this competing PEP because I believe that the
approach proposed in PEP 407 of offering full, potentially binary
incompatible releases of CPython every 6 months places too great a burden
on the wider Python ecosystem.
Under the current CPython release cycle, distributors of key binary
extensions will often support Python releases even after the CPython branches
enter "security fix only" mode (for example, Twisted currently ships binaries
for 2.5, 2.6 and 2.7, NumPy and SciPy suport those 3 along with 3.1 and 3.2,
PyGame adds a 2.4 binary release, wxPython provides both 32-bit and 64-bit
binaries for 2.6 and 2.7, etc).
If CPython were to triple (or more) its rate of releases, the developers of
those libraries (many of which are even more resource starved than CPython)
would face an unpalatable choice: either adopt the faster release cycle
themselves (up to 18 simultaneous binary releases for PyGame!), drop
older Python versions more quickly, or else tell their users to stick to the
CPython LTS releases (thus defeating the entire point of speeding up the
CPython release cycle in the first place).
Similarly, many support tools for Python (e.g. syntax highlighters) can take
quite some time to catch up with language level changes.
At a cultural level, the Python community is also accustomed to a certain
meaning for Python version numbers - they're linked to deprecation periods,
support periods, all sorts of things. PEP 407 proposes that collective
knowledge all be swept aside, without offering a compelling rationale for why
such a course of action is actually *necessary* (aside from, perhaps, making
the lives of the CPython core developers a little easier at the expense of
everyone else).
However, if we go back to the primary rationale for increasing the pace of
change (i.e. more timely support for web protocols and related technologies),
we can note that those only require *standard library* changes. That means
many (perhaps even most) of the negative effects on the wider community can
be avoided by explicitly limiting which parts of CPython are affected by the
new release cycle, and allowing other parts to evolve at their current, more
sedate, pace.
Proposal
========
This PEP proposes the introduction of a new kind of CPython release:
"standard library releases". As with PEP 407, this will give CPython 3 kinds
of release:
* Language release: "x.y.0"
* Maintenance release: "x.y.z" (where z > 0)
* Standard library release: "x.y (xy.z)" (where z > 0)
Under this scheme, an unqualified version reference (such as "3.3") would
always refer to the most recent corresponding language or maintenance
release. It will never be used without qualification to refer to a standard
library release (at least, not by python-dev - obviously, we can only set an
example, not force the rest of the Python ecosystem to go along with it).
Language releases will continue as they are now, as new versions of the
Python language definition, along with a new version of the CPython
interpreter and the Python standard library. Accordingly, a language
release may contain any and all of the following changes:
* new language syntax
* new standard library changes (see below)
* new deprecation warnings
* removal of previously deprecated features
* changes to the emitted bytecode
* changes to the AST
* any other significant changes to the compilation toolchain
* changes to the core interpreter eval loop
* binary incompatible changes to the C ABI (although the PEP 384 stable ABI
must still be preserved)
* bug fixes
Maintenance releases will also continue as they do today, being strictly
limited to bug fixes for the corresponding language release. No new features
or radical internal changes are permitted.
The new standard library releases will occur in parallel with each
maintenance release and will be qualified with a new version identifier
documenting the standard library version. Standard library releases may
include the following changes:
* new features in pure Python modules
* new features in C extension modules (subject to PEP 399 compatibility
requirements)
* new features in language builtins (provided the C ABI remains unaffected)
* bug fixes from the corresponding maintenance release
Standard library version identifiers are constructed by combining the major
and minor version numbers for the Python language release into a single two
digit number and then appending a sequential standard library version
identifier.
Release Cycle
-------------
When maintenance releases are created, *two* new versions of Python would
actually be published on python.org (using the first 3.3 maintenance release,
planned for February 2013 as an example)::
3.3.1 # Maintenance release
3.3 (33.1) # Standard library release
A further 6 months later, the next 3.3 maintenance release would again be
accompanied by a new standard library release::
3.3.2 # Maintenance release
3.3 (33.2) # Standard library release
Again, the standard library release would be binary compatible with the
previous language release, merely offering additional features at the
Python level.
Finally, 18 months after the release of 3.3, a new language release would
be made around the same time as the final 3.3 maintenance and standard
library releases::
3.3.3 # Maintenance release
3.3 (33.3) # Standard library release
3.4.0 # Language release
The 3.4 release cycle would then follow a similar pattern to that for 3.3::
3.4.1 # Maintenance release
3.4 (34.1) # Standard library release
3.4.2 # Maintenance release
3.4 (34.2) # Standard library release
3.4.3 # Maintenance release
3.4 (34.3) # Standard library release
3.5.0 # Language release
Programmatic Version Identification
-----------------------------------
To expose the new version details programmatically, this PEP proposes the
addition of a new ``sys.stdlib_info`` attribute that records the new
standard library version above and beyond the underlying interpreter
version. Using the initial Python 3.3 release as an example::
sys.stdlib_info(python=33, version=0, releaselevel='final', serial=0)
This information would also be included in the ``sys.version`` string::
Python 3.3.0 (33.0, default, Feb 17 2012, 23:03:41)
[GCC 4.6.1]
Security Fixes and Other "Out of Cycle" Releases
------------------------------------------------
For maintenance releases the process of handling out-of-cycle releases (for
example, to fix a security issue or resolve a critical bug in a new release),
remains the same as it is now: the minor version number is incremented and a
new release is made incorporating the required bug fixes, as well as any
other bug fixes that have been committed since the previous release.
For standard library releases, the process is essentially the same, but the
corresponding "What's New?" document may require some tidying up for the
release (as the standard library release may incorporate new features,
not just bug fixes).
User Scenarios
==============
The versioning scheme proposed above is based on a number of user scenarios
that are likely to be encountered if this scheme is adopted. In each case,
the scenario is described for both the status quo (i.e. slow release cycle)
the versioning scheme in this PEP and the free wheeling minor version number
scheme proposed in PEP 407.
To give away the ending, the point of using a separate version number is that
for almost all scenarios, the important number is the *language* version, not
the standard library version. Most users won't even need to care that the
standard library version number exists. In the two identified cases where
it matters, providing it as a separate number is actually clearer and more
explicit than embedding the two different kinds of number into a single
sequence and then tagging some of the numbers in the unified sequence as
special.
Novice user, downloading Python from python.org in March 2013
-------------------------------------------------------------
**Status quo:** must choose between 3.3 and 2.7
**This PEP:** must choose between 3.3 (33.1), 3.3 and 2.7.
**PEP 407:** must choose between 3.4, 3.3 (LTS) and 2.7.
**Verdict:** explaining the meaning of a Long Term Support release is about as
complicated as explaining the meaning of the proposed standard library release
version numbers. I call this a tie.
Novice user, attempting to judge currency of third party documentation
----------------------------------------------------------------------
**Status quo:** minor version differences indicate 18-24 months of
language evolution
**This PEP:** same as status quo for language core, standard library version
numbers indicate 6 months of standard library evolution.
**PEP 407:** minor version differences indicate 18-24 months of language
evolution up to 3.3, then 6 months of language evolution thereafter.
**Verdict:** Since language changes and deprecations can have a much bigger
effect on the accuracy of third party documentation than the addition of new
features to the standard library, I'm calling this a win for the scheme
in this PEP.
Novice user, looking for an extension module binary release
-----------------------------------------------------------
**Status quo:** look for the binary corresponding to the Python version you are
running.
**This PEP:** same as status quo.
**PEP 407 (full releases):** same as status quo, but corresponding binary version
is more likely to be missing (or, if it does exist, has to be found amongst
a much larger list of alternatives).
**PEP 407 (ABI updates limited to LTS releases):** all binary release pages will
need to tell users that Python 3.3, 3.4 and 3.5 all need the 3.3 binary.
**Verdict:** I call this a clear win for the scheme in this PEP. Absolutely
nothing changes from the current situation, since the standard library
version is actually irrelevant in this case (only binary extension
compatibility is important).
Extension module author, deciding whether or not to make a binary release
-------------------------------------------------------------------------
**Status quo:** unless using the PEP 384 stable ABI, a new binary release is
needed every time the minor version number changes.
**This PEP:** same as status quo.
**PEP 407 (full releases):** same as status quo, but becomes a far more
frequent occurrence.
**PEP 407 (ABI updates limited to LTS releases):** before deciding, must first
look up whether the new release is an LTS release or an interim release. If
it is an LTS release, then a new build is necessary.
**Verdict:** I call this another clear win for the scheme in this PEP. As with
the end user facing side of this problem, the standard library version is
actually irrelevant in this case. Moving that information out to a
separate number avoids creating unnecessary confusion.
Python developer, deciding priority of eliminating a Deprecation Warning
------------------------------------------------------------------------
**Status quo:** code that triggers deprecation warnings is not guaranteed to
run on a version of Python with a higher minor version number.
**This PEP:** same as status quo
**PEP 407:** unclear, as the PEP doesn't currently spell this out. Assuming the
deprecation cycle is linked to LTS releases, then upgrading to a non-LTS
release is safe but upgrading to the next LTS release may require avoiding
the deprecated construct.
**Verdict:** another clear win for the scheme in this PEP since, once again, the
standard library version is irrelevant in this scenario.
Alternative interpreter implementor, updating with new features
---------------------------------------------------------------
**Status quo:** new Python versions arrive infrequently, but are a mish-mash of
standard library updates and core language definition and interpreter
changes.
**This PEP:** standard library updates, which are easier to integrate, are
made available more frequently in a form that is clearly and explicitly
compatible with the previous version of the language definition. This means
that, once an alternative implementation catches up to Python 3.3, they
should have a much easier time incorporating standard library features as
they happen (especially pure Python changes), leaving minor version number
updates as the only task that requires updates to their core compilation and
execution components.
**PEP 407 (full releases):** same as status quo, but becomes a far more
frequent occurrence.
**PEP 407 (language updates limited to LTS releases):** unclear, as the PEP
doesn't currently spell out a specific development strategy. Assuming a
3.3 compatibility branch is adopted (as proposed in this PEP), then the
outcome would be much the same, but the version number signalling would be
slightly less clear (since you would have to check to see if a particular
release was an LTS release or not).
**Verdict:** while not as clear cut as some previous scenarios, I'm still
calling this one in favour of the scheme in this PEP. Explicit is better than
implicit, and the scheme in this PEP makes a clear split between the two
different kinds of update rather than adding a separate "LTS" tag to an
otherwise ordinary release number. Tagging a particular version as being
special is great for communicating with version control systems and associated
automated tools, but it's a lousy way to communicate information to other
humans.
Python developer, deciding their minimum version dependency
-----------------------------------------------------------
**Status quo:** look for "version added" or "version changed" markers in the
documentation, check against ``sys.version_info``
**This PEP:** look for "version added" or "version changed" markers in the
documentation. If written as a bare Python version, such as "3.3", check
against ``sys.version_info``. If qualified with a standard library version,
such as "3.3 (33.1)", check against ``sys.stdlib_info``.
**PEP 407:** same as status quo
**Verdict:** the scheme in this PEP actually allows third party libraries to be
more explicit about their rate of adoption of standard library features. More
conservative projects will likely pin their dependency to the language
version and avoid features added in the standard library releases. Faster
moving projects could instead declare their dependency on a particular
standard library version. However, since PEP 407 does have the advantage of
preserving the status quo, I'm calling this one for PEP 407 (albeit with a
slim margin).
Python developers, attempting to reproduce a tracker issue
----------------------------------------------------------
**Status quo:** if not already provided, ask the reporter which version of
Python they're using. This is often done by asking for the first two lines
displayed by the interactive prompt or the value of ``sys.version``.
**This PEP:** same as the status quo (as ``sys.version`` will be updated to
also include the standard library version), but may be needed on additional
occasions (where the user knew enough to state their Python version, but that
proved to be insufficient to reproduce the fault).
**PEP 407:** same as the status quo
**Verdict:** another marginal win for PEP 407. The new standard library version
*is* an extra piece of information that users may need to pass back to
developers when reporting issues with Python libraries (or Python itself,
on our own tracker). However, by including it in ``sys.version``, many
fault reports will already include it, and it is easy to request if needed.
CPython release managers, handling a security fix
-------------------------------------------------
**Status quo:** create a new maintenance release incorporating the security
fix and any other bug fixes under source control. Also create source releases
for any branches open solely for security fixes.
**This PEP:** same as the status quo for maintenance branches. Also create a
new standard library release (potentially incorporating new features along
with the security fix). For security branches, create source releases for
both the former maintenance branch and the standard library update branch.
**PEP 407:** same as the status quo for maintenance and security branches,
but handling security fixes for non-LTS releases is currently an open
question.
**Verdict:** until PEP 407 is updated to actually address this scenario, a
clear win for this PEP.
Effects
=======
Effect on development cycle
---------------------------
Similar to PEP 407, this PEP will break up the delivery of new features into
more discrete chunks. Instead of a whole raft of changes landing all at once
in a language release, each language release will be limited to 6 months
worth of standard library changes, as well as any changes associated with
new syntax.
Effect on workflow
------------------
This PEP proposes the creation of a single additional branch for use in the
normal workflow. After the release of 3.3, the following branches would be
in use::
2.7 # Maintenance branch, no change
3.3 # Maintenance branch, as for 3.2
3.3-compat # New branch, backwards compatible changes
default # Language changes, standard library updates that depend on them
When working on a new feature, developers will need to decide whether or not
it is an acceptable change for a standard library release. If so, then it
should be checked in on ``3.3-compat`` and then merged to ``default``.
Otherwise it should be checked in directly to ``default``.
The "version added" and "version changed" markers for any changes made on
the ``3.3-compat`` branch would need to be flagged with both the language
version and the standard library version. For example: "3.3 (33.1)".
Any changes made directly on the ``default`` branch would just be flagged
with "3.4" as usual.
The ``3.3-compat`` branch would be closed to normal development at the
same time as the ``3.3`` maintenance branch. The ``3.3-compat`` branch would
remain open for security fixes for the same period of time as the ``3.3``
maintenance branch.
Effect on bugfix cycle
----------------------
The effect on the bug fix workflow is essentially the same as that on the
workflow for new features - there is one additional branch to pass through
before the change reaches the ``default`` branch.
If critical bugs are found in a maintenance release, then new maintenance and
standard library releases will be created to resolve the problem. The final
part of the version number will be incremented for both the language version
and the standard library version.
If critical bugs are found in a standard library release that do not affect
the associated maintenance release, then only a new standard library release
will be created and only the standard library's version number will be
incremented.
Note that in these circumstances, the standard library release *may* include
additional features, rather than just containing the bug fix. It is
assumed that anyone that cares about receiving *only* bug fixes without any
new features mixed in will already be relying strictly on the maintenance
releases rather than using the new standard library releases.
Effect on the community
-----------------------
PEP 407 has this to say about the effects on the community:
People who value stability can just synchronize on the LTS releases which,
with the proposed figures, would give a similar support cycle (both in
duration and in stability).
I believe this statement is just plain wrong. Life isn't that simple. Instead,
developers of third party modules and frameworks will come under pressure to
support the full pace of the new release cycle with binary updates, teachers
and book authors will receive complaints that they're only covering an "old"
version of Python ("You're only using 3.3, the latest is 3.5!"), etc.
As the minor version number starts climbing 3 times faster than it has in the
past, I believe perceptions of language stability would also fall (whether
such opinions were justified or not).
I believe isolating the increased pace of change to the standard library,
and clearly delineating it with a separate version number will greatly
reassure the rest of the community that no, we're not suddenly
asking them to triple their own rate of development. Instead, we're merely
going to ship standard library updates for the next language release in
6-monthly installments rather than delaying them all until the next language
definition update, even those changes that are backwards compatible with the
previously released version of Python.
The community benefits listed in PEP 407 are equally applicable to this PEP,
at least as far as the standard library is concerned:
People who value reactivity and access to new features (without taking the
risk to install alpha versions or Mercurial snapshots) would get much more
value from the new release cycle than currently.
People who want to contribute new features or improvements would be more
motivated to do so, knowing that their contributions will be more quickly
available to normal users.
If the faster release cycle encourages more people to focus on contributing
to the standard library rather than proposing changes to the language
definition, I don't see that as a bad thing.
Handling News Updates
=====================
What's New?
-----------
The "What's New" documents would be split out into separate documents for
standard library releases and language releases. So, during the 3.3 release
cycle, we would see:
* What's New in Python 3.3?
* What's New in the Python Standard Library 33.1?
* What's New in the Python Standard Library 33.2?
* What's New in the Python Standard Library 33.3?
And then finally, we would see the next language release:
* What's New in Python 3.4?
For the benefit of users that ignore standard library releases, the 3.4
What's New would link back to the What's New documents for each of the
standard library releases in the 3.3 series.
NEWS
----
Merge conflicts on the NEWS file are already a hassle. Since this PEP
proposes introduction of an additional branch into the normal workflow,
resolving this becomes even more critical. While Mercurial phases may
help to some degree, it would be good to eliminate the problem entirely.
One suggestion from Barry Warsaw is to adopt a non-conflicting
separate-files-per-change approach, similar to that used by Twisted [2_].
Given that the current manually updated NEWS file will be used for the 3.3.0
release, one possible layout for such an approach might look like::
Misc/
NEWS # Now autogenerated from news_entries
news_entries/
3.3/
NEWS # Original 3.3 NEWS file
maint.1/ # Maintenance branch changes
core/
<news entries>
builtins/
<news entries>
extensions/
<news entries>
library/
<news entries>
documentation/
<news entries>
tests/
<news entries>
compat.1/ # Compatibility branch changes
builtins/
<news entries>
extensions/
<news entries>
library/
<news entries>
documentation/
<news entries>
tests/
<news entries>
# Add maint.2, compat.2 etc as releases are made
3.4/
core/
<news entries>
builtins/
<news entries>
extensions/
<news entries>
library/
<news entries>
documentation/
<news entries>
tests/
<news entries>
# Add maint.1, compat.1 etc as releases are made
Putting the version information in the directory heirarchy isn't strictly
necessary (since the NEWS file generator could figure out from the version
history), but does make it easier for *humans* to keep the different versions
in order.
Other benefits of reduced version coupling
==========================================
Slowing down the language release cycle
---------------------------------------
The current release cycle is a compromise between the desire for stability
in the core language definition and C extension ABI, and the desire to get
new features (most notably standard library updates) into user's hands more
quickly.
With the standard library release cycle decoupled (to some degree) from that
of the core language definition, it provides an opportunity to actually
*slow down* the rate of change in the language definition. The language
moratorium for Python 3.2 effectively slowed that cycle down to *more than 3
years* (3.1: June 2009, 3.3: August 2012) without causing any major
problems or complaints.
The NEWS file management scheme described above is actually designed to
allow us the flexibility to slow down language releases at the same time
as standard library releases become more frequent.
As a simple example, if a full two years was allowed between 3.3 and 3.4,
the 3.3 release cycle would end up looking like::
3.2.4 # Maintenance release
3.3.0 # Language release
3.3.1 # Maintenance release
3.3 (33.1) # Standard library release
3.3.2 # Maintenance release
3.3 (33.2) # Standard library release
3.3.3 # Maintenance release
3.3 (33.3) # Standard library release
3.3.4 # Maintenance release
3.3 (33.4) # Standard library release
3.4.0 # Language release
The elegance of the proposed branch structure and NEWS entry layout is that
this decision wouldn't really need to be made until shortly before the planned
3.4 release date. At that point, the decision could be made to postpone the
3.4 release and keep the ``3.3`` and ``3.3-compat`` branches open after the
3.3.3 maintenance release and the 3.3 (33.3) standard library release, thus
adding another standard library release to the cycle. The choice between
another standard library release or a full language release would then be
available every 6 months after that.
Further increasing the pace of standard library development
-----------------------------------------------------------
As noted in the previous section, one benefit of the scheme proposed in this
PEP is that it largely decouples the language release cycle from the
standard library release cycle. The standard library could be updated every
3 months, or even once a month, without having any flow on effects on the
language version numbering or the perceived stability of the core language.
While that pace of development isn't practical as long as the binary
installer creation for Windows and Mac OS X involves several manual steps
(including manual testing) and for as long as we don't have separate
"<branch>-release" trees that only receive versions that have been marked as
good by the stable buildbots, it's still a useful criterion to keep in mind
when considering proposed new versioning schemes: what if we eventually want
to make standard library releases even *faster* than every 6 months?
If the practical issues were ever resolved, then the separate standard
library versioning scheme in this PEP could handle it. The tagged version
number approach proposed in PEP 407 could not (at least, not without a lot
of user confusion and uncertainty).
Other Questions
===============
Why not use the major version number?
-------------------------------------
The simplest and most logical solution would actually be to map the
major.minor.micro version numbers to the language version, stdlib version
and maintenance release version respectively.
Instead of releasing Python 3.3.0, we would instead release Python 4.0.0
and the release cycle would look like::
4.0.0 # Language release
4.0.1 # Maintenance release
4.1.0 # Standard library release
4.0.2 # Maintenance release
4.2.0 # Standard library release
4.0.3 # Maintenance release
4.3.0 # Standard library release
5.0.0 # Language release
However, the ongoing pain of the Python 2 -> Python 3 transition (and
associated workarounds like the ``python3`` and ``python2`` symlinks to
refer directly to the desired release series) means that this simple option
isn't viable for historical reasons.
One way that this simple approach *could* be made to work is to merge the
current major and minor version numbers directly into a 2-digit major
version number::
33.0.0 # Language release
33.0.1 # Maintenance release
33.1.0 # Standard library release
33.0.2 # Maintenance release
33.2.0 # Standard library release
33.0.3 # Maintenance release
33.3.0 # Standard library release
34.0.0 # Language release
Why not use a four part version number?
---------------------------------------
Another simple versioning scheme would just add a "standard library" version
into the existing versioning scheme::
3.3.0.0 # Language release
3.3.0.1 # Maintenance release
3.3.1.0 # Standard library release
3.3.0.2 # Maintenance release
3.3.2.0 # Standard library release
3.3.0.3 # Maintenance release
3.3.3.0 # Standard library release
3.4.0.0 # Language release
However, this scheme isn't viable due to backwards compatibility constraints
on the ``sys.version_info`` structure.
Why not use a date-based versioning scheme?
-------------------------------------------
Earlier versions of this PEP proposed a date-based versioning scheme for
the standard library. However, such a scheme made it very difficult to
handle out-of-cycle releases to fix security issues and other critical
bugs in standard library releases, as it required the following steps:
1. Change the release version number to the date of the current month.
2. Update the What's New, NEWS and documentation to refer to the new release
number.
3. Make the new release.
With the sequential scheme now proposed, such releases should at most require
a little tidying up of the What's New document before making the release.
Why isn't PEP 384 enough?
-------------------------
PEP 384 introduced the notion of a "Stable ABI" for CPython, a limited
subset of the full C ABI that is guaranteed to remain stable. Extensions
built against the stable ABI should be able to support all subsequent
Python versions with the same binary.
This will help new projects to avoid coupling their C extension modules too
closely to a specific version of CPython. For existing modules, however,
migrating to the stable ABI can involve quite a lot of work (especially for
extension modules that define a lot of classes). With limited development
resources available, any time spent on such a change is time that could
otherwise have been spent working on features that offer more direct benefits
to end users.
There are also other benefits to separate versioning (as described above)
that are not directly related to the question of binary compatibility with
third party C extensions.
Why no binary compatible additions to the C ABI in standard library releases?
-----------------------------------------------------------------------------
There's a case to be made that *additions* to the CPython C ABI could
reasonably be permitted in standard library releases. This would give C
extension authors the same freedom as any other package or module author
to depend either on a particular language version or on a standard library
version.
The PEP currently associates the interpreter version with the language
version, and therefore limits major interpreter changes (including C ABI
additions) to the language releases.
An alternative, internally consistent, approach would be to link the
interpreter version with the standard library version, with only changes that
may affect backwards compatibility limited to language releases.
Under such a scheme, the following changes would be acceptable in standard
library releases:
* Standard library updates
* new features in pure Python modules
* new features in C extension modules (subject to PEP 399 compatibility
requirements)
* new features in language builtins
* Interpreter implementation updates
* binary compatible additions to the C ABI
* changes to the compilation toolchain that do not affect the AST or alter
the bytecode magic number
* changes to the core interpreter eval loop
* bug fixes from the corresponding maintenance release
And the following changes would be acceptable in language releases:
* new language syntax
* any updates acceptable in a standard library release
* new deprecation warnings
* removal of previously deprecated features
* changes to the AST
* changes to the emitted bytecode that require altering the magic number
* binary incompatible changes to the C ABI (although the PEP 384 stable ABI
must still be preserved)
While such an approach could probably be made to work, there does not appear
to be a compelling justification for it, and the approach currently described
in the PEP is simpler and easier to explain.
Why not separate out the standard library entirely?
---------------------------------------------------
A concept that is occasionally discussed is the idea of making the standard
library truly independent from the CPython reference implementation.
My personal opinion is that actually making such a change would involve a
lot of work for next to no pay-off. CPython without the standard library is
useless (the build chain won't even run, let alone the test suite). You also
can't create a standalone pure Python standard library either, because too
many "standard library modules" are actually tightly linked in to the
internal details of their respective interpreters (for example, the builtins,
``weakref``, ``gc``, ``sys``, ``inspect``, ``ast``).
Creating a separate CPython development branch that is kept compatible with
the previous language release, and making releases from that branch that are
identified with a separate standard library version number should provide
most of the benefits of a separate standard library repository with only a
fraction of the pain.
Acknowledgements
================
Thanks go to the PEP 407 authors for starting this discussion, as well as
to those authors and Larry Hastings for initial discussions of the proposal
made in this PEP.
References
==========
.. [1] PEP 407: New release cycle and introducing long-term support versions
http://www.python.org/dev/peps/pep-0407/
.. [2] Twisted's "topfiles" approach to NEWS generation
http://twistedmatrix.com/trac/wiki/ReviewProcess#Newsfiles
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

415
pep-0414.txt Normal file
View File

@ -0,0 +1,415 @@
PEP: 414
Title: Explicit Unicode Literal for Python 3.3
Version: $Revision$
Last-Modified: $Date$
Author: Armin Ronacher <armin.ronacher@active-4.com>,
Nick Coghlan <ncoghlan@gmail.com>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 15-Feb-2012
Post-History: 28-Feb-2012, 04-Mar-2012
Resolution: http://mail.python.org/pipermail/python-dev/2012-February/116995.html
Abstract
========
This document proposes the reintegration of an explicit unicode literal
from Python 2.x to the Python 3.x language specification, in order to
reduce the volume of changes needed when porting Unicode-aware
Python 2 applications to Python 3.
BDFL Pronouncement
==================
This PEP has been formally accepted for Python 3.3:
I'm accepting the PEP. It's about as harmless as they come. Make it so.
Proposal
========
This PEP proposes that Python 3.3 restore support for Python 2's Unicode
literal syntax, substantially increasing the number of lines of existing
Python 2 code in Unicode aware applications that will run without modification
on Python 3.
Specifically, the Python 3 definition for string literal prefixes will be
expanded to allow::
"u" | "U" | "ur" | "UR" | "Ur" | "uR"
in addition to the currently supported::
"r" | "R"
The following will all denote ordinary Python 3 strings::
'text'
"text"
'''text'''
"""text"""
u'text'
u"text"
u'''text'''
u"""text"""
U'text'
U"text"
U'''text'''
U"""text"""
Combination of the unicode prefix with the raw string prefix will also be
supported, just as it was in Python 2.
No changes are proposed to Python 3's actual Unicode handling, only to the
acceptable forms for string literals.
Author's Note
=============
This PEP was originally written by Armin Ronacher, and Guido's approval was
given based on that version.
The currently published version has been rewritten by Nick Coghlan to
include additional historical details and rationale that were taken into
account when Guido made his decision, but were not explicitly documented in
Armin's version of the PEP.
Readers should be aware that many of the arguments in this PEP are *not*
technical ones. Instead, they relate heavily to the *social* and *personal*
aspects of software development.
Rationale
=========
With the release of a Python 3 compatible version of the Web Services Gateway
Interface (WSGI) specification (PEP 3333) for Python 3.2, many parts of the
Python web ecosystem have been making a concerted effort to support Python 3
without adversely affecting their existing developer and user communities.
One major item of feedback from key developers in those communities, including
Chris McDonough (WebOb, Pyramid), Armin Ronacher (Flask, Werkzeug), Jacob
Kaplan-Moss (Django) and Kenneth Reitz (``requests``) is that the requirement
to change the spelling of *every* Unicode literal in an application
(regardless of how that is accomplished) is a key stumbling block for porting
efforts.
In particular, unlike many of the other Python 3 changes, it isn't one that
framework and library authors can easily handle on behalf of their users. Most
of those users couldn't care less about the "purity" of the Python language
specification, they just want their websites and applications to work as well
as possible.
While it is the Python web community that has been most vocal in highlighting
this concern, it is expected that other highly Unicode aware domains (such as
GUI development) may run into similar issues as they (and their communities)
start making concerted efforts to support Python 3.
Common Objections
=================
Complaint: This PEP may harm adoption of Python 3.2
---------------------------------------------------
This complaint is interesting, as it carries within it a tacit admission that
this PEP *will* make it easier to port Unicode aware Python 2 applications to
Python 3.
There are many existing Python communities that are prepared to put up with
the constraints imposed by the existing suite of porting tools, or to update
their Python 2 code bases sufficiently that the problems are minimised.
This PEP is not for those communities. Instead, it is designed specifically to
help people that *don't* want to put up with those difficulties.
However, since the proposal is for a comparatively small tweak to the language
syntax with no semantic changes, it is feasible to support it as a third
party import hook. While such an import hook imposes some import time
overhead, and requires additional steps from each application that needs it
to get the hook in place, it allows applications that target Python 3.2
to use libraries and frameworks that would otherwise only run on Python 3.3+
due to their use of unicode literal prefixes.
One such import hook project is Vinay Sajip's ``uprefix`` [4]_.
For those that prefer to translate their code in advance rather than
converting on the fly at import time, Armin Ronacher is working on a hook
that runs at install time rather than during import [5]_.
Combining the two approaches is of course also possible. For example, the
import hook could be used for rapid edit-test cycles during local
development, but the install hook for continuous integration tasks and
deployment on Python 3.2.
The approaches described in this section may prove useful, for example, for
applications that wish to target Python 3 on the Ubuntu 12.04 LTS release,
which will ship with Python 2.7 and 3.2 as officially supported Python
versions.
Complaint: Python 3 shouldn't be made worse just to support porting from Python 2
---------------------------------------------------------------------------------
This is indeed one of the key design principles of Python 3. However, one of
the key design principles of Python as a whole is that "practicality beats
purity". If we're going to impose a significant burden on third party
developers, we should have a solid rationale for doing so.
In most cases, the rationale for backwards incompatible Python 3 changes are
either to improve code correctness (for example, stricter default separation
of binary and text data and integer division upgrading to floats when
necessary), reduce typical memory usage (for example, increased usage of
iterators and views over concrete lists), or to remove distracting nuisances
that make Python code harder to read without increasing its expressiveness
(for example, the comma based syntax for naming caught exceptions). Changes
backed by such reasoning are *not* going to be reverted, regardless of
objections from Python 2 developers attempting to make the transition to
Python 3.
In many cases, Python 2 offered two ways of doing things for historical reasons.
For example, inequality could be tested with both ``!=`` and ``<>`` and integer
literals could be specified with an optional ``L`` suffix. Such redundancies
have been eliminated in Python 3, which reduces the overall size of the
language and improves consistency across developers.
In the original Python 3 design (up to and including Python 3.2), the explicit
prefix syntax for unicode literals was deemed to fall into this category, as it
is completely unnecessary in Python 3. However, the difference between those
other cases and unicode literals is that the unicode literal prefix is *not*
redundant in Python 2 code: it is a programmatically significant distinction
that needs to be preserved in some fashion to avoid losing information.
While porting tools were created to help with the transition (see next section)
it still creates an additional burden on heavy users of unicode strings in
Python 2, solely so that future developers learning Python 3 don't need to be
told "For historical reasons, string literals may have an optional ``u`` or
``U`` prefix. Never use this yourselves, it's just there to help with porting
from an earlier version of the language."
Plenty of students learning Python 2 received similar warnings regarding string
exceptions without being confused or irreparably stunted in their growth as
Python developers. It will be the same with this feature.
This point is further reinforced by the fact that Python 3 *still* allows the
uppercase variants of the ``B`` and ``R`` prefixes for bytes literals and raw
bytes and string literals. If the potential for confusion due to string prefix
variants is that significant, where was the outcry asking that these
redundant prefixes be removed along with all the other redundancies that were
eliminated in Python 3?
Just as support for string exceptions was eliminated from Python 2 using the
normal deprecation process, support for redundant string prefix characters
(specifically, ``B``, ``R``, ``u``, ``U``) may eventually be eliminated
from Python 3, regardless of the current acceptance of this PEP. However,
such a change will likely only occur once third party libraries supporting
Python 2.7 is about as common as libraries supporting Python 2.2 or 2.3 is
today.
Complaint: The WSGI "native strings" concept is an ugly hack
------------------------------------------------------------
One reason the removal of unicode literals has provoked such concern amongst
the web development community is that the updated WSGI specification had to
make a few compromises to minimise the disruption for existing web servers
that provide a WSGI-compatible interface (this was deemed necessary in order
to make the updated standard a viable target for web application authors and
web framework developers).
One of those compromises is the concept of a "native string". WSGI defines
three different kinds of string:
* text strings: handled as ``unicode`` in Python 2 and ``str`` in Python 3
* native strings: handled as ``str`` in both Python 2 and Python 3
* binary data: handled as ``str`` in Python 2 and ``bytes`` in Python 3
Some developers consider WSGI's "native strings" to be an ugly hack, as they
are *explicitly* documented as being used solely for ``latin-1`` decoded
"text", regardless of the actual encoding of the underlying data. Using this
approach bypasses many of the updates to Python 3's data model that are
designed to encourage correct handling of text encodings. However, it
generally works due to the specific details of the problem domain - web server
and web framework developers are some of the individuals *most* aware of how
blurry the line can get between binary data and text when working with HTTP
and related protocols, and how important it is to understand the implications
of the encodings in use when manipulating encoded text data. At the
*application* level most of these details are hidden from the developer by
the web frameworks and support libraries (both in Python 2 *and* in Python 3).
In practice, native strings are a useful concept because there are some APIs
(both in the standard library and in third party frameworks and packages) and
some internal interpreter details that are designed primarily to work with
``str``. These components often don't support ``unicode`` in Python 2
or ``bytes`` in Python 3, or, if they do, require additional encoding details
and/or impose constraints that don't apply to the ``str`` variants.
Some example of interfaces that are best handled by using actual ``str``
instances are:
* Python identifiers (as attributes, dict keys, class names, module names,
import references, etc)
* URLs for the most part as well as HTTP headers in urllib/http servers
* WSGI environment keys and CGI-inherited values
* Python source code for dynamic compilation and AST hacks
* Exception messages
* ``__repr__`` return value
* preferred filesystem paths
* preferred OS environment
In Python 2.6 and 2.7, these distinctions are most naturally expressed as
follows:
* ``u""``: text string (``unicode``)
* ``""``: native string (``str``)
* ``b""``: binary data (``str``, also aliased as ``bytes``)
In Python 3, the ``latin-1`` decoded native strings are not distinguished
from any other text strings:
* ``""``: text string (``str``)
* ``""``: native string (``str``)
* ``b""``: binary data (``bytes``)
If ``from __future__ import unicode_literals`` is used to modify the behaviour
of Python 2, then, along with an appropriate definition of ``n()``, the
distinction can be expressed as:
* ``""``: text string
* ``n("")``: native string
* ``b""``: binary data
(While ``n=str`` works for simple cases, it can sometimes have problems
due to non-ASCII source encodings)
In the common subset of Python 2 and Python 3 (with appropriate
specification of a source encoding and definitions of the ``u()`` and ``b()``
helper functions), they can be expressed as:
* ``u("")``: text string
* ``""``: native string
* ``b("")``: binary data
That last approach is the only variant that supports Python 2.5 and earlier.
Of all the alternatives, the format currently supported in Python 2.6 and 2.7
is by far the cleanest approach that clearly distinguishes the three desired
kinds of behaviour. With this PEP, that format will also be supported in
Python 3.3+. It will also be supported in Python 3.1 and 3.2 through the use
of import and install hooks. While it is significantly less likely, it is
also conceivable that the hooks could be adapted to allow the use of the
``b`` prefix on Python 2.5.
Complaint: The existing tools should be good enough for everyone
----------------------------------------------------------------
A commonly expressed sentiment from developers that have already sucessfully
ported applications to Python 3 is along the lines of "if you think it's hard,
you're doing it wrong" or "it's not that hard, just try it!". While it is no
doubt unintentional, these responses all have the effect of telling the
people that are pointing out inadequacies in the current porting toolset
"there's nothing wrong with the porting tools, you just suck and don't know
how to use them properly".
These responses are a case of completely missing the point of what people are
complaining about. The feedback that resulted in this PEP isn't due to people complaining that ports aren't possible. Instead, the feedback is coming from
people that have succesfully *completed* ports and are objecting that they
found the experience thoroughly *unpleasant* for the class of application that
they needed to port (specifically, Unicode aware web frameworks and support
libraries).
This is a subjective appraisal, and it's the reason why the Python 3
porting tools ecosystem is a case where the "one obvious way to do it"
philosophy emphatically does *not* apply. While it was originally intended that
"develop in Python 2, convert with ``2to3``, test both" would be the standard
way to develop for both versions in parallel, in practice, the needs of
different projects and developer communities have proven to be sufficiently
diverse that a variety of approaches have been devised, allowing each group
to select an approach that best fits their needs.
Lennart Regebro has produced an excellent overview of the available migration
strategies [2]_, and a similar review is provided in the official porting
guide [3]_. (Note that the official guidance has softened to "it depends on
your specific situation" since Lennart wrote his overview).
However, both of those guides are written from the founding assumption that
all of the developers involved are *already* committed to the idea of
supporting Python 3. They make no allowance for the *social* aspects of such a
change when you're interacting with a user base that may not be especially
tolerant of disruptions without a clear benefit, or are trying to persuade
Python 2 focused upstream developers to accept patches that are solely about
improving Python 3 forward compatibility.
With the current porting toolset, *every* migration strategy will result in
changes to *every* Unicode literal in a project. No exceptions. They will
be converted to either an unprefixed string literal (if the project decides to
adopt the ``unicode_literals`` import) or else to a converter call like
``u("text")``.
If the ``unicode_literals`` import approach is employed, but is not adopted
across the entire project at the same time, then the meaning of a bare string
literal may become annoyingly ambiguous. This problem can be particularly
pernicious for *aggregated* software, like a Django site - in such a situation,
some files may end up using the ``unicode_literals`` import and others may not,
creating definite potential for confusion.
While these problems are clearly solvable at a technical level, they're a
completely unnecessary distraction at the social level. Developer energy should
be reserved for addressing *real* technical difficulties associated with the
Python 3 transition (like distinguishing their 8-bit text strings from their
binary data). They shouldn't be punished with additional code changes (even
automated ones) solely due to the fact that they have *already* explicitly
identified their Unicode strings in Python 2.
Armin Ronacher has created an experimental extension to 2to3 which only
modernizes Python code to the extent that it runs on Python 2.7 or later with
support from the cross-version compatibility ``six`` library. This tool is
available as ``python-modernize`` [1]_. Currently, the deltas generated by
this tool will affect every Unicode literal in the converted source. This
will create legitimate concerns amongst upstream developers asked to accept
such changes, and amongst framework *users* being asked to change their
applications.
However, by eliminating the noise from changes to the Unicode literal syntax,
many projects could be cleanly and (comparatively) non-controversially made
forward compatible with Python 3.3+ just by running ``python-modernize`` and
applying the recommended changes.
References
==========
.. [1] Python-Modernize
(http://github.com/mitsuhiko/python-modernize)
.. [2] Porting to Python 3: Migration Strategies
(http://python3porting.com/strategies.html)
.. [3] Porting Python 2 Code to Python 3
(http://docs.python.org/howto/pyporting.html)
.. [4] uprefix import hook project
(https://bitbucket.org/vinay.sajip/uprefix)
.. [5] install hook to remove unicode string prefix characters
(https://github.com/mitsuhiko/unicode-literals-pep/tree/master/install-hook)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

88
pep-0415.txt Normal file
View File

@ -0,0 +1,88 @@
PEP: 415
Title: Implementing PEP 409 differently
Version: $Revision$
Last-Modified: $Date$
Author: Benjamin Peterson <benjamin@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 26-Feb-2012
Post-History: 26-Feb-2012
Abstract
========
PEP 409 allows PEP 3134 exception contexts and causes to be suppressed when the
exception is printed. This is done using the ``raise exc from None``
syntax. This PEP proposes to implement context and cause suppression
differently.
Rationale
=========
PEP 409 changes ``__cause__`` to be ``Ellipsis`` by default. Then if
``__cause__`` is set to ``None`` by ``raise exc from None``, no context or cause
will be printed should the exception be uncaught.
The main problem with this scheme is it complicates the role of
``__cause__``. ``__cause__`` should indicate the cause of the exception not
whether ``__context__`` should be printed or not. This use of ``__cause__`` is
also not easily extended in the future. For example, we may someday want to
allow the programmer to select which of ``__context__`` and ``__cause__`` will
be printed. The PEP 409 implementation is not amenable to this.
The use of ``Ellipsis`` is a hack. Before PEP 409, ``Ellipsis`` was used
exclusively in extended slicing. Extended slicing has nothing to do with
exceptions, so it's not clear to someone inspecting an exception object why
``__cause__`` should be set to ``Ellipsis``. Using ``Ellipsis`` by default for
``__cause__`` makes it asymmetrical with ``__context__``.
Proposal
========
A new attribute on ``BaseException``, ``__suppress_context__``, will
be introduced. Whenever ``__cause__`` is set, ``__suppress_context__``
will be set to ``True``. In particular, ``raise exc from cause``
syntax will set ``exc.__suppress_context__`` to ``True``. Exception
printing code will check for that attribute to determine whether
context and cause will be printed. ``__cause__`` will return to its
original purpose and values.
There is precedence for ``__suppress_context__`` with the
``print_line_and_file`` exception attribute.
To summarize, ``raise exc from cause`` will be equivalent to::
exc.__cause__ = cause
raise exc
where ``exc.__cause__ = cause`` implicitly sets
``exc.__suppress_context__``.
Patches
=======
There is a patch on `Issue 14133`_.
References
==========
.. _issue 14133:
http://bugs.python.org/issue6210
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

134
pep-0416.txt Normal file
View File

@ -0,0 +1,134 @@
PEP: 416
Title: Add a frozendict builtin type
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-February-2012
Python-Version: 3.3
Abstract
========
Add a new frozendict builtin type.
Rationale
=========
A frozendict is a read-only mapping: a key cannot be added nor removed, and a
key is always mapped to the same value. However, frozendict values can be
mutable (not hashable). A frozendict is hashable and so immutable if and only
if all values are hashable (immutable).
Use cases:
* frozendict lookup can be done at compile time instead of runtime because the
mapping is read-only. frozendict can be used instead of a preprocessor to
remove conditional code at compilation, like code specific to a debug build.
* hashable frozendict can be used as a key of a mapping or as a member of set.
frozendict can be used to implement a cache.
* frozendict avoids the need of a lock when the frozendict is shared
by multiple threads or processes, especially hashable frozendict. It would
also help to prohibe coroutines (generators + greenlets) to modify the
global state.
* frozendict helps to implement read-only object proxies for security modules.
For example, it would be possible to use frozendict type for __builtins__
mapping or type.__dict__. This is possible because frozendict is compatible
with the PyDict C API.
* frozendict avoids the need of a read-only proxy in some cases. frozendict is
faster than a proxy because getting an item in a frozendict is a fast lookup
whereas a proxy requires a function call.
* use a frozendict as the default value of function argument: avoid the
problem of mutable default argument.
Constraints
===========
* frozendict has to implement the Mapping abstract base class
* frozendict keys and values can be unorderable
* a frozendict is hashable if all keys and values are hashable
* frozendict hash does not depend on the items creation order
Implementation
==============
* Add a PyFrozenDictObject structure based on PyDictObject with an extra
"Py_hash_t hash;" field
* frozendict.__hash__() is implemented using hash(frozenset(self.items())) and
caches the result in its private hash attribute
* Register frozendict as a collections.abc.Mapping
* frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and
PyDict_DelItem() raise a TypeError
Recipe: hashable dict
======================
To ensure that a a frozendict is hashable, values can be checked
before creating the frozendict::
import itertools
def hashabledict(*args, **kw):
# ensure that all values are hashable
for key, value in itertools.chain(args, kw.items()):
if isinstance(value, (int, str, bytes, float, frozenset, complex)):
# avoid the compute the hash (which may be slow) for builtin
# types known to be hashable for any value
continue
hash(value)
# don't check the key: frozendict already checks the key
return frozendict.__new__(cls, *args, **kw)
Objections
==========
*namedtuple may fit the requiements of a frozendict.*
A namedtuple is not a mapping, it does not implement the Mapping abstract base
class.
*frozendict can be implemented in Python using descriptors" and "frozendict
just need to be practically constant.*
If frozendict is used to harden Python (security purpose), it must be
implemented in C. A type implemented in C is also faster.
*The PEP 351 was rejected.*
The PEP 351 tries to freeze an object and so may convert a mutable object to an
immutable object (using a different type). frozendict doesn't convert anything:
hash(frozendict) raises a TypeError if a value is not hashable. Freezing an
object is not the purpose of this PEP.
Links
=====
* PEP 412: Key-Sharing Dictionary
(`issue #13903 <http://bugs.python.org/issue13903>`_)
* PEP 351: The freeze protocol
* `The case for immutable dictionaries; and the central misunderstanding of PEP 351 <http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html>`_
* `Frozen dictionaries (Python recipe 414283) <http://code.activestate.com/recipes/414283-frozen-dictionaries/>`_
by Oren Tirosh
* Python security modules implementing read-only object proxies using a C
extension:
* `pysandbox <https://github.com/haypo/pysandbox/>`_
* `mxProxy <http://www.egenix.com/products/python/mxBase/mxProxy/>`_
* `zope.proxy <http://pypi.python.org/pypi/zope.proxy>`_
* `zope.security <http://pypi.python.org/pypi/zope.security>`_
Copyright
=========
This document has been placed in the public domain.

View File

@ -3,7 +3,7 @@ Title: Python 3000
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Guido van Rossum <guido@python.org> Author: Guido van Rossum <guido@python.org>
Status: Active Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 05-Apr-2006 Created: 05-Apr-2006

View File

@ -3,7 +3,7 @@ Title: Procedure for Backwards-Incompatible Changes
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Steven Bethard <steven.bethard@gmail.com> Author: Steven Bethard <steven.bethard@gmail.com>
Status: Draft Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 27-Mar-2006 Created: 27-Mar-2006

View File

@ -3,7 +3,7 @@ Title: Python Language Moratorium
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Brett Cannon, Jesse Noller, Guido van Rossum Author: Brett Cannon, Jesse Noller, Guido van Rossum
Status: Active Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 21-Oct-2009 Created: 21-Oct-2009

View File

@ -3,7 +3,7 @@ Title: Things that will Not Change in Python 3000
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Georg Brandl <georg@python.org> Author: Georg Brandl <georg@python.org>
Status: Active Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 04-Apr-2006 Created: 04-Apr-2006

View File

@ -3,8 +3,8 @@ Title: Miscellaneous Python 3.0 Plans
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Brett Cannon <brett@python.org> Author: Brett Cannon <brett@python.org>
Status: Active Status: Final
Type: Informational Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 20-Aug-2004 Created: 20-Aug-2004
Post-History: Post-History:

View File

@ -1,382 +1,160 @@
PEP: 3144 PEP: 3144
Title: IP Address Manipulation Library for the Python Standard Library Title: IP Address Manipulation Library for the Python Standard Library
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Peter Moody <peter@hda3.com> Author: Peter Moody <pmoody@google.com>
Discussions-To: ipaddr-py-dev@googlegroups.com Discussions-To: <ipaddr-py-dev@googlegroups.com>
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
Content-Type: text/plain Content-Type: text/plain
Created: 13-Aug-2009 Created: 6-Feb-2012
Python-Version: 3.2 Python-Version: 3.3
Abstract: Abstract:
This PEP proposes a design for a lightweight ip address manipulation module This PEP proposes a design and for an IP address manipulation module for
for python. python.
Motivation: Motivation:
Many network administrators use python in their day to day jobs. Finding a Several very good IP address modules for python already exist.
library to assist with the common ip address manipulation tasks is easy. The truth is that all of the struggle with the balance between
Finding a good library for performing those tasks can be somewhat more adherence to Pythonic principals and the shorthand upon which
difficult. For this reason, I (like many before me) scratched an itch and network engineers and administrators rely. I believe ipaddr
wrote my own with an emphasis on being easy to understand and fast for the strikes the right balance.
most common operations.
For context, a previous version of this library was up for inclusion in
python 3.1, see issue 3959 [1] for more information.
Rationale: Rationale:
ipaddr was designed with a few basic principals in mind: The existance of several Python IP address manipulation moduels is
evidence of an outstanding need for the functionality this module
seeks to provide.
- IPv4 and IPv6 objects are distinct.
- IP addresses and IP networks are distinct.
- the library should be useful and the assumptions obvious to the network
programmer.
- IP networks should be treated as lists (as opposed to some other
python intrinsic) in so far as it makes sense.
- the library should be lightweight and fast without sacrificing
expected functionality.
- Distinct IPV4 and IPV6 objects. Background:
While there are many similarities, IPV4 and IPV6 objects are fundamentally PEP 3144 and ipaddr have been up for inclusion before. The
different. The similarities allow for easy abstraction of certain version of the library specified here is backwards incompatible
operations which affect the bits from both in the same manner, but their with the version on PyPI and the one which was discussed before.
differences mean attempts to combine them into one object yield unexpected In order to avoid confusing users of the current ipaddr, I've
results. According to Vint Cerf, "I have seen a substantial amount of renamed this version of the library "ipaddress".
traffic about IPv4 and IPv6 comparisons and the general consensus is that
these are not comparable." (Vint Cerf [2]). For python versions >= 3.0,
this means that (<, >, <=, >=) comparison operations between IPv4 and IPv6
objects raise a TypeError per the Ordering Comparisons [3].
- Distinct network and address objects. The main differences between ipaddr and ipaddress are:
An IPV4 address is a single 32 bit number while the IPV4 address assigned * ipaddress *Network classes are equivalent to the ipaddr *Network
to a networked computer is a 32 bit address and associated network. class counterparts with the strict flag set to True.
Similarly, an IPV6 address is a 128 bit number while an IPV6 address
assigned to a networked computer is a 128 bit number and associated network
information. The similarities leads to easy abstraction of some methods
and properties, but there are obviously a number of address/network
specific properties which require they be distinct. For instance, IP
networks contain a network address (the base address of the network),
broadcast address (the upper end of the network, also the address to
which every machine on a given network is supposed listen, hence the name
broadcast), supernetworks and subnetworks, etc. The individual property
addresses in an IP network obviously don't have the same properties,
they're simply 32 or 128 bit numbers.
- Principal of least confusion for network programmers. * ipaddress *Interface classes are equivalent to the ipaddr
*Network class counterparts with the strict flag set to False.
It should be understood that, above all, this module is designed with the * The factory functions in ipaddress were renamed to disambiguate
network administrator in mind. In practice, this means that a number of them from classes.
assumptions are made with regards to common usage and the library prefers
the usefulness of accepted practice over strict adherence to RFCs. For
example, ipaddr accepts '192.168.1.1/24' as a network definition because
this is a very common way of describing an address + netmask despite the
fact that 192.168.1.1 is actually an IP address on the network
192.168.1.0/24. Strict adherence would require that networks have all of
the host bits masked to zero, which would require two objects to describe
that IP + network. In practice, a looser interpretation of a network is
a very useful if common abstraction, so ipaddr prefers to make this
available. For the developer who is concerned with strict adherence,
ipaddr provides an optional 'strict' boolean argument to the
IPv(4|6)Network constructors which guarantees that all host bits are masked
down.
- Treat network elements as lists (in so far as it's possible). * A few attributes were renamed to disambiguate their purpose as
well. (eg. network, network_address)
Treating IP networks as lists is a natural extension from viewing the
network as a series of individual ip addresses. Most of the standard list
methods should be implemented and should behave in a manner that would be
consistent if the IP network object were actually a list of strings or
integers. The methods which actually modify a lists contents don't extend
as well to this model (__add__, __iadd__, __sub__, __isub__, etc) but
others (__contains__, __iter__, etc) work quite nicely. It should be noted
that __len__ doesn't work as expected since python internals has this
limited to a 32 bit integer and it would need to be at least 128 bits to
work with IPV6.
- Lightweight.
While some network programmers will undoubtedly want more than this library
provides, keeping the functionality to strictly what's required from a IP
address manipulation module is critical to keeping the code fast, easily
comprehensible and extensible. It is a goal to provide enough options in
terms of functionality to allow the developer to easily do their work
without needlessly cluttering the library. Finally, It's important to note
that this design doesn't prevent subclassing or otherwise extending to meet
the unforeseen needs.
Specification: Specification:
A slightly more detailed look at the library follows. The ipaddr module defines a total of 6 new public classes, 3 for
manipulating IPv4 objects and 3 for manipulating IPv6 objects.
The classes are as follows:
- Design IPv4Address/IPv6Address - These define individual addresses, for
example the IPv4 address returned by an A record query for
www.google.com (74.125.224.84) or the IPv6 address returned by a
AAAA record query for ipv6.google.com (2001:4860:4001:801::1011).
ipaddr has four main classes most people will use: IPv4Network/IPv6Network - These define networks or groups of
addresses, for example the IPv4 network reserved for multicast use
(224.0.0.0/4) or the IPv6 network reserved for multicast
(ff00::/8, wow, that's big).
1. IPv4Address. (eg, '192.168.1.1') IPv4Interface/IPv6Interface - These hybrid classes refer to an
2. IPv4Network (eg, '192.168.0.0/16') individual address on a given network. For example, the IPV4
3. IPv6Address (eg, '::1') address 192.0.2.1 on the network 192.0.2.0/24 could be referred to
4. IPv6Network (eg, '2001::/32') as 192.0.2.1/24. Likewise, the IPv6 address 2001:DB8::1 on the
network 2001:DB8::/96 could be referred to as 2001:DB8::1/96.
It's very common to refer to addresses assigned to computer
network interfaces like this, hence the Interface name.
Most of the operations a network administrator performs on networks are All IPv4 classes share certain characteristics and methods; the
similar for both IPv4 and IPv6 networks. Ie. finding subnets, supernets, number of bits needed to represent them, whether or not they
determining if an address is contained in a given network, etc. Similarly, belong to certain special IPv4 network ranges, etc. Similarly,
both addresses and networks (of the same ip version!) have much in common; all IPv6 classes share characteristics and methods.
the process for turning a given 32 or 128 bit number into a human readable
string notation, determining if the ip is within the valid specified range,
etc. Finally, there are some pythonic abstractions which are valid for all
addresses and networks, both IPv4 and IPv6. In short, there is common
functionality shared between (ipaddr class names in parentheses):
1. all IP addresses and networks, both IPv4 and IPv6. (_IPAddrBase) ipaddr makes extensive use of inheritance to avoid code
duplication as much as possible. The parent classes are private,
but they are outlined here:
2. all IP addresses of both versions. (_BaseIP) _IPAddrBase - Provides methods common to all ipaddr objects.
3. all IP networks of both version. (_BaseNet) _BaseAddress - Provides methods common to IPv4Address and
IPv6Address.
4. all IPv4 objects, both addresses and networks. (_BaseV4) _BaseInterface - Provides methods common to IPv4Interface and
IPv6Interface, as well as IPv4Network and IPv6Network (ipaddr
treats the Network classes as a special case of Interface).
5. all IPv6 objects, both addresses and networks. (_BaseV6) _BaseV4 - Provides methods and variables (eg, _max_prefixlen)
common to all IPv4 classes.
Seeing this as a clear hierarchy is important for recognizing how much _BaseV6 - Provides methods and variables common to all IPv6
code is common between the four main classes. For this reason, ipaddr uses classes.
class inheritance to abstract out as much common code is possible and
appropriate. This lack of duplication and very clean layout also makes
the job of the developer much easier should they need to debug code (either
theirs or mine).
Knowing that there might be cases where the developer doesn't so much care Comparisons between objects of differing IP versions results in a
as to the types of IP they might be receiving, ipaddr comes with two TypeError [1]. Additionally, comparisons of objects with
important helper functions, IPAddress() and IPNetwork(). These, as you different _Base parent classes results in a TypeError. The effect
might guess, return the appropriately typed address or network objects for of the _Base parent class limitation is that IPv4Interface's can
the given argument. be compared to IPv4Network's and IPv6Interface's can be compared
to IPv6Network's.
Finally, as mentioned earlier, there is no meaningful natural ordering
between IPv4 and IPv6 addresses and networks [2]. Rather than invent a
standard, ipaddr follows Ordering Comparisons and returns a TypeError
when asked to compare objects of differing IP versions. In practice, there
are many ways a programmer may wish to order the addresses, so this this
shouldn't pose a problem for the developer who can easily write:
v4 = [x for x in mixed_list if x._version == 4]
v6 = [x for x in mixed_list if x._version == 6]
# perform operations on v4 and v6 here.
return v4_return + v6_return
- Multiple ways of displaying an IP Address.
Not everyone will want to display the same information in the same format;
IP addresses in cisco syntax are represented by network/hostmask, junipers
are (network/IP)/prefixlength and IPTables are (network/IP)/(prefixlength/
netmask). The ipaddr library provides multiple ways to display an address.
In [1]: IPNetwork('1.1.1.1').with_prefixlen
Out[1]: '1.1.1.1/32'
In [1]: IPNetwork('1.1.1.1').with_netmask
Out[1]: '1.1.1.1/255.255.255.255'
In [1]: IPNetwork('1.1.1.1').with_hostmask
Out[1]: '1.1.1.1/0.0.0.0'
the same applies to IPv6. It should be noted that netmasks and hostmasks
are not commonly used in IPv6, the methods exist for compatibility with
IPv4.
- Lazy evaluation combined with aggressive caching of network elements.
(the following example is for IPv6Network objects but the exact same
properties apply to IPv6Network objects).
As mentioned, an IP network object is defined by a number of properties.
The object
In [1]: IPv4Network('1.1.1.0/24')
has a number of IPv4Address properties
In [1]: o = IPv4Network('1.1.1.0/24')
In [2]: o.network
Out[2]: IPv4Address('1.1.1.0')
In [3]: o.broadcast
Out[3]: IPv4Address('1.1.1.255')
In [4]: o.hostmask
Out[4]: IPv4Address('0.0.0.255')
If we were to compute them all at object creation time, we would incur a
non-negligible performance hit. Since these properties are required to
define the object completely but their values aren't always of interest to
the programmer, their computation should be done only when requested.
However, in order to avoid the performance hit in the case where one
attribute for a particular object is requested repeatedly (and continuously
recomputed), the results of the computation should be cached.
- Address list summarization.
ipaddr supports easy summarization of lists of possibly contiguous
addresses, as this is something network administrators constantly find
themselves doing. This currently works in a number of ways.
1. collapse_address_list([list]):
Given a list of networks, ipaddr will collapse the list into the smallest
possible list of networks that wholey contain the addresses supplied.
In [1]: collapse_address_list([IPNetwork('1.1.0.0/24'),
...: IPNetwork('1.1.1.0/24')])
Out[1]: [IPv4Network('1.1.0.0/23')]
more elaborately:
In [1]: collapse_address_list([IPNetwork(x) for x in
...: IPNetwork('1.1.0.0/23')])
Out[1]: [IPv4Network('1.1.0.0/23')]
2. summarize_address_range(first, last).
Given a start and end address, ipaddr will provide the smallest number of
networks to cover the given range.
In [1]: summarize_address_range(IPv4Address('1.1.1.0'),
...: IPv4Address('2.2.2.0'))
Out[1]:
[IPv4Network('1.1.1.0/24'),
IPv4Network('1.1.2.0/23'),
IPv4Network('1.1.4.0/22'),
IPv4Network('1.1.8.0/21'),
IPv4Network('1.1.16.0/20'),
IPv4Network('1.1.32.0/19'),
IPv4Network('1.1.64.0/18'),
IPv4Network('1.1.128.0/17'),
IPv4Network('1.2.0.0/15'),
IPv4Network('1.4.0.0/14'),
IPv4Network('1.8.0.0/13'),
IPv4Network('1.16.0.0/12'),
IPv4Network('1.32.0.0/11'),
IPv4Network('1.64.0.0/10'),
IPv4Network('1.128.0.0/9'),
IPv4Network('2.0.0.0/15'),
IPv4Network('2.2.0.0/23'),
IPv4Network('2.2.2.0/32')]
- Address Exclusion.
Used somewhat less often, but all the more annoying, is the case where an
programmer would want "all of the addresses in a newtork *except* these".
ipaddr performs this exclusion equally well for IPv4 and IPv6 networks
and collapses the resulting address list.
In [1]: IPNetwork('1.1.0.0/15').address_exclude(IPNetwork('1.1.1.0/24'))
Out[1]:
[IPv4Network('1.0.0.0/16'),
IPv4Network('1.1.0.0/24'),
IPv4Network('1.1.2.0/23'),
IPv4Network('1.1.4.0/22'),
IPv4Network('1.1.8.0/21'),
IPv4Network('1.1.16.0/20'),
IPv4Network('1.1.32.0/19'),
IPv4Network('1.1.64.0/18'),
IPv4Network('1.1.128.0/17')]
In [1]: IPNewtork('::1/96').address_exclude(IPNetwork('::1/112'))
Out[1]:
[IPv6Network('::1:0/112'),
IPv6Network('::2:0/111'),
IPv6Network('::4:0/110'),
IPv6Network('::8:0/109'),
IPv6Network('::10:0/108'),
IPv6Network('::20:0/107'),
IPv6Network('::40:0/106'),
IPv6Network('::80:0/105'),
IPv6Network('::100:0/104'),
IPv6Network('::200:0/103'),
IPv6Network('::400:0/102'),
IPv6Network('::800:0/101'),
IPv6Network('::1000:0/100'),
IPv6Network('::2000:0/99'),
IPv6Network('::4000:0/98'),
IPv6Network('::8000:0/97')]
- IPv6 address compression.
By default, IPv6 addresses are compressed internally (see the method
BaseV6._compress_hextets), but ipaddr makes both the compressed and the
exploded representations available.
In [1]: IPNetwork('::1').compressed
Out[1]: '::1/128'
In [2]: IPNetwork('::1').exploded
Out[2]: '0000:0000:0000:0000:0000:0000:0000:1/128'
In [3]: IPv6Address('::1').exploded
Out[3]: '0000:0000:0000:0000:0000:0000:0000:0001'
In [4]: IPv6Address('::1').compressed
Out[4]: '::1'
(the same methods exist for IPv4 networks and addresses, but they're
just stubs for returning the normal __str__ representation).
- Most other common operations.
It is a design goal to support all of the common operation expected from
an IP address manipulation module. As such, finding supernets, subnets,
address and network containment etc are all supported.
Reference Implementation: Reference Implementation:
A reference implementation is available at: The current reference implementation can be found at:
http://ipaddr-py.googlecode.com/svn/trunk http://code.google.com/p/ipaddr-py/downloads/detail?name=3144.tar.gz
More information about using the reference implementation can be
found at: http://code.google.com/p/ipaddr-py/wiki/Using3144
References: References:
[1] http://bugs.python.org/issue3959
[2] Appealing to authority is a logical fallacy, but Vint Cerf is an [1] Appealing to authority is a logical fallacy, but Vint Cerf is an
an authority who can't be ignored. Full text of the email follows: an authority who can't be ignored. Full text of the email
follows:
""" """
I have seen a substantial amount of traffic about IPv4 and IPv6 I have seen a substantial amount of traffic about IPv4 and
comparisons and the general consensus is that these are not comparable. IPv6 comparisons and the general consensus is that these are
not comparable.
If we were to take a very simple minded view, we might treat these as If we were to take a very simple minded view, we might treat
pure integers in which case there is an ordering but not a useful one. these as pure integers in which case there is an ordering but
not a useful one.
In the IPv4 world, "length" is important because we take longest (most In the IPv4 world, "length" is important because we take
specific) address first for routing. Length is determine by the mask, longest (most specific) address first for routing. Length is
as you know. determine by the mask, as you know.
Assuming that the same style of argument works in IPv6, we would have Assuming that the same style of argument works in IPv6, we
to conclude that treating an IPv6 value purely as an integer for would have to conclude that treating an IPv6 value purely as
comparison with IPv4 would lead to some really strange results. an integer for comparison with IPv4 would lead to some really
strange results.
All of IPv4 space would lie in the host space of 0::0/96 prefix of All of IPv4 space would lie in the host space of 0::0/96
IPv6. For any useful interpretation of IPv4, this is a non-starter. prefix of IPv6. For any useful interpretation of IPv4, this is
a non-starter.
I think the only sensible conclusion is that IPv4 values and IPv6 values I think the only sensible conclusion is that IPv4 values and
should be treated as non-comparable. IPv6 values should be treated as non-comparable.
Vint Vint
""" """
[3] http://docs.python.org/dev/3.0/whatsnew/3.0.html#ordering-comparisons
Copyright: Copyright:

View File

@ -3,7 +3,7 @@ Title: Statement local namespaces (aka "given" clause)
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com> Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Deferred Status: Withdrawn
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2010-07-09 Created: 2010-07-09
@ -47,45 +47,21 @@ Haskell. It avoids some problems that have been identified in past proposals,
but has not yet itself been subject to the test of implementation. but has not yet itself been subject to the test of implementation.
PEP Deferral PEP Withdrawal
============ ==============
Despite the lifting of the language moratorium (PEP 3003) for Python 3.3, I've had a complicated history with this PEP. For a long time I left it in
this PEP currently remains in a Deferred state. This idea, if implemented, the Deferred state because I wasn't convinced the additional complexity was
will potentially have a deep and pervasive effect on the way people write worth the payoff. Then, briefly, I became more enamoured of the idea and
Python code. only left it at Deferred because I didn't really have time to pursue it.
When this PEP was first put forward, even I, as the PEP author, was not I'm now withdrawing it, as, the longer I reflect on the topic, the more I
convinced it was a good idea. Instead, I was simply writing it as a way to feel this approach is simply far too intrusive and complicated to ever be
avoid endlessly rehashing similar topics on python-ideas. When someone a good idea for Python as a language.
broached the subject, they could be pointed at this PEP and told "Come back
when you've read and understood the arguments presented there". Subsequent
discussions (most notably, those surrounding PEP 403's attempt at a more
restricted version of the idea) have convinced me that the idea is valuable
and will help address a number of situations where developers feel that
Python "gets in the way" instead of "matching the way they think". For me,
it is this aspect of "let people express what they're thinking, rather than
forcing them to think differently due to Python's limitations" that finally
allowed the idea to clear the "status quo wins a stalemate" bar ([5]_).
However, while I now think the idea is worthwhile, I don't think there is I've also finally found a couple of syntax proposals for PEP 403 that
sufficient time left in the 3.3 release cycle for the idea to mature. A read quite nicely and address the same set of use cases as this PEP
reference implementation is needed, and people need time to experiment with while remaining significantly simpler.
that implementation and offer feedback on whether or not it helps with
programming paradigms that are currently somewhat clumsy in Python (like
callback programming). Even if a PEP co-author volunteered immediately to
work on the implementation and incorporate feedback into the PEP text, I feel
targetting 3.3 would be unnecessarily rushing things. So, I've marked this
PEP as a candidate for 3.4 rather than 3.3.
Once that process is complete, Guido van Rossum (or his delegate) will need
to be sufficiently convinced of the idea's merit and accept the PEP. Such
acceptance will require not only a fully functional reference implementation
for CPython (as already mentioned), but also indications from the other three
major Python implementations (PyPy, Jython, IronPython) that they consider
it feasible to implement the proposed semantics once they reach the point of
targetting 3.4 compatibility. Input from related projects with a vested
interest in Python's syntax (e.g. Cython) will also be valuable.
Proposal Proposal

View File

@ -71,27 +71,20 @@ arguments can not be pickled (or, rather, unpickled) [3]_. Both a new
special method (``__getnewargs_ex__`` ?) and a new opcode (NEWOBJEX ?) special method (``__getnewargs_ex__`` ?) and a new opcode (NEWOBJEX ?)
are needed. are needed.
Serializing more callable objects Serializing more "lookupable" objects
--------------------------------- -------------------------------------
Currently, only module-global functions are serializable. For some kinds of objects, it only makes sense to serialize them by name
Multiprocessing has custom support for pickling other callables such (for example classes and functions). By default, pickle is only able to
as bound methods [4]_. This support could be folded in the protocol, serialize module-global functions and classes by name. Supporting other
and made more efficient through a new GETATTR opcode. kinds of objects, such as unbound methods [4]_, is a common request.
Actually, third-party support for some of them, such as bound methods,
is implemented in the multiprocessing module [5]_.
Serializing "pseudo-global" objects :pep:`3155` now makes it possible to lookup many more objects by name.
----------------------------------- Generalizing the GLOBAL opcode to accept dot-separated names, or adding
a special GETATTR opcode, would allow the standard pickle implementation
Objects which are not module-global, but should be treated in a to support, in an efficient way, all those kinds of objects.
similar fashion -- such as unbound methods [5]_ or nested classes --
cannot currently be pickled (or, rather, unpickled) because the pickle
protocol does not correctly specify how to retrieve them. One
solution would be through the adjunction of a ``__namespace__`` (or
``__qualname__``) to all class and function objects, specifying the
full "path" by which they can be retrieved. For globals, this would
generally be ``"{}.{}".format(obj.__module__, obj.__name__)``. Then a
new opcode can resolve that path and push the object on the stack,
similarly to the GLOBAL opcode.
Binary encoding for all opcodes Binary encoding for all opcodes
------------------------------- -------------------------------
@ -131,12 +124,12 @@ References
.. [3] "pickle/copyreg doesn't support keyword only arguments in __new__": .. [3] "pickle/copyreg doesn't support keyword only arguments in __new__":
http://bugs.python.org/issue4727 http://bugs.python.org/issue4727
.. [4] Lib/multiprocessing/forking.py: .. [4] "pickle should support methods":
http://hg.python.org/cpython/file/baea9f5f973c/Lib/multiprocessing/forking.py#l54
.. [5] "pickle should support methods":
http://bugs.python.org/issue9276 http://bugs.python.org/issue9276
.. [5] Lib/multiprocessing/forking.py:
http://hg.python.org/cpython/file/baea9f5f973c/Lib/multiprocessing/forking.py#l54
Copyright Copyright
========= =========

View File

@ -3,13 +3,13 @@ Title: Qualified name for classes and functions
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Antoine Pitrou <solipsis@pitrou.net> Author: Antoine Pitrou <solipsis@pitrou.net>
Status: Draft Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2011-10-29 Created: 2011-10-29
Python-Version: 3.3 Python-Version: 3.3
Post-History: Post-History:
Resolution: TBD Resolution: http://mail.python.org/pipermail/python-dev/2011-November/114545.html
Rationale Rationale
@ -59,15 +59,16 @@ objects came up several times. It also limits pickling support [1]_.
Proposal Proposal
======== ========
This PEP proposes the addition of a ``__qname__`` attribute to This PEP proposes the addition of a ``__qualname__`` attribute to
functions and classes. For top-level functions and classes, the functions and classes. For top-level functions and classes, the
``__qname__`` attribute is equal to the ``__name__`` attribute. For ``__qualname__`` attribute is equal to the ``__name__`` attribute. For
nested classed, methods, and nested functions, the ``__qname__`` nested classed, methods, and nested functions, the ``__qualname__``
attribute contains a dotted path leading to the object from the module attribute contains a dotted path leading to the object from the module
top-level. top-level. A function's local namespace is represented in that dotted
path by a component named ``<locals>``.
The repr() and str() of functions and classes is modified to use The repr() and str() of functions and classes is modified to use
``__qname__`` rather than ``__name__``. ``__qualname__`` rather than ``__name__``.
Example with nested classes Example with nested classes
--------------------------- ---------------------------
@ -77,13 +78,13 @@ Example with nested classes
... class D: ... class D:
... def g(): pass ... def g(): pass
... ...
>>> C.__qname__ >>> C.__qualname__
'C' 'C'
>>> C.f.__qname__ >>> C.f.__qualname__
'C.f' 'C.f'
>>> C.D.__qname__ >>> C.D.__qualname__
'C.D' 'C.D'
>>> C.D.g.__qname__ >>> C.D.g.__qualname__
'C.D.g' 'C.D.g'
Example with nested functions Example with nested functions
@ -93,10 +94,10 @@ Example with nested functions
... def g(): pass ... def g(): pass
... return g ... return g
... ...
>>> f.__qname__ >>> f.__qualname__
'f' 'f'
>>> f().__qname__ >>> f().__qualname__
'f.g' 'f.<locals>.g'
Limitations Limitations
@ -107,16 +108,52 @@ dotted path will not be walkable programmatically as a function's
namespace is not available from the outside. It will still be more namespace is not available from the outside. It will still be more
helpful to the human reader than the bare ``__name__``. helpful to the human reader than the bare ``__name__``.
As the ``__name__`` attribute, the ``__qname__`` attribute is computed As the ``__name__`` attribute, the ``__qualname__`` attribute is computed
statically and it will not automatically follow rebinding. statically and it will not automatically follow rebinding.
Discussion
==========
Excluding the module name
-------------------------
As ``__name__``, ``__qualname__`` doesn't include the module name. This
makes it independent of module aliasing and rebinding, and also allows to
compute it at compile time.
Reviving unbound methods
------------------------
Reviving unbound methods would only solve a fraction of the problems this
PEP solves, at a higher price (an additional object type and an additional
indirection, rather than an additional attribute).
Naming choice
=============
"Qualified name" is the best approximation, as a short phrase, of what the
additional attribute is about. It is not a "full name" or "fully qualified
name" since it (deliberately) does not include the module name. Calling
it a "path" would risk confusion with filesystem paths and the ``__file__``
attribute.
The first proposal for the attribute name was to call it ``__qname__`` but
many people (who are not aware of previous use of such jargon in e.g. the
XML specification [2]_) found it obscure and non-obvious, which is why the
slightly less short and more explicit ``__qualname__`` was finally chosen.
References References
========== ==========
.. [1] "pickle should support methods": .. [1] "pickle should support methods":
http://bugs.python.org/issue9276 http://bugs.python.org/issue9276
.. [2] "QName" entry in Wikipedia:
http://en.wikipedia.org/wiki/QName
Copyright Copyright
========= =========

View File

@ -36,15 +36,15 @@ def sort_peps(peps):
for pep in peps: for pep in peps:
# Order of 'if' statement important. Key Status values take precedence # Order of 'if' statement important. Key Status values take precedence
# over Type value, and vice-versa. # over Type value, and vice-versa.
if pep.type_ == 'Process': if pep.status == 'Draft':
if pep.status in ("Active", "Draft"): open_.append(pep)
elif pep.type_ == 'Process':
if pep.status == "Active":
meta.append(pep) meta.append(pep)
elif pep.status in ("Withdrawn", "Rejected"): elif pep.status in ("Withdrawn", "Rejected"):
dead.append(pep) dead.append(pep)
else: else:
historical.append(pep) historical.append(pep)
elif pep.status == 'Draft':
open_.append(pep)
elif pep.status == 'Deferred': elif pep.status == 'Deferred':
deferred.append(pep) deferred.append(pep)
elif pep.status in ('Rejected', 'Withdrawn', elif pep.status in ('Rejected', 'Withdrawn',
@ -169,7 +169,7 @@ def write_pep0(peps, output=sys.stdout):
print>>output, unicode(pep) print>>output, unicode(pep)
print>>output print>>output
print>>output print>>output
print>>output, u" Numerical Index" print>>output, u"Numerical Index"
print>>output print>>output
write_column_headers(output) write_column_headers(output)
prev_pep = 0 prev_pep = 0