Convert PEPs 261, 267, 325, 358, 361 (#204)

* Convert PEPs 261, 267, 325, 358, 361

* Fixes to PEP 261 and PEP 361
This commit is contained in:
Mariatta 2017-02-10 14:19:22 -08:00 committed by GitHub
parent c5881cf2b5
commit 9c9560962a
5 changed files with 1060 additions and 986 deletions

View File

@ -5,31 +5,32 @@ Last-Modified: $Date$
Author: Paul Prescod <paul@prescod.net> Author: Paul Prescod <paul@prescod.net>
Status: Final Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst
Created: 27-Jun-2001 Created: 27-Jun-2001
Python-Version: 2.2 Python-Version: 2.2
Post-History: 27-Jun-2001 Post-History: 27-Jun-2001
Abstract Abstract
========
Python 2.1 unicode characters can have ordinals only up to 2**16 -1. Python 2.1 unicode characters can have ordinals only up to 2**16 -1.
This range corresponds to a range in Unicode known as the Basic This range corresponds to a range in Unicode known as the Basic
Multilingual Plane. There are now characters in Unicode that live Multilingual Plane. There are now characters in Unicode that live
on other "planes". The largest addressable character in Unicode on other "planes". The largest addressable character in Unicode
has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
will call this TOPCHAR and call characters in this range "wide will call this TOPCHAR and call characters in this range "wide
characters". characters".
Glossary Glossary
========
Character Character
Used by itself, means the addressable units of a Python Used by itself, means the addressable units of a Python
Unicode string. Unicode string.
Code point Code point
A code point is an integer between 0 and TOPCHAR. A code point is an integer between 0 and TOPCHAR.
If you imagine Unicode as a mapping from integers to If you imagine Unicode as a mapping from integers to
characters, each integer is a code point. But the characters, each integer is a code point. But the
@ -38,59 +39,56 @@ Glossary
be used for characters. Some are guaranteed never be used for characters. Some are guaranteed never
to be used for characters. to be used for characters.
Codec Codec
A set of functions for translating between physical A set of functions for translating between physical
encodings (e.g. on disk or coming in from a network) encodings (e.g. on disk or coming in from a network)
into logical Python objects. into logical Python objects.
Encoding Encoding
Mechanism for representing abstract characters in terms of Mechanism for representing abstract characters in terms of
physical bits and bytes. Encodings allow us to store physical bits and bytes. Encodings allow us to store
Unicode characters on disk and transmit them over networks Unicode characters on disk and transmit them over networks
in a manner that is compatible with other Unicode software. in a manner that is compatible with other Unicode software.
Surrogate pair Surrogate pair
Two physical characters that represent a single logical Two physical characters that represent a single logical
character. Part of a convention for representing 32-bit character. Part of a convention for representing 32-bit
code points in terms of two 16-bit code points. code points in terms of two 16-bit code points.
Unicode string Unicode string
A Python type representing a sequence of code points with A Python type representing a sequence of code points with
"string semantics" (e.g. case conversions, regular "string semantics" (e.g. case conversions, regular
expression compatibility, etc.) Constructed with the expression compatibility, etc.) Constructed with the
unicode() function. ``unicode()`` function.
Proposed Solution Proposed Solution
=================
One solution would be to merely increase the maximum ordinal One solution would be to merely increase the maximum ordinal
to a larger value. Unfortunately the only straightforward to a larger value. Unfortunately the only straightforward
implementation of this idea is to use 4 bytes per character. implementation of this idea is to use 4 bytes per character.
This has the effect of doubling the size of most Unicode This has the effect of doubling the size of most Unicode
strings. In order to avoid imposing this cost on every strings. In order to avoid imposing this cost on every
user, Python 2.2 will allow the 4-byte implementation as a user, Python 2.2 will allow the 4-byte implementation as a
build-time option. Users can choose whether they care about build-time option. Users can choose whether they care about
wide characters or prefer to preserve memory. wide characters or prefer to preserve memory.
The 4-byte option is called "wide Py_UNICODE". The 2-byte option The 4-byte option is called ``wide Py_UNICODE``. The 2-byte option
is called "narrow Py_UNICODE". is called ``narrow Py_UNICODE``.
Most things will behave identically in the wide and narrow worlds. Most things will behave identically in the wide and narrow worlds.
* unichr(i) for 0 <= i < 2**16 (0x10000) always returns a * unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
length-one string. length-one string.
* unichr(i) for 2**16 <= i <= TOPCHAR will return a * unichr(i) for 2**16 <= i <= TOPCHAR will return a
length-one string on wide Python builds. On narrow builds it will length-one string on wide Python builds. On narrow builds it will
raise ValueError. raise ``ValueError``.
ISSUE ISSUE
Python currently allows \U literals that cannot be Python currently allows ``\U`` literals that cannot be
represented as a single Python character. It generates two represented as a single Python character. It generates two
Python characters known as a "surrogate pair". Should this Python characters known as a "surrogate pair". Should this
be disallowed on future narrow Python builds? be disallowed on future narrow Python builds?
@ -135,28 +133,30 @@ Proposed Solution
careful of these characters which are disallowed by the careful of these characters which are disallowed by the
Unicode specification. Unicode specification.
* ord() is always the inverse of unichr() * ``ord()`` is always the inverse of ``unichr()``
* There is an integer value in the sys module that describes the * There is an integer value in the sys module that describes the
largest ordinal for a character in a Unicode string on the current largest ordinal for a character in a Unicode string on the current
interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds interpreter. ``sys.maxunicode`` is 2**16-1 (0xffff) on narrow builds
of Python and TOPCHAR on wide builds. of Python and TOPCHAR on wide builds.
ISSUE: Should there be distinct constants for accessing ISSUE:
Should there be distinct constants for accessing
TOPCHAR and the real upper bound for the domain of TOPCHAR and the real upper bound for the domain of
unichr (if they differ)? There has also been a unichr (if they differ)? There has also been a
suggestion of sys.unicodewidth which can take the suggestion of sys.unicodewidth which can take the
values 'wide' and 'narrow'. values 'wide' and 'narrow'.
* every Python Unicode character represents exactly one Unicode code * every Python Unicode character represents exactly one Unicode code
point (i.e. Python Unicode Character = Abstract Unicode character). point (i.e. Python Unicode Character = Abstract Unicode character).
* codecs will be upgraded to support "wide characters" * codecs will be upgraded to support "wide characters"
(represented directly in UCS-4, and as variable-length sequences (represented directly in UCS-4, and as variable-length sequences
in UTF-8 and UTF-16). This is the main part of the implementation in UTF-8 and UTF-16). This is the main part of the implementation
left to be done. left to be done.
* There is a convention in the Unicode world for encoding a 32-bit * There is a convention in the Unicode world for encoding a 32-bit
code point in terms of two 16-bit code points. These are known code point in terms of two 16-bit code points. These are known
as "surrogate pairs". Python's codecs will adopt this convention as "surrogate pairs". Python's codecs will adopt this convention
and encode 32-bit code points as surrogate pairs on narrow Python and encode 32-bit code points as surrogate pairs on narrow Python
@ -174,64 +174,68 @@ Proposed Solution
fixed-width characters and does not have to worry about fixed-width characters and does not have to worry about
surrogates. surrogates.
Con: Con:
No clear proposal of how to communicate this to codecs. No clear proposal of how to communicate this to codecs.
* there are no restrictions on constructing strings that use * there are no restrictions on constructing strings that use
code points "reserved for surrogates" improperly. These are code points "reserved for surrogates" improperly. These are
called "isolated surrogates". The codecs should disallow reading called "isolated surrogates". The codecs should disallow reading
these from files, but you could construct them using string these from files, but you could construct them using string
literals or unichr(). literals or ``unichr()``.
Implementation Implementation
==============
There is a new define: There is a new define::
#define Py_UNICODE_SIZE 2 #define Py_UNICODE_SIZE 2
To test whether UCS2 or UCS4 is in use, the derived macro To test whether UCS2 or UCS4 is in use, the derived macro
Py_UNICODE_WIDE should be used, which is defined when UCS-4 is in ``Py_UNICODE_WIDE`` should be used, which is defined when UCS-4 is in
use. use.
There is a new configure option: There is a new configure option:
--enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses ===================== ==========================================
--enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
wchar_t if it fits wchar_t if it fits
--enable-unicode=ucs4 configures a wide Py_UNICODE, and uses --enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
wchar_t if it fits wchar_t if it fits
--enable-unicode same as "=ucs2" --enable-unicode same as "=ucs2"
--disable-unicode entirely remove the Unicode functionality. --disable-unicode entirely remove the Unicode functionality.
===================== ==========================================
It is also proposed that one day --enable-unicode will just It is also proposed that one day ``--enable-unicode`` will just
default to the width of your platforms wchar_t. default to the width of your platforms ``wchar_t``.
Windows builds will be narrow for a while based on the fact that Windows builds will be narrow for a while based on the fact that
there have been few requests for wide characters, those requests there have been few requests for wide characters, those requests
are mostly from hard-core programmers with the ability to buy are mostly from hard-core programmers with the ability to buy
their own Python and Windows itself is strongly biased towards their own Python and Windows itself is strongly biased towards
16-bit characters. 16-bit characters.
Notes Notes
=====
This PEP does NOT imply that people using Unicode need to use a This PEP does NOT imply that people using Unicode need to use a
4-byte encoding for their files on disk or sent over the network. 4-byte encoding for their files on disk or sent over the network.
It only allows them to do so. For example, ASCII is still a It only allows them to do so. For example, ASCII is still a
legitimate (7-bit) Unicode-encoding. legitimate (7-bit) Unicode-encoding.
It has been proposed that there should be a module that handles It has been proposed that there should be a module that handles
surrogates in narrow Python builds for programmers. If someone surrogates in narrow Python builds for programmers. If someone
wants to implement that, it will be another PEP. It might also be wants to implement that, it will be another PEP. It might also be
combined with features that allow other kinds of character-, combined with features that allow other kinds of character-,
word- and line- based indexing. word- and line- based indexing.
Rejected Suggestions Rejected Suggestions
====================
More or less the status-quo More or less the status-quo
We could officially say that Python characters are 16-bit and We could officially say that Python characters are 16-bit and
require programmers to implement wide characters in their require programmers to implement wide characters in their
@ -241,7 +245,7 @@ Rejected Suggestions
abstracted pseudo-strings would not be legal as input to the abstracted pseudo-strings would not be legal as input to the
regular expression engine. regular expression engine.
"Space-efficient Unicode" type "Space-efficient Unicode" type
Another class of solution is to use some efficient storage Another class of solution is to use some efficient storage
internally but present an abstraction of wide characters to internally but present an abstraction of wide characters to
@ -253,32 +257,35 @@ Rejected Suggestions
narrow Python. Guido is not willing to undertake the narrow Python. Guido is not willing to undertake the
implementation right now. implementation right now.
Two types Two types
We could introduce a 32-bit Unicode type alongside the 16-bit We could introduce a 32-bit Unicode type alongside the 16-bit
type. There is a lot of code that expects there to be only a type. There is a lot of code that expects there to be only a
single Unicode type. single Unicode type.
This PEP represents the least-effort solution. Over the next This PEP represents the least-effort solution. Over the next
several years, 32-bit Unicode characters will become more common several years, 32-bit Unicode characters will become more common
and that may either convince us that we need a more sophisticated and that may either convince us that we need a more sophisticated
solution or (on the other hand) convince us that simply solution or (on the other hand) convince us that simply
mandating wide Unicode characters is an appropriate solution. mandating wide Unicode characters is an appropriate solution.
Right now the two options on the table are do nothing or do Right now the two options on the table are do nothing or do
this. this.
References References
==========
Unicode Glossary: http://www.unicode.org/glossary/ Unicode Glossary: http://www.unicode.org/glossary/
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
End: indent-tabs-mode: nil
End:

View File

@ -5,75 +5,81 @@ Last-Modified: $Date$
Author: jeremy@alum.mit.edu (Jeremy Hylton) Author: jeremy@alum.mit.edu (Jeremy Hylton)
Status: Deferred Status: Deferred
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst
Created: 23-May-2001 Created: 23-May-2001
Python-Version: 2.2 Python-Version: 2.2
Post-History: Post-History:
Deferral
While this PEP is a nice idea, no-one has yet emerged to do the work of Deferral
hashing out the differences between this PEP, PEP 266 and PEP 280. ========
Hence, it is being deferred.
While this PEP is a nice idea, no-one has yet emerged to do the work of
hashing out the differences between this PEP, PEP 266 and PEP 280.
Hence, it is being deferred.
Abstract Abstract
========
This PEP proposes a new implementation of global module namespaces This PEP proposes a new implementation of global module namespaces
and the builtin namespace that speeds name resolution. The and the builtin namespace that speeds name resolution. The
implementation would use an array of object pointers for most implementation would use an array of object pointers for most
operations in these namespaces. The compiler would assign indices operations in these namespaces. The compiler would assign indices
for global variables and module attributes at compile time. for global variables and module attributes at compile time.
The current implementation represents these namespaces as The current implementation represents these namespaces as
dictionaries. A global name incurs a dictionary lookup each time dictionaries. A global name incurs a dictionary lookup each time
it is used; a builtin name incurs two dictionary lookups, a failed it is used; a builtin name incurs two dictionary lookups, a failed
lookup in the global namespace and a second lookup in the builtin lookup in the global namespace and a second lookup in the builtin
namespace. namespace.
This implementation should speed Python code that uses This implementation should speed Python code that uses
module-level functions and variables. It should also eliminate module-level functions and variables. It should also eliminate
awkward coding styles that have evolved to speed access to these awkward coding styles that have evolved to speed access to these
names. names.
The implementation is complicated because the global and builtin The implementation is complicated because the global and builtin
namespaces can be modified dynamically in ways that are impossible namespaces can be modified dynamically in ways that are impossible
for the compiler to detect. (Example: A module's namespace is for the compiler to detect. (Example: A module's namespace is
modified by a script after the module is imported.) As a result, modified by a script after the module is imported.) As a result,
the implementation must maintain several auxiliary data structures the implementation must maintain several auxiliary data structures
to preserve these dynamic features. to preserve these dynamic features.
Introduction Introduction
============
This PEP proposes a new implementation of attribute access for This PEP proposes a new implementation of attribute access for
module objects that optimizes access to module variables known at module objects that optimizes access to module variables known at
compile time. The module will store these variables in an array compile time. The module will store these variables in an array
and provide an interface to lookup attributes using array offsets. and provide an interface to lookup attributes using array offsets.
For globals, builtins, and attributes of imported modules, the For globals, builtins, and attributes of imported modules, the
compiler will generate code that uses the array offsets for fast compiler will generate code that uses the array offsets for fast
access. access.
[describe the key parts of the design: dlict, compiler support, [describe the key parts of the design: dlict, compiler support,
stupid name trick workarounds, optimization of other module's stupid name trick workarounds, optimization of other module's
globals] globals]
The implementation will preserve existing semantics for module The implementation will preserve existing semantics for module
namespaces, including the ability to modify module namespaces at namespaces, including the ability to modify module namespaces at
runtime in ways that affect the visibility of builtin names. runtime in ways that affect the visibility of builtin names.
DLict design DLict design
============
The namespaces are implemented using a data structure that has The namespaces are implemented using a data structure that has
sometimes gone under the name dlict. It is a dictionary that has sometimes gone under the name ``dlict``. It is a dictionary that has
numbered slots for some dictionary entries. The type must be numbered slots for some dictionary entries. The type must be
implemented in C to achieve acceptable performance. The new implemented in C to achieve acceptable performance. The new
type-class unification work should make this fairly easy. The type-class unification work should make this fairly easy. The
DLict will presumably be a subclass of dictionary with an ``DLict`` will presumably be a subclass of dictionary with an
alternate storage module for some keys. alternate storage module for some keys.
A Python implementation is included here to illustrate the basic A Python implementation is included here to illustrate the basic
design: design::
"""A dictionary-list hybrid""" """A dictionary-list hybrid"""
@ -183,102 +189,108 @@ DLict design
Compiler issues Compiler issues
===============
The compiler currently collects the names of all global variables The compiler currently collects the names of all global variables
in a module. These are names bound at the module level or bound in a module. These are names bound at the module level or bound
in a class or function body that declares them to be global. in a class or function body that declares them to be global.
The compiler would assign indices for each global name and add the The compiler would assign indices for each global name and add the
names and indices of the globals to the module's code object. names and indices of the globals to the module's code object.
Each code object would then be bound irrevocably to the module it Each code object would then be bound irrevocably to the module it
was defined in. (Not sure if there are some subtle problems with was defined in. (Not sure if there are some subtle problems with
this.) this.)
For attributes of imported modules, the module will store an For attributes of imported modules, the module will store an
indirection record. Internally, the module will store a pointer indirection record. Internally, the module will store a pointer
to the defining module and the offset of the attribute in the to the defining module and the offset of the attribute in the
defining module's global variable array. The offset would be defining module's global variable array. The offset would be
initialized the first time the name is looked up. initialized the first time the name is looked up.
Runtime model Runtime model
=============
The PythonVM will be extended with new opcodes to access globals The PythonVM will be extended with new opcodes to access globals
and module attributes via a module-level array. and module attributes via a module-level array.
A function object would need to point to the module that defined A function object would need to point to the module that defined
it in order to provide access to the module-level global array. it in order to provide access to the module-level global array.
For module attributes stored in the dlict (call them static For module attributes stored in the ``dlict`` (call them static
attributes), the get/delattr implementation would need to track attributes), the get/delattr implementation would need to track
access to these attributes using the old by-name interface. If a access to these attributes using the old by-name interface. If a
static attribute is updated dynamically, e.g. static attribute is updated dynamically, e.g.::
mod.__dict__["foo"] = 2 mod.__dict__["foo"] = 2
The implementation would need to update the array slot instead of The implementation would need to update the array slot instead of
the backup dict. the backup dict.
Backwards compatibility Backwards compatibility
=======================
The dlict will need to maintain meta-information about whether a The ``dlict`` will need to maintain meta-information about whether a
slot is currently used or not. It will also need to maintain a slot is currently used or not. It will also need to maintain a
pointer to the builtin namespace. When a name is not currently pointer to the builtin namespace. When a name is not currently
used in the global namespace, the lookup will have to fail over to used in the global namespace, the lookup will have to fail over to
the builtin namespace. the builtin namespace.
In the reverse case, each module may need a special accessor In the reverse case, each module may need a special accessor
function for the builtin namespace that checks to see if a global function for the builtin namespace that checks to see if a global
shadowing the builtin has been added dynamically. This check shadowing the builtin has been added dynamically. This check
would only occur if there was a dynamic change to the module's would only occur if there was a dynamic change to the module's
dlict, i.e. when a name is bound that wasn't discovered at ``dlict``, i.e. when a name is bound that wasn't discovered at
compile-time. compile-time.
These mechanisms would have little if any cost for the common case These mechanisms would have little if any cost for the common case
whether a module's global namespace is not modified in strange whether a module's global namespace is not modified in strange
ways at runtime. They would add overhead for modules that did ways at runtime. They would add overhead for modules that did
unusual things with global names, but this is an uncommon practice unusual things with global names, but this is an uncommon practice
and probably one worth discouraging. and probably one worth discouraging.
It may be desirable to disable dynamic additions to the global It may be desirable to disable dynamic additions to the global
namespace in some future version of Python. If so, the new namespace in some future version of Python. If so, the new
implementation could provide warnings. implementation could provide warnings.
Related PEPs Related PEPs
============
PEP 266, Optimizing Global Variable/Attribute Access, proposes a PEP 266, Optimizing Global Variable/Attribute Access, proposes a
different mechanism for optimizing access to global variables as different mechanism for optimizing access to global variables as
well as attributes of objects. The mechanism uses two new opcodes well as attributes of objects. The mechanism uses two new opcodes
TRACK_OBJECT and UNTRACK_OBJECT to create a slot in the local ``TRACK_OBJECT`` and ``UNTRACK_OBJECT`` to create a slot in the local
variables array that aliases the global or object attribute. If variables array that aliases the global or object attribute. If
the object being aliases is rebound, the rebind operation is the object being aliases is rebound, the rebind operation is
responsible for updating the aliases. responsible for updating the aliases.
The objecting tracking approach applies to a wider range of The objecting tracking approach applies to a wider range of
objects than just module. It may also have a higher runtime cost, objects than just module. It may also have a higher runtime cost,
because each function that uses a global or object attribute must because each function that uses a global or object attribute must
execute extra opcodes to register its interest in an object and execute extra opcodes to register its interest in an object and
unregister on exit; the cost of registration is unclear, but unregister on exit; the cost of registration is unclear, but
presumably involves a dynamically resizable data structure to hold presumably involves a dynamically resizable data structure to hold
a list of callbacks. a list of callbacks.
The implementation proposed here avoids the need for registration, The implementation proposed here avoids the need for registration,
because it does not create aliases. Instead it allows functions because it does not create aliases. Instead it allows functions
that reference a global variable or module attribute to retain a that reference a global variable or module attribute to retain a
pointer to the location where the original binding is stored. A pointer to the location where the original binding is stored. A
second advantage is that the initial lookup is performed once per second advantage is that the initial lookup is performed once per
module rather than once per function call. module rather than once per function call.
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
End: indent-tabs-mode: nil
End:

View File

@ -5,73 +5,78 @@ Last-Modified: $Date$
Author: Samuele Pedroni <pedronis@python.org> Author: Samuele Pedroni <pedronis@python.org>
Status: Rejected Status: Rejected
Type: Standards Track Type: Standards Track
Content-Type: text/plain Content-Type: text/x-rst
Created: 25-Aug-2003 Created: 25-Aug-2003
Python-Version: 2.4 Python-Version: 2.4
Post-History: Post-History:
Abstract Abstract
========
Generators allow for natural coding and abstraction of traversal Generators allow for natural coding and abstraction of traversal
over data. Currently if external resources needing proper timely over data. Currently if external resources needing proper timely
release are involved, generators are unfortunately not adequate. release are involved, generators are unfortunately not adequate.
The typical idiom for timely release is not supported, a yield The typical idiom for timely release is not supported, a yield
statement is not allowed in the try clause of a try-finally statement is not allowed in the try clause of a try-finally
statement inside a generator. The finally clause execution can be statement inside a generator. The finally clause execution can be
neither guaranteed nor enforced. neither guaranteed nor enforced.
This PEP proposes that the built-in generator type implement a
close method and destruction semantics, such that the restriction
on yield placement can be lifted, expanding the applicability of
generators.
This PEP proposes that the built-in generator type implement a
close method and destruction semantics, such that the restriction
on yield placement can be lifted, expanding the applicability of
generators.
Pronouncement Pronouncement
=============
Rejected in favor of PEP 342 which includes substantially all of
the requested behavior in a more refined form.
Rejected in favor of PEP 342 which includes substantially all of
the requested behavior in a more refined form.
Rationale Rationale
=========
Python generators allow for natural coding of many data traversal Python generators allow for natural coding of many data traversal
scenarios. Their instantiation produces iterators, scenarios. Their instantiation produces iterators,
i.e. first-class objects abstracting traversal (with all the i.e. first-class objects abstracting traversal (with all the
advantages of first- classness). In this respect they match in advantages of first- classness). In this respect they match in
power and offer some advantages over the approach using iterator power and offer some advantages over the approach using iterator
methods taking a (smalltalkish) block. On the other hand, given methods taking a (smalltalkish) block. On the other hand, given
current limitations (no yield allowed in a try clause of a current limitations (no yield allowed in a try clause of a
try-finally inside a generator) the latter approach seems better try-finally inside a generator) the latter approach seems better
suited to encapsulating not only traversal but also exception suited to encapsulating not only traversal but also exception
handling and proper resource acquisition and release. handling and proper resource acquisition and release.
Let's consider an example (for simplicity, files in read-mode are Let's consider an example (for simplicity, files in read-mode are
used): used)::
def all_lines(index_path): def all_lines(index_path):
for path in file(index_path, "r"): for path in file(index_path, "r"):
for line in file(path.strip(), "r"): for line in file(path.strip(), "r"):
yield line yield line
this is short and to the point, but the try-finally for timely this is short and to the point, but the try-finally for timely
closing of the files cannot be added. (While instead of a path, a closing of the files cannot be added. (While instead of a path, a
file, whose closing then would be responsibility of the caller, file, whose closing then would be responsibility of the caller,
could be passed in as argument, the same is not applicable for the could be passed in as argument, the same is not applicable for the
files opened depending on the contents of the index). files opened depending on the contents of the index).
If we want timely release, we have to sacrifice the simplicity and If we want timely release, we have to sacrifice the simplicity and
directness of the generator-only approach: (e.g.) directness of the generator-only approach: (e.g.)::
class AllLines: class AllLines:
def __init__(self,index_path): def __init__(self, index_path):
self.index_path = index_path self.index_path = index_path
self.index = None self.index = None
self.document = None self.document = None
def __iter__(self): def __iter__(self):
self.index = file(self.index_path,"r") self.index = file(self.index_path, "r")
for path in self.index: for path in self.index:
self.document = file(path.strip(),"r") self.document = file(path.strip(), "r")
for line in self.document: for line in self.document:
yield line yield line
self.document.close() self.document.close()
@ -83,7 +88,7 @@ Rationale
if self.document: if self.document:
self.document.close() self.document.close()
to be used as: to be used as::
all_lines = AllLines("index.txt") all_lines = AllLines("index.txt")
try: try:
@ -92,22 +97,22 @@ Rationale
finally: finally:
all_lines.close() all_lines.close()
The more convoluted solution implementing timely release, seems The more convoluted solution implementing timely release, seems
to offer a precious hint. What we have done is encapsulate our to offer a precious hint. What we have done is encapsulate our
traversal in an object (iterator) with a close method. traversal in an object (iterator) with a close method.
This PEP proposes that generators should grow such a close method This PEP proposes that generators should grow such a close method
with such semantics that the example could be rewritten as: with such semantics that the example could be rewritten as::
# Today this is not valid Python: yield is not allowed between # Today this is not valid Python: yield is not allowed between
# try and finally, and generator type instances support no # try and finally, and generator type instances support no
# close method. # close method.
def all_lines(index_path): def all_lines(index_path):
index = file(index_path,"r") index = file(index_path, "r")
try: try:
for path in index: for path in index:
document = file(path.strip(),"r") document = file(path.strip(), "r")
try: try:
for line in document: for line in document:
yield line yield line
@ -123,77 +128,78 @@ Rationale
finally: finally:
all.close() # close on generator all.close() # close on generator
Currently PEP 255 [1] disallows yield inside a try clause of a Currently PEP 255 [1]_ disallows yield inside a try clause of a
try-finally statement, because the execution of the finally clause try-finally statement, because the execution of the finally clause
cannot be guaranteed as required by try-finally semantics. cannot be guaranteed as required by try-finally semantics.
The semantics of the proposed close method should be such that The semantics of the proposed close method should be such that
while the finally clause execution still cannot be guaranteed, it while the finally clause execution still cannot be guaranteed, it
can be enforced when required. Specifically, the close method can be enforced when required. Specifically, the close method
behavior should trigger the execution of the finally clauses behavior should trigger the execution of the finally clauses
inside the generator, either by forcing a return in the generator inside the generator, either by forcing a return in the generator
frame or by throwing an exception in it. In situations requiring frame or by throwing an exception in it. In situations requiring
timely resource release, close could then be explicitly invoked. timely resource release, close could then be explicitly invoked.
The semantics of generator destruction on the other hand should be The semantics of generator destruction on the other hand should be
extended in order to implement a best-effort policy for the extended in order to implement a best-effort policy for the
general case. Specifically, destruction should invoke close(). general case. Specifically, destruction should invoke ``close()``.
The best-effort limitation comes from the fact that the The best-effort limitation comes from the fact that the
destructor's execution is not guaranteed in the first place. destructor's execution is not guaranteed in the first place.
This seems to be a reasonable compromise, the resulting global This seems to be a reasonable compromise, the resulting global
behavior being similar to that of files and closing. behavior being similar to that of files and closing.
Possible Semantics Possible Semantics
==================
The built-in generator type should have a close method The built-in generator type should have a close method
implemented, which can then be invoked as: implemented, which can then be invoked as::
gen.close() gen.close()
where gen is an instance of the built-in generator type. where ``gen`` is an instance of the built-in generator type.
Generator destruction should also invoke close method behavior. Generator destruction should also invoke close method behavior.
If a generator is already terminated, close should be a no-op. If a generator is already terminated, close should be a no-op.
Otherwise, there are two alternative solutions, Return or Otherwise, there are two alternative solutions, Return or
Exception Semantics: Exception Semantics:
A - Return Semantics: The generator should be resumed, generator A - Return Semantics: The generator should be resumed, generator
execution should continue as if the instruction at the re-entry execution should continue as if the instruction at the re-entry
point is a return. Consequently, finally clauses surrounding the point is a return. Consequently, finally clauses surrounding the
re-entry point would be executed, in the case of a then allowed re-entry point would be executed, in the case of a then allowed
try-yield-finally pattern. try-yield-finally pattern.
Issues: is it important to be able to distinguish forced Issues: is it important to be able to distinguish forced
termination by close, normal termination, exception propagation termination by close, normal termination, exception propagation
from generator or generator-called code? In the normal case it from generator or generator-called code? In the normal case it
seems not, finally clauses should be there to work the same in all seems not, finally clauses should be there to work the same in all
these cases, still this semantics could make such a distinction these cases, still this semantics could make such a distinction
hard. hard.
Except-clauses, like by a normal return, are not executed, such Except-clauses, like by a normal return, are not executed, such
clauses in legacy generators expect to be executed for exceptions clauses in legacy generators expect to be executed for exceptions
raised by the generator or by code called from it. Not executing raised by the generator or by code called from it. Not executing
them in the close case seems correct. them in the close case seems correct.
B - Exception Semantics: The generator should be resumed and B - Exception Semantics: The generator should be resumed and
execution should continue as if a special-purpose exception execution should continue as if a special-purpose exception
(e.g. CloseGenerator) has been raised at re-entry point. Close (e.g. CloseGenerator) has been raised at re-entry point. Close
implementation should consume and not propagate further this implementation should consume and not propagate further this
exception. exception.
Issues: should StopIteration be reused for this purpose? Probably Issues: should ``StopIteration`` be reused for this purpose? Probably
not. We would like close to be a harmless operation for legacy not. We would like close to be a harmless operation for legacy
generators, which could contain code catching StopIteration to generators, which could contain code catching ``StopIteration`` to
deal with other generators/iterators. deal with other generators/iterators.
In general, with exception semantics, it is unclear what to do if In general, with exception semantics, it is unclear what to do if
the generator does not terminate or we do not receive the special the generator does not terminate or we do not receive the special
exception propagated back. Other different exceptions should exception propagated back. Other different exceptions should
probably be propagated, but consider this possible legacy probably be propagated, but consider this possible legacy
generator code: generator code::
try: try:
... ...
@ -202,85 +208,91 @@ Possible Semantics
except: # or except Exception:, etc except: # or except Exception:, etc
raise Exception("boom") raise Exception("boom")
If close is invoked with the generator suspended after the yield, If close is invoked with the generator suspended after the yield,
the except clause would catch our special purpose exception, so we the except clause would catch our special purpose exception, so we
would get a different exception propagated back, which in this would get a different exception propagated back, which in this
case ought to be reasonably consumed and ignored but in general case ought to be reasonably consumed and ignored but in general
should be propagated, but separating these scenarios seems hard. should be propagated, but separating these scenarios seems hard.
The exception approach has the advantage to let the generator The exception approach has the advantage to let the generator
distinguish between termination cases and have more control. On distinguish between termination cases and have more control. On
the other hand, clear-cut semantics seem harder to define. the other hand, clear-cut semantics seem harder to define.
Remarks Remarks
=======
If this proposal is accepted, it should become common practice to If this proposal is accepted, it should become common practice to
document whether a generator acquires resources, so that its close document whether a generator acquires resources, so that its close
method ought to be called. If a generator is no longer used, method ought to be called. If a generator is no longer used,
calling close should be harmless. calling close should be harmless.
On the other hand, in the typical scenario the code that On the other hand, in the typical scenario the code that
instantiated the generator should call close if required by it. instantiated the generator should call close if required by it.
Generic code dealing with iterators/generators instantiated Generic code dealing with iterators/generators instantiated
elsewhere should typically not be littered with close calls. elsewhere should typically not be littered with close calls.
The rare case of code that has acquired ownership of and need to The rare case of code that has acquired ownership of and need to
properly deal with all of iterators, generators and generators properly deal with all of iterators, generators and generators
acquiring resources that need timely release, is easily solved: acquiring resources that need timely release, is easily solved::
if hasattr(iterator, 'close'): if hasattr(iterator, 'close'):
iterator.close() iterator.close()
Open Issues Open Issues
===========
Definitive semantics ought to be chosen. Currently Guido favors Definitive semantics ought to be chosen. Currently Guido favors
Exception Semantics. If the generator yields a value instead of Exception Semantics. If the generator yields a value instead of
terminating, or propagating back the special exception, a special terminating, or propagating back the special exception, a special
exception should be raised again on the generator side. exception should be raised again on the generator side.
It is still unclear whether spuriously converted special It is still unclear whether spuriously converted special
exceptions (as discussed in Possible Semantics) are a problem and exceptions (as discussed in Possible Semantics) are a problem and
what to do about them. what to do about them.
Implementation issues should be explored. Implementation issues should be explored.
Alternative Ideas Alternative Ideas
=================
The idea that the yield placement limitation should be removed and The idea that the yield placement limitation should be removed and
that generator destruction should trigger execution of finally that generator destruction should trigger execution of finally
clauses has been proposed more than once. Alone it cannot clauses has been proposed more than once. Alone it cannot
guarantee that timely release of resources acquired by a generator guarantee that timely release of resources acquired by a generator
can be enforced. can be enforced.
PEP 288 [2] proposes a more general solution, allowing custom PEP 288 [2]_ proposes a more general solution, allowing custom
exception passing to generators. The proposal in this PEP exception passing to generators. The proposal in this PEP
addresses more directly the problem of resource release. Were PEP addresses more directly the problem of resource release. Were PEP
288 implemented, Exceptions Semantics for close could be layered 288 implemented, Exceptions Semantics for close could be layered
on top of it, on the other hand PEP 288 should make a separate on top of it, on the other hand PEP 288 should make a separate
case for the more general functionality. case for the more general functionality.
References References
==========
[1] PEP 255 Simple Generators .. [1] PEP 255 Simple Generators
http://www.python.org/dev/peps/pep-0255/ http://www.python.org/dev/peps/pep-0255/
[2] PEP 288 Generators Attributes and Exceptions .. [2] PEP 288 Generators Attributes and Exceptions
http://www.python.org/dev/peps/pep-0288/ http://www.python.org/dev/peps/pep-0288/
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
sentence-end-double-space: t indent-tabs-mode: nil
fill-column: 70 sentence-end-double-space: t
End: fill-column: 70
End:

View File

@ -5,62 +5,66 @@ Last-Modified: $Date$
Author: Neil Schemenauer <nas@arctrix.com>, Guido van Rossum <guido@python.org> Author: Neil Schemenauer <nas@arctrix.com>, Guido van Rossum <guido@python.org>
Status: Final Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/plain Content-Type: text/x-rst
Created: 15-Feb-2006 Created: 15-Feb-2006
Python-Version: 2.6, 3.0 Python-Version: 2.6, 3.0
Post-History: Post-History:
Update Update
======
This PEP has partially been superseded by PEP 3137. This PEP has partially been superseded by PEP 3137.
Abstract Abstract
========
This PEP outlines the introduction of a raw bytes sequence type. This PEP outlines the introduction of a raw bytes sequence type.
Adding the bytes type is one step in the transition to Adding the bytes type is one step in the transition to
Unicode-based str objects which will be introduced in Python 3.0. Unicode-based str objects which will be introduced in Python 3.0.
The PEP describes how the bytes type should work in Python 2.6, as The PEP describes how the bytes type should work in Python 2.6, as
well as how it should work in Python 3.0. (Occasionally there are well as how it should work in Python 3.0. (Occasionally there are
differences because in Python 2.6, we have two string types, str differences because in Python 2.6, we have two string types, str
and unicode, while in Python 3.0 we will only have one string and unicode, while in Python 3.0 we will only have one string
type, whose name will be str but whose semantics will be like the type, whose name will be str but whose semantics will be like the
2.6 unicode type.) 2.6 unicode type.)
Motivation Motivation
==========
Python's current string objects are overloaded. They serve to hold Python's current string objects are overloaded. They serve to hold
both sequences of characters and sequences of bytes. This both sequences of characters and sequences of bytes. This
overloading of purpose leads to confusion and bugs. In future overloading of purpose leads to confusion and bugs. In future
versions of Python, string objects will be used for holding versions of Python, string objects will be used for holding
character data. The bytes object will fulfil the role of a byte character data. The bytes object will fulfil the role of a byte
container. Eventually the unicode type will be renamed to str container. Eventually the unicode type will be renamed to str
and the old str type will be removed. and the old str type will be removed.
Specification Specification
=============
A bytes object stores a mutable sequence of integers that are in A bytes object stores a mutable sequence of integers that are in
the range 0 to 255. Unlike string objects, indexing a bytes the range 0 to 255. Unlike string objects, indexing a bytes
object returns an integer. Assigning or comparing an object that object returns an integer. Assigning or comparing an object that
is not an integer to an element causes a TypeError exception. is not an integer to an element causes a ``TypeError`` exception.
Assigning an element to a value outside the range 0 to 255 causes Assigning an element to a value outside the range 0 to 255 causes
a ValueError exception. The .__len__() method of bytes returns a ``ValueError`` exception. The ``.__len__()`` method of bytes returns
the number of integers stored in the sequence (i.e. the number of the number of integers stored in the sequence (i.e. the number of
bytes). bytes).
The constructor of the bytes object has the following signature: The constructor of the bytes object has the following signature::
bytes([initializer[, encoding]]) bytes([initializer[, encoding]])
If no arguments are provided then a bytes object containing zero If no arguments are provided then a bytes object containing zero
elements is created and returned. The initializer argument can be elements is created and returned. The initializer argument can be
a string (in 2.6, either str or unicode), an iterable of integers, a string (in 2.6, either str or unicode), an iterable of integers,
or a single integer. The pseudo-code for the constructor or a single integer. The pseudo-code for the constructor
(optimized for clear semantics, not for speed) is: (optimized for clear semantics, not for speed) is::
def bytes(initializer=0, encoding=None): def bytes(initializer=0, encoding=None):
if isinstance(initializer, int): # In 2.6, int -> (int, long) if isinstance(initializer, int): # In 2.6, int -> (int, long)
@ -88,32 +92,32 @@ Specification
new[i] = c new[i] = c
return new return new
The .__repr__() method returns a string that can be evaluated to The ``.__repr__()`` method returns a string that can be evaluated to
generate a new bytes object containing a bytes literal: generate a new bytes object containing a bytes literal::
>>> bytes([10, 20, 30]) >>> bytes([10, 20, 30])
b'\n\x14\x1e' b'\n\x14\x1e'
The object has a .decode() method equivalent to the .decode() The object has a ``.decode()`` method equivalent to the ``.decode()``
method of the str object. The object has a classmethod .fromhex() method of the str object. The object has a classmethod ``.fromhex()``
that takes a string of characters from the set [0-9a-fA-F ] and that takes a string of characters from the set ``[0-9a-fA-F ]`` and
returns a bytes object (similar to binascii.unhexlify). For returns a bytes object (similar to binascii.unhexlify). For
example: example::
>>> bytes.fromhex('5c5350ff') >>> bytes.fromhex('5c5350ff')
b'\\SP\xff' b'\\SP\xff'
>>> bytes.fromhex('5c 53 50 ff') >>> bytes.fromhex('5c 53 50 ff')
b'\\SP\xff' b'\\SP\xff'
The object has a .hex() method that does the reverse conversion The object has a ``.hex()`` method that does the reverse conversion
(similar to binascii.hexlify): (similar to binascii.hexlify)::
>> bytes([92, 83, 80, 255]).hex() >> bytes([92, 83, 80, 255]).hex()
'5c5350ff' '5c5350ff'
The bytes object has some methods similar to list methods, and The bytes object has some methods similar to list methods, and
others similar to str methods. Here is a complete list of others similar to str methods. Here is a complete list of
methods, with their approximate signatures: methods, with their approximate signatures::
.__add__(bytes) -> bytes .__add__(bytes) -> bytes
.__contains__(int | bytes) -> bool .__contains__(int | bytes) -> bool
@ -162,17 +166,18 @@ Specification
.rsplit(bytes) -> list[bytes] .rsplit(bytes) -> list[bytes]
.translate(bytes, [bytes]) -> bytes .translate(bytes, [bytes]) -> bytes
Note the conspicuous absence of .isupper(), .upper(), and friends. Note the conspicuous absence of ``.isupper()``, ``.upper()``, and friends.
(But see "Open Issues" below.) There is no .__hash__() because (But see "Open Issues" below.) There is no ``.__hash__()`` because
the object is mutable. There is no use case for a .sort() method. the object is mutable. There is no use case for a ``.sort()`` method.
The bytes type also supports the buffer interface, supporting The bytes type also supports the buffer interface, supporting
reading and writing binary (but not character) data. reading and writing binary (but not character) data.
Out of Scope Issues Out of Scope Issues
===================
* Python 3k will have a much different I/O subsystem. Deciding * Python 3k will have a much different I/O subsystem. Deciding
how that I/O subsystem will work and interact with the bytes how that I/O subsystem will work and interact with the bytes
object is out of the scope of this PEP. The expectation however object is out of the scope of this PEP. The expectation however
is that binary I/O will read and write bytes, while text I/O is that binary I/O will read and write bytes, while text I/O
@ -180,96 +185,100 @@ Out of Scope Issues
interface, the existing binary I/O operations in Python 2.6 will interface, the existing binary I/O operations in Python 2.6 will
support bytes objects. support bytes objects.
* It has been suggested that a special method named .__bytes__() * It has been suggested that a special method named ``.__bytes__()``
be added to the language to allow objects to be converted into be added to the language to allow objects to be converted into
byte arrays. This decision is out of scope. byte arrays. This decision is out of scope.
* A bytes literal of the form b"..." is also proposed. This is * A bytes literal of the form ``b"..."`` is also proposed. This is
the subject of PEP 3112. the subject of PEP 3112.
Open Issues Open Issues
===========
* The .decode() method is redundant since a bytes object b can * The ``.decode()`` method is redundant since a bytes object ``b`` can
also be decoded by calling unicode(b, <encoding>) (in 2.6) or also be decoded by calling ``unicode(b, <encoding>)`` (in 2.6) or
str(b, <encoding>) (in 3.0). Do we need encode/decode methods ``str(b, <encoding>)`` (in 3.0). Do we need encode/decode methods
at all? In a sense the spelling using a constructor is cleaner. at all? In a sense the spelling using a constructor is cleaner.
* Need to specify the methods still more carefully. * Need to specify the methods still more carefully.
* Pickling and marshalling support need to be specified. * Pickling and marshalling support need to be specified.
* Should all those list methods really be implemented? * Should all those list methods really be implemented?
* A case could be made for supporting .ljust(), .rjust(), * A case could be made for supporting ``.ljust()``, ``.rjust()``,
.center() with a mandatory second argument. ``.center()`` with a mandatory second argument.
* A case could be made for supporting .split() with a mandatory * A case could be made for supporting ``.split()`` with a mandatory
argument. argument.
* A case could even be made for supporting .islower(), .isupper(), * A case could even be made for supporting ``.islower()``, ``.isupper()``,
.isspace(), .isalpha(), .isalnum(), .isdigit() and the ``.isspace()``, ``.isalpha()``, ``.isalnum()``, ``.isdigit()`` and the
corresponding conversions (.lower() etc.), using the ASCII corresponding conversions (``.lower()`` etc.), using the ASCII
definitions for letters, digits and whitespace. If this is definitions for letters, digits and whitespace. If this is
accepted, the cases for .ljust(), .rjust(), .center() and accepted, the cases for ``.ljust()``, ``.rjust()``, ``.center()`` and
.split() become much stronger, and they should have default ``.split()`` become much stronger, and they should have default
arguments as well, using an ASCII space or all ASCII whitespace arguments as well, using an ASCII space or all ASCII whitespace
(for .split()). (for ``.split()``).
Frequently Asked Questions Frequently Asked Questions
==========================
Q: Why have the optional encoding argument when the encode method of Q: Why have the optional encoding argument when the encode method of
Unicode objects does the same thing? Unicode objects does the same thing?
A: In the current version of Python, the encode method returns a str A: In the current version of Python, the encode method returns a str
object and we cannot change that without breaking code. The object and we cannot change that without breaking code. The
construct bytes(s.encode(...)) is expensive because it has to construct bytes(``s.encode(...)``) is expensive because it has to
copy the byte sequence multiple times. Also, Python generally copy the byte sequence multiple times. Also, Python generally
provides two ways of converting an object of type A into an provides two ways of converting an object of type A into an
object of type B: ask an A instance to convert itself to a B, or object of type B: ask an A instance to convert itself to a B, or
ask the type B to create a new instance from an A. Depending on ask the type B to create a new instance from an A. Depending on
what A and B are, both APIs make sense; sometimes reasons of what A and B are, both APIs make sense; sometimes reasons of
decoupling require that A can't know about B, in which case you decoupling require that A can't know about B, in which case you
have to use the latter approach; sometimes B can't know about A, have to use the latter approach; sometimes B can't know about A,
in which case you have to use the former. in which case you have to use the former.
Q: Why does bytes ignore the encoding argument if the initializer is Q: Why does bytes ignore the encoding argument if the initializer is
a str? (This only applies to 2.6.) a str? (This only applies to 2.6.)
A: There is no sane meaning that the encoding can have in that case. A: There is no sane meaning that the encoding can have in that case.
str objects *are* byte arrays and they know nothing about the str objects *are* byte arrays and they know nothing about the
encoding of character data they contain. We need to assume that encoding of character data they contain. We need to assume that
the programmer has provided a str object that already uses the the programmer has provided a str object that already uses the
desired encoding. If you need something other than a pure copy of desired encoding. If you need something other than a pure copy of
the bytes then you need to first decode the string. For example: the bytes then you need to first decode the string. For example::
bytes(s.decode(encoding1), encoding2) bytes(s.decode(encoding1), encoding2)
Q: Why not have the encoding argument default to Latin-1 (or some Q: Why not have the encoding argument default to Latin-1 (or some
other encoding that covers the entire byte range) rather than other encoding that covers the entire byte range) rather than
ASCII? ASCII?
A: The system default encoding for Python is ASCII. It seems least A: The system default encoding for Python is ASCII. It seems least
confusing to use that default. Also, in Py3k, using Latin-1 as confusing to use that default. Also, in Py3k, using Latin-1 as
the default might not be what users expect. For example, they the default might not be what users expect. For example, they
might prefer a Unicode encoding. Any default will not always might prefer a Unicode encoding. Any default will not always
work as expected. At least ASCII will complain loudly if you try work as expected. At least ASCII will complain loudly if you try
to encode non-ASCII data. to encode non-ASCII data.
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
sentence-end-double-space: t indent-tabs-mode: nil
fill-column: 70 sentence-end-double-space: t
coding: utf-8 fill-column: 70
End: coding: utf-8
End:

View File

@ -5,285 +5,319 @@ Last-Modified: $Date$
Author: Neal Norwitz, Barry Warsaw Author: Neal Norwitz, Barry Warsaw
Status: Final Status: Final
Type: Informational Type: Informational
Content-Type: text/x-rst
Created: 29-June-2006 Created: 29-June-2006
Python-Version: 2.6 and 3.0 Python-Version: 2.6 and 3.0
Post-History: 17-Mar-2008 Post-History: 17-Mar-2008
Abstract Abstract
========
This document describes the development and release schedule for This document describes the development and release schedule for
Python 2.6 and 3.0. The schedule primarily concerns itself with Python 2.6 and 3.0. The schedule primarily concerns itself with
PEP-sized items. Small features may be added up to and including PEP-sized items. Small features may be added up to and including
the first beta release. Bugs may be fixed until the final the first beta release. Bugs may be fixed until the final
release. release.
There will be at least two alpha releases, two beta releases, and There will be at least two alpha releases, two beta releases, and
one release candidate. The releases are planned for October 2008. one release candidate. The releases are planned for October 2008.
Python 2.6 is not only the next advancement in the Python 2 Python 2.6 is not only the next advancement in the Python 2
series, it is also a transitional release, helping developers series, it is also a transitional release, helping developers
begin to prepare their code for Python 3.0. As such, many begin to prepare their code for Python 3.0. As such, many
features are being backported from Python 3.0 to 2.6. Thus, it features are being backported from Python 3.0 to 2.6. Thus, it
makes sense to release both versions in at the same time. The makes sense to release both versions in at the same time. The
precedence for this was set with the Python 1.6 and 2.0 release. precedence for this was set with the Python 1.6 and 2.0 release.
Until rc, we will be releasing Python 2.6 and 3.0 in lockstep, on Until rc, we will be releasing Python 2.6 and 3.0 in lockstep, on
a monthly release cycle. The releases will happen on the first a monthly release cycle. The releases will happen on the first
Wednesday of every month through the beta testing cycle. Because Wednesday of every month through the beta testing cycle. Because
Python 2.6 is ready sooner, and because we have outside deadlines Python 2.6 is ready sooner, and because we have outside deadlines
we'd like to meet, we've decided to split the rc releases. Thus we'd like to meet, we've decided to split the rc releases. Thus
Python 2.6 final is currently planned to come out two weeks before Python 2.6 final is currently planned to come out two weeks before
Python 3.0 final. Python 3.0 final.
Release Manager and Crew Release Manager and Crew
========================
2.6/3.0 Release Manager: Barry Warsaw - 2.6/3.0 Release Manager: Barry Warsaw
Windows installers: Martin v. Loewis - Windows installers: Martin v. Loewis
Mac installers: Ronald Oussoren - Mac installers: Ronald Oussoren
Documentation: Georg Brandl - Documentation: Georg Brandl
RPMs: Sean Reifschneider - RPMs: Sean Reifschneider
Release Lifespan Release Lifespan
================
Python 3.0 is no longer being maintained for any purpose. Python 3.0 is no longer being maintained for any purpose.
Python 2.6.9 is the final security-only source-only maintenance Python 2.6.9 is the final security-only source-only maintenance
release of the Python 2.6 series. With its release on October 29, release of the Python 2.6 series. With its release on October 29,
2013, all official support for Python 2.6 has ended. Python 2.6 2013, all official support for Python 2.6 has ended. Python 2.6
is no longer being maintained for any purpose. is no longer being maintained for any purpose.
Release Schedule Release Schedule
================
Feb 29 2008: Python 2.6a1 and 3.0a3 are released - Feb 29 2008: Python 2.6a1 and 3.0a3 are released
Apr 02 2008: Python 2.6a2 and 3.0a4 are released - Apr 02 2008: Python 2.6a2 and 3.0a4 are released
May 08 2008: Python 2.6a3 and 3.0a5 are released - May 08 2008: Python 2.6a3 and 3.0a5 are released
Jun 18 2008: Python 2.6b1 and 3.0b1 are released - Jun 18 2008: Python 2.6b1 and 3.0b1 are released
Jul 17 2008: Python 2.6b2 and 3.0b2 are released - Jul 17 2008: Python 2.6b2 and 3.0b2 are released
Aug 20 2008: Python 2.6b3 and 3.0b3 are released - Aug 20 2008: Python 2.6b3 and 3.0b3 are released
Sep 12 2008: Python 2.6rc1 is released - Sep 12 2008: Python 2.6rc1 is released
Sep 17 2008: Python 2.6rc2 and 3.0rc1 released - Sep 17 2008: Python 2.6rc2 and 3.0rc1 released
Oct 01 2008: Python 2.6 final released - Oct 01 2008: Python 2.6 final released
Nov 06 2008: Python 3.0rc2 released - Nov 06 2008: Python 3.0rc2 released
Nov 21 2008: Python 3.0rc3 released - Nov 21 2008: Python 3.0rc3 released
Dec 03 2008: Python 3.0 final released - Dec 03 2008: Python 3.0 final released
Dec 04 2008: Python 2.6.1 final released - Dec 04 2008: Python 2.6.1 final released
Apr 14 2009: Python 2.6.2 final released - Apr 14 2009: Python 2.6.2 final released
Oct 02 2009: Python 2.6.3 final released - Oct 02 2009: Python 2.6.3 final released
Oct 25 2009: Python 2.6.4 final released - Oct 25 2009: Python 2.6.4 final released
Mar 19 2010: Python 2.6.5 final released - Mar 19 2010: Python 2.6.5 final released
Aug 24 2010: Python 2.6.6 final released - Aug 24 2010: Python 2.6.6 final released
Jun 03 2011: Python 2.6.7 final released (security-only) - Jun 03 2011: Python 2.6.7 final released (security-only)
Apr 10 2012: Python 2.6.8 final released (security-only) - Apr 10 2012: Python 2.6.8 final released (security-only)
Oct 29 2013: Python 2.6.9 final released (security-only) - Oct 29 2013: Python 2.6.9 final released (security-only)
Completed features for 3.0 Completed features for 3.0
==========================
See PEP 3000 [#pep3000] and PEP 3100 [#pep3100] for details on the See PEP 3000 [pep3000]_ and PEP 3100 [pep3100]_ for details on the
Python 3.0 project. Python 3.0 project.
Completed features for 2.6 Completed features for 2.6
==========================
PEPs: PEPs:
- 352: Raising a string exception now triggers a TypeError. - 352: Raising a string exception now triggers a TypeError.
Attempting to catch a string exception raises DeprecationWarning. Attempting to catch a string exception raises DeprecationWarning.
BaseException.message has been deprecated. [#pep352] BaseException.message has been deprecated. [pep352]_
- 358: The "bytes" Object [#pep358] - 358: The "bytes" Object [pep358]_
- 366: Main module explicit relative imports [#pep366] - 366: Main module explicit relative imports [pep366]_
- 370: Per user site-packages directory [#pep370] - 370: Per user site-packages directory [pep370]_
- 3112: Bytes literals in Python 3000 [#pep3112] - 3112: Bytes literals in Python 3000 [pep3112]_
- 3127: Integer Literal Support and Syntax [#pep3127] - 3127: Integer Literal Support and Syntax [pep3127]_
- 371: Addition of the multiprocessing package [#pep371] - 371: Addition of the multiprocessing package [pep371]_
New modules in the standard library: New modules in the standard library:
- json - json
- new enhanced turtle module - new enhanced turtle module
- ast - ast
Deprecated modules and functions in the standard library: Deprecated modules and functions in the standard library:
- buildtools - buildtools
- cfmfile - cfmfile
- commands.getstatus() - commands.getstatus()
- macostools.touched() - macostools.touched()
- md5 - md5
- MimeWriter - MimeWriter
- mimify - mimify
- popen2, os.popen[234]() - popen2, os.popen[234]()
- posixfile - posixfile
- sets - sets
- sha - sha
Modules removed from the standard library: Modules removed from the standard library:
- gopherlib - gopherlib
- rgbimg - rgbimg
- macfs - macfs
Warnings for features removed in Py3k: Warnings for features removed in Py3k:
- builtins: apply, callable, coerce, dict.has_key, execfile, - builtins: apply, callable, coerce, dict.has_key, execfile,
reduce, reload reduce, reload
- backticks and <> - backticks and <>
- float args to xrange - float args to xrange
- coerce and all its friends - coerce and all its friends
- comparing by default comparison - comparing by default comparison
- {}.has_key() - {}.has_key()
- file.xreadlines - file.xreadlines
- softspace removal for print() function - softspace removal for print() function
- removal of modules because of PEP 4/3100/3108 - removal of modules because of PEP 4/3100/3108
Other major features: Other major features:
- with/as will be keywords - with/as will be keywords
- a __dir__() special method to control dir() was added [1] - a __dir__() special method to control dir() was added [1]
- AtheOS support stopped. - AtheOS support stopped.
- warnings module implemented in C - warnings module implemented in C
- compile() takes an AST and can convert to byte code - compile() takes an AST and can convert to byte code
Possible features for 2.6 Possible features for 2.6
=========================
New features *should* be implemented prior to alpha2, particularly New features *should* be implemented prior to alpha2, particularly
any C modifications or behavioral changes. New features *must* be any C modifications or behavioral changes. New features *must* be
implemented prior to beta1 or will require Release Manager approval. implemented prior to beta1 or will require Release Manager approval.
The following PEPs are being worked on for inclusion in 2.6: None. The following PEPs are being worked on for inclusion in 2.6: None.
Each non-trivial feature listed here that is not a PEP must be Each non-trivial feature listed here that is not a PEP must be
discussed on python-dev. Other enhancements include: discussed on python-dev. Other enhancements include:
- distutils replacement (requires a PEP) - distutils replacement (requires a PEP)
New modules in the standard library: New modules in the standard library:
- winerror - winerror
http://python.org/sf/1505257 http://python.org/sf/1505257
(Patch rejected, module should be written in C) (Patch rejected, module should be written in C)
- setuptools - setuptools
BDFL pronouncement for inclusion in 2.5: BDFL pronouncement for inclusion in 2.5:
http://mail.python.org/pipermail/python-dev/2006-April/063964.html http://mail.python.org/pipermail/python-dev/2006-April/063964.html
PJE's withdrawal from 2.5 for inclusion in 2.6: PJE's withdrawal from 2.5 for inclusion in 2.6:
http://mail.python.org/pipermail/python-dev/2006-April/064145.html http://mail.python.org/pipermail/python-dev/2006-April/064145.html
Modules to gain a DeprecationWarning (as specified for Python 2.6 Modules to gain a DeprecationWarning (as specified for Python 2.6
or through negligence): or through negligence):
- rfc822 - rfc822
- mimetools - mimetools
- multifile - multifile
- compiler package (or a Py3K warning instead?) - compiler package (or a Py3K warning instead?)
- Convert Parser/*.c to use the C warnings module rather than printf - Convert Parser/\*.c to use the C warnings module rather than printf
- Add warnings for Py3k features removed:
- Add warnings for Py3k features removed:
* __getslice__/__setslice__/__delslice__ * __getslice__/__setslice__/__delslice__
* float args to PyArgs_ParseTuple * float args to PyArgs_ParseTuple
* __cmp__? * __cmp__?
* other comparison changes? * other comparison changes?
* int division? * int division?
* All PendingDeprecationWarnings (e.g. exceptions) * All PendingDeprecationWarnings (e.g. exceptions)
* using zip() result as a list * using zip() result as a list
* the exec statement (use function syntax) * the exec statement (use function syntax)
* function attributes that start with func_* (should use __*__) * function attributes that start with func_* (should use __*__)
* the L suffix for long literals * the L suffix for long literals
* renaming of __nonzero__ to __bool__ * renaming of __nonzero__ to __bool__
* multiple inheritance with classic classes? (MRO might change) * multiple inheritance with classic classes? (MRO might change)
* properties and classic classes? (instance attrs shadow property) * properties and classic classes? (instance attrs shadow property)
- use __bool__ method if available and there's no __nonzero__ - use __bool__ method if available and there's no __nonzero__
- Check the various bits of code in Demo/ and Tools/ all still work, - Check the various bits of code in Demo/ and Tools/ all still work,
update or remove the ones that don't. update or remove the ones that don't.
- All modules in Modules/ should be updated to be ssize_t clean. - All modules in Modules/ should be updated to be ssize_t clean.
- All of Python (including Modules/) should compile cleanly with g++ - All of Python (including Modules/) should compile cleanly with g++
- Start removing deprecated features and generally moving towards Py3k - Start removing deprecated features and generally moving towards Py3k
- Replace all old style tests (operate on import) with unittest or docttest - Replace all old style tests (operate on import) with unittest or docttest
- Add tests for all untested modules - Add tests for all untested modules
- Document undocumented modules/features - Document undocumented modules/features
- bdist_deb in distutils package - bdist_deb in distutils package
http://mail.python.org/pipermail/python-dev/2006-February/060926.html http://mail.python.org/pipermail/python-dev/2006-February/060926.html
- bdist_egg in distutils package - bdist_egg in distutils package
- pure python pgen module - pure python pgen module
(Owner: Guido) (Owner: Guido)
Deferral to 2.6: Deferral to 2.6:
http://mail.python.org/pipermail/python-dev/2006-April/064528.html http://mail.python.org/pipermail/python-dev/2006-April/064528.html
- Remove the fpectl module? - Remove the fpectl module?
Deferred until 2.7 Deferred until 2.7
==================
None None
Open issues Open issues
===========
How should import warnings be handled?
- http://mail.python.org/pipermail/python-dev/2006-June/066345.html
- http://python.org/sf/1515609
- http://python.org/sf/1515361
How should import warnings be handled?
http://mail.python.org/pipermail/python-dev/2006-June/066345.html
http://python.org/sf/1515609
http://python.org/sf/1515361
References References
==========
.. [1] Adding a __dir__() magic method .. [1] Adding a __dir__() magic method
http://mail.python.org/pipermail/python-dev/2006-July/067139.html http://mail.python.org/pipermail/python-dev/2006-July/067139.html
.. [#pep358] PEP 358 (The "bytes" Object) .. [pep352] PEP 352 (Required Superclass for Exceptions)
http://www.python.org/dev/peps/pep-0352
.. [pep358] PEP 358 (The "bytes" Object)
http://www.python.org/dev/peps/pep-0358 http://www.python.org/dev/peps/pep-0358
.. [#pep366] PEP 366 (Main module explicit relative imports) .. [pep366] PEP 366 (Main module explicit relative imports)
http://www.python.org/dev/peps/pep-0366 http://www.python.org/dev/peps/pep-0366
.. [#pep367] PEP 367 (New Super) .. [pep367] PEP 367 (New Super)
http://www.python.org/dev/peps/pep-0367 http://www.python.org/dev/peps/pep-0367
.. [#pep371] PEP 371 (Addition of the multiprocessing package) .. [pep370] PEP 370 (Per user site-packages directory)
http://www.python.org/dev/peps/pep-0370
.. [pep371] PEP 371 (Addition of the multiprocessing package)
http://www.python.org/dev/peps/pep-0371 http://www.python.org/dev/peps/pep-0371
.. [#pep3000] PEP 3000 (Python 3000) .. [pep3000] PEP 3000 (Python 3000)
http://www.python.org/dev/peps/pep-3000 http://www.python.org/dev/peps/pep-3000
.. [#pep3100] PEP 3100 (Miscellaneous Python 3.0 Plans) .. [pep3100] PEP 3100 (Miscellaneous Python 3.0 Plans)
http://www.python.org/dev/peps/pep-3100 http://www.python.org/dev/peps/pep-3100
.. [#pep3112] PEP 3112 (Bytes literals in Python 3000) .. [pep3112] PEP 3112 (Bytes literals in Python 3000)
http://www.python.org/dev/peps/pep-3112 http://www.python.org/dev/peps/pep-3112
.. [#pep3127] PEP 3127 (Integer Literal Support and Syntax) .. [pep3127] PEP 3127 (Integer Literal Support and Syntax)
http://www.python.org/dev/peps/pep-3127 http://www.python.org/dev/peps/pep-3127
.. _Google calendar: .. _Google calendar: http://www.google.com/calendar/ical/b6v58qvojllt0i6ql654r1vh00%40group.calendar.google.com/public/basic.ics
http://www.google.com/calendar/ical/b6v58qvojllt0i6ql654r1vh00%40group.calendar.google.com/public/basic.ics
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
sentence-end-double-space: t indent-tabs-mode: nil
fill-column: 70 sentence-end-double-space: t
coding: utf-8 fill-column: 70
End: coding: utf-8
End: