Convert PEPs 261, 267, 325, 358, 361 (#204)
* Convert PEPs 261, 267, 325, 358, 361 * Fixes to PEP 261 and PEP 361
This commit is contained in:
parent
c5881cf2b5
commit
9c9560962a
391
pep-0261.txt
391
pep-0261.txt
|
@ -5,280 +5,287 @@ Last-Modified: $Date$
|
|||
Author: Paul Prescod <paul@prescod.net>
|
||||
Status: Final
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 27-Jun-2001
|
||||
Python-Version: 2.2
|
||||
Post-History: 27-Jun-2001
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Python 2.1 unicode characters can have ordinals only up to 2**16 -1.
|
||||
This range corresponds to a range in Unicode known as the Basic
|
||||
Multilingual Plane. There are now characters in Unicode that live
|
||||
on other "planes". The largest addressable character in Unicode
|
||||
has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
|
||||
will call this TOPCHAR and call characters in this range "wide
|
||||
characters".
|
||||
Python 2.1 unicode characters can have ordinals only up to 2**16 -1.
|
||||
This range corresponds to a range in Unicode known as the Basic
|
||||
Multilingual Plane. There are now characters in Unicode that live
|
||||
on other "planes". The largest addressable character in Unicode
|
||||
has the ordinal 17 * 2**16 - 1 (0x10ffff). For readability, we
|
||||
will call this TOPCHAR and call characters in this range "wide
|
||||
characters".
|
||||
|
||||
|
||||
Glossary
|
||||
========
|
||||
|
||||
Character
|
||||
Character
|
||||
Used by itself, means the addressable units of a Python
|
||||
Unicode string.
|
||||
|
||||
Used by itself, means the addressable units of a Python
|
||||
Unicode string.
|
||||
Code point
|
||||
A code point is an integer between 0 and TOPCHAR.
|
||||
If you imagine Unicode as a mapping from integers to
|
||||
characters, each integer is a code point. But the
|
||||
integers between 0 and TOPCHAR that do not map to
|
||||
characters are also code points. Some will someday
|
||||
be used for characters. Some are guaranteed never
|
||||
to be used for characters.
|
||||
|
||||
Code point
|
||||
Codec
|
||||
A set of functions for translating between physical
|
||||
encodings (e.g. on disk or coming in from a network)
|
||||
into logical Python objects.
|
||||
|
||||
A code point is an integer between 0 and TOPCHAR.
|
||||
If you imagine Unicode as a mapping from integers to
|
||||
characters, each integer is a code point. But the
|
||||
integers between 0 and TOPCHAR that do not map to
|
||||
characters are also code points. Some will someday
|
||||
be used for characters. Some are guaranteed never
|
||||
to be used for characters.
|
||||
Encoding
|
||||
Mechanism for representing abstract characters in terms of
|
||||
physical bits and bytes. Encodings allow us to store
|
||||
Unicode characters on disk and transmit them over networks
|
||||
in a manner that is compatible with other Unicode software.
|
||||
|
||||
Codec
|
||||
Surrogate pair
|
||||
Two physical characters that represent a single logical
|
||||
character. Part of a convention for representing 32-bit
|
||||
code points in terms of two 16-bit code points.
|
||||
|
||||
A set of functions for translating between physical
|
||||
encodings (e.g. on disk or coming in from a network)
|
||||
into logical Python objects.
|
||||
|
||||
Encoding
|
||||
|
||||
Mechanism for representing abstract characters in terms of
|
||||
physical bits and bytes. Encodings allow us to store
|
||||
Unicode characters on disk and transmit them over networks
|
||||
in a manner that is compatible with other Unicode software.
|
||||
|
||||
Surrogate pair
|
||||
|
||||
Two physical characters that represent a single logical
|
||||
character. Part of a convention for representing 32-bit
|
||||
code points in terms of two 16-bit code points.
|
||||
|
||||
Unicode string
|
||||
|
||||
A Python type representing a sequence of code points with
|
||||
"string semantics" (e.g. case conversions, regular
|
||||
expression compatibility, etc.) Constructed with the
|
||||
unicode() function.
|
||||
Unicode string
|
||||
A Python type representing a sequence of code points with
|
||||
"string semantics" (e.g. case conversions, regular
|
||||
expression compatibility, etc.) Constructed with the
|
||||
``unicode()`` function.
|
||||
|
||||
|
||||
Proposed Solution
|
||||
=================
|
||||
|
||||
One solution would be to merely increase the maximum ordinal
|
||||
to a larger value. Unfortunately the only straightforward
|
||||
implementation of this idea is to use 4 bytes per character.
|
||||
This has the effect of doubling the size of most Unicode
|
||||
strings. In order to avoid imposing this cost on every
|
||||
user, Python 2.2 will allow the 4-byte implementation as a
|
||||
build-time option. Users can choose whether they care about
|
||||
wide characters or prefer to preserve memory.
|
||||
One solution would be to merely increase the maximum ordinal
|
||||
to a larger value. Unfortunately the only straightforward
|
||||
implementation of this idea is to use 4 bytes per character.
|
||||
This has the effect of doubling the size of most Unicode
|
||||
strings. In order to avoid imposing this cost on every
|
||||
user, Python 2.2 will allow the 4-byte implementation as a
|
||||
build-time option. Users can choose whether they care about
|
||||
wide characters or prefer to preserve memory.
|
||||
|
||||
The 4-byte option is called "wide Py_UNICODE". The 2-byte option
|
||||
is called "narrow Py_UNICODE".
|
||||
The 4-byte option is called ``wide Py_UNICODE``. The 2-byte option
|
||||
is called ``narrow Py_UNICODE``.
|
||||
|
||||
Most things will behave identically in the wide and narrow worlds.
|
||||
Most things will behave identically in the wide and narrow worlds.
|
||||
|
||||
* unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
|
||||
length-one string.
|
||||
* unichr(i) for 0 <= i < 2**16 (0x10000) always returns a
|
||||
length-one string.
|
||||
|
||||
* unichr(i) for 2**16 <= i <= TOPCHAR will return a
|
||||
length-one string on wide Python builds. On narrow builds it will
|
||||
raise ValueError.
|
||||
* unichr(i) for 2**16 <= i <= TOPCHAR will return a
|
||||
length-one string on wide Python builds. On narrow builds it will
|
||||
raise ``ValueError``.
|
||||
|
||||
ISSUE
|
||||
ISSUE
|
||||
|
||||
Python currently allows \U literals that cannot be
|
||||
represented as a single Python character. It generates two
|
||||
Python characters known as a "surrogate pair". Should this
|
||||
be disallowed on future narrow Python builds?
|
||||
Python currently allows ``\U`` literals that cannot be
|
||||
represented as a single Python character. It generates two
|
||||
Python characters known as a "surrogate pair". Should this
|
||||
be disallowed on future narrow Python builds?
|
||||
|
||||
Pro:
|
||||
Pro:
|
||||
|
||||
Python already the construction of a surrogate pair
|
||||
for a large unicode literal character escape sequence.
|
||||
This is basically designed as a simple way to construct
|
||||
"wide characters" even in a narrow Python build. It is also
|
||||
somewhat logical considering that the Unicode-literal syntax
|
||||
is basically a short-form way of invoking the unicode-escape
|
||||
codec.
|
||||
Python already the construction of a surrogate pair
|
||||
for a large unicode literal character escape sequence.
|
||||
This is basically designed as a simple way to construct
|
||||
"wide characters" even in a narrow Python build. It is also
|
||||
somewhat logical considering that the Unicode-literal syntax
|
||||
is basically a short-form way of invoking the unicode-escape
|
||||
codec.
|
||||
|
||||
Con:
|
||||
Con:
|
||||
|
||||
Surrogates could be easily created this way but the user
|
||||
still needs to be careful about slicing, indexing, printing
|
||||
etc. Therefore, some have suggested that Unicode
|
||||
literals should not support surrogates.
|
||||
Surrogates could be easily created this way but the user
|
||||
still needs to be careful about slicing, indexing, printing
|
||||
etc. Therefore, some have suggested that Unicode
|
||||
literals should not support surrogates.
|
||||
|
||||
|
||||
ISSUE
|
||||
ISSUE
|
||||
|
||||
Should Python allow the construction of characters that do
|
||||
not correspond to Unicode code points? Unassigned Unicode
|
||||
code points should obviously be legal (because they could
|
||||
be assigned at any time). But code points above TOPCHAR are
|
||||
guaranteed never to be used by Unicode. Should we allow access
|
||||
to them anyhow?
|
||||
Should Python allow the construction of characters that do
|
||||
not correspond to Unicode code points? Unassigned Unicode
|
||||
code points should obviously be legal (because they could
|
||||
be assigned at any time). But code points above TOPCHAR are
|
||||
guaranteed never to be used by Unicode. Should we allow access
|
||||
to them anyhow?
|
||||
|
||||
Pro:
|
||||
Pro:
|
||||
|
||||
If a Python user thinks they know what they're doing why
|
||||
should we try to prevent them from violating the Unicode
|
||||
spec? After all, we don't stop 8-bit strings from
|
||||
containing non-ASCII characters.
|
||||
If a Python user thinks they know what they're doing why
|
||||
should we try to prevent them from violating the Unicode
|
||||
spec? After all, we don't stop 8-bit strings from
|
||||
containing non-ASCII characters.
|
||||
|
||||
Con:
|
||||
Con:
|
||||
|
||||
Codecs and other Unicode-consuming code will have to be
|
||||
careful of these characters which are disallowed by the
|
||||
Unicode specification.
|
||||
Codecs and other Unicode-consuming code will have to be
|
||||
careful of these characters which are disallowed by the
|
||||
Unicode specification.
|
||||
|
||||
* ord() is always the inverse of unichr()
|
||||
* ``ord()`` is always the inverse of ``unichr()``
|
||||
|
||||
* There is an integer value in the sys module that describes the
|
||||
largest ordinal for a character in a Unicode string on the current
|
||||
interpreter. sys.maxunicode is 2**16-1 (0xffff) on narrow builds
|
||||
of Python and TOPCHAR on wide builds.
|
||||
* There is an integer value in the sys module that describes the
|
||||
largest ordinal for a character in a Unicode string on the current
|
||||
interpreter. ``sys.maxunicode`` is 2**16-1 (0xffff) on narrow builds
|
||||
of Python and TOPCHAR on wide builds.
|
||||
|
||||
ISSUE: Should there be distinct constants for accessing
|
||||
TOPCHAR and the real upper bound for the domain of
|
||||
unichr (if they differ)? There has also been a
|
||||
suggestion of sys.unicodewidth which can take the
|
||||
values 'wide' and 'narrow'.
|
||||
ISSUE:
|
||||
|
||||
* every Python Unicode character represents exactly one Unicode code
|
||||
point (i.e. Python Unicode Character = Abstract Unicode character).
|
||||
Should there be distinct constants for accessing
|
||||
TOPCHAR and the real upper bound for the domain of
|
||||
unichr (if they differ)? There has also been a
|
||||
suggestion of sys.unicodewidth which can take the
|
||||
values 'wide' and 'narrow'.
|
||||
|
||||
* codecs will be upgraded to support "wide characters"
|
||||
(represented directly in UCS-4, and as variable-length sequences
|
||||
in UTF-8 and UTF-16). This is the main part of the implementation
|
||||
left to be done.
|
||||
* every Python Unicode character represents exactly one Unicode code
|
||||
point (i.e. Python Unicode Character = Abstract Unicode character).
|
||||
|
||||
* There is a convention in the Unicode world for encoding a 32-bit
|
||||
code point in terms of two 16-bit code points. These are known
|
||||
as "surrogate pairs". Python's codecs will adopt this convention
|
||||
and encode 32-bit code points as surrogate pairs on narrow Python
|
||||
builds.
|
||||
* codecs will be upgraded to support "wide characters"
|
||||
(represented directly in UCS-4, and as variable-length sequences
|
||||
in UTF-8 and UTF-16). This is the main part of the implementation
|
||||
left to be done.
|
||||
|
||||
ISSUE
|
||||
* There is a convention in the Unicode world for encoding a 32-bit
|
||||
code point in terms of two 16-bit code points. These are known
|
||||
as "surrogate pairs". Python's codecs will adopt this convention
|
||||
and encode 32-bit code points as surrogate pairs on narrow Python
|
||||
builds.
|
||||
|
||||
Should there be a way to tell codecs not to generate
|
||||
surrogates and instead treat wide characters as
|
||||
errors?
|
||||
ISSUE
|
||||
|
||||
Pro:
|
||||
Should there be a way to tell codecs not to generate
|
||||
surrogates and instead treat wide characters as
|
||||
errors?
|
||||
|
||||
I might want to write code that works only with
|
||||
fixed-width characters and does not have to worry about
|
||||
surrogates.
|
||||
Pro:
|
||||
|
||||
I might want to write code that works only with
|
||||
fixed-width characters and does not have to worry about
|
||||
surrogates.
|
||||
|
||||
Con:
|
||||
Con:
|
||||
|
||||
No clear proposal of how to communicate this to codecs.
|
||||
No clear proposal of how to communicate this to codecs.
|
||||
|
||||
* there are no restrictions on constructing strings that use
|
||||
code points "reserved for surrogates" improperly. These are
|
||||
called "isolated surrogates". The codecs should disallow reading
|
||||
these from files, but you could construct them using string
|
||||
literals or unichr().
|
||||
* there are no restrictions on constructing strings that use
|
||||
code points "reserved for surrogates" improperly. These are
|
||||
called "isolated surrogates". The codecs should disallow reading
|
||||
these from files, but you could construct them using string
|
||||
literals or ``unichr()``.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
There is a new define:
|
||||
There is a new define::
|
||||
|
||||
#define Py_UNICODE_SIZE 2
|
||||
#define Py_UNICODE_SIZE 2
|
||||
|
||||
To test whether UCS2 or UCS4 is in use, the derived macro
|
||||
Py_UNICODE_WIDE should be used, which is defined when UCS-4 is in
|
||||
use.
|
||||
To test whether UCS2 or UCS4 is in use, the derived macro
|
||||
``Py_UNICODE_WIDE`` should be used, which is defined when UCS-4 is in
|
||||
use.
|
||||
|
||||
There is a new configure option:
|
||||
There is a new configure option:
|
||||
|
||||
--enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
|
||||
wchar_t if it fits
|
||||
--enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
|
||||
wchar_t if it fits
|
||||
--enable-unicode same as "=ucs2"
|
||||
--disable-unicode entirely remove the Unicode functionality.
|
||||
===================== ==========================================
|
||||
--enable-unicode=ucs2 configures a narrow Py_UNICODE, and uses
|
||||
wchar_t if it fits
|
||||
--enable-unicode=ucs4 configures a wide Py_UNICODE, and uses
|
||||
wchar_t if it fits
|
||||
--enable-unicode same as "=ucs2"
|
||||
--disable-unicode entirely remove the Unicode functionality.
|
||||
===================== ==========================================
|
||||
|
||||
It is also proposed that one day --enable-unicode will just
|
||||
default to the width of your platforms wchar_t.
|
||||
It is also proposed that one day ``--enable-unicode`` will just
|
||||
default to the width of your platforms ``wchar_t``.
|
||||
|
||||
Windows builds will be narrow for a while based on the fact that
|
||||
there have been few requests for wide characters, those requests
|
||||
are mostly from hard-core programmers with the ability to buy
|
||||
their own Python and Windows itself is strongly biased towards
|
||||
16-bit characters.
|
||||
Windows builds will be narrow for a while based on the fact that
|
||||
there have been few requests for wide characters, those requests
|
||||
are mostly from hard-core programmers with the ability to buy
|
||||
their own Python and Windows itself is strongly biased towards
|
||||
16-bit characters.
|
||||
|
||||
|
||||
Notes
|
||||
=====
|
||||
|
||||
This PEP does NOT imply that people using Unicode need to use a
|
||||
4-byte encoding for their files on disk or sent over the network.
|
||||
It only allows them to do so. For example, ASCII is still a
|
||||
legitimate (7-bit) Unicode-encoding.
|
||||
This PEP does NOT imply that people using Unicode need to use a
|
||||
4-byte encoding for their files on disk or sent over the network.
|
||||
It only allows them to do so. For example, ASCII is still a
|
||||
legitimate (7-bit) Unicode-encoding.
|
||||
|
||||
It has been proposed that there should be a module that handles
|
||||
surrogates in narrow Python builds for programmers. If someone
|
||||
wants to implement that, it will be another PEP. It might also be
|
||||
combined with features that allow other kinds of character-,
|
||||
word- and line- based indexing.
|
||||
It has been proposed that there should be a module that handles
|
||||
surrogates in narrow Python builds for programmers. If someone
|
||||
wants to implement that, it will be another PEP. It might also be
|
||||
combined with features that allow other kinds of character-,
|
||||
word- and line- based indexing.
|
||||
|
||||
|
||||
Rejected Suggestions
|
||||
====================
|
||||
|
||||
More or less the status-quo
|
||||
More or less the status-quo
|
||||
|
||||
We could officially say that Python characters are 16-bit and
|
||||
require programmers to implement wide characters in their
|
||||
application logic by combining surrogate pairs. This is a heavy
|
||||
burden because emulating 32-bit characters is likely to be
|
||||
very inefficient if it is coded entirely in Python. Plus these
|
||||
abstracted pseudo-strings would not be legal as input to the
|
||||
regular expression engine.
|
||||
We could officially say that Python characters are 16-bit and
|
||||
require programmers to implement wide characters in their
|
||||
application logic by combining surrogate pairs. This is a heavy
|
||||
burden because emulating 32-bit characters is likely to be
|
||||
very inefficient if it is coded entirely in Python. Plus these
|
||||
abstracted pseudo-strings would not be legal as input to the
|
||||
regular expression engine.
|
||||
|
||||
"Space-efficient Unicode" type
|
||||
"Space-efficient Unicode" type
|
||||
|
||||
Another class of solution is to use some efficient storage
|
||||
internally but present an abstraction of wide characters to
|
||||
the programmer. Any of these would require a much more complex
|
||||
implementation than the accepted solution. For instance consider
|
||||
the impact on the regular expression engine. In theory, we could
|
||||
move to this implementation in the future without breaking Python
|
||||
code. A future Python could "emulate" wide Python semantics on
|
||||
narrow Python. Guido is not willing to undertake the
|
||||
implementation right now.
|
||||
Another class of solution is to use some efficient storage
|
||||
internally but present an abstraction of wide characters to
|
||||
the programmer. Any of these would require a much more complex
|
||||
implementation than the accepted solution. For instance consider
|
||||
the impact on the regular expression engine. In theory, we could
|
||||
move to this implementation in the future without breaking Python
|
||||
code. A future Python could "emulate" wide Python semantics on
|
||||
narrow Python. Guido is not willing to undertake the
|
||||
implementation right now.
|
||||
|
||||
Two types
|
||||
Two types
|
||||
|
||||
We could introduce a 32-bit Unicode type alongside the 16-bit
|
||||
type. There is a lot of code that expects there to be only a
|
||||
single Unicode type.
|
||||
We could introduce a 32-bit Unicode type alongside the 16-bit
|
||||
type. There is a lot of code that expects there to be only a
|
||||
single Unicode type.
|
||||
|
||||
This PEP represents the least-effort solution. Over the next
|
||||
several years, 32-bit Unicode characters will become more common
|
||||
and that may either convince us that we need a more sophisticated
|
||||
solution or (on the other hand) convince us that simply
|
||||
mandating wide Unicode characters is an appropriate solution.
|
||||
Right now the two options on the table are do nothing or do
|
||||
this.
|
||||
This PEP represents the least-effort solution. Over the next
|
||||
several years, 32-bit Unicode characters will become more common
|
||||
and that may either convince us that we need a more sophisticated
|
||||
solution or (on the other hand) convince us that simply
|
||||
mandating wide Unicode characters is an appropriate solution.
|
||||
Right now the two options on the table are do nothing or do
|
||||
this.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
Unicode Glossary: http://www.unicode.org/glossary/
|
||||
Unicode Glossary: http://www.unicode.org/glossary/
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
End:
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
End:
|
||||
|
|
426
pep-0267.txt
426
pep-0267.txt
|
@ -5,280 +5,292 @@ Last-Modified: $Date$
|
|||
Author: jeremy@alum.mit.edu (Jeremy Hylton)
|
||||
Status: Deferred
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 23-May-2001
|
||||
Python-Version: 2.2
|
||||
Post-History:
|
||||
|
||||
Deferral
|
||||
|
||||
While this PEP is a nice idea, no-one has yet emerged to do the work of
|
||||
hashing out the differences between this PEP, PEP 266 and PEP 280.
|
||||
Hence, it is being deferred.
|
||||
Deferral
|
||||
========
|
||||
|
||||
While this PEP is a nice idea, no-one has yet emerged to do the work of
|
||||
hashing out the differences between this PEP, PEP 266 and PEP 280.
|
||||
Hence, it is being deferred.
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes a new implementation of global module namespaces
|
||||
and the builtin namespace that speeds name resolution. The
|
||||
implementation would use an array of object pointers for most
|
||||
operations in these namespaces. The compiler would assign indices
|
||||
for global variables and module attributes at compile time.
|
||||
This PEP proposes a new implementation of global module namespaces
|
||||
and the builtin namespace that speeds name resolution. The
|
||||
implementation would use an array of object pointers for most
|
||||
operations in these namespaces. The compiler would assign indices
|
||||
for global variables and module attributes at compile time.
|
||||
|
||||
The current implementation represents these namespaces as
|
||||
dictionaries. A global name incurs a dictionary lookup each time
|
||||
it is used; a builtin name incurs two dictionary lookups, a failed
|
||||
lookup in the global namespace and a second lookup in the builtin
|
||||
namespace.
|
||||
The current implementation represents these namespaces as
|
||||
dictionaries. A global name incurs a dictionary lookup each time
|
||||
it is used; a builtin name incurs two dictionary lookups, a failed
|
||||
lookup in the global namespace and a second lookup in the builtin
|
||||
namespace.
|
||||
|
||||
This implementation should speed Python code that uses
|
||||
module-level functions and variables. It should also eliminate
|
||||
awkward coding styles that have evolved to speed access to these
|
||||
names.
|
||||
This implementation should speed Python code that uses
|
||||
module-level functions and variables. It should also eliminate
|
||||
awkward coding styles that have evolved to speed access to these
|
||||
names.
|
||||
|
||||
The implementation is complicated because the global and builtin
|
||||
namespaces can be modified dynamically in ways that are impossible
|
||||
for the compiler to detect. (Example: A module's namespace is
|
||||
modified by a script after the module is imported.) As a result,
|
||||
the implementation must maintain several auxiliary data structures
|
||||
to preserve these dynamic features.
|
||||
The implementation is complicated because the global and builtin
|
||||
namespaces can be modified dynamically in ways that are impossible
|
||||
for the compiler to detect. (Example: A module's namespace is
|
||||
modified by a script after the module is imported.) As a result,
|
||||
the implementation must maintain several auxiliary data structures
|
||||
to preserve these dynamic features.
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
This PEP proposes a new implementation of attribute access for
|
||||
module objects that optimizes access to module variables known at
|
||||
compile time. The module will store these variables in an array
|
||||
and provide an interface to lookup attributes using array offsets.
|
||||
For globals, builtins, and attributes of imported modules, the
|
||||
compiler will generate code that uses the array offsets for fast
|
||||
access.
|
||||
This PEP proposes a new implementation of attribute access for
|
||||
module objects that optimizes access to module variables known at
|
||||
compile time. The module will store these variables in an array
|
||||
and provide an interface to lookup attributes using array offsets.
|
||||
For globals, builtins, and attributes of imported modules, the
|
||||
compiler will generate code that uses the array offsets for fast
|
||||
access.
|
||||
|
||||
[describe the key parts of the design: dlict, compiler support,
|
||||
stupid name trick workarounds, optimization of other module's
|
||||
globals]
|
||||
[describe the key parts of the design: dlict, compiler support,
|
||||
stupid name trick workarounds, optimization of other module's
|
||||
globals]
|
||||
|
||||
The implementation will preserve existing semantics for module
|
||||
namespaces, including the ability to modify module namespaces at
|
||||
runtime in ways that affect the visibility of builtin names.
|
||||
The implementation will preserve existing semantics for module
|
||||
namespaces, including the ability to modify module namespaces at
|
||||
runtime in ways that affect the visibility of builtin names.
|
||||
|
||||
|
||||
DLict design
|
||||
============
|
||||
|
||||
The namespaces are implemented using a data structure that has
|
||||
sometimes gone under the name dlict. It is a dictionary that has
|
||||
numbered slots for some dictionary entries. The type must be
|
||||
implemented in C to achieve acceptable performance. The new
|
||||
type-class unification work should make this fairly easy. The
|
||||
DLict will presumably be a subclass of dictionary with an
|
||||
alternate storage module for some keys.
|
||||
The namespaces are implemented using a data structure that has
|
||||
sometimes gone under the name ``dlict``. It is a dictionary that has
|
||||
numbered slots for some dictionary entries. The type must be
|
||||
implemented in C to achieve acceptable performance. The new
|
||||
type-class unification work should make this fairly easy. The
|
||||
``DLict`` will presumably be a subclass of dictionary with an
|
||||
alternate storage module for some keys.
|
||||
|
||||
A Python implementation is included here to illustrate the basic
|
||||
design:
|
||||
A Python implementation is included here to illustrate the basic
|
||||
design::
|
||||
|
||||
"""A dictionary-list hybrid"""
|
||||
"""A dictionary-list hybrid"""
|
||||
|
||||
import types
|
||||
import types
|
||||
|
||||
class DLict:
|
||||
def __init__(self, names):
|
||||
assert isinstance(names, types.DictType)
|
||||
self.names = {}
|
||||
self.list = [None] * size
|
||||
self.empty = [1] * size
|
||||
self.dict = {}
|
||||
self.size = 0
|
||||
class DLict:
|
||||
def __init__(self, names):
|
||||
assert isinstance(names, types.DictType)
|
||||
self.names = {}
|
||||
self.list = [None] * size
|
||||
self.empty = [1] * size
|
||||
self.dict = {}
|
||||
self.size = 0
|
||||
|
||||
def __getitem__(self, name):
|
||||
i = self.names.get(name)
|
||||
if i is None:
|
||||
return self.dict[name]
|
||||
def __getitem__(self, name):
|
||||
i = self.names.get(name)
|
||||
if i is None:
|
||||
return self.dict[name]
|
||||
if self.empty[i] is not None:
|
||||
raise KeyError, name
|
||||
return self.list[i]
|
||||
|
||||
def __setitem__(self, name, val):
|
||||
i = self.names.get(name)
|
||||
if i is None:
|
||||
self.dict[name] = val
|
||||
else:
|
||||
self.empty[i] = None
|
||||
self.list[i] = val
|
||||
self.size += 1
|
||||
|
||||
def __delitem__(self, name):
|
||||
i = self.names.get(name)
|
||||
if i is None:
|
||||
del self.dict[name]
|
||||
else:
|
||||
if self.empty[i] is not None:
|
||||
raise KeyError, name
|
||||
return self.list[i]
|
||||
self.empty[i] = 1
|
||||
self.list[i] = None
|
||||
self.size -= 1
|
||||
|
||||
def __setitem__(self, name, val):
|
||||
i = self.names.get(name)
|
||||
if i is None:
|
||||
self.dict[name] = val
|
||||
else:
|
||||
self.empty[i] = None
|
||||
self.list[i] = val
|
||||
self.size += 1
|
||||
def keys(self):
|
||||
if self.dict:
|
||||
return self.names.keys() + self.dict.keys()
|
||||
else:
|
||||
return self.names.keys()
|
||||
|
||||
def __delitem__(self, name):
|
||||
i = self.names.get(name)
|
||||
if i is None:
|
||||
del self.dict[name]
|
||||
else:
|
||||
if self.empty[i] is not None:
|
||||
raise KeyError, name
|
||||
def values(self):
|
||||
if self.dict:
|
||||
return self.names.values() + self.dict.values()
|
||||
else:
|
||||
return self.names.values()
|
||||
|
||||
def items(self):
|
||||
if self.dict:
|
||||
return self.names.items()
|
||||
else:
|
||||
return self.names.items() + self.dict.items()
|
||||
|
||||
def __len__(self):
|
||||
return self.size + len(self.dict)
|
||||
|
||||
def __cmp__(self, dlict):
|
||||
c = cmp(self.names, dlict.names)
|
||||
if c != 0:
|
||||
return c
|
||||
c = cmp(self.size, dlict.size)
|
||||
if c != 0:
|
||||
return c
|
||||
for i in range(len(self.names)):
|
||||
c = cmp(self.empty[i], dlict.empty[i])
|
||||
if c != 0:
|
||||
return c
|
||||
if self.empty[i] is None:
|
||||
c = cmp(self.list[i], dlict.empty[i])
|
||||
if c != 0:
|
||||
return c
|
||||
return cmp(self.dict, dlict.dict)
|
||||
|
||||
def clear(self):
|
||||
self.dict.clear()
|
||||
for i in range(len(self.names)):
|
||||
if self.empty[i] is None:
|
||||
self.empty[i] = 1
|
||||
self.list[i] = None
|
||||
self.size -= 1
|
||||
|
||||
def keys(self):
|
||||
if self.dict:
|
||||
return self.names.keys() + self.dict.keys()
|
||||
else:
|
||||
return self.names.keys()
|
||||
def update(self):
|
||||
pass
|
||||
|
||||
def values(self):
|
||||
if self.dict:
|
||||
return self.names.values() + self.dict.values()
|
||||
else:
|
||||
return self.names.values()
|
||||
def load(self, index):
|
||||
"""dlict-special method to support indexed access"""
|
||||
if self.empty[index] is None:
|
||||
return self.list[index]
|
||||
else:
|
||||
raise KeyError, index # XXX might want reverse mapping
|
||||
|
||||
def items(self):
|
||||
if self.dict:
|
||||
return self.names.items()
|
||||
else:
|
||||
return self.names.items() + self.dict.items()
|
||||
def store(self, index, val):
|
||||
"""dlict-special method to support indexed access"""
|
||||
self.empty[index] = None
|
||||
self.list[index] = val
|
||||
|
||||
def __len__(self):
|
||||
return self.size + len(self.dict)
|
||||
|
||||
def __cmp__(self, dlict):
|
||||
c = cmp(self.names, dlict.names)
|
||||
if c != 0:
|
||||
return c
|
||||
c = cmp(self.size, dlict.size)
|
||||
if c != 0:
|
||||
return c
|
||||
for i in range(len(self.names)):
|
||||
c = cmp(self.empty[i], dlict.empty[i])
|
||||
if c != 0:
|
||||
return c
|
||||
if self.empty[i] is None:
|
||||
c = cmp(self.list[i], dlict.empty[i])
|
||||
if c != 0:
|
||||
return c
|
||||
return cmp(self.dict, dlict.dict)
|
||||
|
||||
def clear(self):
|
||||
self.dict.clear()
|
||||
for i in range(len(self.names)):
|
||||
if self.empty[i] is None:
|
||||
self.empty[i] = 1
|
||||
self.list[i] = None
|
||||
|
||||
def update(self):
|
||||
pass
|
||||
|
||||
def load(self, index):
|
||||
"""dlict-special method to support indexed access"""
|
||||
if self.empty[index] is None:
|
||||
return self.list[index]
|
||||
else:
|
||||
raise KeyError, index # XXX might want reverse mapping
|
||||
|
||||
def store(self, index, val):
|
||||
"""dlict-special method to support indexed access"""
|
||||
self.empty[index] = None
|
||||
self.list[index] = val
|
||||
|
||||
def delete(self, index):
|
||||
"""dlict-special method to support indexed access"""
|
||||
self.empty[index] = 1
|
||||
self.list[index] = None
|
||||
def delete(self, index):
|
||||
"""dlict-special method to support indexed access"""
|
||||
self.empty[index] = 1
|
||||
self.list[index] = None
|
||||
|
||||
|
||||
Compiler issues
|
||||
===============
|
||||
|
||||
The compiler currently collects the names of all global variables
|
||||
in a module. These are names bound at the module level or bound
|
||||
in a class or function body that declares them to be global.
|
||||
The compiler currently collects the names of all global variables
|
||||
in a module. These are names bound at the module level or bound
|
||||
in a class or function body that declares them to be global.
|
||||
|
||||
The compiler would assign indices for each global name and add the
|
||||
names and indices of the globals to the module's code object.
|
||||
Each code object would then be bound irrevocably to the module it
|
||||
was defined in. (Not sure if there are some subtle problems with
|
||||
this.)
|
||||
The compiler would assign indices for each global name and add the
|
||||
names and indices of the globals to the module's code object.
|
||||
Each code object would then be bound irrevocably to the module it
|
||||
was defined in. (Not sure if there are some subtle problems with
|
||||
this.)
|
||||
|
||||
For attributes of imported modules, the module will store an
|
||||
indirection record. Internally, the module will store a pointer
|
||||
to the defining module and the offset of the attribute in the
|
||||
defining module's global variable array. The offset would be
|
||||
initialized the first time the name is looked up.
|
||||
For attributes of imported modules, the module will store an
|
||||
indirection record. Internally, the module will store a pointer
|
||||
to the defining module and the offset of the attribute in the
|
||||
defining module's global variable array. The offset would be
|
||||
initialized the first time the name is looked up.
|
||||
|
||||
|
||||
Runtime model
|
||||
=============
|
||||
|
||||
The PythonVM will be extended with new opcodes to access globals
|
||||
and module attributes via a module-level array.
|
||||
The PythonVM will be extended with new opcodes to access globals
|
||||
and module attributes via a module-level array.
|
||||
|
||||
A function object would need to point to the module that defined
|
||||
it in order to provide access to the module-level global array.
|
||||
A function object would need to point to the module that defined
|
||||
it in order to provide access to the module-level global array.
|
||||
|
||||
For module attributes stored in the dlict (call them static
|
||||
attributes), the get/delattr implementation would need to track
|
||||
access to these attributes using the old by-name interface. If a
|
||||
static attribute is updated dynamically, e.g.
|
||||
For module attributes stored in the ``dlict`` (call them static
|
||||
attributes), the get/delattr implementation would need to track
|
||||
access to these attributes using the old by-name interface. If a
|
||||
static attribute is updated dynamically, e.g.::
|
||||
|
||||
mod.__dict__["foo"] = 2
|
||||
mod.__dict__["foo"] = 2
|
||||
|
||||
The implementation would need to update the array slot instead of
|
||||
the backup dict.
|
||||
The implementation would need to update the array slot instead of
|
||||
the backup dict.
|
||||
|
||||
|
||||
Backwards compatibility
|
||||
=======================
|
||||
|
||||
The dlict will need to maintain meta-information about whether a
|
||||
slot is currently used or not. It will also need to maintain a
|
||||
pointer to the builtin namespace. When a name is not currently
|
||||
used in the global namespace, the lookup will have to fail over to
|
||||
the builtin namespace.
|
||||
The ``dlict`` will need to maintain meta-information about whether a
|
||||
slot is currently used or not. It will also need to maintain a
|
||||
pointer to the builtin namespace. When a name is not currently
|
||||
used in the global namespace, the lookup will have to fail over to
|
||||
the builtin namespace.
|
||||
|
||||
In the reverse case, each module may need a special accessor
|
||||
function for the builtin namespace that checks to see if a global
|
||||
shadowing the builtin has been added dynamically. This check
|
||||
would only occur if there was a dynamic change to the module's
|
||||
dlict, i.e. when a name is bound that wasn't discovered at
|
||||
compile-time.
|
||||
In the reverse case, each module may need a special accessor
|
||||
function for the builtin namespace that checks to see if a global
|
||||
shadowing the builtin has been added dynamically. This check
|
||||
would only occur if there was a dynamic change to the module's
|
||||
``dlict``, i.e. when a name is bound that wasn't discovered at
|
||||
compile-time.
|
||||
|
||||
These mechanisms would have little if any cost for the common case
|
||||
whether a module's global namespace is not modified in strange
|
||||
ways at runtime. They would add overhead for modules that did
|
||||
unusual things with global names, but this is an uncommon practice
|
||||
and probably one worth discouraging.
|
||||
These mechanisms would have little if any cost for the common case
|
||||
whether a module's global namespace is not modified in strange
|
||||
ways at runtime. They would add overhead for modules that did
|
||||
unusual things with global names, but this is an uncommon practice
|
||||
and probably one worth discouraging.
|
||||
|
||||
It may be desirable to disable dynamic additions to the global
|
||||
namespace in some future version of Python. If so, the new
|
||||
implementation could provide warnings.
|
||||
It may be desirable to disable dynamic additions to the global
|
||||
namespace in some future version of Python. If so, the new
|
||||
implementation could provide warnings.
|
||||
|
||||
|
||||
Related PEPs
|
||||
============
|
||||
|
||||
PEP 266, Optimizing Global Variable/Attribute Access, proposes a
|
||||
different mechanism for optimizing access to global variables as
|
||||
well as attributes of objects. The mechanism uses two new opcodes
|
||||
TRACK_OBJECT and UNTRACK_OBJECT to create a slot in the local
|
||||
variables array that aliases the global or object attribute. If
|
||||
the object being aliases is rebound, the rebind operation is
|
||||
responsible for updating the aliases.
|
||||
PEP 266, Optimizing Global Variable/Attribute Access, proposes a
|
||||
different mechanism for optimizing access to global variables as
|
||||
well as attributes of objects. The mechanism uses two new opcodes
|
||||
``TRACK_OBJECT`` and ``UNTRACK_OBJECT`` to create a slot in the local
|
||||
variables array that aliases the global or object attribute. If
|
||||
the object being aliases is rebound, the rebind operation is
|
||||
responsible for updating the aliases.
|
||||
|
||||
The objecting tracking approach applies to a wider range of
|
||||
objects than just module. It may also have a higher runtime cost,
|
||||
because each function that uses a global or object attribute must
|
||||
execute extra opcodes to register its interest in an object and
|
||||
unregister on exit; the cost of registration is unclear, but
|
||||
presumably involves a dynamically resizable data structure to hold
|
||||
a list of callbacks.
|
||||
The objecting tracking approach applies to a wider range of
|
||||
objects than just module. It may also have a higher runtime cost,
|
||||
because each function that uses a global or object attribute must
|
||||
execute extra opcodes to register its interest in an object and
|
||||
unregister on exit; the cost of registration is unclear, but
|
||||
presumably involves a dynamically resizable data structure to hold
|
||||
a list of callbacks.
|
||||
|
||||
The implementation proposed here avoids the need for registration,
|
||||
because it does not create aliases. Instead it allows functions
|
||||
that reference a global variable or module attribute to retain a
|
||||
pointer to the location where the original binding is stored. A
|
||||
second advantage is that the initial lookup is performed once per
|
||||
module rather than once per function call.
|
||||
The implementation proposed here avoids the need for registration,
|
||||
because it does not create aliases. Instead it allows functions
|
||||
that reference a global variable or module attribute to retain a
|
||||
pointer to the location where the original binding is stored. A
|
||||
second advantage is that the initial lookup is performed once per
|
||||
module rather than once per function call.
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
End:
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
End:
|
||||
|
|
414
pep-0325.txt
414
pep-0325.txt
|
@ -5,282 +5,294 @@ Last-Modified: $Date$
|
|||
Author: Samuele Pedroni <pedronis@python.org>
|
||||
Status: Rejected
|
||||
Type: Standards Track
|
||||
Content-Type: text/plain
|
||||
Content-Type: text/x-rst
|
||||
Created: 25-Aug-2003
|
||||
Python-Version: 2.4
|
||||
Post-History:
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Generators allow for natural coding and abstraction of traversal
|
||||
over data. Currently if external resources needing proper timely
|
||||
release are involved, generators are unfortunately not adequate.
|
||||
The typical idiom for timely release is not supported, a yield
|
||||
statement is not allowed in the try clause of a try-finally
|
||||
statement inside a generator. The finally clause execution can be
|
||||
neither guaranteed nor enforced.
|
||||
Generators allow for natural coding and abstraction of traversal
|
||||
over data. Currently if external resources needing proper timely
|
||||
release are involved, generators are unfortunately not adequate.
|
||||
The typical idiom for timely release is not supported, a yield
|
||||
statement is not allowed in the try clause of a try-finally
|
||||
statement inside a generator. The finally clause execution can be
|
||||
neither guaranteed nor enforced.
|
||||
|
||||
This PEP proposes that the built-in generator type implement a
|
||||
close method and destruction semantics, such that the restriction
|
||||
on yield placement can be lifted, expanding the applicability of
|
||||
generators.
|
||||
|
||||
This PEP proposes that the built-in generator type implement a
|
||||
close method and destruction semantics, such that the restriction
|
||||
on yield placement can be lifted, expanding the applicability of
|
||||
generators.
|
||||
|
||||
Pronouncement
|
||||
=============
|
||||
|
||||
Rejected in favor of PEP 342 which includes substantially all of
|
||||
the requested behavior in a more refined form.
|
||||
|
||||
Rejected in favor of PEP 342 which includes substantially all of
|
||||
the requested behavior in a more refined form.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Python generators allow for natural coding of many data traversal
|
||||
scenarios. Their instantiation produces iterators,
|
||||
i.e. first-class objects abstracting traversal (with all the
|
||||
advantages of first- classness). In this respect they match in
|
||||
power and offer some advantages over the approach using iterator
|
||||
methods taking a (smalltalkish) block. On the other hand, given
|
||||
current limitations (no yield allowed in a try clause of a
|
||||
try-finally inside a generator) the latter approach seems better
|
||||
suited to encapsulating not only traversal but also exception
|
||||
handling and proper resource acquisition and release.
|
||||
Python generators allow for natural coding of many data traversal
|
||||
scenarios. Their instantiation produces iterators,
|
||||
i.e. first-class objects abstracting traversal (with all the
|
||||
advantages of first- classness). In this respect they match in
|
||||
power and offer some advantages over the approach using iterator
|
||||
methods taking a (smalltalkish) block. On the other hand, given
|
||||
current limitations (no yield allowed in a try clause of a
|
||||
try-finally inside a generator) the latter approach seems better
|
||||
suited to encapsulating not only traversal but also exception
|
||||
handling and proper resource acquisition and release.
|
||||
|
||||
Let's consider an example (for simplicity, files in read-mode are
|
||||
used):
|
||||
Let's consider an example (for simplicity, files in read-mode are
|
||||
used)::
|
||||
|
||||
def all_lines(index_path):
|
||||
for path in file(index_path, "r"):
|
||||
for line in file(path.strip(), "r"):
|
||||
yield line
|
||||
def all_lines(index_path):
|
||||
for path in file(index_path, "r"):
|
||||
for line in file(path.strip(), "r"):
|
||||
yield line
|
||||
|
||||
this is short and to the point, but the try-finally for timely
|
||||
closing of the files cannot be added. (While instead of a path, a
|
||||
file, whose closing then would be responsibility of the caller,
|
||||
could be passed in as argument, the same is not applicable for the
|
||||
files opened depending on the contents of the index).
|
||||
this is short and to the point, but the try-finally for timely
|
||||
closing of the files cannot be added. (While instead of a path, a
|
||||
file, whose closing then would be responsibility of the caller,
|
||||
could be passed in as argument, the same is not applicable for the
|
||||
files opened depending on the contents of the index).
|
||||
|
||||
If we want timely release, we have to sacrifice the simplicity and
|
||||
directness of the generator-only approach: (e.g.)
|
||||
If we want timely release, we have to sacrifice the simplicity and
|
||||
directness of the generator-only approach: (e.g.)::
|
||||
|
||||
class AllLines:
|
||||
class AllLines:
|
||||
|
||||
def __init__(self,index_path):
|
||||
self.index_path = index_path
|
||||
self.index = None
|
||||
self.document = None
|
||||
def __init__(self, index_path):
|
||||
self.index_path = index_path
|
||||
self.index = None
|
||||
self.document = None
|
||||
|
||||
def __iter__(self):
|
||||
self.index = file(self.index_path,"r")
|
||||
for path in self.index:
|
||||
self.document = file(path.strip(),"r")
|
||||
for line in self.document:
|
||||
yield line
|
||||
self.document.close()
|
||||
self.document = None
|
||||
def __iter__(self):
|
||||
self.index = file(self.index_path, "r")
|
||||
for path in self.index:
|
||||
self.document = file(path.strip(), "r")
|
||||
for line in self.document:
|
||||
yield line
|
||||
self.document.close()
|
||||
self.document = None
|
||||
|
||||
def close(self):
|
||||
if self.index:
|
||||
self.index.close()
|
||||
if self.document:
|
||||
self.document.close()
|
||||
def close(self):
|
||||
if self.index:
|
||||
self.index.close()
|
||||
if self.document:
|
||||
self.document.close()
|
||||
|
||||
to be used as:
|
||||
to be used as::
|
||||
|
||||
all_lines = AllLines("index.txt")
|
||||
all_lines = AllLines("index.txt")
|
||||
try:
|
||||
for line in all_lines:
|
||||
...
|
||||
finally:
|
||||
all_lines.close()
|
||||
|
||||
The more convoluted solution implementing timely release, seems
|
||||
to offer a precious hint. What we have done is encapsulate our
|
||||
traversal in an object (iterator) with a close method.
|
||||
|
||||
This PEP proposes that generators should grow such a close method
|
||||
with such semantics that the example could be rewritten as::
|
||||
|
||||
# Today this is not valid Python: yield is not allowed between
|
||||
# try and finally, and generator type instances support no
|
||||
# close method.
|
||||
|
||||
def all_lines(index_path):
|
||||
index = file(index_path, "r")
|
||||
try:
|
||||
for line in all_lines:
|
||||
...
|
||||
finally:
|
||||
all_lines.close()
|
||||
|
||||
The more convoluted solution implementing timely release, seems
|
||||
to offer a precious hint. What we have done is encapsulate our
|
||||
traversal in an object (iterator) with a close method.
|
||||
|
||||
This PEP proposes that generators should grow such a close method
|
||||
with such semantics that the example could be rewritten as:
|
||||
|
||||
# Today this is not valid Python: yield is not allowed between
|
||||
# try and finally, and generator type instances support no
|
||||
# close method.
|
||||
|
||||
def all_lines(index_path):
|
||||
index = file(index_path,"r")
|
||||
for path in index:
|
||||
document = file(path.strip(), "r")
|
||||
try:
|
||||
for path in index:
|
||||
document = file(path.strip(),"r")
|
||||
try:
|
||||
for line in document:
|
||||
yield line
|
||||
finally:
|
||||
document.close()
|
||||
for line in document:
|
||||
yield line
|
||||
finally:
|
||||
index.close()
|
||||
|
||||
all = all_lines("index.txt")
|
||||
try:
|
||||
for line in all:
|
||||
...
|
||||
document.close()
|
||||
finally:
|
||||
all.close() # close on generator
|
||||
index.close()
|
||||
|
||||
Currently PEP 255 [1] disallows yield inside a try clause of a
|
||||
try-finally statement, because the execution of the finally clause
|
||||
cannot be guaranteed as required by try-finally semantics.
|
||||
all = all_lines("index.txt")
|
||||
try:
|
||||
for line in all:
|
||||
...
|
||||
finally:
|
||||
all.close() # close on generator
|
||||
|
||||
The semantics of the proposed close method should be such that
|
||||
while the finally clause execution still cannot be guaranteed, it
|
||||
can be enforced when required. Specifically, the close method
|
||||
behavior should trigger the execution of the finally clauses
|
||||
inside the generator, either by forcing a return in the generator
|
||||
frame or by throwing an exception in it. In situations requiring
|
||||
timely resource release, close could then be explicitly invoked.
|
||||
Currently PEP 255 [1]_ disallows yield inside a try clause of a
|
||||
try-finally statement, because the execution of the finally clause
|
||||
cannot be guaranteed as required by try-finally semantics.
|
||||
|
||||
The semantics of generator destruction on the other hand should be
|
||||
extended in order to implement a best-effort policy for the
|
||||
general case. Specifically, destruction should invoke close().
|
||||
The best-effort limitation comes from the fact that the
|
||||
destructor's execution is not guaranteed in the first place.
|
||||
The semantics of the proposed close method should be such that
|
||||
while the finally clause execution still cannot be guaranteed, it
|
||||
can be enforced when required. Specifically, the close method
|
||||
behavior should trigger the execution of the finally clauses
|
||||
inside the generator, either by forcing a return in the generator
|
||||
frame or by throwing an exception in it. In situations requiring
|
||||
timely resource release, close could then be explicitly invoked.
|
||||
|
||||
This seems to be a reasonable compromise, the resulting global
|
||||
behavior being similar to that of files and closing.
|
||||
The semantics of generator destruction on the other hand should be
|
||||
extended in order to implement a best-effort policy for the
|
||||
general case. Specifically, destruction should invoke ``close()``.
|
||||
The best-effort limitation comes from the fact that the
|
||||
destructor's execution is not guaranteed in the first place.
|
||||
|
||||
This seems to be a reasonable compromise, the resulting global
|
||||
behavior being similar to that of files and closing.
|
||||
|
||||
|
||||
Possible Semantics
|
||||
==================
|
||||
|
||||
The built-in generator type should have a close method
|
||||
implemented, which can then be invoked as:
|
||||
The built-in generator type should have a close method
|
||||
implemented, which can then be invoked as::
|
||||
|
||||
gen.close()
|
||||
gen.close()
|
||||
|
||||
where gen is an instance of the built-in generator type.
|
||||
Generator destruction should also invoke close method behavior.
|
||||
where ``gen`` is an instance of the built-in generator type.
|
||||
Generator destruction should also invoke close method behavior.
|
||||
|
||||
If a generator is already terminated, close should be a no-op.
|
||||
If a generator is already terminated, close should be a no-op.
|
||||
|
||||
Otherwise, there are two alternative solutions, Return or
|
||||
Exception Semantics:
|
||||
Otherwise, there are two alternative solutions, Return or
|
||||
Exception Semantics:
|
||||
|
||||
A - Return Semantics: The generator should be resumed, generator
|
||||
execution should continue as if the instruction at the re-entry
|
||||
point is a return. Consequently, finally clauses surrounding the
|
||||
re-entry point would be executed, in the case of a then allowed
|
||||
try-yield-finally pattern.
|
||||
A - Return Semantics: The generator should be resumed, generator
|
||||
execution should continue as if the instruction at the re-entry
|
||||
point is a return. Consequently, finally clauses surrounding the
|
||||
re-entry point would be executed, in the case of a then allowed
|
||||
try-yield-finally pattern.
|
||||
|
||||
Issues: is it important to be able to distinguish forced
|
||||
termination by close, normal termination, exception propagation
|
||||
from generator or generator-called code? In the normal case it
|
||||
seems not, finally clauses should be there to work the same in all
|
||||
these cases, still this semantics could make such a distinction
|
||||
hard.
|
||||
Issues: is it important to be able to distinguish forced
|
||||
termination by close, normal termination, exception propagation
|
||||
from generator or generator-called code? In the normal case it
|
||||
seems not, finally clauses should be there to work the same in all
|
||||
these cases, still this semantics could make such a distinction
|
||||
hard.
|
||||
|
||||
Except-clauses, like by a normal return, are not executed, such
|
||||
clauses in legacy generators expect to be executed for exceptions
|
||||
raised by the generator or by code called from it. Not executing
|
||||
them in the close case seems correct.
|
||||
Except-clauses, like by a normal return, are not executed, such
|
||||
clauses in legacy generators expect to be executed for exceptions
|
||||
raised by the generator or by code called from it. Not executing
|
||||
them in the close case seems correct.
|
||||
|
||||
B - Exception Semantics: The generator should be resumed and
|
||||
execution should continue as if a special-purpose exception
|
||||
(e.g. CloseGenerator) has been raised at re-entry point. Close
|
||||
implementation should consume and not propagate further this
|
||||
exception.
|
||||
B - Exception Semantics: The generator should be resumed and
|
||||
execution should continue as if a special-purpose exception
|
||||
(e.g. CloseGenerator) has been raised at re-entry point. Close
|
||||
implementation should consume and not propagate further this
|
||||
exception.
|
||||
|
||||
Issues: should StopIteration be reused for this purpose? Probably
|
||||
not. We would like close to be a harmless operation for legacy
|
||||
generators, which could contain code catching StopIteration to
|
||||
deal with other generators/iterators.
|
||||
Issues: should ``StopIteration`` be reused for this purpose? Probably
|
||||
not. We would like close to be a harmless operation for legacy
|
||||
generators, which could contain code catching ``StopIteration`` to
|
||||
deal with other generators/iterators.
|
||||
|
||||
In general, with exception semantics, it is unclear what to do if
|
||||
the generator does not terminate or we do not receive the special
|
||||
exception propagated back. Other different exceptions should
|
||||
probably be propagated, but consider this possible legacy
|
||||
generator code:
|
||||
In general, with exception semantics, it is unclear what to do if
|
||||
the generator does not terminate or we do not receive the special
|
||||
exception propagated back. Other different exceptions should
|
||||
probably be propagated, but consider this possible legacy
|
||||
generator code::
|
||||
|
||||
try:
|
||||
...
|
||||
yield ...
|
||||
...
|
||||
except: # or except Exception:, etc
|
||||
raise Exception("boom")
|
||||
try:
|
||||
...
|
||||
yield ...
|
||||
...
|
||||
except: # or except Exception:, etc
|
||||
raise Exception("boom")
|
||||
|
||||
If close is invoked with the generator suspended after the yield,
|
||||
the except clause would catch our special purpose exception, so we
|
||||
would get a different exception propagated back, which in this
|
||||
case ought to be reasonably consumed and ignored but in general
|
||||
should be propagated, but separating these scenarios seems hard.
|
||||
If close is invoked with the generator suspended after the yield,
|
||||
the except clause would catch our special purpose exception, so we
|
||||
would get a different exception propagated back, which in this
|
||||
case ought to be reasonably consumed and ignored but in general
|
||||
should be propagated, but separating these scenarios seems hard.
|
||||
|
||||
The exception approach has the advantage to let the generator
|
||||
distinguish between termination cases and have more control. On
|
||||
the other hand, clear-cut semantics seem harder to define.
|
||||
The exception approach has the advantage to let the generator
|
||||
distinguish between termination cases and have more control. On
|
||||
the other hand, clear-cut semantics seem harder to define.
|
||||
|
||||
|
||||
Remarks
|
||||
=======
|
||||
|
||||
If this proposal is accepted, it should become common practice to
|
||||
document whether a generator acquires resources, so that its close
|
||||
method ought to be called. If a generator is no longer used,
|
||||
calling close should be harmless.
|
||||
If this proposal is accepted, it should become common practice to
|
||||
document whether a generator acquires resources, so that its close
|
||||
method ought to be called. If a generator is no longer used,
|
||||
calling close should be harmless.
|
||||
|
||||
On the other hand, in the typical scenario the code that
|
||||
instantiated the generator should call close if required by it.
|
||||
Generic code dealing with iterators/generators instantiated
|
||||
elsewhere should typically not be littered with close calls.
|
||||
On the other hand, in the typical scenario the code that
|
||||
instantiated the generator should call close if required by it.
|
||||
Generic code dealing with iterators/generators instantiated
|
||||
elsewhere should typically not be littered with close calls.
|
||||
|
||||
The rare case of code that has acquired ownership of and need to
|
||||
properly deal with all of iterators, generators and generators
|
||||
acquiring resources that need timely release, is easily solved:
|
||||
The rare case of code that has acquired ownership of and need to
|
||||
properly deal with all of iterators, generators and generators
|
||||
acquiring resources that need timely release, is easily solved::
|
||||
|
||||
if hasattr(iterator, 'close'):
|
||||
iterator.close()
|
||||
if hasattr(iterator, 'close'):
|
||||
iterator.close()
|
||||
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
||||
Definitive semantics ought to be chosen. Currently Guido favors
|
||||
Exception Semantics. If the generator yields a value instead of
|
||||
terminating, or propagating back the special exception, a special
|
||||
exception should be raised again on the generator side.
|
||||
Definitive semantics ought to be chosen. Currently Guido favors
|
||||
Exception Semantics. If the generator yields a value instead of
|
||||
terminating, or propagating back the special exception, a special
|
||||
exception should be raised again on the generator side.
|
||||
|
||||
It is still unclear whether spuriously converted special
|
||||
exceptions (as discussed in Possible Semantics) are a problem and
|
||||
what to do about them.
|
||||
It is still unclear whether spuriously converted special
|
||||
exceptions (as discussed in Possible Semantics) are a problem and
|
||||
what to do about them.
|
||||
|
||||
Implementation issues should be explored.
|
||||
Implementation issues should be explored.
|
||||
|
||||
|
||||
Alternative Ideas
|
||||
=================
|
||||
|
||||
The idea that the yield placement limitation should be removed and
|
||||
that generator destruction should trigger execution of finally
|
||||
clauses has been proposed more than once. Alone it cannot
|
||||
guarantee that timely release of resources acquired by a generator
|
||||
can be enforced.
|
||||
The idea that the yield placement limitation should be removed and
|
||||
that generator destruction should trigger execution of finally
|
||||
clauses has been proposed more than once. Alone it cannot
|
||||
guarantee that timely release of resources acquired by a generator
|
||||
can be enforced.
|
||||
|
||||
PEP 288 [2] proposes a more general solution, allowing custom
|
||||
exception passing to generators. The proposal in this PEP
|
||||
addresses more directly the problem of resource release. Were PEP
|
||||
288 implemented, Exceptions Semantics for close could be layered
|
||||
on top of it, on the other hand PEP 288 should make a separate
|
||||
case for the more general functionality.
|
||||
PEP 288 [2]_ proposes a more general solution, allowing custom
|
||||
exception passing to generators. The proposal in this PEP
|
||||
addresses more directly the problem of resource release. Were PEP
|
||||
288 implemented, Exceptions Semantics for close could be layered
|
||||
on top of it, on the other hand PEP 288 should make a separate
|
||||
case for the more general functionality.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] PEP 255 Simple Generators
|
||||
http://www.python.org/dev/peps/pep-0255/
|
||||
.. [1] PEP 255 Simple Generators
|
||||
http://www.python.org/dev/peps/pep-0255/
|
||||
|
||||
[2] PEP 288 Generators Attributes and Exceptions
|
||||
http://www.python.org/dev/peps/pep-0288/
|
||||
.. [2] PEP 288 Generators Attributes and Exceptions
|
||||
http://www.python.org/dev/peps/pep-0288/
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
End:
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
End:
|
||||
|
|
405
pep-0358.txt
405
pep-0358.txt
|
@ -5,271 +5,280 @@ Last-Modified: $Date$
|
|||
Author: Neil Schemenauer <nas@arctrix.com>, Guido van Rossum <guido@python.org>
|
||||
Status: Final
|
||||
Type: Standards Track
|
||||
Content-Type: text/plain
|
||||
Content-Type: text/x-rst
|
||||
Created: 15-Feb-2006
|
||||
Python-Version: 2.6, 3.0
|
||||
Post-History:
|
||||
|
||||
|
||||
Update
|
||||
======
|
||||
|
||||
This PEP has partially been superseded by PEP 3137.
|
||||
This PEP has partially been superseded by PEP 3137.
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP outlines the introduction of a raw bytes sequence type.
|
||||
Adding the bytes type is one step in the transition to
|
||||
Unicode-based str objects which will be introduced in Python 3.0.
|
||||
This PEP outlines the introduction of a raw bytes sequence type.
|
||||
Adding the bytes type is one step in the transition to
|
||||
Unicode-based str objects which will be introduced in Python 3.0.
|
||||
|
||||
The PEP describes how the bytes type should work in Python 2.6, as
|
||||
well as how it should work in Python 3.0. (Occasionally there are
|
||||
differences because in Python 2.6, we have two string types, str
|
||||
and unicode, while in Python 3.0 we will only have one string
|
||||
type, whose name will be str but whose semantics will be like the
|
||||
2.6 unicode type.)
|
||||
The PEP describes how the bytes type should work in Python 2.6, as
|
||||
well as how it should work in Python 3.0. (Occasionally there are
|
||||
differences because in Python 2.6, we have two string types, str
|
||||
and unicode, while in Python 3.0 we will only have one string
|
||||
type, whose name will be str but whose semantics will be like the
|
||||
2.6 unicode type.)
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
Python's current string objects are overloaded. They serve to hold
|
||||
both sequences of characters and sequences of bytes. This
|
||||
overloading of purpose leads to confusion and bugs. In future
|
||||
versions of Python, string objects will be used for holding
|
||||
character data. The bytes object will fulfil the role of a byte
|
||||
container. Eventually the unicode type will be renamed to str
|
||||
and the old str type will be removed.
|
||||
Python's current string objects are overloaded. They serve to hold
|
||||
both sequences of characters and sequences of bytes. This
|
||||
overloading of purpose leads to confusion and bugs. In future
|
||||
versions of Python, string objects will be used for holding
|
||||
character data. The bytes object will fulfil the role of a byte
|
||||
container. Eventually the unicode type will be renamed to str
|
||||
and the old str type will be removed.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
A bytes object stores a mutable sequence of integers that are in
|
||||
the range 0 to 255. Unlike string objects, indexing a bytes
|
||||
object returns an integer. Assigning or comparing an object that
|
||||
is not an integer to an element causes a TypeError exception.
|
||||
Assigning an element to a value outside the range 0 to 255 causes
|
||||
a ValueError exception. The .__len__() method of bytes returns
|
||||
the number of integers stored in the sequence (i.e. the number of
|
||||
bytes).
|
||||
A bytes object stores a mutable sequence of integers that are in
|
||||
the range 0 to 255. Unlike string objects, indexing a bytes
|
||||
object returns an integer. Assigning or comparing an object that
|
||||
is not an integer to an element causes a ``TypeError`` exception.
|
||||
Assigning an element to a value outside the range 0 to 255 causes
|
||||
a ``ValueError`` exception. The ``.__len__()`` method of bytes returns
|
||||
the number of integers stored in the sequence (i.e. the number of
|
||||
bytes).
|
||||
|
||||
The constructor of the bytes object has the following signature:
|
||||
The constructor of the bytes object has the following signature::
|
||||
|
||||
bytes([initializer[, encoding]])
|
||||
bytes([initializer[, encoding]])
|
||||
|
||||
If no arguments are provided then a bytes object containing zero
|
||||
elements is created and returned. The initializer argument can be
|
||||
a string (in 2.6, either str or unicode), an iterable of integers,
|
||||
or a single integer. The pseudo-code for the constructor
|
||||
(optimized for clear semantics, not for speed) is:
|
||||
If no arguments are provided then a bytes object containing zero
|
||||
elements is created and returned. The initializer argument can be
|
||||
a string (in 2.6, either str or unicode), an iterable of integers,
|
||||
or a single integer. The pseudo-code for the constructor
|
||||
(optimized for clear semantics, not for speed) is::
|
||||
|
||||
def bytes(initializer=0, encoding=None):
|
||||
if isinstance(initializer, int): # In 2.6, int -> (int, long)
|
||||
initializer = [0]*initializer
|
||||
elif isinstance(initializer, basestring):
|
||||
if isinstance(initializer, unicode): # In 3.0, "if True"
|
||||
if encoding is None:
|
||||
# In 3.0, raise TypeError("explicit encoding required")
|
||||
encoding = sys.getdefaultencoding()
|
||||
initializer = initializer.encode(encoding)
|
||||
initializer = [ord(c) for c in initializer]
|
||||
else:
|
||||
if encoding is not None:
|
||||
raise TypeError("no encoding allowed for this initializer")
|
||||
tmp = []
|
||||
for c in initializer:
|
||||
if not isinstance(c, int):
|
||||
raise TypeError("initializer must be iterable of ints")
|
||||
if not 0 <= c < 256:
|
||||
raise ValueError("initializer element out of range")
|
||||
tmp.append(c)
|
||||
initializer = tmp
|
||||
new = <new bytes object of length len(initializer)>
|
||||
for i, c in enumerate(initializer):
|
||||
new[i] = c
|
||||
return new
|
||||
def bytes(initializer=0, encoding=None):
|
||||
if isinstance(initializer, int): # In 2.6, int -> (int, long)
|
||||
initializer = [0]*initializer
|
||||
elif isinstance(initializer, basestring):
|
||||
if isinstance(initializer, unicode): # In 3.0, "if True"
|
||||
if encoding is None:
|
||||
# In 3.0, raise TypeError("explicit encoding required")
|
||||
encoding = sys.getdefaultencoding()
|
||||
initializer = initializer.encode(encoding)
|
||||
initializer = [ord(c) for c in initializer]
|
||||
else:
|
||||
if encoding is not None:
|
||||
raise TypeError("no encoding allowed for this initializer")
|
||||
tmp = []
|
||||
for c in initializer:
|
||||
if not isinstance(c, int):
|
||||
raise TypeError("initializer must be iterable of ints")
|
||||
if not 0 <= c < 256:
|
||||
raise ValueError("initializer element out of range")
|
||||
tmp.append(c)
|
||||
initializer = tmp
|
||||
new = <new bytes object of length len(initializer)>
|
||||
for i, c in enumerate(initializer):
|
||||
new[i] = c
|
||||
return new
|
||||
|
||||
The .__repr__() method returns a string that can be evaluated to
|
||||
generate a new bytes object containing a bytes literal:
|
||||
The ``.__repr__()`` method returns a string that can be evaluated to
|
||||
generate a new bytes object containing a bytes literal::
|
||||
|
||||
>>> bytes([10, 20, 30])
|
||||
b'\n\x14\x1e'
|
||||
>>> bytes([10, 20, 30])
|
||||
b'\n\x14\x1e'
|
||||
|
||||
The object has a .decode() method equivalent to the .decode()
|
||||
method of the str object. The object has a classmethod .fromhex()
|
||||
that takes a string of characters from the set [0-9a-fA-F ] and
|
||||
returns a bytes object (similar to binascii.unhexlify). For
|
||||
example:
|
||||
The object has a ``.decode()`` method equivalent to the ``.decode()``
|
||||
method of the str object. The object has a classmethod ``.fromhex()``
|
||||
that takes a string of characters from the set ``[0-9a-fA-F ]`` and
|
||||
returns a bytes object (similar to binascii.unhexlify). For
|
||||
example::
|
||||
|
||||
>>> bytes.fromhex('5c5350ff')
|
||||
>>> bytes.fromhex('5c5350ff')
|
||||
b'\\SP\xff'
|
||||
>>> bytes.fromhex('5c 53 50 ff')
|
||||
>>> bytes.fromhex('5c 53 50 ff')
|
||||
b'\\SP\xff'
|
||||
|
||||
The object has a .hex() method that does the reverse conversion
|
||||
(similar to binascii.hexlify):
|
||||
The object has a ``.hex()`` method that does the reverse conversion
|
||||
(similar to binascii.hexlify)::
|
||||
|
||||
>> bytes([92, 83, 80, 255]).hex()
|
||||
'5c5350ff'
|
||||
>> bytes([92, 83, 80, 255]).hex()
|
||||
'5c5350ff'
|
||||
|
||||
The bytes object has some methods similar to list methods, and
|
||||
others similar to str methods. Here is a complete list of
|
||||
methods, with their approximate signatures:
|
||||
The bytes object has some methods similar to list methods, and
|
||||
others similar to str methods. Here is a complete list of
|
||||
methods, with their approximate signatures::
|
||||
|
||||
.__add__(bytes) -> bytes
|
||||
.__contains__(int | bytes) -> bool
|
||||
.__delitem__(int | slice) -> None
|
||||
.__delslice__(int, int) -> None
|
||||
.__eq__(bytes) -> bool
|
||||
.__ge__(bytes) -> bool
|
||||
.__getitem__(int | slice) -> int | bytes
|
||||
.__getslice__(int, int) -> bytes
|
||||
.__gt__(bytes) -> bool
|
||||
.__iadd__(bytes) -> bytes
|
||||
.__imul__(int) -> bytes
|
||||
.__iter__() -> iterator
|
||||
.__le__(bytes) -> bool
|
||||
.__len__() -> int
|
||||
.__lt__(bytes) -> bool
|
||||
.__mul__(int) -> bytes
|
||||
.__ne__(bytes) -> bool
|
||||
.__reduce__(...) -> ...
|
||||
.__reduce_ex__(...) -> ...
|
||||
.__repr__() -> str
|
||||
.__reversed__() -> bytes
|
||||
.__rmul__(int) -> bytes
|
||||
.__setitem__(int | slice, int | iterable[int]) -> None
|
||||
.__setslice__(int, int, iterable[int]) -> Bote
|
||||
.append(int) -> None
|
||||
.count(int) -> int
|
||||
.decode(str) -> str | unicode # in 3.0, only str
|
||||
.endswith(bytes) -> bool
|
||||
.extend(iterable[int]) -> None
|
||||
.find(bytes) -> int
|
||||
.index(bytes | int) -> int
|
||||
.insert(int, int) -> None
|
||||
.join(iterable[bytes]) -> bytes
|
||||
.partition(bytes) -> (bytes, bytes, bytes)
|
||||
.pop([int]) -> int
|
||||
.remove(int) -> None
|
||||
.replace(bytes, bytes) -> bytes
|
||||
.rindex(bytes | int) -> int
|
||||
.rpartition(bytes) -> (bytes, bytes, bytes)
|
||||
.split(bytes) -> list[bytes]
|
||||
.startswith(bytes) -> bool
|
||||
.reverse() -> None
|
||||
.rfind(bytes) -> int
|
||||
.rindex(bytes | int) -> int
|
||||
.rsplit(bytes) -> list[bytes]
|
||||
.translate(bytes, [bytes]) -> bytes
|
||||
.__add__(bytes) -> bytes
|
||||
.__contains__(int | bytes) -> bool
|
||||
.__delitem__(int | slice) -> None
|
||||
.__delslice__(int, int) -> None
|
||||
.__eq__(bytes) -> bool
|
||||
.__ge__(bytes) -> bool
|
||||
.__getitem__(int | slice) -> int | bytes
|
||||
.__getslice__(int, int) -> bytes
|
||||
.__gt__(bytes) -> bool
|
||||
.__iadd__(bytes) -> bytes
|
||||
.__imul__(int) -> bytes
|
||||
.__iter__() -> iterator
|
||||
.__le__(bytes) -> bool
|
||||
.__len__() -> int
|
||||
.__lt__(bytes) -> bool
|
||||
.__mul__(int) -> bytes
|
||||
.__ne__(bytes) -> bool
|
||||
.__reduce__(...) -> ...
|
||||
.__reduce_ex__(...) -> ...
|
||||
.__repr__() -> str
|
||||
.__reversed__() -> bytes
|
||||
.__rmul__(int) -> bytes
|
||||
.__setitem__(int | slice, int | iterable[int]) -> None
|
||||
.__setslice__(int, int, iterable[int]) -> Bote
|
||||
.append(int) -> None
|
||||
.count(int) -> int
|
||||
.decode(str) -> str | unicode # in 3.0, only str
|
||||
.endswith(bytes) -> bool
|
||||
.extend(iterable[int]) -> None
|
||||
.find(bytes) -> int
|
||||
.index(bytes | int) -> int
|
||||
.insert(int, int) -> None
|
||||
.join(iterable[bytes]) -> bytes
|
||||
.partition(bytes) -> (bytes, bytes, bytes)
|
||||
.pop([int]) -> int
|
||||
.remove(int) -> None
|
||||
.replace(bytes, bytes) -> bytes
|
||||
.rindex(bytes | int) -> int
|
||||
.rpartition(bytes) -> (bytes, bytes, bytes)
|
||||
.split(bytes) -> list[bytes]
|
||||
.startswith(bytes) -> bool
|
||||
.reverse() -> None
|
||||
.rfind(bytes) -> int
|
||||
.rindex(bytes | int) -> int
|
||||
.rsplit(bytes) -> list[bytes]
|
||||
.translate(bytes, [bytes]) -> bytes
|
||||
|
||||
Note the conspicuous absence of .isupper(), .upper(), and friends.
|
||||
(But see "Open Issues" below.) There is no .__hash__() because
|
||||
the object is mutable. There is no use case for a .sort() method.
|
||||
Note the conspicuous absence of ``.isupper()``, ``.upper()``, and friends.
|
||||
(But see "Open Issues" below.) There is no ``.__hash__()`` because
|
||||
the object is mutable. There is no use case for a ``.sort()`` method.
|
||||
|
||||
The bytes type also supports the buffer interface, supporting
|
||||
reading and writing binary (but not character) data.
|
||||
The bytes type also supports the buffer interface, supporting
|
||||
reading and writing binary (but not character) data.
|
||||
|
||||
|
||||
Out of Scope Issues
|
||||
===================
|
||||
|
||||
* Python 3k will have a much different I/O subsystem. Deciding
|
||||
how that I/O subsystem will work and interact with the bytes
|
||||
object is out of the scope of this PEP. The expectation however
|
||||
is that binary I/O will read and write bytes, while text I/O
|
||||
will read strings. Since the bytes type supports the buffer
|
||||
interface, the existing binary I/O operations in Python 2.6 will
|
||||
support bytes objects.
|
||||
* Python 3k will have a much different I/O subsystem. Deciding
|
||||
how that I/O subsystem will work and interact with the bytes
|
||||
object is out of the scope of this PEP. The expectation however
|
||||
is that binary I/O will read and write bytes, while text I/O
|
||||
will read strings. Since the bytes type supports the buffer
|
||||
interface, the existing binary I/O operations in Python 2.6 will
|
||||
support bytes objects.
|
||||
|
||||
* It has been suggested that a special method named .__bytes__()
|
||||
be added to the language to allow objects to be converted into
|
||||
byte arrays. This decision is out of scope.
|
||||
* It has been suggested that a special method named ``.__bytes__()``
|
||||
be added to the language to allow objects to be converted into
|
||||
byte arrays. This decision is out of scope.
|
||||
|
||||
* A bytes literal of the form b"..." is also proposed. This is
|
||||
the subject of PEP 3112.
|
||||
* A bytes literal of the form ``b"..."`` is also proposed. This is
|
||||
the subject of PEP 3112.
|
||||
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
||||
* The .decode() method is redundant since a bytes object b can
|
||||
also be decoded by calling unicode(b, <encoding>) (in 2.6) or
|
||||
str(b, <encoding>) (in 3.0). Do we need encode/decode methods
|
||||
at all? In a sense the spelling using a constructor is cleaner.
|
||||
* The ``.decode()`` method is redundant since a bytes object ``b`` can
|
||||
also be decoded by calling ``unicode(b, <encoding>)`` (in 2.6) or
|
||||
``str(b, <encoding>)`` (in 3.0). Do we need encode/decode methods
|
||||
at all? In a sense the spelling using a constructor is cleaner.
|
||||
|
||||
* Need to specify the methods still more carefully.
|
||||
* Need to specify the methods still more carefully.
|
||||
|
||||
* Pickling and marshalling support need to be specified.
|
||||
* Pickling and marshalling support need to be specified.
|
||||
|
||||
* Should all those list methods really be implemented?
|
||||
* Should all those list methods really be implemented?
|
||||
|
||||
* A case could be made for supporting .ljust(), .rjust(),
|
||||
.center() with a mandatory second argument.
|
||||
* A case could be made for supporting ``.ljust()``, ``.rjust()``,
|
||||
``.center()`` with a mandatory second argument.
|
||||
|
||||
* A case could be made for supporting .split() with a mandatory
|
||||
argument.
|
||||
* A case could be made for supporting ``.split()`` with a mandatory
|
||||
argument.
|
||||
|
||||
* A case could even be made for supporting .islower(), .isupper(),
|
||||
.isspace(), .isalpha(), .isalnum(), .isdigit() and the
|
||||
corresponding conversions (.lower() etc.), using the ASCII
|
||||
definitions for letters, digits and whitespace. If this is
|
||||
accepted, the cases for .ljust(), .rjust(), .center() and
|
||||
.split() become much stronger, and they should have default
|
||||
arguments as well, using an ASCII space or all ASCII whitespace
|
||||
(for .split()).
|
||||
* A case could even be made for supporting ``.islower()``, ``.isupper()``,
|
||||
``.isspace()``, ``.isalpha()``, ``.isalnum()``, ``.isdigit()`` and the
|
||||
corresponding conversions (``.lower()`` etc.), using the ASCII
|
||||
definitions for letters, digits and whitespace. If this is
|
||||
accepted, the cases for ``.ljust()``, ``.rjust()``, ``.center()`` and
|
||||
``.split()`` become much stronger, and they should have default
|
||||
arguments as well, using an ASCII space or all ASCII whitespace
|
||||
(for ``.split()``).
|
||||
|
||||
|
||||
Frequently Asked Questions
|
||||
==========================
|
||||
|
||||
Q: Why have the optional encoding argument when the encode method of
|
||||
Unicode objects does the same thing?
|
||||
Q: Why have the optional encoding argument when the encode method of
|
||||
Unicode objects does the same thing?
|
||||
|
||||
A: In the current version of Python, the encode method returns a str
|
||||
object and we cannot change that without breaking code. The
|
||||
construct bytes(s.encode(...)) is expensive because it has to
|
||||
copy the byte sequence multiple times. Also, Python generally
|
||||
provides two ways of converting an object of type A into an
|
||||
object of type B: ask an A instance to convert itself to a B, or
|
||||
ask the type B to create a new instance from an A. Depending on
|
||||
what A and B are, both APIs make sense; sometimes reasons of
|
||||
decoupling require that A can't know about B, in which case you
|
||||
have to use the latter approach; sometimes B can't know about A,
|
||||
in which case you have to use the former.
|
||||
A: In the current version of Python, the encode method returns a str
|
||||
object and we cannot change that without breaking code. The
|
||||
construct bytes(``s.encode(...)``) is expensive because it has to
|
||||
copy the byte sequence multiple times. Also, Python generally
|
||||
provides two ways of converting an object of type A into an
|
||||
object of type B: ask an A instance to convert itself to a B, or
|
||||
ask the type B to create a new instance from an A. Depending on
|
||||
what A and B are, both APIs make sense; sometimes reasons of
|
||||
decoupling require that A can't know about B, in which case you
|
||||
have to use the latter approach; sometimes B can't know about A,
|
||||
in which case you have to use the former.
|
||||
|
||||
|
||||
Q: Why does bytes ignore the encoding argument if the initializer is
|
||||
a str? (This only applies to 2.6.)
|
||||
Q: Why does bytes ignore the encoding argument if the initializer is
|
||||
a str? (This only applies to 2.6.)
|
||||
|
||||
A: There is no sane meaning that the encoding can have in that case.
|
||||
str objects *are* byte arrays and they know nothing about the
|
||||
encoding of character data they contain. We need to assume that
|
||||
the programmer has provided a str object that already uses the
|
||||
desired encoding. If you need something other than a pure copy of
|
||||
the bytes then you need to first decode the string. For example:
|
||||
A: There is no sane meaning that the encoding can have in that case.
|
||||
str objects *are* byte arrays and they know nothing about the
|
||||
encoding of character data they contain. We need to assume that
|
||||
the programmer has provided a str object that already uses the
|
||||
desired encoding. If you need something other than a pure copy of
|
||||
the bytes then you need to first decode the string. For example::
|
||||
|
||||
bytes(s.decode(encoding1), encoding2)
|
||||
bytes(s.decode(encoding1), encoding2)
|
||||
|
||||
|
||||
Q: Why not have the encoding argument default to Latin-1 (or some
|
||||
other encoding that covers the entire byte range) rather than
|
||||
ASCII?
|
||||
Q: Why not have the encoding argument default to Latin-1 (or some
|
||||
other encoding that covers the entire byte range) rather than
|
||||
ASCII?
|
||||
|
||||
A: The system default encoding for Python is ASCII. It seems least
|
||||
confusing to use that default. Also, in Py3k, using Latin-1 as
|
||||
the default might not be what users expect. For example, they
|
||||
might prefer a Unicode encoding. Any default will not always
|
||||
work as expected. At least ASCII will complain loudly if you try
|
||||
to encode non-ASCII data.
|
||||
A: The system default encoding for Python is ASCII. It seems least
|
||||
confusing to use that default. Also, in Py3k, using Latin-1 as
|
||||
the default might not be what users expect. For example, they
|
||||
might prefer a Unicode encoding. Any default will not always
|
||||
work as expected. At least ASCII will complain loudly if you try
|
||||
to encode non-ASCII data.
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
||||
|
|
406
pep-0361.txt
406
pep-0361.txt
|
@ -5,285 +5,319 @@ Last-Modified: $Date$
|
|||
Author: Neal Norwitz, Barry Warsaw
|
||||
Status: Final
|
||||
Type: Informational
|
||||
Content-Type: text/x-rst
|
||||
Created: 29-June-2006
|
||||
Python-Version: 2.6 and 3.0
|
||||
Post-History: 17-Mar-2008
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This document describes the development and release schedule for
|
||||
Python 2.6 and 3.0. The schedule primarily concerns itself with
|
||||
PEP-sized items. Small features may be added up to and including
|
||||
the first beta release. Bugs may be fixed until the final
|
||||
release.
|
||||
This document describes the development and release schedule for
|
||||
Python 2.6 and 3.0. The schedule primarily concerns itself with
|
||||
PEP-sized items. Small features may be added up to and including
|
||||
the first beta release. Bugs may be fixed until the final
|
||||
release.
|
||||
|
||||
There will be at least two alpha releases, two beta releases, and
|
||||
one release candidate. The releases are planned for October 2008.
|
||||
There will be at least two alpha releases, two beta releases, and
|
||||
one release candidate. The releases are planned for October 2008.
|
||||
|
||||
Python 2.6 is not only the next advancement in the Python 2
|
||||
series, it is also a transitional release, helping developers
|
||||
begin to prepare their code for Python 3.0. As such, many
|
||||
features are being backported from Python 3.0 to 2.6. Thus, it
|
||||
makes sense to release both versions in at the same time. The
|
||||
precedence for this was set with the Python 1.6 and 2.0 release.
|
||||
Python 2.6 is not only the next advancement in the Python 2
|
||||
series, it is also a transitional release, helping developers
|
||||
begin to prepare their code for Python 3.0. As such, many
|
||||
features are being backported from Python 3.0 to 2.6. Thus, it
|
||||
makes sense to release both versions in at the same time. The
|
||||
precedence for this was set with the Python 1.6 and 2.0 release.
|
||||
|
||||
Until rc, we will be releasing Python 2.6 and 3.0 in lockstep, on
|
||||
a monthly release cycle. The releases will happen on the first
|
||||
Wednesday of every month through the beta testing cycle. Because
|
||||
Python 2.6 is ready sooner, and because we have outside deadlines
|
||||
we'd like to meet, we've decided to split the rc releases. Thus
|
||||
Python 2.6 final is currently planned to come out two weeks before
|
||||
Python 3.0 final.
|
||||
Until rc, we will be releasing Python 2.6 and 3.0 in lockstep, on
|
||||
a monthly release cycle. The releases will happen on the first
|
||||
Wednesday of every month through the beta testing cycle. Because
|
||||
Python 2.6 is ready sooner, and because we have outside deadlines
|
||||
we'd like to meet, we've decided to split the rc releases. Thus
|
||||
Python 2.6 final is currently planned to come out two weeks before
|
||||
Python 3.0 final.
|
||||
|
||||
|
||||
Release Manager and Crew
|
||||
========================
|
||||
|
||||
2.6/3.0 Release Manager: Barry Warsaw
|
||||
Windows installers: Martin v. Loewis
|
||||
Mac installers: Ronald Oussoren
|
||||
Documentation: Georg Brandl
|
||||
RPMs: Sean Reifschneider
|
||||
- 2.6/3.0 Release Manager: Barry Warsaw
|
||||
- Windows installers: Martin v. Loewis
|
||||
- Mac installers: Ronald Oussoren
|
||||
- Documentation: Georg Brandl
|
||||
- RPMs: Sean Reifschneider
|
||||
|
||||
|
||||
Release Lifespan
|
||||
================
|
||||
|
||||
Python 3.0 is no longer being maintained for any purpose.
|
||||
Python 3.0 is no longer being maintained for any purpose.
|
||||
|
||||
Python 2.6.9 is the final security-only source-only maintenance
|
||||
release of the Python 2.6 series. With its release on October 29,
|
||||
2013, all official support for Python 2.6 has ended. Python 2.6
|
||||
is no longer being maintained for any purpose.
|
||||
Python 2.6.9 is the final security-only source-only maintenance
|
||||
release of the Python 2.6 series. With its release on October 29,
|
||||
2013, all official support for Python 2.6 has ended. Python 2.6
|
||||
is no longer being maintained for any purpose.
|
||||
|
||||
|
||||
Release Schedule
|
||||
================
|
||||
|
||||
Feb 29 2008: Python 2.6a1 and 3.0a3 are released
|
||||
Apr 02 2008: Python 2.6a2 and 3.0a4 are released
|
||||
May 08 2008: Python 2.6a3 and 3.0a5 are released
|
||||
Jun 18 2008: Python 2.6b1 and 3.0b1 are released
|
||||
Jul 17 2008: Python 2.6b2 and 3.0b2 are released
|
||||
Aug 20 2008: Python 2.6b3 and 3.0b3 are released
|
||||
Sep 12 2008: Python 2.6rc1 is released
|
||||
Sep 17 2008: Python 2.6rc2 and 3.0rc1 released
|
||||
Oct 01 2008: Python 2.6 final released
|
||||
Nov 06 2008: Python 3.0rc2 released
|
||||
Nov 21 2008: Python 3.0rc3 released
|
||||
Dec 03 2008: Python 3.0 final released
|
||||
Dec 04 2008: Python 2.6.1 final released
|
||||
Apr 14 2009: Python 2.6.2 final released
|
||||
Oct 02 2009: Python 2.6.3 final released
|
||||
Oct 25 2009: Python 2.6.4 final released
|
||||
Mar 19 2010: Python 2.6.5 final released
|
||||
Aug 24 2010: Python 2.6.6 final released
|
||||
Jun 03 2011: Python 2.6.7 final released (security-only)
|
||||
Apr 10 2012: Python 2.6.8 final released (security-only)
|
||||
Oct 29 2013: Python 2.6.9 final released (security-only)
|
||||
- Feb 29 2008: Python 2.6a1 and 3.0a3 are released
|
||||
- Apr 02 2008: Python 2.6a2 and 3.0a4 are released
|
||||
- May 08 2008: Python 2.6a3 and 3.0a5 are released
|
||||
- Jun 18 2008: Python 2.6b1 and 3.0b1 are released
|
||||
- Jul 17 2008: Python 2.6b2 and 3.0b2 are released
|
||||
- Aug 20 2008: Python 2.6b3 and 3.0b3 are released
|
||||
- Sep 12 2008: Python 2.6rc1 is released
|
||||
- Sep 17 2008: Python 2.6rc2 and 3.0rc1 released
|
||||
- Oct 01 2008: Python 2.6 final released
|
||||
- Nov 06 2008: Python 3.0rc2 released
|
||||
- Nov 21 2008: Python 3.0rc3 released
|
||||
- Dec 03 2008: Python 3.0 final released
|
||||
- Dec 04 2008: Python 2.6.1 final released
|
||||
- Apr 14 2009: Python 2.6.2 final released
|
||||
- Oct 02 2009: Python 2.6.3 final released
|
||||
- Oct 25 2009: Python 2.6.4 final released
|
||||
- Mar 19 2010: Python 2.6.5 final released
|
||||
- Aug 24 2010: Python 2.6.6 final released
|
||||
- Jun 03 2011: Python 2.6.7 final released (security-only)
|
||||
- Apr 10 2012: Python 2.6.8 final released (security-only)
|
||||
- Oct 29 2013: Python 2.6.9 final released (security-only)
|
||||
|
||||
|
||||
Completed features for 3.0
|
||||
==========================
|
||||
|
||||
See PEP 3000 [#pep3000] and PEP 3100 [#pep3100] for details on the
|
||||
Python 3.0 project.
|
||||
See PEP 3000 [pep3000]_ and PEP 3100 [pep3100]_ for details on the
|
||||
Python 3.0 project.
|
||||
|
||||
|
||||
Completed features for 2.6
|
||||
==========================
|
||||
|
||||
PEPs:
|
||||
PEPs:
|
||||
|
||||
- 352: Raising a string exception now triggers a TypeError.
|
||||
Attempting to catch a string exception raises DeprecationWarning.
|
||||
BaseException.message has been deprecated. [#pep352]
|
||||
- 358: The "bytes" Object [#pep358]
|
||||
- 366: Main module explicit relative imports [#pep366]
|
||||
- 370: Per user site-packages directory [#pep370]
|
||||
- 3112: Bytes literals in Python 3000 [#pep3112]
|
||||
- 3127: Integer Literal Support and Syntax [#pep3127]
|
||||
- 371: Addition of the multiprocessing package [#pep371]
|
||||
- 352: Raising a string exception now triggers a TypeError.
|
||||
Attempting to catch a string exception raises DeprecationWarning.
|
||||
BaseException.message has been deprecated. [pep352]_
|
||||
- 358: The "bytes" Object [pep358]_
|
||||
- 366: Main module explicit relative imports [pep366]_
|
||||
- 370: Per user site-packages directory [pep370]_
|
||||
- 3112: Bytes literals in Python 3000 [pep3112]_
|
||||
- 3127: Integer Literal Support and Syntax [pep3127]_
|
||||
- 371: Addition of the multiprocessing package [pep371]_
|
||||
|
||||
New modules in the standard library:
|
||||
New modules in the standard library:
|
||||
|
||||
- json
|
||||
- new enhanced turtle module
|
||||
- ast
|
||||
- json
|
||||
- new enhanced turtle module
|
||||
- ast
|
||||
|
||||
Deprecated modules and functions in the standard library:
|
||||
Deprecated modules and functions in the standard library:
|
||||
|
||||
- buildtools
|
||||
- cfmfile
|
||||
- commands.getstatus()
|
||||
- macostools.touched()
|
||||
- md5
|
||||
- MimeWriter
|
||||
- mimify
|
||||
- popen2, os.popen[234]()
|
||||
- posixfile
|
||||
- sets
|
||||
- sha
|
||||
- buildtools
|
||||
- cfmfile
|
||||
- commands.getstatus()
|
||||
- macostools.touched()
|
||||
- md5
|
||||
- MimeWriter
|
||||
- mimify
|
||||
- popen2, os.popen[234]()
|
||||
- posixfile
|
||||
- sets
|
||||
- sha
|
||||
|
||||
Modules removed from the standard library:
|
||||
Modules removed from the standard library:
|
||||
|
||||
- gopherlib
|
||||
- rgbimg
|
||||
- macfs
|
||||
- gopherlib
|
||||
- rgbimg
|
||||
- macfs
|
||||
|
||||
Warnings for features removed in Py3k:
|
||||
Warnings for features removed in Py3k:
|
||||
|
||||
- builtins: apply, callable, coerce, dict.has_key, execfile,
|
||||
reduce, reload
|
||||
- backticks and <>
|
||||
- float args to xrange
|
||||
- coerce and all its friends
|
||||
- comparing by default comparison
|
||||
- {}.has_key()
|
||||
- file.xreadlines
|
||||
- softspace removal for print() function
|
||||
- removal of modules because of PEP 4/3100/3108
|
||||
- builtins: apply, callable, coerce, dict.has_key, execfile,
|
||||
reduce, reload
|
||||
- backticks and <>
|
||||
- float args to xrange
|
||||
- coerce and all its friends
|
||||
- comparing by default comparison
|
||||
- {}.has_key()
|
||||
- file.xreadlines
|
||||
- softspace removal for print() function
|
||||
- removal of modules because of PEP 4/3100/3108
|
||||
|
||||
Other major features:
|
||||
Other major features:
|
||||
|
||||
- with/as will be keywords
|
||||
- a __dir__() special method to control dir() was added [1]
|
||||
- AtheOS support stopped.
|
||||
- warnings module implemented in C
|
||||
- compile() takes an AST and can convert to byte code
|
||||
- with/as will be keywords
|
||||
- a __dir__() special method to control dir() was added [1]
|
||||
- AtheOS support stopped.
|
||||
- warnings module implemented in C
|
||||
- compile() takes an AST and can convert to byte code
|
||||
|
||||
|
||||
Possible features for 2.6
|
||||
=========================
|
||||
|
||||
New features *should* be implemented prior to alpha2, particularly
|
||||
any C modifications or behavioral changes. New features *must* be
|
||||
implemented prior to beta1 or will require Release Manager approval.
|
||||
New features *should* be implemented prior to alpha2, particularly
|
||||
any C modifications or behavioral changes. New features *must* be
|
||||
implemented prior to beta1 or will require Release Manager approval.
|
||||
|
||||
The following PEPs are being worked on for inclusion in 2.6: None.
|
||||
The following PEPs are being worked on for inclusion in 2.6: None.
|
||||
|
||||
Each non-trivial feature listed here that is not a PEP must be
|
||||
discussed on python-dev. Other enhancements include:
|
||||
Each non-trivial feature listed here that is not a PEP must be
|
||||
discussed on python-dev. Other enhancements include:
|
||||
|
||||
- distutils replacement (requires a PEP)
|
||||
- distutils replacement (requires a PEP)
|
||||
|
||||
New modules in the standard library:
|
||||
New modules in the standard library:
|
||||
|
||||
- winerror
|
||||
http://python.org/sf/1505257
|
||||
(Patch rejected, module should be written in C)
|
||||
- winerror
|
||||
http://python.org/sf/1505257
|
||||
(Patch rejected, module should be written in C)
|
||||
|
||||
- setuptools
|
||||
BDFL pronouncement for inclusion in 2.5:
|
||||
http://mail.python.org/pipermail/python-dev/2006-April/063964.html
|
||||
- setuptools
|
||||
BDFL pronouncement for inclusion in 2.5:
|
||||
http://mail.python.org/pipermail/python-dev/2006-April/063964.html
|
||||
|
||||
PJE's withdrawal from 2.5 for inclusion in 2.6:
|
||||
http://mail.python.org/pipermail/python-dev/2006-April/064145.html
|
||||
PJE's withdrawal from 2.5 for inclusion in 2.6:
|
||||
http://mail.python.org/pipermail/python-dev/2006-April/064145.html
|
||||
|
||||
Modules to gain a DeprecationWarning (as specified for Python 2.6
|
||||
or through negligence):
|
||||
Modules to gain a DeprecationWarning (as specified for Python 2.6
|
||||
or through negligence):
|
||||
|
||||
- rfc822
|
||||
- mimetools
|
||||
- multifile
|
||||
- compiler package (or a Py3K warning instead?)
|
||||
- rfc822
|
||||
- mimetools
|
||||
- multifile
|
||||
- compiler package (or a Py3K warning instead?)
|
||||
|
||||
- Convert Parser/*.c to use the C warnings module rather than printf
|
||||
- Convert Parser/\*.c to use the C warnings module rather than printf
|
||||
|
||||
- Add warnings for Py3k features removed:
|
||||
* __getslice__/__setslice__/__delslice__
|
||||
* float args to PyArgs_ParseTuple
|
||||
* __cmp__?
|
||||
* other comparison changes?
|
||||
* int division?
|
||||
* All PendingDeprecationWarnings (e.g. exceptions)
|
||||
* using zip() result as a list
|
||||
* the exec statement (use function syntax)
|
||||
* function attributes that start with func_* (should use __*__)
|
||||
* the L suffix for long literals
|
||||
* renaming of __nonzero__ to __bool__
|
||||
* multiple inheritance with classic classes? (MRO might change)
|
||||
* properties and classic classes? (instance attrs shadow property)
|
||||
- Add warnings for Py3k features removed:
|
||||
|
||||
- use __bool__ method if available and there's no __nonzero__
|
||||
* __getslice__/__setslice__/__delslice__
|
||||
|
||||
- Check the various bits of code in Demo/ and Tools/ all still work,
|
||||
update or remove the ones that don't.
|
||||
* float args to PyArgs_ParseTuple
|
||||
|
||||
- All modules in Modules/ should be updated to be ssize_t clean.
|
||||
* __cmp__?
|
||||
|
||||
- All of Python (including Modules/) should compile cleanly with g++
|
||||
* other comparison changes?
|
||||
|
||||
- Start removing deprecated features and generally moving towards Py3k
|
||||
* int division?
|
||||
|
||||
- Replace all old style tests (operate on import) with unittest or docttest
|
||||
* All PendingDeprecationWarnings (e.g. exceptions)
|
||||
|
||||
- Add tests for all untested modules
|
||||
* using zip() result as a list
|
||||
|
||||
- Document undocumented modules/features
|
||||
* the exec statement (use function syntax)
|
||||
|
||||
- bdist_deb in distutils package
|
||||
http://mail.python.org/pipermail/python-dev/2006-February/060926.html
|
||||
* function attributes that start with func_* (should use __*__)
|
||||
|
||||
- bdist_egg in distutils package
|
||||
* the L suffix for long literals
|
||||
|
||||
- pure python pgen module
|
||||
(Owner: Guido)
|
||||
Deferral to 2.6:
|
||||
http://mail.python.org/pipermail/python-dev/2006-April/064528.html
|
||||
* renaming of __nonzero__ to __bool__
|
||||
|
||||
- Remove the fpectl module?
|
||||
* multiple inheritance with classic classes? (MRO might change)
|
||||
|
||||
* properties and classic classes? (instance attrs shadow property)
|
||||
|
||||
- use __bool__ method if available and there's no __nonzero__
|
||||
|
||||
- Check the various bits of code in Demo/ and Tools/ all still work,
|
||||
update or remove the ones that don't.
|
||||
|
||||
- All modules in Modules/ should be updated to be ssize_t clean.
|
||||
|
||||
- All of Python (including Modules/) should compile cleanly with g++
|
||||
|
||||
- Start removing deprecated features and generally moving towards Py3k
|
||||
|
||||
- Replace all old style tests (operate on import) with unittest or docttest
|
||||
|
||||
- Add tests for all untested modules
|
||||
|
||||
- Document undocumented modules/features
|
||||
|
||||
- bdist_deb in distutils package
|
||||
http://mail.python.org/pipermail/python-dev/2006-February/060926.html
|
||||
|
||||
- bdist_egg in distutils package
|
||||
|
||||
- pure python pgen module
|
||||
(Owner: Guido)
|
||||
Deferral to 2.6:
|
||||
http://mail.python.org/pipermail/python-dev/2006-April/064528.html
|
||||
|
||||
- Remove the fpectl module?
|
||||
|
||||
|
||||
Deferred until 2.7
|
||||
==================
|
||||
|
||||
None
|
||||
None
|
||||
|
||||
|
||||
Open issues
|
||||
===========
|
||||
|
||||
How should import warnings be handled?
|
||||
|
||||
- http://mail.python.org/pipermail/python-dev/2006-June/066345.html
|
||||
- http://python.org/sf/1515609
|
||||
- http://python.org/sf/1515361
|
||||
|
||||
How should import warnings be handled?
|
||||
http://mail.python.org/pipermail/python-dev/2006-June/066345.html
|
||||
http://python.org/sf/1515609
|
||||
http://python.org/sf/1515361
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] Adding a __dir__() magic method
|
||||
http://mail.python.org/pipermail/python-dev/2006-July/067139.html
|
||||
http://mail.python.org/pipermail/python-dev/2006-July/067139.html
|
||||
|
||||
.. [#pep358] PEP 358 (The "bytes" Object)
|
||||
http://www.python.org/dev/peps/pep-0358
|
||||
.. [pep352] PEP 352 (Required Superclass for Exceptions)
|
||||
http://www.python.org/dev/peps/pep-0352
|
||||
|
||||
.. [#pep366] PEP 366 (Main module explicit relative imports)
|
||||
http://www.python.org/dev/peps/pep-0366
|
||||
.. [pep358] PEP 358 (The "bytes" Object)
|
||||
http://www.python.org/dev/peps/pep-0358
|
||||
|
||||
.. [#pep367] PEP 367 (New Super)
|
||||
http://www.python.org/dev/peps/pep-0367
|
||||
.. [pep366] PEP 366 (Main module explicit relative imports)
|
||||
http://www.python.org/dev/peps/pep-0366
|
||||
|
||||
.. [#pep371] PEP 371 (Addition of the multiprocessing package)
|
||||
http://www.python.org/dev/peps/pep-0371
|
||||
.. [pep367] PEP 367 (New Super)
|
||||
http://www.python.org/dev/peps/pep-0367
|
||||
|
||||
.. [#pep3000] PEP 3000 (Python 3000)
|
||||
http://www.python.org/dev/peps/pep-3000
|
||||
.. [pep370] PEP 370 (Per user site-packages directory)
|
||||
http://www.python.org/dev/peps/pep-0370
|
||||
|
||||
.. [#pep3100] PEP 3100 (Miscellaneous Python 3.0 Plans)
|
||||
http://www.python.org/dev/peps/pep-3100
|
||||
.. [pep371] PEP 371 (Addition of the multiprocessing package)
|
||||
http://www.python.org/dev/peps/pep-0371
|
||||
|
||||
.. [#pep3112] PEP 3112 (Bytes literals in Python 3000)
|
||||
http://www.python.org/dev/peps/pep-3112
|
||||
.. [pep3000] PEP 3000 (Python 3000)
|
||||
http://www.python.org/dev/peps/pep-3000
|
||||
|
||||
.. [#pep3127] PEP 3127 (Integer Literal Support and Syntax)
|
||||
http://www.python.org/dev/peps/pep-3127
|
||||
.. [pep3100] PEP 3100 (Miscellaneous Python 3.0 Plans)
|
||||
http://www.python.org/dev/peps/pep-3100
|
||||
|
||||
.. _Google calendar:
|
||||
http://www.google.com/calendar/ical/b6v58qvojllt0i6ql654r1vh00%40group.calendar.google.com/public/basic.ics
|
||||
.. [pep3112] PEP 3112 (Bytes literals in Python 3000)
|
||||
http://www.python.org/dev/peps/pep-3112
|
||||
|
||||
.. [pep3127] PEP 3127 (Integer Literal Support and Syntax)
|
||||
http://www.python.org/dev/peps/pep-3127
|
||||
|
||||
.. _Google calendar: http://www.google.com/calendar/ical/b6v58qvojllt0i6ql654r1vh00%40group.calendar.google.com/public/basic.ics
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
||||
|
|
Loading…
Reference in New Issue