Convert PEPs 213, 241, 291, 298, 311 (#199)

Further progress on issue #4 

PEP 213:
- Fix typo in the word miniscule -> minuscule

PEP 241

PEP 291

PEP 298

PEP 311
This commit is contained in:
Mariatta 2017-02-06 13:14:46 -08:00 committed by Nick Coghlan
parent 2f36c2d928
commit f9a66fe511
5 changed files with 845 additions and 752 deletions

View File

@ -5,231 +5,248 @@ Last-Modified: $Date$
Author: paul@prescod.net (Paul Prescod)
Status: Deferred
Type: Standards Track
Content-Type: text/x-rst
Created: 21-Jul-2000
Python-Version: 2.1
Post-History:
Introduction
============
It is possible (and even relatively common) in Python code and
in extension modules to "trap" when an instance's client code
attempts to set an attribute and execute code instead. In other
words, it is possible to allow users to use attribute assignment/
retrieval/deletion syntax even though the underlying implementation
is doing some computation rather than directly modifying a
binding.
It is possible (and even relatively common) in Python code and
in extension modules to "trap" when an instance's client code
attempts to set an attribute and execute code instead. In other
words, it is possible to allow users to use attribute assignment/
retrieval/deletion syntax even though the underlying implementation
is doing some computation rather than directly modifying a
binding.
This PEP describes a feature that makes it easier, more efficient
and safer to implement these handlers for Python instances.
This PEP describes a feature that makes it easier, more efficient
and safer to implement these handlers for Python instances.
Justification
=============
Scenario 1:
Scenario 1
----------
You have a deployed class that works on an attribute named
"stdout". After a while, you think it would be better to
check that stdout is really an object with a "write" method
at the moment of assignment. Rather than change to a
setstdout method (which would be incompatible with deployed
code) you would rather trap the assignment and check the
object's type.
You have a deployed class that works on an attribute named
"stdout". After a while, you think it would be better to
check that stdout is really an object with a "write" method
at the moment of assignment. Rather than change to a
setstdout method (which would be incompatible with deployed
code) you would rather trap the assignment and check the
object's type.
Scenario 2:
Scenario 2
----------
You want to be as compatible as possible with an object
model that has a concept of attribute assignment. It could
be the W3C Document Object Model or a particular COM
interface (e.g. the PowerPoint interface). In that case
you may well want attributes in the model to show up as
attributes in the Python interface, even though the
underlying implementation may not use attributes at all.
You want to be as compatible as possible with an object
model that has a concept of attribute assignment. It could
be the W3C Document Object Model or a particular COM
interface (e.g. the PowerPoint interface). In that case
you may well want attributes in the model to show up as
attributes in the Python interface, even though the
underlying implementation may not use attributes at all.
Scenario 3:
Scenario 3
----------
A user wants to make an attribute read-only.
A user wants to make an attribute read-only.
In short, this feature allows programmers to separate the
interface of their module from the underlying implementation
for whatever purpose. Again, this is not a new feature but
merely a new syntax for an existing convention.
In short, this feature allows programmers to separate the
interface of their module from the underlying implementation
for whatever purpose. Again, this is not a new feature but
merely a new syntax for an existing convention.
Current Solution
================
To make some attributes read-only:
To make some attributes read-only::
class foo:
class foo:
def __setattr__( self, name, val ):
if name=="readonlyattr":
raise TypeError
elif name=="readonlyattr2":
raise TypeError
...
else:
self.__dict__["name"]=val
if name=="readonlyattr":
raise TypeError
elif name=="readonlyattr2":
raise TypeError
...
else:
self.__dict__["name"]=val
This has the following problems:
This has the following problems:
1. The creator of the method must be intimately aware of whether
somewhere else in the class hiearchy __setattr__ has also been
trapped for any particular purpose. If so, she must specifically
call that method rather than assigning to the dictionary. There
are many different reasons to overload __setattr__ so there is a
decent potential for clashes. For instance object database
implementations often overload setattr for an entirely unrelated
purpose.
1. The creator of the method must be intimately aware of whether
somewhere else in the class hierarchy ``__setattr__`` has also been
trapped for any particular purpose. If so, she must specifically
call that method rather than assigning to the dictionary. There
are many different reasons to overload ``__setattr__`` so there is a
decent potential for clashes. For instance object database
implementations often overload setattr for an entirely unrelated
purpose.
2. The string-based switch statement forces all attribute handlers
to be specified in one place in the code. They may then dispatch
to task-specific methods (for modularity) but this could cause
performance problems.
2. The string-based switch statement forces all attribute handlers
to be specified in one place in the code. They may then dispatch
to task-specific methods (for modularity) but this could cause
performance problems.
3. Logic for the setting, getting and deleting must live in
__getattr__, __setattr__ and __delattr__. Once again, this can be
mitigated through an extra level of method call but this is
inefficient.
3. Logic for the setting, getting and deleting must live in
``__getattr__``, ``__setattr__`` and ``__delattr__``. Once again, this can
be mitigated through an extra level of method call but this is
inefficient.
Proposed Syntax
Special methods should declare themselves with declarations of the
following form:
===============
class x:
def __attr_XXX__(self, op, val ):
if op=="get":
return someComputedValue(self.internal)
elif op=="set":
self.internal=someComputedValue(val)
elif op=="del":
del self.internal
Special methods should declare themselves with declarations of the
following form::
Client code looks like this:
class x:
def __attr_XXX__(self, op, val ):
if op=="get":
return someComputedValue(self.internal)
elif op=="set":
self.internal=someComputedValue(val)
elif op=="del":
del self.internal
fooval=x.foo
x.foo=fooval+5
del x.foo
Client code looks like this::
fooval=x.foo
x.foo=fooval+5
del x.foo
Semantics
Attribute references of all three kinds should call the method.
The op parameter can be "get"/"set"/"del". Of course this string
will be interned so the actual checks for the string will be
very fast.
It is disallowed to actually have an attribute named XXX in the
same instance as a method named __attr_XXX__.
=========
An implementation of __attr_XXX__ takes precedence over an
implementation of __getattr__ based on the principle that
__getattr__ is supposed to be invoked only after finding an
appropriate attribute has failed.
Attribute references of all three kinds should call the method.
The op parameter can be "get"/"set"/"del". Of course this string
will be interned so the actual checks for the string will be
very fast.
An implementation of __attr_XXX__ takes precedence over an
implementation of __setattr__ in order to be consistent. The
opposite choice seems fairly feasible also, however. The same
goes for __del_y__.
It is disallowed to actually have an attribute named XXX in the
same instance as a method named __attr_XXX__.
An implementation of __attr_XXX__ takes precedence over an
implementation of ``__getattr__`` based on the principle that
``__getattr__`` is supposed to be invoked only after finding an
appropriate attribute has failed.
An implementation of __attr_XXX__ takes precedence over an
implementation of ``__setattr__`` in order to be consistent. The
opposite choice seems fairly feasible also, however. The same
goes for __del_y__.
Proposed Implementation
There is a new object type called an attribute access handler.
Objects of this type have the following attributes:
=======================
name (e.g. XXX, not __attr__XXX__
method (pointer to a method object
In PyClass_New, methods of the appropriate form will be detected and
converted into objects (just like unbound method objects). These are
stored in the class __dict__ under the name XXX. The original method
is stored as an unbound method under its original name.
There is a new object type called an attribute access handler.
Objects of this type have the following attributes::
If there are any attribute access handlers in an instance at all,
a flag is set. Let's call it "I_have_computed_attributes" for
now. Derived classes inherit the flag from base classes. Instances
inherit the flag from classes.
A get proceeds as usual until just before the object is returned.
In addition to the current check whether the returned object is a
method it would also check whether a returned object is an access
handler. If so, it would invoke the getter method and return
the value. To remove an attribute access handler you could directly
fiddle with the dictionary.
A set proceeds by checking the "I_have_computed_attributes" flag. If
it is not set, everything proceeds as it does today. If it is set
then we must do a dictionary get on the requested object name. If it
returns an attribute access handler then we call the setter function
with the value. If it returns any other object then we discard the
result and continue as we do today. Note that having an attribute
access handler will mildly affect attribute "setting" performance for
all sets on a particular instance, but no more so than today, using
__setattr__. Gets are more efficient than they are today with
__getattr__.
The I_have_computed_attributes flag is intended to eliminate the
performance degradation of an extra "get" per "set" for objects not
using this feature. Checking this flag should have miniscule
performance implications for all objects.
name (e.g. XXX, not __attr__XXX__)
method (pointer to a method object)
The implementation of delete is analogous to the implementation
of set.
In PyClass_New, methods of the appropriate form will be detected and
converted into objects (just like unbound method objects). These are
stored in the class ``__dict__`` under the name XXX. The original method
is stored as an unbound method under its original name.
If there are any attribute access handlers in an instance at all,
a flag is set. Let's call it "I_have_computed_attributes" for
now. Derived classes inherit the flag from base classes. Instances
inherit the flag from classes.
A get proceeds as usual until just before the object is returned.
In addition to the current check whether the returned object is a
method it would also check whether a returned object is an access
handler. If so, it would invoke the getter method and return
the value. To remove an attribute access handler you could directly
fiddle with the dictionary.
A set proceeds by checking the "I_have_computed_attributes" flag. If
it is not set, everything proceeds as it does today. If it is set
then we must do a dictionary get on the requested object name. If it
returns an attribute access handler then we call the setter function
with the value. If it returns any other object then we discard the
result and continue as we do today. Note that having an attribute
access handler will mildly affect attribute "setting" performance for
all sets on a particular instance, but no more so than today, using
``__setattr__``. Gets are more efficient than they are today with
``__getattr__``.
The I_have_computed_attributes flag is intended to eliminate the
performance degradation of an extra "get" per "set" for objects not
using this feature. Checking this flag should have minuscule
performance implications for all objects.
The implementation of delete is analogous to the implementation
of set.
Caveats
=======
1. You might note that I have not proposed any logic to keep
the I_have_computed_attributes flag up to date as attributes
are added and removed from the instance's dictionary. This is
consistent with current Python. If you add a __setattr__ method
to an object after it is in use, that method will not behave as
it would if it were available at "compile" time. The dynamism is
arguably not worth the extra implementation effort. This snippet
demonstrates the current behavior:
1. You might note that I have not proposed any logic to keep
the I_have_computed_attributes flag up to date as attributes
are added and removed from the instance's dictionary. This is
consistent with current Python. If you add a ``__setattr__`` method
to an object after it is in use, that method will not behave as
it would if it were available at "compile" time. The dynamism is
arguably not worth the extra implementation effort. This snippet
demonstrates the current behavior::
>>> def prn(*args):print args
>>> class a:
... __setattr__=prn
>>> a().foo=5
(<__main__.a instance at 882890>, 'foo', 5)
>>> def prn(*args):print args
>>> class a:
>>> class b: pass
>>> bi=b()
>>> bi.__setattr__=prn
>>> b.foo=5
... __setattr__=prn
>>> a().foo=5
(<__main__.a instance at 882890>, 'foo', 5)
2. Assignment to __dict__["XXX"] can overwrite the attribute
access handler for __attr_XXX__. Typically the access handlers will
store information away in private __XXX variables
>>> class b: pass
>>> bi=b()
>>> bi.__setattr__=prn
>>> b.foo=5
3. An attribute access handler that attempts to call setattr or getattr
on the object itself can cause an infinite loop (as with __getattr__)
Once again, the solution is to use a special (typically private)
variable such as __XXX.
2. Assignment to __dict__["XXX"] can overwrite the attribute
access handler for __attr_XXX__. Typically the access handlers will
store information away in private __XXX variables
3. An attribute access handler that attempts to call setattr or getattr
on the object itself can cause an infinite loop (as with ``__getattr__``)
Once again, the solution is to use a special (typically private)
variable such as __XXX.
Note
====
The descriptor mechanism described in PEP 252 is powerful enough
to support this more directly. A 'getset' constructor may be
added to the language making this possible:
The descriptor mechanism described in PEP 252 is powerful enough
to support this more directly. A 'getset' constructor may be
added to the language making this possible::
class C:
def get_x(self):
return self.__x
def set_x(self, v):
self.__x = v
x = getset(get_x, set_x)
class C:
def get_x(self):
return self.__x
def set_x(self, v):
self.__x = v
x = getset(get_x, set_x)
Additional syntactic sugar might be added, or a naming convention
could be recognized.
Additional syntactic sugar might be added, or a naming convention
could be recognized.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

View File

@ -5,232 +5,261 @@ Last-Modified: $Date$
Author: A.M. Kuchling <amk@amk.ca>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Mar-2001
Post-History: 19-Mar-2001
Introduction
This PEP describes a mechanism for adding metadata to Python
packages. It includes specifics of the field names, and their
semantics and usage.
Introduction
============
This PEP describes a mechanism for adding metadata to Python
packages. It includes specifics of the field names, and their
semantics and usage.
Including Metadata in Packages
==============================
The Distutils 'sdist' command will be modified to extract the
metadata fields from the arguments and write them to a file in the
generated zipfile or tarball. This file will be named PKG-INFO
and will be placed in the top directory of the source
distribution (where the README, INSTALL, and other files usually
go).
The Distutils 'sdist' command will be modified to extract the
metadata fields from the arguments and write them to a file in the
generated zipfile or tarball. This file will be named PKG-INFO
and will be placed in the top directory of the source
distribution (where the README, INSTALL, and other files usually
go).
Developers may not provide their own PKG-INFO file. The "sdist"
command will, if it detects an existing PKG-INFO file, terminate
with an appropriate error message. This should prevent confusion
caused by the PKG-INFO and setup.py files being out of sync.
Developers may not provide their own PKG-INFO file. The "sdist"
command will, if it detects an existing PKG-INFO file, terminate
with an appropriate error message. This should prevent confusion
caused by the PKG-INFO and setup.py files being out of sync.
The PKG-INFO file format is a single set of RFC-822 headers
parseable by the rfc822.py module. The field names listed in the
following section are used as the header names. There's no
extension mechanism in this simple format; the Catalog and Distutils
SIGs will aim at getting a more flexible format ready for Python 2.2.
The PKG-INFO file format is a single set of RFC-822 headers
parseable by the rfc822.py module. The field names listed in the
following section are used as the header names. There's no
extension mechanism in this simple format; the Catalog and Distutils
SIGs will aim at getting a more flexible format ready for Python 2.2.
Fields
======
This section specifies the names and semantics of each of the
supported metadata fields.
Fields marked with "(Multiple use)" may be specified multiple
times in a single PKG-INFO file. Other fields may only occur
once in a PKG-INFO file. Fields marked with "(optional)" are
not required to appear in a valid PKG-INFO file, all other
fields must be present.
This section specifies the names and semantics of each of the
supported metadata fields.
Metadata-Version
Fields marked with "(Multiple use)" may be specified multiple
times in a single PKG-INFO file. Other fields may only occur
once in a PKG-INFO file. Fields marked with "(optional)" are
not required to appear in a valid PKG-INFO file, all other
fields must be present.
Version of the file format; currently "1.0" is the only
legal value here.
Metadata-Version
----------------
Example:
Version of the file format; currently "1.0" is the only
legal value here.
Metadata-Version: 1.0
Example::
Name
Metadata-Version: 1.0
The name of the package.
Name
----
Example:
The name of the package.
Name: BeagleVote
Version
Example::
A string containing the package's version number. This
field should be parseable by one of the Version classes
(StrictVersion or LooseVersion) in the distutils.version
module.
Name: BeagleVote
Example:
Version
-------
Version: 1.0a2
Platform (multiple use)
A string containing the package's version number. This
field should be parseable by one of the Version classes
(StrictVersion or LooseVersion) in the distutils.version
module.
A comma-separated list of platform specifications, summarizing
the operating systems supported by the package. The major
supported platforms are listed below, but this list is
necessarily incomplete.
Example::
POSIX, MacOS, Windows, BeOS, PalmOS.
Version: 1.0a2
Binary distributions will use the Supported-Platform field in
their metadata to specify the OS and CPU for which the binary
package was compiled. The semantics of the Supported-Platform
are not specified in this PEP.
Platform (multiple use)
-----------------------
Example:
A comma-separated list of platform specifications, summarizing
the operating systems supported by the package. The major
supported platforms are listed below, but this list is
necessarily incomplete.
Platform: POSIX, Windows
Summary
::
A one-line summary of what the package does.
POSIX, MacOS, Windows, BeOS, PalmOS.
Example:
Binary distributions will use the Supported-Platform field in
their metadata to specify the OS and CPU for which the binary
package was compiled. The semantics of the Supported-Platform
are not specified in this PEP.
Summary: A module for collecting votes from beagles.
Description (optional)
Example::
A longer description of the package that can run to several
paragraphs. (Software that deals with metadata should not
assume any maximum size for this field, though one hopes that
people won't include their instruction manual as the
long-description.)
Platform: POSIX, Windows
Example:
Description: This module collects votes from beagles
in order to determine their electoral wishes.
Do NOT try to use this module with basset hounds;
it makes them grumpy.
Keywords (optional)
Summary
-------
A list of additional keywords to be used to assist searching
for the package in a larger catalog.
A one-line summary of what the package does.
Example:
Example::
Keywords: dog puppy voting election
Home-page (optional)
Summary: A module for collecting votes from beagles.
A string containing the URL for the package's home page.
Description (optional)
----------------------
Example:
A longer description of the package that can run to several
paragraphs. (Software that deals with metadata should not
assume any maximum size for this field, though one hopes that
people won't include their instruction manual as the
long-description.)
Home-page: http://www.example.com/~cschultz/bvote/
Author (optional)
Example::
A string containing at a minimum the author's name. Contact
information can also be added, separating each line with
newlines.
Description: This module collects votes from beagles
in order to determine their electoral wishes.
Do NOT try to use this module with basset hounds;
it makes them grumpy.
Example:
Keywords (optional)
-------------------
Author: C. Schultz
Universal Features Syndicate
Los Angeles, CA
Author-email
A list of additional keywords to be used to assist searching
for the package in a larger catalog.
A string containing the author's e-mail address. It can contain
a name and e-mail address in the legal forms for a RFC-822
'From:' header. It's not optional because cataloging systems
can use the e-mail portion of this field as a unique key
representing the author. A catalog might provide authors the
ability to store their GPG key, personal home page, and other
additional metadata *about the author*, and optionally the
ability to associate several e-mail addresses with the same
person. Author-related metadata fields are not covered by this
PEP.
Example::
Example:
Keywords: dog puppy voting election
Author-email: "C. Schultz" <cschultz@example.com>
License
A string selected from a short list of choices, specifying the
license covering the package. Some licenses result in the
software being freely redistributable, so packagers and
resellers can automatically know that they're free to
redistribute the software. Other licenses will require
a careful reading by a human to determine how the software can be
repackaged and resold.
Home-page (optional)
--------------------
The choices are:
A string containing the URL for the package's home page.
Artistic, BSD, DFSG, GNU GPL, GNU LGPL, "MIT",
Mozilla PL, "public domain", Python, Qt PL, Zope PL, unknown,
nocommercial, nosell, nosource, shareware, other
Example::
Definitions of some of the licenses are:
Home-page: http://www.example.com/~cschultz/bvote/
DFSG The license conforms to the Debian Free Software
Guidelines, but does not use one of the other
DFSG conforming licenses listed here.
More information is available at:
http://www.debian.org/social_contract#guidelines
Author (optional)
-----------------
Python Python 1.6 or higher license. Version 1.5.2 and
earlier are under the MIT license.
A string containing at a minimum the author's name. Contact
information can also be added, separating each line with
newlines.
public domain Software is public domain, not copyrighted.
unknown Status is not known
nocommercial Free private use but commercial use not permitted
nosell Free use but distribution for profit by arrangement
nosource Freely distributable but no source code
shareware Payment is requested if software is used
other General category for other non-DFSG licenses
Example::
Some of these licenses can be interpreted to mean the software is
freely redistributable. The list of redistributable licenses is:
Author: C. Schultz
Universal Features Syndicate
Los Angeles, CA
Artistic, BSD, DFSG, GNU GPL, GNU LGPL, "MIT",
Mozilla PL, "public domain", Python, Qt PL, Zope PL,
nosource, shareware
Author-email
------------
Note that being redistributable does not mean a package
qualifies as free software, 'nosource' and 'shareware' being
examples.
A string containing the author's e-mail address. It can contain
a name and e-mail address in the legal forms for a RFC-822
'From:' header. It's not optional because cataloging systems
can use the e-mail portion of this field as a unique key
representing the author. A catalog might provide authors the
ability to store their GPG key, personal home page, and other
additional metadata *about the author*, and optionally the
ability to associate several e-mail addresses with the same
person. Author-related metadata fields are not covered by this
PEP.
Example:
Example::
Author-email: "C. Schultz" <cschultz@example.com>
License
-------
A string selected from a short list of choices, specifying the
license covering the package. Some licenses result in the
software being freely redistributable, so packagers and
resellers can automatically know that they're free to
redistribute the software. Other licenses will require
a careful reading by a human to determine how the software can be
repackaged and resold.
The choices are::
Artistic, BSD, DFSG, GNU GPL, GNU LGPL, "MIT",
Mozilla PL, "public domain", Python, Qt PL, Zope PL, unknown,
nocommercial, nosell, nosource, shareware, other
Definitions of some of the licenses are:
============= ===================================================
DFSG The license conforms to the Debian Free Software
Guidelines, but does not use one of the other
DFSG conforming licenses listed here.
More information is available at:
http://www.debian.org/social_contract#guidelines
Python Python 1.6 or higher license. Version 1.5.2 and
earlier are under the MIT license.
public domain Software is public domain, not copyrighted.
unknown Status is not known
nocommercial Free private use but commercial use not permitted
nosell Free use but distribution for profit by arrangement
nosource Freely distributable but no source code
shareware Payment is requested if software is used
other General category for other non-DFSG licenses
============= ===================================================
Some of these licenses can be interpreted to mean the software is
freely redistributable. The list of redistributable licenses is::
Artistic, BSD, DFSG, GNU GPL, GNU LGPL, "MIT",
Mozilla PL, "public domain", Python, Qt PL, Zope PL,
nosource, shareware
Note that being redistributable does not mean a package
qualifies as free software, 'nosource' and 'shareware' being
examples.
Example::
License: MIT
License: MIT
Acknowledgements
================
Many changes and rewrites to this document were suggested by the
readers of the Distutils SIG. In particular, Sean Reifschneider
often contributed actual text for inclusion in this PEP.
The list of licenses was compiled using the SourceForge license
list and the CTAN license list compiled by Graham Williams; Carey
Evans also offered several useful suggestions on this list.
Many changes and rewrites to this document were suggested by the
readers of the Distutils SIG. In particular, Sean Reifschneider
often contributed actual text for inclusion in this PEP.
The list of licenses was compiled using the SourceForge license
list and the CTAN license list compiled by Graham Williams; Carey
Evans also offered several useful suggestions on this list.
Copyright
=========
This document has been placed in the public domain.
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

View File

@ -5,147 +5,161 @@ Last-Modified: $Date$
Author: nnorwitz@gmail.com (Neal Norwitz)
Status: Final
Type: Informational
Content-Type: text/x-rst
Created: 06-Jun-2002
Python-Version: 2.3
Post-History:
Abstract
========
This PEP describes the packages and modules in the Python 2
standard library which should remain backward compatible with
previous versions of Python. If a package is not listed here,
then it need only remain compatible with the version of Python it
is distributed with.
This PEP describes the packages and modules in the Python 2
standard library which should remain backward compatible with
previous versions of Python. If a package is not listed here,
then it need only remain compatible with the version of Python it
is distributed with.
This PEP has no bearing on the Python 3 standard library.
This PEP has no bearing on the Python 3 standard library.
Rationale
=========
Authors have various reasons why packages and modules should
continue to work with previous versions of Python. In order to
maintain backward compatibility for these modules while moving the
rest of the standard library forward, it is necessary to know
which modules can be modified and which should use old and
possibly deprecated features.
Authors have various reasons why packages and modules should
continue to work with previous versions of Python. In order to
maintain backward compatibility for these modules while moving the
rest of the standard library forward, it is necessary to know
which modules can be modified and which should use old and
possibly deprecated features.
Generally, authors should attempt to keep changes backward
compatible with the previous released version of Python in order
to make bug fixes easier to backport.
Generally, authors should attempt to keep changes backward
compatible with the previous released version of Python in order
to make bug fixes easier to backport.
In addition to a package or module being listed in this PEP,
authors must add a comment at the top of each file documenting
the compatibility requirement.
In addition to a package or module being listed in this PEP,
authors must add a comment at the top of each file documenting
the compatibility requirement.
When a major version of Python is released, a Subversion branch is
created for continued maintenance and bug fix releases. A package
version on a branch may have a different compatibility requirement
than the same package on the trunk (i.e. current bleeding-edge
development). Where appropriate, these branch compatibilities are
listed below.
When a major version of Python is released, a Subversion branch is
created for continued maintenance and bug fix releases. A package
version on a branch may have a different compatibility requirement
than the same package on the trunk (i.e. current bleeding-edge
development). Where appropriate, these branch compatibilities are
listed below.
Features to Avoid
=================
The following list contains common features to avoid in order
to maintain backward compatibility with each version of Python.
This list is not complete! It is only meant as a general guide.
The following list contains common features to avoid in order
to maintain backward compatibility with each version of Python.
This list is not complete! It is only meant as a general guide.
Note that the features below were implemented in the version
following the one listed. For example, features listed next to
1.5.2 were implemented in 2.0.
Note that the features below were implemented in the version
following the one listed. For example, features listed next to
1.5.2 were implemented in 2.0.
Version Features to Avoid
------- -----------------
1.5.2 string methods, Unicode, list comprehensions,
augmented assignment (eg, +=), zip(), import x as y,
dict.setdefault(), print >> f,
calling f(*args, **kw), plus all features below
======= ======================================================
Version Features to Avoid
======= ======================================================
1.5.2 string methods, Unicode, list comprehensions,
augmented assignment (eg, +=), zip(), import x as y,
dict.setdefault(), print >> f,
calling f(\*args, \**kw), plus all features below
2.0 nested scopes, rich comparisons,
function attributes, plus all features below
2.0 nested scopes, rich comparisons,
function attributes, plus all features below
2.1 use of object or new-style classes, iterators,
using generators, nested scopes, or //
without from __future__ import ... statement,
isinstance(X, TYP) where TYP is a tuple of types,
plus all features below
2.1 use of object or new-style classes, iterators,
using generators, nested scopes, or //
without from __future__ import ... statement,
isinstance(X, TYP) where TYP is a tuple of types,
plus all features below
2.2 bool, True, False, basestring, enumerate(),
{}.pop(), PendingDeprecationWarning,
Universal Newlines, plus all features below
plus all features below
2.2 bool, True, False, basestring, enumerate(),
{}.pop(), PendingDeprecationWarning,
Universal Newlines, plus all features below
plus all features below
2.3 generator expressions, multi-line imports,
decorators, int/long unification, set/frozenset,
reversed(), sorted(), "".rsplit(),
plus all features below
2.3 generator expressions, multi-line imports,
decorators, int/long unification, set/frozenset,
reversed(), sorted(), "".rsplit(),
plus all features below
2.4 with statement, conditional expressions,
combined try/except/finally, relative imports,
yield expressions or generator.throw/send/close(),
plus all features below
2.4 with statement, conditional expressions,
combined try/except/finally, relative imports,
yield expressions or generator.throw/send/close(),
plus all features below
2.5 with statement without from __future__ import,
io module, str.format(), except as,
bytes, b'' literals, property.setter/deleter
2.5 with statement without from __future__ import,
io module, str.format(), except as,
bytes, b'' literals, property.setter/deleter
======= ======================================================
Backward Compatible Packages, Modules, and Tools
================================================
Package/Module Maintainer(s) Python Version Notes
-------------- ------------- -------------- -----
2to3 Benjamin Peterson 2.5
bsddb Greg Smith 2.1
Barry Warsaw
compiler Jeremy Hylton 2.1
decimal Raymond Hettinger 2.3 [2]
distutils Tarek Ziade 2.3
email Barry Warsaw 2.1 / 2.3 [1]
pkgutil Phillip Eby 2.3
platform Marc-Andre Lemburg 1.5.2
pybench Marc-Andre Lemburg 1.5.2 [3]
sre Fredrik Lundh 2.1
subprocess Peter Astrand 2.2
wsgiref Phillip J. Eby 2.1
xml (PyXML) Martin v. Loewis 2.0
xmlrpclib Fredrik Lundh 2.1
Tool Maintainer(s) Python Version
---- ------------- --------------
None
============== ================== ============== =====
Package/Module Maintainer(s) Python Version Notes
============== ================== ============== =====
2to3 Benjamin Peterson 2.5
bsddb - Greg Smith 2.1
- Barry Warsaw
compiler Jeremy Hylton 2.1
decimal Raymond Hettinger 2.3 [2]
distutils Tarek Ziade 2.3
email Barry Warsaw 2.1 / 2.3 [1]
pkgutil Phillip Eby 2.3
platform Marc-Andre Lemburg 1.5.2
pybench Marc-Andre Lemburg 1.5.2 [3]
sre Fredrik Lundh 2.1
subprocess Peter Astrand 2.2
wsgiref Phillip J. Eby 2.1
xml (PyXML) Martin v. Loewis 2.0
xmlrpclib Fredrik Lundh 2.1
============== ================== ============== =====
Notes
-----
==== ============= ==============
Tool Maintainer(s) Python Version
==== ============= ==============
None
==== ============= ==============
[1] The email package version 2 was distributed with Python up to
Python 2.3, and this must remain Python 2.1 compatible. email
package version 3 will be distributed with Python 2.4 and will
need to remain compatible only with Python 2.3.
[2] Specification updates will be treated as bugfixes and backported.
Python 2.3 compatibility will be kept for at least Python 2.4.
The decision will be revisited for Python 2.5 and not changed
unless compelling advantages arise.
Notes
-----
[3] pybench lives under the Tools/ directory. Compatibility with
older Python versions is needed in order to be able to compare
performance between Python versions. New features may still
be used in new tests, which may then be configured to fail
gracefully on import by the tool in older Python versions.
1. The email package version 2 was distributed with Python up to
Python 2.3, and this must remain Python 2.1 compatible. email
package version 3 will be distributed with Python 2.4 and will
need to remain compatible only with Python 2.3.
2. Specification updates will be treated as bugfixes and backported.
Python 2.3 compatibility will be kept for at least Python 2.4.
The decision will be revisited for Python 2.5 and not changed
unless compelling advantages arise.
3. pybench lives under the Tools/ directory. Compatibility with
older Python versions is needed in order to be able to compare
performance between Python versions. New features may still
be used in new tests, which may then be configured to fail
gracefully on import by the tool in older Python versions.
Copyright
=========
This document has been placed in the public domain.
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

View File

@ -5,229 +5,240 @@ Last-Modified: $Date$
Author: Thomas Heller <theller@python.net>
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Created: 26-Jul-2002
Python-Version: 2.3
Post-History: 30-Jul-2002, 1-Aug-2002
Abstract
========
This PEP proposes an extension to the buffer interface called the
'locked buffer interface'.
This PEP proposes an extension to the buffer interface called the
'locked buffer interface'.
The locked buffer interface avoids the flaws of the 'old' buffer
interface [1] as defined in Python versions up to and including
2.2, and has the following semantics:
The locked buffer interface avoids the flaws of the 'old' buffer
interface [1]_ as defined in Python versions up to and including
2.2, and has the following semantics:
The lifetime of the retrieved pointer is clearly defined and
controlled by the client.
- The lifetime of the retrieved pointer is clearly defined and
controlled by the client.
The buffer size is returned as a 'size_t' data type, which
allows access to large buffers on platforms where sizeof(int)
!= sizeof(void *).
- The buffer size is returned as a 'size_t' data type, which
allows access to large buffers on platforms where ``sizeof(int)
!= sizeof(void *)``.
(Guido comments: This second sounds like a change we could also
make to the "old" buffer interface, if we introduce another flag
bit that's *not* part of the default flags.)
(Guido comments: This second sounds like a change we could also
make to the "old" buffer interface, if we introduce another flag
bit that's *not* part of the default flags.)
Specification
=============
The locked buffer interface exposes new functions which return the
size and the pointer to the internal memory block of any python
object which chooses to implement this interface.
The locked buffer interface exposes new functions which return the
size and the pointer to the internal memory block of any python
object which chooses to implement this interface.
Retrieving a buffer from an object puts this object in a locked
state during which the buffer may not be freed, resized, or
reallocated.
Retrieving a buffer from an object puts this object in a locked
state during which the buffer may not be freed, resized, or
reallocated.
The object must be unlocked again by releasing the buffer if it's
no longer used by calling another function in the locked buffer
interface. If the object never resizes or reallocates the buffer
during its lifetime, this function may be NULL. Failure to call
this function (if it is != NULL) is a programming error and may
have unexpected results.
The object must be unlocked again by releasing the buffer if it's
no longer used by calling another function in the locked buffer
interface. If the object never resizes or reallocates the buffer
during its lifetime, this function may be NULL. Failure to call
this function (if it is != NULL) is a programming error and may
have unexpected results.
The locked buffer interface omits the memory segment model which
is present in the old buffer interface - only a single memory
block can be exposed.
The locked buffer interface omits the memory segment model which
is present in the old buffer interface - only a single memory
block can be exposed.
The memory blocks can be accessed without holding the global
interpreter lock.
The memory blocks can be accessed without holding the global
interpreter lock.
Implementation
==============
Define a new flag in Include/object.h:
Define a new flag in Include/object.h::
/* PyBufferProcs contains bf_acquirelockedreadbuffer,
bf_acquirelockedwritebuffer, and bf_releaselockedbuffer */
#define Py_TPFLAGS_HAVE_LOCKEDBUFFER (1L<<15)
/* PyBufferProcs contains bf_acquirelockedreadbuffer,
bf_acquirelockedwritebuffer, and bf_releaselockedbuffer */
#define Py_TPFLAGS_HAVE_LOCKEDBUFFER (1L<<15)
This flag would be included in Py_TPFLAGS_DEFAULT:
This flag would be included in ``Py_TPFLAGS_DEFAULT``::
#define Py_TPFLAGS_DEFAULT ( \
....
Py_TPFLAGS_HAVE_LOCKEDBUFFER | \
....
0)
#define Py_TPFLAGS_DEFAULT ( \
....
Py_TPFLAGS_HAVE_LOCKEDBUFFER | \
....
0)
Extend the PyBufferProcs structure by new fields in
Include/object.h:
Extend the ``PyBufferProcs`` structure by new fields in
Include/object.h::
typedef size_t (*acquirelockedreadbufferproc)(PyObject *,
const void **);
typedef size_t (*acquirelockedwritebufferproc)(PyObject *,
void **);
typedef void (*releaselockedbufferproc)(PyObject *);
typedef size_t (*acquirelockedreadbufferproc)(PyObject *,
const void **);
typedef size_t (*acquirelockedwritebufferproc)(PyObject *,
void **);
typedef void (*releaselockedbufferproc)(PyObject *);
typedef struct {
getreadbufferproc bf_getreadbuffer;
getwritebufferproc bf_getwritebuffer;
getsegcountproc bf_getsegcount;
getcharbufferproc bf_getcharbuffer;
/* locked buffer interface functions */
acquirelockedreadbufferproc bf_acquirelockedreadbuffer;
acquirelockedwritebufferproc bf_acquirelockedwritebuffer;
releaselockedbufferproc bf_releaselockedbuffer;
} PyBufferProcs;
typedef struct {
getreadbufferproc bf_getreadbuffer;
getwritebufferproc bf_getwritebuffer;
getsegcountproc bf_getsegcount;
getcharbufferproc bf_getcharbuffer;
/* locked buffer interface functions */
acquirelockedreadbufferproc bf_acquirelockedreadbuffer;
acquirelockedwritebufferproc bf_acquirelockedwritebuffer;
releaselockedbufferproc bf_releaselockedbuffer;
} PyBufferProcs;
The new fields are present if the Py_TPFLAGS_HAVE_LOCKEDBUFFER
flag is set in the object's type.
The new fields are present if the ``Py_TPFLAGS_HAVE_LOCKEDBUFFER``
flag is set in the object's type.
The Py_TPFLAGS_HAVE_LOCKEDBUFFER flag implies the
Py_TPFLAGS_HAVE_GETCHARBUFFER flag.
The ``Py_TPFLAGS_HAVE_LOCKEDBUFFER`` flag implies the
``Py_TPFLAGS_HAVE_GETCHARBUFFER`` flag.
The acquirelockedreadbufferproc and acquirelockedwritebufferproc
functions return the size in bytes of the memory block on success,
and fill in the passed void * pointer on success. If these
functions fail - either because an error occurs or no memory block
is exposed - they must set the void * pointer to NULL and raise an
exception. The return value is undefined in these cases and
should not be used.
The ``acquirelockedreadbufferproc`` and ``acquirelockedwritebufferproc``
functions return the size in bytes of the memory block on success,
and fill in the passed void \* pointer on success. If these
functions fail - either because an error occurs or no memory block
is exposed - they must set the void \* pointer to NULL and raise an
exception. The return value is undefined in these cases and
should not be used.
If calls to these functions succeed, eventually the buffer must be
released by a call to the releaselockedbufferproc, supplying the
original object as argument. The releaselockedbufferproc cannot
fail. For objects that actually maintain an internal lock count
it would be a fatal error if the releaselockedbufferproc function
would be called too often, leading to a negative lock count.
If calls to these functions succeed, eventually the buffer must be
released by a call to the ``releaselockedbufferproc``, supplying the
original object as argument. The ``releaselockedbufferproc`` cannot
fail. For objects that actually maintain an internal lock count
it would be a fatal error if the ``releaselockedbufferproc`` function
would be called too often, leading to a negative lock count.
Similar to the 'old' buffer interface, any of these functions may
be set to NULL, but it is strongly recommended to implement the
releaselockedbufferproc function (even if it does nothing) if any
of the acquireread/writelockedbufferproc functions are
implemented, to discourage extension writers from checking for a
NULL value and not calling it.
Similar to the 'old' buffer interface, any of these functions may
be set to NULL, but it is strongly recommended to implement the
``releaselockedbufferproc`` function (even if it does nothing) if any
of the ``acquireread``/``writelockedbufferproc`` functions are
implemented, to discourage extension writers from checking for a
NULL value and not calling it.
These functions aren't supposed to be called directly, they are
called through convenience functions declared in
Include/abstract.h:
These functions aren't supposed to be called directly, they are
called through convenience functions declared in
Include/abstract.h::
int PyObject_AquireLockedReadBuffer(PyObject *obj,
const void **buffer,
size_t *buffer_len);
int PyObject_AquireLockedReadBuffer(PyObject *obj,
const void **buffer,
size_t *buffer_len);
int PyObject_AcquireLockedWriteBuffer(PyObject *obj,
void **buffer,
size_t *buffer_len);
int PyObject_AcquireLockedWriteBuffer(PyObject *obj,
void **buffer,
size_t *buffer_len);
void PyObject_ReleaseLockedBuffer(PyObject *obj);
void PyObject_ReleaseLockedBuffer(PyObject *obj);
The former two functions return 0 on success, set buffer to the
memory location and buffer_len to the length of the memory block
in bytes. On failure, or if the locked buffer interface is not
implemented by obj, they return -1 and set an exception.
The former two functions return 0 on success, set buffer to the
memory location and buffer_len to the length of the memory block
in bytes. On failure, or if the locked buffer interface is not
implemented by obj, they return -1 and set an exception.
The latter function doesn't return anything, and cannot fail.
The latter function doesn't return anything, and cannot fail.
Backward Compatibility
======================
The size of the PyBufferProcs structure changes if this proposal
is implemented, but the type's tp_flags slot can be used to
determine if the additional fields are present.
The size of the ``PyBufferProcs`` structure changes if this proposal
is implemented, but the type's ``tp_flags`` slot can be used to
determine if the additional fields are present.
Reference Implementation
========================
An implementation has been uploaded to the SourceForge patch
manager as http://www.python.org/sf/652857.
An implementation has been uploaded to the SourceForge patch
manager as http://www.python.org/sf/652857.
Additional Notes/Comments
=========================
Python strings, unicode strings, mmap objects, and array objects
would expose the locked buffer interface.
Python strings, unicode strings, mmap objects, and array objects
would expose the locked buffer interface.
mmap and array objects would actually enter a locked state while
the buffer is active, this is not needed for strings and unicode
objects. Resizing locked array objects is not allowed and will
raise an exception. Whether closing a locked mmap object is an
error or will only be deferred until the lock count reaches zero
is an implementation detail.
mmap and array objects would actually enter a locked state while
the buffer is active, this is not needed for strings and unicode
objects. Resizing locked array objects is not allowed and will
raise an exception. Whether closing a locked mmap object is an
error or will only be deferred until the lock count reaches zero
is an implementation detail.
Guido recommends:
Guido recommends
But I'm still very concerned that if most built-in types
(e.g. strings, bytes) don't implement the release
functionality, it's too easy for an extension to seem to work
while forgetting to release the buffer.
But I'm still very concerned that if most built-in types
(e.g. strings, bytes) don't implement the release
functionality, it's too easy for an extension to seem to work
while forgetting to release the buffer.
I recommend that at least some built-in types implement the
acquire/release functionality with a counter, and assert that
the counter is zero when the object is deleted -- if the
assert fails, someone DECREF'ed their reference to the object
without releasing it. (The rule should be that you must own a
reference to the object while you've acquired the object.)
I recommend that at least some built-in types implement the
acquire/release functionality with a counter, and assert that
the counter is zero when the object is deleted -- if the
assert fails, someone DECREF'ed their reference to the object
without releasing it. (The rule should be that you must own a
reference to the object while you've acquired the object.)
For strings that might be impractical because the string
object would have to grow 4 bytes to hold the counter; but the
new bytes object (PEP 296) could easily implement the counter,
and the array object too -- that way there will be plenty of
opportunity to test proper use of the protocol.
For strings that might be impractical because the string
object would have to grow 4 bytes to hold the counter; but the
new bytes object (PEP 296) could easily implement the counter,
and the array object too -- that way there will be plenty of
opportunity to test proper use of the protocol.
Community Feedback
==================
Greg Ewing doubts the locked buffer interface is needed at all, he
thinks the normal buffer interface could be used if the pointer is
(re)fetched each time it's used. This seems to be dangerous,
because even innocent looking calls to the Python API like
Py_DECREF() may trigger execution of arbitrary Python code.
Greg Ewing doubts the locked buffer interface is needed at all, he
thinks the normal buffer interface could be used if the pointer is
(re)fetched each time it's used. This seems to be dangerous,
because even innocent looking calls to the Python API like
``Py_DECREF()`` may trigger execution of arbitrary Python code.
The first version of this proposal didn't have the release
function, but it turned out that this would have been too
restrictive: mmap and array objects wouldn't have been able to
implement it, because mmap objects can be closed anytime if not
locked, and array objects could resize or reallocate the buffer.
The first version of this proposal didn't have the release
function, but it turned out that this would have been too
restrictive: mmap and array objects wouldn't have been able to
implement it, because mmap objects can be closed anytime if not
locked, and array objects could resize or reallocate the buffer.
This PEP will probably be rejected because nobody except the
author needs it.
This PEP will probably be rejected because nobody except the
author needs it.
References
==========
[1] The buffer interface
http://mail.python.org/pipermail/python-dev/2000-October/009974.html
.. [1] The buffer interface
http://mail.python.org/pipermail/python-dev/2000-October/009974.html
[2] The Buffer Problem
http://www.python.org/dev/peps/pep-0296/
.. [2] The Buffer Problem
http://www.python.org/dev/peps/pep-0296/
Copyright
=========
This document has been placed in the public domain.
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

View File

@ -5,230 +5,252 @@ Last-Modified: $Date$
Author: Mark Hammond <mhammond@skippinet.com.au>
Status: Final
Type: Standards Track
Content-Type: text/plain
Content-Type: text/x-rst
Created: 05-Feb-2003
Post-History: 05-Feb-2003 14-Feb-2003 19-Apr-2003
Abstract
========
This PEP proposes a simplified API for access to the Global
Interpreter Lock (GIL) for Python extension modules.
Specifically, it provides a solution for authors of complex
multi-threaded extensions, where the current state of Python
(i.e., the state of the GIL is unknown.
This PEP proposes a simplified API for access to the Global
Interpreter Lock (GIL) for Python extension modules.
Specifically, it provides a solution for authors of complex
multi-threaded extensions, where the current state of Python
(i.e., the state of the GIL is unknown.
This PEP proposes a new API, for platforms built with threading
support, to manage the Python thread state. An implementation
strategy is proposed, along with an initial, platform independent
implementation.
This PEP proposes a new API, for platforms built with threading
support, to manage the Python thread state. An implementation
strategy is proposed, along with an initial, platform independent
implementation.
Rationale
=========
The current Python interpreter state API is suitable for simple,
single-threaded extensions, but quickly becomes incredibly complex
for non-trivial, multi-threaded extensions.
The current Python interpreter state API is suitable for simple,
single-threaded extensions, but quickly becomes incredibly complex
for non-trivial, multi-threaded extensions.
Currently Python provides two mechanisms for dealing with the GIL:
Currently Python provides two mechanisms for dealing with the GIL:
- Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros.
These macros are provided primarily to allow a simple Python
extension that already owns the GIL to temporarily release it
while making an "external" (ie, non-Python), generally
expensive, call. Any existing Python threads that are blocked
waiting for the GIL are then free to run. While this is fine
for extensions making calls from Python into the outside world,
it is no help for extensions that need to make calls into Python
when the thread state is unknown.
- ``Py_BEGIN_ALLOW_THREADS`` and ``Py_END_ALLOW_THREADS`` macros.
These macros are provided primarily to allow a simple Python
extension that already owns the GIL to temporarily release it
while making an "external" (ie, non-Python), generally
expensive, call. Any existing Python threads that are blocked
waiting for the GIL are then free to run. While this is fine
for extensions making calls from Python into the outside world,
it is no help for extensions that need to make calls into Python
when the thread state is unknown.
- PyThreadState and PyInterpreterState APIs.
These API functions allow an extension/embedded application to
acquire the GIL, but suffer from a serious boot-strapping
problem - they require you to know the state of the Python
interpreter and of the GIL before they can be used. One
particular problem is for extension authors that need to deal
with threads never before seen by Python, but need to call
Python from this thread. It is very difficult, delicate and
error prone to author an extension where these "new" threads
always know the exact state of the GIL, and therefore can
reliably interact with this API.
- ``PyThreadState`` and ``PyInterpreterState`` APIs.
These API functions allow an extension/embedded application to
acquire the GIL, but suffer from a serious boot-strapping
problem - they require you to know the state of the Python
interpreter and of the GIL before they can be used. One
particular problem is for extension authors that need to deal
with threads never before seen by Python, but need to call
Python from this thread. It is very difficult, delicate and
error prone to author an extension where these "new" threads
always know the exact state of the GIL, and therefore can
reliably interact with this API.
For these reasons, the question of how such extensions should
interact with Python is quickly becoming a FAQ. The main impetus
for this PEP, a thread on python-dev [1], immediately identified
the following projects with this exact issue:
For these reasons, the question of how such extensions should
interact with Python is quickly becoming a FAQ. The main impetus
for this PEP, a thread on python-dev [1]_, immediately identified
the following projects with this exact issue:
- The win32all extensions
- Boost
- ctypes
- Python-GTK bindings
- Uno
- PyObjC
- Mac toolbox
- PyXPCOM
- The win32all extensions
- Boost
- ctypes
- Python-GTK bindings
- Uno
- PyObjC
- Mac toolbox
- PyXPCOM
Currently, there is no reasonable, portable solution to this
problem, forcing each extension author to implement their own
hand-rolled version. Further, the problem is complex, meaning
many implementations are likely to be incorrect, leading to a
variety of problems that will often manifest simply as "Python has
hung".
Currently, there is no reasonable, portable solution to this
problem, forcing each extension author to implement their own
hand-rolled version. Further, the problem is complex, meaning
many implementations are likely to be incorrect, leading to a
variety of problems that will often manifest simply as "Python has
hung".
While the biggest problem in the existing thread-state API is the
lack of the ability to query the current state of the lock, it is
felt that a more complete, simplified solution should be offered
to extension authors. Such a solution should encourage authors to
provide error-free, complex extension modules that take full
advantage of Python's threading mechanisms.
While the biggest problem in the existing thread-state API is the
lack of the ability to query the current state of the lock, it is
felt that a more complete, simplified solution should be offered
to extension authors. Such a solution should encourage authors to
provide error-free, complex extension modules that take full
advantage of Python's threading mechanisms.
Limitations and Exclusions
==========================
This proposal identifies a solution for extension authors with
complex multi-threaded requirements, but that only require a
single "PyInterpreterState". There is no attempt to cater for
extensions that require multiple interpreter states. At the time
of writing, no extension has been identified that requires
multiple PyInterpreterStates, and indeed it is not clear if that
facility works correctly in Python itself.
This proposal identifies a solution for extension authors with
complex multi-threaded requirements, but that only require a
single "PyInterpreterState". There is no attempt to cater for
extensions that require multiple interpreter states. At the time
of writing, no extension has been identified that requires
multiple PyInterpreterStates, and indeed it is not clear if that
facility works correctly in Python itself.
This API will not perform automatic initialization of Python, or
initialize Python for multi-threaded operation. Extension authors
must continue to call Py_Initialize(), and for multi-threaded
applications, PyEval_InitThreads(). The reason for this is that
the first thread to call PyEval_InitThreads() is nominated as the
"main thread" by Python, and so forcing the extension author to
specify the main thread (by forcing her to make this first call)
removes ambiguity. As Py_Initialize() must be called before
PyEval_InitThreads(), and as both of these functions currently
support being called multiple times, the burden this places on
extension authors is considered reasonable.
This API will not perform automatic initialization of Python, or
initialize Python for multi-threaded operation. Extension authors
must continue to call ``Py_Initialize()``, and for multi-threaded
applications, ``PyEval_InitThreads()``. The reason for this is that
the first thread to call ``PyEval_InitThreads()`` is nominated as the
"main thread" by Python, and so forcing the extension author to
specify the main thread (by forcing her to make this first call)
removes ambiguity. As ``Py_Initialize()`` must be called before
``PyEval_InitThreads()``, and as both of these functions currently
support being called multiple times, the burden this places on
extension authors is considered reasonable.
It is intended that this API be all that is necessary to acquire
the Python GIL. Apart from the existing, standard
Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros, it is
assumed that no additional thread state API functions will be used
by the extension. Extensions with such complicated requirements
are free to continue to use the existing thread state API.
It is intended that this API be all that is necessary to acquire
the Python GIL. Apart from the existing, standard
``Py_BEGIN_ALLOW_THREADS`` and ``Py_END_ALLOW_THREADS`` macros, it is
assumed that no additional thread state API functions will be used
by the extension. Extensions with such complicated requirements
are free to continue to use the existing thread state API.
Proposal
========
This proposal recommends a new API be added to Python to simplify
the management of the GIL. This API will be available on all
platforms built with WITH_THREAD defined.
This proposal recommends a new API be added to Python to simplify
the management of the GIL. This API will be available on all
platforms built with ``WITH_THREAD`` defined.
The intent is that assuming Python has correctly been initialized,
an extension author be able to use a small, well-defined "prologue
dance", at any time and on any thread, which will ensure Python
is ready to be used on that thread. After the extension has
finished with Python, it must also perform an "epilogue dance" to
release any resources previously acquired. Ideally, these dances
can be expressed in a single line.
The intent is that assuming Python has correctly been initialized,
an extension author be able to use a small, well-defined "prologue
dance", at any time and on any thread, which will ensure Python
is ready to be used on that thread. After the extension has
finished with Python, it must also perform an "epilogue dance" to
release any resources previously acquired. Ideally, these dances
can be expressed in a single line.
Specifically, the following new APIs are proposed:
Specifically, the following new APIs are proposed::
/* Ensure that the current thread is ready to call the Python
C API, regardless of the current state of Python, or of its
thread lock. This may be called as many times as desired
by a thread so long as each call is matched with a call to
PyGILState_Release(). In general, other thread-state APIs may
be used between _Ensure() and _Release() calls, so long as the
thread-state is restored to its previous state before the Release().
For example, normal use of the Py_BEGIN_ALLOW_THREADS/
Py_END_ALLOW_THREADS macros are acceptable.
The return value is an opaque "handle" to the thread state when
PyGILState_Acquire() was called, and must be passed to
PyGILState_Release() to ensure Python is left in the same state. Even
though recursive calls are allowed, these handles can *not* be
shared - each unique call to PyGILState_Ensure must save the handle
for its call to PyGILState_Release.
When the function returns, the current thread will hold the GIL.
Failure is a fatal error.
*/
PyAPI_FUNC(PyGILState_STATE) PyGILState_Ensure(void);
/* Ensure that the current thread is ready to call the Python
C API, regardless of the current state of Python, or of its
thread lock. This may be called as many times as desired
by a thread so long as each call is matched with a call to
PyGILState_Release(). In general, other thread-state APIs may
be used between _Ensure() and _Release() calls, so long as the
thread-state is restored to its previous state before the Release().
For example, normal use of the Py_BEGIN_ALLOW_THREADS/
Py_END_ALLOW_THREADS macros are acceptable.
/* Release any resources previously acquired. After this call, Python's
state will be the same as it was prior to the corresponding
PyGILState_Acquire call (but generally this state will be unknown to
the caller, hence the use of the GILState API.)
Every call to PyGILState_Ensure must be matched by a call to
PyGILState_Release on the same thread.
*/
PyAPI_FUNC(void) PyGILState_Release(PyGILState_STATE);
The return value is an opaque "handle" to the thread state when
PyGILState_Acquire() was called, and must be passed to
PyGILState_Release() to ensure Python is left in the same state. Even
though recursive calls are allowed, these handles can *not* be
shared - each unique call to PyGILState_Ensure must save the handle
for its call to PyGILState_Release.
Common usage will be:
When the function returns, the current thread will hold the GIL.
void SomeCFunction(void)
{
/* ensure we hold the lock */
PyGILState_STATE state = PyGILState_Ensure();
/* Use the Python API */
...
/* Restore the state of Python */
PyGILState_Release(state);
}
Failure is a fatal error.
*/
PyAPI_FUNC(PyGILState_STATE) PyGILState_Ensure(void);
/* Release any resources previously acquired. After this call, Python's
state will be the same as it was prior to the corresponding
PyGILState_Acquire call (but generally this state will be unknown to
the caller, hence the use of the GILState API.)
Every call to PyGILState_Ensure must be matched by a call to
PyGILState_Release on the same thread.
*/
PyAPI_FUNC(void) PyGILState_Release(PyGILState_STATE);
Common usage will be::
void SomeCFunction(void)
{
/* ensure we hold the lock */
PyGILState_STATE state = PyGILState_Ensure();
/* Use the Python API */
...
/* Restore the state of Python */
PyGILState_Release(state);
}
Design and Implementation
=========================
The general operation of PyGILState_Ensure() will be:
- assert Python is initialized.
- Get a PyThreadState for the current thread, creating and saving
if necessary.
- remember the current state of the lock (owned/not owned)
- If the current state does not own the GIL, acquire it.
- Increment a counter for how many calls to
PyGILState_Ensure have been made on the current thread.
- return
The general operation of ``PyGILState_Ensure()`` will be:
The general operation of PyGILState_Release() will be:
- assert Python is initialized.
- assert our thread currently holds the lock.
- If old state indicates lock was previously unlocked, release GIL.
- Decrement the PyGILState_Ensure counter for the thread.
- If counter == 0:
- release and delete the PyThreadState.
- forget the ThreadState as being owned by the thread.
- return
- Get a ``PyThreadState`` for the current thread, creating and saving
if necessary.
It is assumed that it is an error if two discrete PyThreadStates
are used for a single thread. Comments in pystate.h ("State
unique per thread") support this view, although it is never
directly stated. Thus, this will require some implementation of
Thread Local Storage. Fortunately, a platform independent
implementation of Thread Local Storage already exists in the
Python source tree, in the SGI threading port. This code will be
integrated into the platform independent Python core, but in such
a way that platforms can provide a more optimal implementation if
desired.
- remember the current state of the lock (owned/not owned)
- If the current state does not own the GIL, acquire it.
- Increment a counter for how many calls to ``PyGILState_Ensure`` have been
made on the current thread.
- return
The general operation of ``PyGILState_Release()`` will be:
- assert our thread currently holds the lock.
- If old state indicates lock was previously unlocked, release GIL.
- Decrement the ``PyGILState_Ensure`` counter for the thread.
- If counter == 0:
- release and delete the ``PyThreadState``.
- forget the ``ThreadState`` as being owned by the thread.
- return
It is assumed that it is an error if two discrete ``PyThreadStates``
are used for a single thread. Comments in ``pystate.h`` ("State
unique per thread") support this view, although it is never
directly stated. Thus, this will require some implementation of
Thread Local Storage. Fortunately, a platform independent
implementation of Thread Local Storage already exists in the
Python source tree, in the SGI threading port. This code will be
integrated into the platform independent Python core, but in such
a way that platforms can provide a more optimal implementation if
desired.
Implementation
==============
An implementation of this proposal can be found at
http://www.python.org/sf/684256
An implementation of this proposal can be found at
http://www.python.org/sf/684256
References
==========
[1] http://mail.python.org/pipermail/python-dev/2002-December/031424.html
.. [1] David Abrahams, Extension modules, Threading, and the GIL
http://mail.python.org/pipermail/python-dev/2002-December/031424.html
Copyright
=========
This document has been placed in the public domain.
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End: