merge
This commit is contained in:
commit
1ed9a8a185
18
pep-0008.txt
18
pep-0008.txt
|
@ -158,9 +158,21 @@ The preferred way of wrapping long lines is by using Python's implied
|
||||||
line continuation inside parentheses, brackets and braces. Long lines
|
line continuation inside parentheses, brackets and braces. Long lines
|
||||||
can be broken over multiple lines by wrapping expressions in
|
can be broken over multiple lines by wrapping expressions in
|
||||||
parentheses. These should be used in preference to using a backslash
|
parentheses. These should be used in preference to using a backslash
|
||||||
for line continuation. Make sure to indent the continued line
|
for line continuation.
|
||||||
appropriately. The preferred place to break around a binary operator
|
|
||||||
is *after* the operator, not before it. Some examples::
|
Backslashes may still be appropriate at times. For example, long,
|
||||||
|
multiple ``with``-statements cannot use implicit continuation, so
|
||||||
|
backslashes are acceptable::
|
||||||
|
|
||||||
|
with open('/path/to/some/file/you/want/to/read') as file_1, \
|
||||||
|
open('/path/to/some/file/being/written', 'w') as file_2:
|
||||||
|
file_2.write(file_1.read())
|
||||||
|
|
||||||
|
Another such case is with ``assert`` statements.
|
||||||
|
|
||||||
|
Make sure to indent the continued line appropriately. The preferred
|
||||||
|
place to break around a binary operator is *after* the operator, not
|
||||||
|
before it. Some examples::
|
||||||
|
|
||||||
class Rectangle(Blob):
|
class Rectangle(Blob):
|
||||||
|
|
||||||
|
|
23
pep-0315.txt
23
pep-0315.txt
|
@ -4,7 +4,7 @@ Version: $Revision$
|
||||||
Last-Modified: $Date$
|
Last-Modified: $Date$
|
||||||
Author: Raymond Hettinger <python@rcn.com>
|
Author: Raymond Hettinger <python@rcn.com>
|
||||||
W Isaac Carroll <icarroll@pobox.com>
|
W Isaac Carroll <icarroll@pobox.com>
|
||||||
Status: Deferred
|
Status: Rejected
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/plain
|
Content-Type: text/plain
|
||||||
Created: 25-Apr-2003
|
Created: 25-Apr-2003
|
||||||
|
@ -21,19 +21,32 @@ Abstract
|
||||||
|
|
||||||
Notice
|
Notice
|
||||||
|
|
||||||
Deferred; see
|
Rejected; see
|
||||||
|
http://mail.python.org/pipermail/python-ideas/2013-June/021610.html
|
||||||
|
|
||||||
|
This PEP has been deferred since 2006; see
|
||||||
http://mail.python.org/pipermail/python-dev/2006-February/060718.html
|
http://mail.python.org/pipermail/python-dev/2006-February/060718.html
|
||||||
|
|
||||||
Subsequent efforts to revive the PEP in April 2009 did not
|
Subsequent efforts to revive the PEP in April 2009 did not
|
||||||
meet with success because no syntax emerged that could
|
meet with success because no syntax emerged that could
|
||||||
compete with a while-True and an inner if-break.
|
compete with the following form:
|
||||||
|
|
||||||
A syntax was found for a basic do-while loop but it found
|
while True:
|
||||||
had little support because the condition was at the top:
|
<setup code>
|
||||||
|
if not <condition>:
|
||||||
|
break
|
||||||
|
<loop body>
|
||||||
|
|
||||||
|
A syntax alternative to the one proposed in the PEP was found for
|
||||||
|
a basic do-while loop but it gained little support because the
|
||||||
|
condition was at the top:
|
||||||
|
|
||||||
do ... while <cond>:
|
do ... while <cond>:
|
||||||
<loop body>
|
<loop body>
|
||||||
|
|
||||||
|
Users of the language are advised to use the while-True form with
|
||||||
|
an inner if-break when a do-while loop would have been appropriate.
|
||||||
|
|
||||||
|
|
||||||
Motivation
|
Motivation
|
||||||
|
|
||||||
|
|
1847
pep-0426.txt
1847
pep-0426.txt
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,249 @@
|
||||||
|
{
|
||||||
|
"id": "http://www.python.org/dev/peps/pep-0426/",
|
||||||
|
"$schema": "http://json-schema.org/draft-04/schema#",
|
||||||
|
"title": "Metadata for Python Software Packages 2.0",
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"metadata_version": {
|
||||||
|
"description": "Version of the file format",
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^(\\d+(\\.\\d+)*)$"
|
||||||
|
},
|
||||||
|
"generator": {
|
||||||
|
"description": "Name and version of the program that produced this file.",
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[0-9A-Za-z]([0-9A-Za-z_.-]*[0-9A-Za-z])( \\((\\d+(\\.\\d+)*)((a|b|c|rc)(\\d+))?(\\.(post)(\\d+))?(\\.(dev)(\\d+))\\))?$"
|
||||||
|
},
|
||||||
|
"name": {
|
||||||
|
"description": "The name of the distribution.",
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[0-9A-Za-z]([0-9A-Za-z_.-]*[0-9A-Za-z])?$"
|
||||||
|
},
|
||||||
|
"version": {
|
||||||
|
"description": "The distribution's public version identifier",
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^(\\d+(\\.\\d+)*)((a|b|c|rc)(\\d+))?(\\.(post)(\\d+))?(\\.(dev)(\\d+))?$"
|
||||||
|
},
|
||||||
|
"source_label": {
|
||||||
|
"description": "A constrained identifying text string",
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[0-9a-z_.-+]+$"
|
||||||
|
},
|
||||||
|
"source_url": {
|
||||||
|
"description": "A string containing a full URL where the source for this specific version of the distribution can be downloaded.",
|
||||||
|
"type": "string",
|
||||||
|
"format": "uri"
|
||||||
|
},
|
||||||
|
"summary": {
|
||||||
|
"description": "A one-line summary of what the distribution does.",
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"document_names": {
|
||||||
|
"description": "Names of supporting metadata documents",
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"description": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/document_name"
|
||||||
|
},
|
||||||
|
"changelog": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/document_name"
|
||||||
|
},
|
||||||
|
"license": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/document_name"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"additionalProperties": false
|
||||||
|
},
|
||||||
|
"keywords": {
|
||||||
|
"description": "A list of additional keywords to be used to assist searching for the distribution in a larger catalog.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"license": {
|
||||||
|
"description": "A string indicating the license covering the distribution.",
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"classifiers": {
|
||||||
|
"description": "A list of strings, with each giving a single classification value for the distribution.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"contacts": {
|
||||||
|
"description": "A list of contributor entries giving the recommended contact points for getting more information about the project.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"$ref": "#/definitions/contact"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"contributors": {
|
||||||
|
"description": "A list of contributor entries for other contributors not already listed as current project points of contact.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"$ref": "#/definitions/contact"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"project_urls": {
|
||||||
|
"description": "A mapping of arbitrary text labels to additional URLs relevant to the project.",
|
||||||
|
"type": "object"
|
||||||
|
},
|
||||||
|
"extras": {
|
||||||
|
"description": "A list of optional sets of dependencies that may be used to define conditional dependencies in \"may_require\" and similar fields.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/extra_name"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"distributes": {
|
||||||
|
"description": "A list of subdistributions made available through this metadistribution.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/dependencies"
|
||||||
|
},
|
||||||
|
"may_distribute": {
|
||||||
|
"description": "A list of subdistributions that may be made available through this metadistribution, based on the extras requested and the target deployment environment.",
|
||||||
|
"$ref": "#/definitions/conditional_dependencies"
|
||||||
|
},
|
||||||
|
"run_requires": {
|
||||||
|
"description": "A list of other distributions needed when to run this distribution.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/dependencies"
|
||||||
|
},
|
||||||
|
"run_may_require": {
|
||||||
|
"description": "A list of other distributions that may be needed when this distribution is deployed, based on the extras requested and the target deployment environment.",
|
||||||
|
"$ref": "#/definitions/conditional_dependencies"
|
||||||
|
},
|
||||||
|
"test_requires": {
|
||||||
|
"description": "A list of other distributions needed when this distribution is tested.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/dependencies"
|
||||||
|
},
|
||||||
|
"test_may_require": {
|
||||||
|
"description": "A list of other distributions that may be needed when this distribution is tested, based on the extras requested and the target deployment environment.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/conditional_dependencies"
|
||||||
|
},
|
||||||
|
"build_requires": {
|
||||||
|
"description": "A list of other distributions needed when this distribution is built.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/dependencies"
|
||||||
|
},
|
||||||
|
"build_may_require": {
|
||||||
|
"description": "A list of other distributions that may be needed when this distribution is built, based on the extras requested and the target deployment environment.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/conditional_dependencies"
|
||||||
|
},
|
||||||
|
"dev_requires": {
|
||||||
|
"description": "A list of other distributions needed when this distribution is developed.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/dependencies"
|
||||||
|
},
|
||||||
|
"dev_may_require": {
|
||||||
|
"description": "A list of other distributions that may be needed when this distribution is developed, based on the extras requested and the target deployment environment.",
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/conditional_dependencies"
|
||||||
|
},
|
||||||
|
"provides": {
|
||||||
|
"description": "A list of strings naming additional dependency requirements that are satisfied by installing this distribution. These strings must be of the form Name or Name (Version), as for the requires field.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"obsoleted_by": {
|
||||||
|
"description": "A string that indicates that this project is no longer being developed. The named project provides a substitute or replacement.",
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/version_specifier"
|
||||||
|
},
|
||||||
|
"supports_environments": {
|
||||||
|
"description": "A list of strings specifying the environments that the distribution explicitly supports.",
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/environment_marker"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"metabuild_hooks": {
|
||||||
|
"description": "The metabuild_hooks field is used to define various operations that may be invoked on a distribution in a platform independent manner.",
|
||||||
|
"type": "object"
|
||||||
|
},
|
||||||
|
"extensions": {
|
||||||
|
"description": "Extensions to the metadata may be present in a mapping under the 'extensions' key.",
|
||||||
|
"type": "object"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
|
||||||
|
"required": ["metadata_version", "name", "version"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
|
||||||
|
"definitions": {
|
||||||
|
"contact": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"name": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"email": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"url": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"role": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": ["name"],
|
||||||
|
"additionalProperties": false
|
||||||
|
},
|
||||||
|
"dependencies": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/version_specifier"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"conditional_dependencies": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"extra": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/extra_name"
|
||||||
|
},
|
||||||
|
"environment": {
|
||||||
|
"type": "string",
|
||||||
|
"$ref": "#/definitions/environment_marker"
|
||||||
|
},
|
||||||
|
"dependencies": {
|
||||||
|
"type": "array",
|
||||||
|
"$ref": "#/definitions/dependencies"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": ["dependencies"],
|
||||||
|
"additionalProperties": false
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"version_specifier": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"extra_name": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"environment_marker": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"document_name": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
|
@ -5,7 +5,7 @@ Last-Modified: $Date$
|
||||||
Author: Barry Warsaw <barry@python.org>,
|
Author: Barry Warsaw <barry@python.org>,
|
||||||
Eli Bendersky <eliben@gmail.com>,
|
Eli Bendersky <eliben@gmail.com>,
|
||||||
Ethan Furman <ethan@stoneleaf.us>
|
Ethan Furman <ethan@stoneleaf.us>
|
||||||
Status: Accepted
|
Status: Final
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 2013-02-23
|
Created: 2013-02-23
|
||||||
|
@ -467,6 +467,10 @@ assignment to ``Animal`` is equivalent to::
|
||||||
... cat = 3
|
... cat = 3
|
||||||
... dog = 4
|
... dog = 4
|
||||||
|
|
||||||
|
The reason for defaulting to ``1`` as the starting number and not ``0`` is
|
||||||
|
that ``0`` is ``False`` in a boolean sense, but enum members all evaluate
|
||||||
|
to ``True``.
|
||||||
|
|
||||||
|
|
||||||
Proposed variations
|
Proposed variations
|
||||||
===================
|
===================
|
||||||
|
|
323
pep-0440.txt
323
pep-0440.txt
|
@ -9,7 +9,7 @@ Status: Draft
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 18 Mar 2013
|
Created: 18 Mar 2013
|
||||||
Post-History: 30 Mar 2013, 27-May-2013
|
Post-History: 30 Mar 2013, 27 May 2013, 20 Jun 2013
|
||||||
Replaces: 386
|
Replaces: 386
|
||||||
|
|
||||||
|
|
||||||
|
@ -27,7 +27,7 @@ standardised approach to versioning, as described in PEP 345 and PEP 386.
|
||||||
This PEP was broken out of the metadata 2.0 specification in PEP 426.
|
This PEP was broken out of the metadata 2.0 specification in PEP 426.
|
||||||
|
|
||||||
Unlike PEP 426, the notes that remain in this document are intended as
|
Unlike PEP 426, the notes that remain in this document are intended as
|
||||||
part of the final specification.
|
part of the final specification (except for this one).
|
||||||
|
|
||||||
|
|
||||||
Definitions
|
Definitions
|
||||||
|
@ -40,7 +40,7 @@ document are to be interpreted as described in RFC 2119.
|
||||||
The following terms are to be interpreted as described in PEP 426:
|
The following terms are to be interpreted as described in PEP 426:
|
||||||
|
|
||||||
* "Distributions"
|
* "Distributions"
|
||||||
* "Versions"
|
* "Releases"
|
||||||
* "Build tools"
|
* "Build tools"
|
||||||
* "Index servers"
|
* "Index servers"
|
||||||
* "Publication tools"
|
* "Publication tools"
|
||||||
|
@ -52,9 +52,13 @@ The following terms are to be interpreted as described in PEP 426:
|
||||||
Version scheme
|
Version scheme
|
||||||
==============
|
==============
|
||||||
|
|
||||||
Distribution versions are identified by both a public version identifier,
|
Distributions are identified by a public version identifier which
|
||||||
which supports all defined version comparison operations, and a build
|
supports all defined version comparison operations
|
||||||
label, which supports only strict equality comparisons.
|
|
||||||
|
Distributions may also define a source label, which is not used by
|
||||||
|
automated tools. Source labels are useful when a project internal
|
||||||
|
versioning scheme requires translation to create a compliant public
|
||||||
|
version identifier.
|
||||||
|
|
||||||
The version scheme is used both to describe the distribution version
|
The version scheme is used both to describe the distribution version
|
||||||
provided by a particular distribution archive, as well as to place
|
provided by a particular distribution archive, as well as to place
|
||||||
|
@ -84,7 +88,7 @@ Public version identifiers are separated into up to four segments:
|
||||||
* Post-release segment: ``.postN``
|
* Post-release segment: ``.postN``
|
||||||
* Development release segment: ``.devN``
|
* Development release segment: ``.devN``
|
||||||
|
|
||||||
Any given version will be a "release", "pre-release", "post-release" or
|
Any given release will be a "final release", "pre-release", "post-release" or
|
||||||
"developmental release" as defined in the following sections.
|
"developmental release" as defined in the following sections.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
@ -105,28 +109,37 @@ Source labels
|
||||||
Source labels are text strings with minimal defined semantics.
|
Source labels are text strings with minimal defined semantics.
|
||||||
|
|
||||||
To ensure source labels can be readily incorporated as part of file names
|
To ensure source labels can be readily incorporated as part of file names
|
||||||
and URLs, they MUST be comprised of only ASCII alphanumerics, plus signs,
|
and URLs, and to avoid formatting inconsistences in hexadecimal hash
|
||||||
periods and hyphens.
|
representations they MUST be limited to the following set of permitted
|
||||||
|
characters:
|
||||||
|
|
||||||
In addition, source labels MUST be unique within a given distribution.
|
* Lowercase ASCII letters (``[a-z]``)
|
||||||
|
* ASCII digits (``[0-9]``)
|
||||||
|
* underscores (``_``)
|
||||||
|
* hyphens (``-``)
|
||||||
|
* periods (``.``)
|
||||||
|
* plus signs (``+``)
|
||||||
|
|
||||||
As with distribution names, all comparisons of source labels MUST be case
|
Source labels MUST start and end with an ASCII letter or digit.
|
||||||
insensitive.
|
|
||||||
|
Source labels MUST be unique within each project and MUST NOT match any
|
||||||
|
defined version for the project.
|
||||||
|
|
||||||
|
|
||||||
Releases
|
Final releases
|
||||||
--------
|
--------------
|
||||||
|
|
||||||
A version identifier that consists solely of a release segment is termed
|
A version identifier that consists solely of a release segment is
|
||||||
a "release".
|
termed a "final release".
|
||||||
|
|
||||||
The release segment consists of one or more non-negative integer values,
|
The release segment consists of one or more non-negative integer
|
||||||
separated by dots::
|
values, separated by dots::
|
||||||
|
|
||||||
N[.N]+
|
N[.N]+
|
||||||
|
|
||||||
Releases within a project will typically be numbered in a consistently
|
Final releases within a project MUST be numbered in a consistently
|
||||||
increasing fashion.
|
increasing fashion, otherwise automated tools will not be able to upgrade
|
||||||
|
them correctly.
|
||||||
|
|
||||||
Comparison and ordering of release segments considers the numeric value
|
Comparison and ordering of release segments considers the numeric value
|
||||||
of each component of the release segment in turn. When comparing release
|
of each component of the release segment in turn. When comparing release
|
||||||
|
@ -157,8 +170,8 @@ For example::
|
||||||
2.0
|
2.0
|
||||||
2.0.1
|
2.0.1
|
||||||
|
|
||||||
A release series is any set of release numbers that start with a common
|
A release series is any set of final release numbers that start with a
|
||||||
prefix. For example, ``3.3.1``, ``3.3.5`` and ``3.3.9.45`` are all
|
common prefix. For example, ``3.3.1``, ``3.3.5`` and ``3.3.9.45`` are all
|
||||||
part of the ``3.3`` release series.
|
part of the ``3.3`` release series.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
@ -206,8 +219,8 @@ of both ``c`` and ``rc`` releases for a common release segment.
|
||||||
Post-releases
|
Post-releases
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
Some projects use post-releases to address minor errors in a release that
|
Some projects use post-releases to address minor errors in a final release
|
||||||
do not affect the distributed software (for example, correcting an error
|
that do not affect the distributed software (for example, correcting an error
|
||||||
in the release notes).
|
in the release notes).
|
||||||
|
|
||||||
If used as part of a project's development cycle, these post-releases are
|
If used as part of a project's development cycle, these post-releases are
|
||||||
|
@ -371,7 +384,7 @@ are permitted and MUST be ordered as shown::
|
||||||
.devN, aN, bN, cN, rcN, <no suffix>, .postN
|
.devN, aN, bN, cN, rcN, <no suffix>, .postN
|
||||||
|
|
||||||
Note that `rc` will always sort after `c` (regardless of the numeric
|
Note that `rc` will always sort after `c` (regardless of the numeric
|
||||||
component) although they are semantically equivalent. Tools are free to
|
component) although they are semantically equivalent. Tools MAY
|
||||||
reject this case as ambiguous and remain in compliance with the PEP.
|
reject this case as ambiguous and remain in compliance with the PEP.
|
||||||
|
|
||||||
Within an alpha (``1.0a1``), beta (``1.0b1``), or release candidate
|
Within an alpha (``1.0a1``), beta (``1.0b1``), or release candidate
|
||||||
|
@ -506,6 +519,22 @@ numbering based on API compatibility, as well as triggering more appropriate
|
||||||
version comparison semantics.
|
version comparison semantics.
|
||||||
|
|
||||||
|
|
||||||
|
Olson database versioning
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The ``pytz`` project inherits its versioning scheme from the corresponding
|
||||||
|
Olson timezone database versioning scheme: the year followed by a lowercase
|
||||||
|
character indicating the version of the database within that year.
|
||||||
|
|
||||||
|
This can be translated to a compliant 3-part version identifier as
|
||||||
|
``0.<year>.<serial>``, where the serial starts at zero (for the '<year>a'
|
||||||
|
release) and is incremented with each subsequent database update within the
|
||||||
|
year.
|
||||||
|
|
||||||
|
As with other translated version identifiers, the corresponding Olson
|
||||||
|
database version would be recorded in the source label field.
|
||||||
|
|
||||||
|
|
||||||
Version specifiers
|
Version specifiers
|
||||||
==================
|
==================
|
||||||
|
|
||||||
|
@ -521,7 +550,6 @@ clause:
|
||||||
* ``~=``: `Compatible release`_ clause
|
* ``~=``: `Compatible release`_ clause
|
||||||
* ``==``: `Version matching`_ clause
|
* ``==``: `Version matching`_ clause
|
||||||
* ``!=``: `Version exclusion`_ clause
|
* ``!=``: `Version exclusion`_ clause
|
||||||
* ``is``: `Source reference`_ clause
|
|
||||||
* ``<=``, ``>=``: `Inclusive ordered comparison`_ clause
|
* ``<=``, ``>=``: `Inclusive ordered comparison`_ clause
|
||||||
* ``<``, ``>``: `Exclusive ordered comparison`_ clause
|
* ``<``, ``>``: `Exclusive ordered comparison`_ clause
|
||||||
|
|
||||||
|
@ -605,6 +633,11 @@ version. The *only* substitution performed is the zero padding of the
|
||||||
release segment to ensure the release segments are compared with the same
|
release segment to ensure the release segments are compared with the same
|
||||||
length.
|
length.
|
||||||
|
|
||||||
|
Whether or not strict version matching is appropriate depends on the specific
|
||||||
|
use case for the version specifier. Automated tools SHOULD at least issue
|
||||||
|
warnings and MAY reject them entirely when strict version matches are used
|
||||||
|
inappropriately.
|
||||||
|
|
||||||
Prefix matching may be requested instead of strict comparison, by appending
|
Prefix matching may be requested instead of strict comparison, by appending
|
||||||
a trailing ``.*`` to the version identifier in the version matching clause.
|
a trailing ``.*`` to the version identifier in the version matching clause.
|
||||||
This means that additional trailing segments will be ignored when
|
This means that additional trailing segments will be ignored when
|
||||||
|
@ -645,75 +678,6 @@ match or not as shown::
|
||||||
!= 1.1.* # Same prefix, so 1.1.post1 does not match clause
|
!= 1.1.* # Same prefix, so 1.1.post1 does not match clause
|
||||||
|
|
||||||
|
|
||||||
Source reference
|
|
||||||
----------------
|
|
||||||
|
|
||||||
A source reference includes the source reference operator ``is`` and
|
|
||||||
a source label or a source URL.
|
|
||||||
|
|
||||||
Installation tools MAY also permit direct references to a platform
|
|
||||||
appropriate binary archive in a source reference clause.
|
|
||||||
|
|
||||||
Publication tools and public index servers SHOULD NOT permit direct
|
|
||||||
references to a platform appropriate binary archive in a source
|
|
||||||
reference clause.
|
|
||||||
|
|
||||||
Source label matching works solely on strict equality comparisons: the
|
|
||||||
candidate source label must be exactly the same as the source label in the
|
|
||||||
version clause for the clause to match the candidate distribution.
|
|
||||||
|
|
||||||
For example, a source reference could be used to depend directly on a
|
|
||||||
version control hash based identifier rather than the translated public
|
|
||||||
version::
|
|
||||||
|
|
||||||
exact-dependency (is 1.3.7+build.11.e0f985a)
|
|
||||||
|
|
||||||
A source URL is distinguished from a source label by the presence of
|
|
||||||
``:`` and ``/`` characters in the source reference. As these characters
|
|
||||||
are not permitted in source labels, they indicate that the reference uses
|
|
||||||
a source URL.
|
|
||||||
|
|
||||||
Some appropriate targets for a source URL are a source tarball, an sdist
|
|
||||||
archive or a direct reference to a tag or specific commit in an online
|
|
||||||
version control system. The exact URLs and
|
|
||||||
targets supported will be installation tool specific.
|
|
||||||
|
|
||||||
For example, a local source archive may be referenced directly::
|
|
||||||
|
|
||||||
pip (is file:///localbuilds/pip-1.3.1.zip)
|
|
||||||
|
|
||||||
All source URL references SHOULD either specify a local file URL, a secure
|
|
||||||
transport mechanism (such as ``https``) or else include an expected hash
|
|
||||||
value in the URL for verification purposes. If an insecure network
|
|
||||||
transport is specified without any hash information (or with hash
|
|
||||||
information that the tool doesn't understand), automated tools SHOULD
|
|
||||||
at least emit a warning and MAY refuse to rely on the URL.
|
|
||||||
|
|
||||||
It is RECOMMENDED that only hashes which are unconditionally provided by
|
|
||||||
the latest version of the standard library's ``hashlib`` module be used
|
|
||||||
for source archive hashes. At time of writing, that list consists of
|
|
||||||
``'md5'``, ``'sha1'``, ``'sha224'``, ``'sha256'``, ``'sha384'``, and
|
|
||||||
``'sha512'``.
|
|
||||||
|
|
||||||
For source archive references, an expected hash value may be
|
|
||||||
specified by including a ``<hash-algorithm>=<expected-hash>`` as part of
|
|
||||||
the URL fragment.
|
|
||||||
|
|
||||||
For version control references, the ``VCS+protocol`` scheme SHOULD be
|
|
||||||
used to identify both the version control system and the secure transport.
|
|
||||||
|
|
||||||
To support version control systems that do not support including commit or
|
|
||||||
tag references directly in the URL, that information may be appended to the
|
|
||||||
end of the URL using the ``@<tag>`` notation.
|
|
||||||
|
|
||||||
The use of ``is`` when defining dependencies for published distributions
|
|
||||||
is strongly discouraged as it greatly complicates the deployment of
|
|
||||||
security fixes. The source label matching operator is intended primarily
|
|
||||||
for use when defining dependencies for repeatable *deployments of
|
|
||||||
applications* while using a shared distribution index, as well as to
|
|
||||||
reference dependencies which are not published through an index server.
|
|
||||||
|
|
||||||
|
|
||||||
Inclusive ordered comparison
|
Inclusive ordered comparison
|
||||||
----------------------------
|
----------------------------
|
||||||
|
|
||||||
|
@ -752,62 +716,108 @@ Handling of pre-releases
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
Pre-releases of any kind, including developmental releases, are implicitly
|
Pre-releases of any kind, including developmental releases, are implicitly
|
||||||
excluded from all version specifiers, *unless* a pre-release or developmental
|
excluded from all version specifiers, *unless* they are already present
|
||||||
release is explicitly mentioned in one of the clauses. For example, these
|
on the system, explicitly requested by the user, or if the only available
|
||||||
specifiers implicitly exclude all pre-releases and development
|
version that satisfies the version specifier is a pre-release.
|
||||||
releases of later versions::
|
|
||||||
|
|
||||||
2.2
|
|
||||||
>= 1.0
|
|
||||||
|
|
||||||
While these specifiers would include at least some of them::
|
|
||||||
|
|
||||||
2.2.dev0
|
|
||||||
2.2, != 2.3b2
|
|
||||||
>= 1.0a1
|
|
||||||
>= 1.0c1
|
|
||||||
>= 1.0, != 1.0b2
|
|
||||||
>= 1.0, < 2.0.dev123
|
|
||||||
|
|
||||||
By default, dependency resolution tools SHOULD:
|
By default, dependency resolution tools SHOULD:
|
||||||
|
|
||||||
* accept already installed pre-releases for all version specifiers
|
* accept already installed pre-releases for all version specifiers
|
||||||
* accept remotely available pre-releases for version specifiers which
|
* accept remotely available pre-releases for version specifiers where
|
||||||
include at least one version clauses that references a pre-release
|
there is no final or post release that satisfies the version specifier
|
||||||
* exclude all other pre-releases from consideration
|
* exclude all other pre-releases from consideration
|
||||||
|
|
||||||
|
Dependency resolution tools MAY issue a warning if a pre-release is needed
|
||||||
|
to satisfy a version specifier.
|
||||||
|
|
||||||
Dependency resolution tools SHOULD also allow users to request the
|
Dependency resolution tools SHOULD also allow users to request the
|
||||||
following alternative behaviours:
|
following alternative behaviours:
|
||||||
|
|
||||||
* accepting pre-releases for all version specifiers
|
* accepting pre-releases for all version specifiers
|
||||||
* excluding pre-releases for all version specifiers (reporting an error or
|
* excluding pre-releases for all version specifiers (reporting an error or
|
||||||
warning if a pre-release is already installed locally)
|
warning if a pre-release is already installed locally, or if a
|
||||||
|
pre-release is the only way to satisfy a particular specifier)
|
||||||
|
|
||||||
Dependency resolution tools MAY also allow the above behaviour to be
|
Dependency resolution tools MAY also allow the above behaviour to be
|
||||||
controlled on a per-distribution basis.
|
controlled on a per-distribution basis.
|
||||||
|
|
||||||
Post-releases and purely numeric releases receive no special treatment in
|
Post-releases and final releases receive no special treatment in version
|
||||||
version specifiers - they are always included unless explicitly excluded.
|
specifiers - they are always included unless explicitly excluded.
|
||||||
|
|
||||||
|
|
||||||
Examples
|
Examples
|
||||||
--------
|
--------
|
||||||
|
|
||||||
* ``3.1``: version 3.1 or later, but not
|
* ``3.1``: version 3.1 or later, but not version 4.0 or later.
|
||||||
version 4.0 or later. Excludes pre-releases and developmental releases.
|
* ``3.1.2``: version 3.1.2 or later, but not version 3.2.0 or later.
|
||||||
* ``3.1.2``: version 3.1.2 or later, but not
|
* ``3.1a1``: version 3.1a1 or later, but not version 4.0 or later.
|
||||||
version 3.2.0 or later. Excludes pre-releases and developmental releases.
|
|
||||||
* ``3.1a1``: version 3.1a1 or later, but not
|
|
||||||
version 4.0 or later. Allows pre-releases like 3.2a4 and developmental
|
|
||||||
releases like 3.2.dev1.
|
|
||||||
* ``== 3.1``: specifically version 3.1 (or 3.1.0), excludes all pre-releases,
|
* ``== 3.1``: specifically version 3.1 (or 3.1.0), excludes all pre-releases,
|
||||||
post releases, developmental releases and any 3.1.x maintenance releases.
|
post releases, developmental releases and any 3.1.x maintenance releases.
|
||||||
* ``== 3.1.*``: any version that starts with 3.1, excluding pre-releases and
|
* ``== 3.1.*``: any version that starts with 3.1. Equivalent to the
|
||||||
developmental releases. Equivalent to the ``3.1.0`` compatible release
|
``3.1.0`` compatible release clause.
|
||||||
clause.
|
|
||||||
* ``3.1.0, != 3.1.3``: version 3.1.0 or later, but not version 3.1.3 and
|
* ``3.1.0, != 3.1.3``: version 3.1.0 or later, but not version 3.1.3 and
|
||||||
not version 3.2.0 or later. Excludes pre-releases and developmental
|
not version 3.2.0 or later.
|
||||||
releases.
|
|
||||||
|
|
||||||
|
Direct references
|
||||||
|
=================
|
||||||
|
|
||||||
|
Some automated tools may permit the use of a direct reference as an
|
||||||
|
alternative to a normal version specifier. A direct reference consists of
|
||||||
|
the word ``from`` and an explicit URL.
|
||||||
|
|
||||||
|
Whether or not direct references are appropriate depends on the specific
|
||||||
|
use case for the version specifier. Automated tools SHOULD at least issue
|
||||||
|
warnings and MAY reject them entirely when direct references are used
|
||||||
|
inappropriately.
|
||||||
|
|
||||||
|
Public index servers SHOULD NOT allow the use of direct references in
|
||||||
|
uploaded distributions. Direct references are intended as a tool for
|
||||||
|
software integrators rather than publishers.
|
||||||
|
|
||||||
|
Depending on the use case, some appropriate targets for a direct URL
|
||||||
|
reference may be a valid ``source_url`` entry (see PEP 426), an sdist, or
|
||||||
|
a wheel binary archive. The exact URLs and targets supported will be tool
|
||||||
|
dependent.
|
||||||
|
|
||||||
|
For example, a local source archive may be referenced directly::
|
||||||
|
|
||||||
|
pip (from file:///localbuilds/pip-1.3.1.zip)
|
||||||
|
|
||||||
|
Alternatively, a prebuilt archive may also be referenced::
|
||||||
|
|
||||||
|
pip (from file:///localbuilds/pip-1.3.1-py33-none-any.whl)
|
||||||
|
|
||||||
|
All direct references that do not refer to a local file URL SHOULD
|
||||||
|
specify a secure transport mechanism (such as ``https``), include an
|
||||||
|
expected hash value in the URL for verification purposes, or both. If an
|
||||||
|
insecure transport is specified without any hash information, with hash
|
||||||
|
information that the tool doesn't understand, or with a selected hash
|
||||||
|
algorithm that the tool considers too weak to trust, automated tools
|
||||||
|
SHOULD at least emit a warning and MAY refuse to rely on the URL.
|
||||||
|
|
||||||
|
It is RECOMMENDED that only hashes which are unconditionally provided by
|
||||||
|
the latest version of the standard library's ``hashlib`` module be used
|
||||||
|
for source archive hashes. At time of writing, that list consists of
|
||||||
|
``'md5'``, ``'sha1'``, ``'sha224'``, ``'sha256'``, ``'sha384'``, and
|
||||||
|
``'sha512'``.
|
||||||
|
|
||||||
|
For source archive and wheel references, an expected hash value may be
|
||||||
|
specified by including a ``<hash-algorithm>=<expected-hash>`` entry as
|
||||||
|
part of the URL fragment.
|
||||||
|
|
||||||
|
Version control references, the ``VCS+protocol`` scheme SHOULD be
|
||||||
|
used to identify both the version control system and the secure transport.
|
||||||
|
|
||||||
|
To support version control systems that do not support including commit or
|
||||||
|
tag references directly in the URL, that information may be appended to the
|
||||||
|
end of the URL using the ``@<tag>`` notation.
|
||||||
|
|
||||||
|
Remote URL examples::
|
||||||
|
|
||||||
|
pip (from https://github.com/pypa/pip/archive/1.3.1.zip)
|
||||||
|
pip (from http://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686)
|
||||||
|
pip (from git+https://github.com/pypa/pip.git@1.3.1)
|
||||||
|
|
||||||
|
|
||||||
Updating the versioning specification
|
Updating the versioning specification
|
||||||
|
@ -825,28 +835,30 @@ Summary of differences from \PEP 386
|
||||||
|
|
||||||
* Moved the description of version specifiers into the versioning PEP
|
* Moved the description of version specifiers into the versioning PEP
|
||||||
|
|
||||||
* added the "source label" concept to better handle projects that wish to
|
* Added the "source label" concept to better handle projects that wish to
|
||||||
use a non-compliant versioning scheme internally, especially those based
|
use a non-compliant versioning scheme internally, especially those based
|
||||||
on DVCS hashes
|
on DVCS hashes
|
||||||
|
|
||||||
* added the "compatible release" clause
|
* Added the "direct reference" concept as a standard notation for direct
|
||||||
|
references to resources (rather than each tool needing to invents its own)
|
||||||
|
|
||||||
* added the "source reference" clause
|
* Added the "compatible release" clause
|
||||||
|
|
||||||
* added the trailing wildcard syntax for prefix based version matching
|
* Added the trailing wildcard syntax for prefix based version matching
|
||||||
and exclusion
|
and exclusion
|
||||||
|
|
||||||
* changed the top level sort position of the ``.devN`` suffix
|
* Changed the top level sort position of the ``.devN`` suffix
|
||||||
|
|
||||||
* allowed single value version numbers
|
* Allowed single value version numbers
|
||||||
|
|
||||||
* explicit exclusion of leading or trailing whitespace
|
* Explicit exclusion of leading or trailing whitespace
|
||||||
|
|
||||||
* explicit criterion for the exclusion of date based versions
|
* Explicit criterion for the exclusion of date based versions
|
||||||
|
|
||||||
* implicitly exclude pre-releases unless explicitly requested
|
* Implicitly exclude pre-releases unless they're already present or
|
||||||
|
needed to satisfy a dependency
|
||||||
|
|
||||||
* treat post releases the same way as unqualified releases
|
* Treat post releases the same way as unqualified releases
|
||||||
|
|
||||||
* Discuss ordering and dependencies across metadata versions
|
* Discuss ordering and dependencies across metadata versions
|
||||||
|
|
||||||
|
@ -995,11 +1007,12 @@ The previous interpretation also excluded post-releases from some version
|
||||||
specifiers for no adequately justified reason.
|
specifiers for no adequately justified reason.
|
||||||
|
|
||||||
The updated interpretation is intended to make it difficult to accidentally
|
The updated interpretation is intended to make it difficult to accidentally
|
||||||
accept a pre-release version as satisfying a dependency, while allowing
|
accept a pre-release version as satisfying a dependency, while still
|
||||||
pre-release versions to be explicitly requested when needed.
|
allowing pre-release versions to be retrieved automatically when that's the
|
||||||
|
only way to satisfy a dependency.
|
||||||
|
|
||||||
The "some forward compatibility assumed" default version constraint is
|
The "some forward compatibility assumed" default version constraint is
|
||||||
taken directly from the Ruby community's "pessimistic version constraint"
|
derived from the Ruby community's "pessimistic version constraint"
|
||||||
operator [2]_ to allow projects to take a cautious approach to forward
|
operator [2]_ to allow projects to take a cautious approach to forward
|
||||||
compatibility promises, while still easily setting a minimum required
|
compatibility promises, while still easily setting a minimum required
|
||||||
version for their dependencies. It is made the default behaviour rather
|
version for their dependencies. It is made the default behaviour rather
|
||||||
|
@ -1022,16 +1035,26 @@ improved tools for dynamic path manipulation.
|
||||||
|
|
||||||
The trailing wildcard syntax to request prefix based version matching was
|
The trailing wildcard syntax to request prefix based version matching was
|
||||||
added to make it possible to sensibly define both compatible release clauses
|
added to make it possible to sensibly define both compatible release clauses
|
||||||
and the desired pre-release handling semantics for ``<`` and ``>`` ordered
|
and the desired pre- and post-release handling semantics for ``<`` and ``>``
|
||||||
comparison clauses.
|
ordered comparison clauses.
|
||||||
|
|
||||||
Source references are added for two purposes. In conjunction with source
|
|
||||||
labels, they allow hash based references to exact versions that aren't
|
Adding direct references
|
||||||
compliant with the fully ordered public version scheme, such as those
|
------------------------
|
||||||
generated from version control. In combination with source URLs, they
|
|
||||||
also allow the new metadata standard to natively support an existing
|
Direct references are added as an "escape clause" to handle messy real
|
||||||
feature of ``pip``, which allows arbitrary URLs like
|
world situations that don't map neatly to the standard distribution model.
|
||||||
``file:///localbuilds/exampledist-1.0-py33-none-any.whl``.
|
This includes dependencies on unpublished software for internal use, as well
|
||||||
|
as handling the more complex compatibility issues that may arise when
|
||||||
|
wrapping third party libraries as C extensions (this is of especial concern
|
||||||
|
to the scientific community).
|
||||||
|
|
||||||
|
Index servers are deliberately given a lot of freedom to disallow direct
|
||||||
|
references, since they're intended primarily as a tool for integrators
|
||||||
|
rather than publishers. PyPI in particular is currently going through the
|
||||||
|
process of *eliminating* dependencies on external references, as unreliable
|
||||||
|
external services have the effect of slowing down installation operations,
|
||||||
|
as well as reducing PyPI's own apparent reliability.
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
|
|
|
@ -4,13 +4,13 @@ Version: $Revision$
|
||||||
Last-Modified: $Date$
|
Last-Modified: $Date$
|
||||||
Author: Antoine Pitrou <solipsis@pitrou.net>
|
Author: Antoine Pitrou <solipsis@pitrou.net>
|
||||||
BDFL-Delegate: Benjamin Peterson <benjamin@python.org>
|
BDFL-Delegate: Benjamin Peterson <benjamin@python.org>
|
||||||
Status: Draft
|
Status: Accepted
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 2013-05-18
|
Created: 2013-05-18
|
||||||
Python-Version: 3.4
|
Python-Version: 3.4
|
||||||
Post-History: 2013-05-18
|
Post-History: 2013-05-18
|
||||||
Resolution: TBD
|
Resolution: http://mail.python.org/pipermail/python-dev/2013-June/126746.html
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
|
@ -201,8 +201,7 @@ Predictability
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
Following this scheme, an object's finalizer is always called exactly
|
Following this scheme, an object's finalizer is always called exactly
|
||||||
once. The only exception is if an object is resurrected: the finalizer
|
once, even if it was resurrected afterwards.
|
||||||
will be called again when the object becomes unreachable again.
|
|
||||||
|
|
||||||
For CI objects, the order in which finalizers are called (step 2 above)
|
For CI objects, the order in which finalizers are called (step 2 above)
|
||||||
is undefined.
|
is undefined.
|
||||||
|
|
75
pep-0443.txt
75
pep-0443.txt
|
@ -4,7 +4,7 @@ Version: $Revision$
|
||||||
Last-Modified: $Date$
|
Last-Modified: $Date$
|
||||||
Author: Łukasz Langa <lukasz@langa.pl>
|
Author: Łukasz Langa <lukasz@langa.pl>
|
||||||
Discussions-To: Python-Dev <python-dev@python.org>
|
Discussions-To: Python-Dev <python-dev@python.org>
|
||||||
Status: Accepted
|
Status: Final
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 22-May-2013
|
Created: 22-May-2013
|
||||||
|
@ -193,48 +193,37 @@ handling of old-style classes and Zope's ExtensionClasses. More
|
||||||
importantly, it introduces support for Abstract Base Classes (ABC).
|
importantly, it introduces support for Abstract Base Classes (ABC).
|
||||||
|
|
||||||
When a generic function implementation is registered for an ABC, the
|
When a generic function implementation is registered for an ABC, the
|
||||||
dispatch algorithm switches to a mode of MRO calculation for the
|
dispatch algorithm switches to an extended form of C3 linearization,
|
||||||
provided argument which includes the relevant ABCs. The algorithm is as
|
which includes the relevant ABCs in the MRO of the provided argument.
|
||||||
follows::
|
The algorithm inserts ABCs where their functionality is introduced, i.e.
|
||||||
|
``issubclass(cls, abc)`` returns ``True`` for the class itself but
|
||||||
|
returns ``False`` for all its direct base classes. Implicit ABCs for
|
||||||
|
a given class (either registered or inferred from the presence of
|
||||||
|
a special method like ``__len__()``) are inserted directly after the
|
||||||
|
last ABC explicitly listed in the MRO of said class.
|
||||||
|
|
||||||
def _compose_mro(cls, haystack):
|
In its most basic form, this linearization returns the MRO for the given
|
||||||
"""Calculates the MRO for a given class `cls`, including relevant
|
type::
|
||||||
abstract base classes from `haystack`."""
|
|
||||||
bases = set(cls.__mro__)
|
|
||||||
mro = list(cls.__mro__)
|
|
||||||
for regcls in haystack:
|
|
||||||
if regcls in bases or not issubclass(cls, regcls):
|
|
||||||
continue # either present in the __mro__ or unrelated
|
|
||||||
for index, base in enumerate(mro):
|
|
||||||
if not issubclass(base, regcls):
|
|
||||||
break
|
|
||||||
if base in bases and not issubclass(regcls, base):
|
|
||||||
# Conflict resolution: put classes present in __mro__
|
|
||||||
# and their subclasses first.
|
|
||||||
index += 1
|
|
||||||
mro.insert(index, regcls)
|
|
||||||
return mro
|
|
||||||
|
|
||||||
In its most basic form, it returns the MRO for the given type::
|
|
||||||
|
|
||||||
>>> _compose_mro(dict, [])
|
>>> _compose_mro(dict, [])
|
||||||
[<class 'dict'>, <class 'object'>]
|
[<class 'dict'>, <class 'object'>]
|
||||||
|
|
||||||
When the haystack consists of ABCs that the specified type is a subclass
|
When the second argument contains ABCs that the specified type is
|
||||||
of, they are inserted in a predictable order::
|
a subclass of, they are inserted in a predictable order::
|
||||||
|
|
||||||
>>> _compose_mro(dict, [Sized, MutableMapping, str,
|
>>> _compose_mro(dict, [Sized, MutableMapping, str,
|
||||||
... Sequence, Iterable])
|
... Sequence, Iterable])
|
||||||
[<class 'dict'>, <class 'collections.abc.MutableMapping'>,
|
[<class 'dict'>, <class 'collections.abc.MutableMapping'>,
|
||||||
<class 'collections.abc.Iterable'>, <class 'collections.abc.Sized'>,
|
<class 'collections.abc.Mapping'>, <class 'collections.abc.Sized'>,
|
||||||
|
<class 'collections.abc.Iterable'>, <class 'collections.abc.Container'>,
|
||||||
<class 'object'>]
|
<class 'object'>]
|
||||||
|
|
||||||
While this mode of operation is significantly slower, all dispatch
|
While this mode of operation is significantly slower, all dispatch
|
||||||
decisions are cached. The cache is invalidated on registering new
|
decisions are cached. The cache is invalidated on registering new
|
||||||
implementations on the generic function or when user code calls
|
implementations on the generic function or when user code calls
|
||||||
``register()`` on an ABC to register a new virtual subclass. In the
|
``register()`` on an ABC to implicitly subclass it. In the latter case,
|
||||||
latter case, it is possible to create a situation with ambiguous
|
it is possible to create a situation with ambiguous dispatch, for
|
||||||
dispatch, for instance::
|
instance::
|
||||||
|
|
||||||
>>> from collections import Iterable, Container
|
>>> from collections import Iterable, Container
|
||||||
>>> class P:
|
>>> class P:
|
||||||
|
@ -261,20 +250,38 @@ guess::
|
||||||
RuntimeError: Ambiguous dispatch: <class 'collections.abc.Container'>
|
RuntimeError: Ambiguous dispatch: <class 'collections.abc.Container'>
|
||||||
or <class 'collections.abc.Iterable'>
|
or <class 'collections.abc.Iterable'>
|
||||||
|
|
||||||
Note that this exception would not be raised if ``Iterable`` and
|
Note that this exception would not be raised if one or more ABCs had
|
||||||
``Container`` had been provided as base classes during class definition.
|
been provided explicitly as base classes during class definition. In
|
||||||
In this case dispatch happens in the MRO order::
|
this case dispatch happens in the MRO order::
|
||||||
|
|
||||||
>>> class Ten(Iterable, Container):
|
>>> class Ten(Iterable, Container):
|
||||||
... def __iter__(self):
|
... def __iter__(self):
|
||||||
... for i in range(10):
|
... for i in range(10):
|
||||||
... yield i
|
... yield i
|
||||||
... def __contains__(self, value):
|
... def __contains__(self, value):
|
||||||
... return value in range(10)
|
... return value in range(10)
|
||||||
...
|
...
|
||||||
>>> g(Ten())
|
>>> g(Ten())
|
||||||
'iterable'
|
'iterable'
|
||||||
|
|
||||||
|
A similar conflict arises when subclassing an ABC is inferred from the
|
||||||
|
presence of a special method like ``__len__()`` or ``__contains__()``::
|
||||||
|
|
||||||
|
>>> class Q:
|
||||||
|
... def __contains__(self, value):
|
||||||
|
... return False
|
||||||
|
...
|
||||||
|
>>> issubclass(Q, Container)
|
||||||
|
True
|
||||||
|
>>> Iterable.register(Q)
|
||||||
|
>>> g(Q())
|
||||||
|
Traceback (most recent call last):
|
||||||
|
...
|
||||||
|
RuntimeError: Ambiguous dispatch: <class 'collections.abc.Container'>
|
||||||
|
or <class 'collections.abc.Iterable'>
|
||||||
|
|
||||||
|
An early version of the PEP contained a custom approach that was simpler
|
||||||
|
but created a number of edge cases with surprising results [#why-c3]_.
|
||||||
|
|
||||||
Usage Patterns
|
Usage Patterns
|
||||||
==============
|
==============
|
||||||
|
@ -378,6 +385,8 @@ References
|
||||||
a particular annotation style".
|
a particular annotation style".
|
||||||
(http://www.python.org/dev/peps/pep-0008)
|
(http://www.python.org/dev/peps/pep-0008)
|
||||||
|
|
||||||
|
.. [#why-c3] http://bugs.python.org/issue18244
|
||||||
|
|
||||||
.. [#pep-3124] http://www.python.org/dev/peps/pep-3124/
|
.. [#pep-3124] http://www.python.org/dev/peps/pep-3124/
|
||||||
|
|
||||||
.. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules
|
.. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules
|
||||||
|
|
|
@ -0,0 +1,773 @@
|
||||||
|
PEP: 445
|
||||||
|
Title: Add new APIs to customize Python memory allocators
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Victor Stinner <victor.stinner@gmail.com>
|
||||||
|
BDFL-Delegate: Antoine Pitrou <solipsis@pitrou.net>
|
||||||
|
Status: Accepted
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 15-june-2013
|
||||||
|
Python-Version: 3.4
|
||||||
|
Resolution: http://mail.python.org/pipermail/python-dev/2013-July/127222.html
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This PEP proposes new Application Programming Interfaces (API) to customize
|
||||||
|
Python memory allocators. The only implementation required to conform to
|
||||||
|
this PEP is CPython, but other implementations may choose to be compatible,
|
||||||
|
or to re-use a similar scheme.
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
Use cases:
|
||||||
|
|
||||||
|
* Applications embedding Python which want to isolate Python memory from
|
||||||
|
the memory of the application, or want to use a different memory
|
||||||
|
allocator optimized for its Python usage
|
||||||
|
* Python running on embedded devices with low memory and slow CPU.
|
||||||
|
A custom memory allocator can be used for efficiency and/or to get
|
||||||
|
access all the memory of the device.
|
||||||
|
* Debug tools for memory allocators:
|
||||||
|
|
||||||
|
- track the memory usage (find memory leaks)
|
||||||
|
- get the location of a memory allocation: Python filename and line
|
||||||
|
number, and the size of a memory block
|
||||||
|
- detect buffer underflow, buffer overflow and misuse of Python
|
||||||
|
allocator APIs (see `Redesign Debug Checks on Memory Block
|
||||||
|
Allocators as Hooks`_)
|
||||||
|
- force memory allocations to fail to test handling of the
|
||||||
|
``MemoryError`` exception
|
||||||
|
|
||||||
|
|
||||||
|
Proposal
|
||||||
|
========
|
||||||
|
|
||||||
|
New Functions and Structures
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
* Add a new GIL-free (no need to hold the GIL) memory allocator:
|
||||||
|
|
||||||
|
- ``void* PyMem_RawMalloc(size_t size)``
|
||||||
|
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
|
||||||
|
- ``void PyMem_RawFree(void *ptr)``
|
||||||
|
- The newly allocated memory will not have been initialized in any
|
||||||
|
way.
|
||||||
|
- Requesting zero bytes returns a distinct non-*NULL* pointer if
|
||||||
|
possible, as if ``PyMem_Malloc(1)`` had been called instead.
|
||||||
|
|
||||||
|
* Add a new ``PyMemAllocator`` structure::
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
/* user context passed as the first argument to the 3 functions */
|
||||||
|
void *ctx;
|
||||||
|
|
||||||
|
/* allocate a memory block */
|
||||||
|
void* (*malloc) (void *ctx, size_t size);
|
||||||
|
|
||||||
|
/* allocate or resize a memory block */
|
||||||
|
void* (*realloc) (void *ctx, void *ptr, size_t new_size);
|
||||||
|
|
||||||
|
/* release a memory block */
|
||||||
|
void (*free) (void *ctx, void *ptr);
|
||||||
|
} PyMemAllocator;
|
||||||
|
|
||||||
|
* Add a new ``PyMemAllocatorDomain`` enum to choose the Python
|
||||||
|
allocator domain. Domains:
|
||||||
|
|
||||||
|
- ``PYMEM_DOMAIN_RAW``: ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()``
|
||||||
|
and ``PyMem_RawFree()``
|
||||||
|
|
||||||
|
- ``PYMEM_DOMAIN_MEM``: ``PyMem_Malloc()``, ``PyMem_Realloc()`` and
|
||||||
|
``PyMem_Free()``
|
||||||
|
|
||||||
|
- ``PYMEM_DOMAIN_OBJ``: ``PyObject_Malloc()``, ``PyObject_Realloc()``
|
||||||
|
and ``PyObject_Free()``
|
||||||
|
|
||||||
|
* Add new functions to get and set memory block allocators:
|
||||||
|
|
||||||
|
- ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
||||||
|
- ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
||||||
|
- The new allocator must return a distinct non-*NULL* pointer when
|
||||||
|
requesting zero bytes
|
||||||
|
- For the ``PYMEM_DOMAIN_RAW`` domain, the allocator must be
|
||||||
|
thread-safe: the GIL is not held when the allocator is called.
|
||||||
|
|
||||||
|
* Add a new ``PyObjectArenaAllocator`` structure::
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
/* user context passed as the first argument to the 2 functions */
|
||||||
|
void *ctx;
|
||||||
|
|
||||||
|
/* allocate an arena */
|
||||||
|
void* (*alloc) (void *ctx, size_t size);
|
||||||
|
|
||||||
|
/* release an arena */
|
||||||
|
void (*free) (void *ctx, void *ptr, size_t size);
|
||||||
|
} PyObjectArenaAllocator;
|
||||||
|
|
||||||
|
* Add new functions to get and set the arena allocator used by
|
||||||
|
*pymalloc*:
|
||||||
|
|
||||||
|
- ``void PyObject_GetArenaAllocator(PyObjectArenaAllocator *allocator)``
|
||||||
|
- ``void PyObject_SetArenaAllocator(PyObjectArenaAllocator *allocator)``
|
||||||
|
|
||||||
|
* Add a new function to reinstall the debug checks on memory allocators when
|
||||||
|
a memory allocator is replaced with ``PyMem_SetAllocator()``:
|
||||||
|
|
||||||
|
- ``void PyMem_SetupDebugHooks(void)``
|
||||||
|
- Install the debug hooks on all memory block allocators. The function can be
|
||||||
|
called more than once, hooks are only installed once.
|
||||||
|
- The function does nothing is Python is not compiled in debug mode.
|
||||||
|
|
||||||
|
* Memory block allocators always return *NULL* if *size* is greater than
|
||||||
|
``PY_SSIZE_T_MAX``. The check is done before calling the inner
|
||||||
|
function.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
|
||||||
|
with a short lifetime. It uses memory mappings with a fixed size of 256
|
||||||
|
KB called "arenas".
|
||||||
|
|
||||||
|
Here is how the allocators are set up by default:
|
||||||
|
|
||||||
|
* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``,
|
||||||
|
``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero
|
||||||
|
bytes
|
||||||
|
* ``PYMEM_DOMAIN_OBJ``: *pymalloc* allocator which falls back on
|
||||||
|
``PyMem_Malloc()`` for allocations larger than 512 bytes
|
||||||
|
* *pymalloc* arena allocator: ``VirtualAlloc()`` and ``VirtualFree()`` on
|
||||||
|
Windows, ``mmap()`` and ``munmap()`` when available, or ``malloc()``
|
||||||
|
and ``free()``
|
||||||
|
|
||||||
|
|
||||||
|
Redesign Debug Checks on Memory Block Allocators as Hooks
|
||||||
|
---------------------------------------------------------
|
||||||
|
|
||||||
|
Since Python 2.3, Python implements different checks on memory
|
||||||
|
allocators in debug mode:
|
||||||
|
|
||||||
|
* Newly allocated memory is filled with the byte ``0xCB``, freed memory
|
||||||
|
is filled with the byte ``0xDB``.
|
||||||
|
* Detect API violations, ex: ``PyObject_Free()`` called on a memory
|
||||||
|
block allocated by ``PyMem_Malloc()``
|
||||||
|
* Detect write before the start of the buffer (buffer underflow)
|
||||||
|
* Detect write after the end of the buffer (buffer overflow)
|
||||||
|
|
||||||
|
In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``,
|
||||||
|
``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``,
|
||||||
|
``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new
|
||||||
|
allocator allocates a larger buffer and writes a pattern to detect buffer
|
||||||
|
underflow, buffer overflow and use after free (by filling the buffer with
|
||||||
|
the byte ``0xDB``). It uses the original ``PyObject_Malloc()``
|
||||||
|
function to allocate memory. So ``PyMem_Malloc()`` and
|
||||||
|
``PyMem_Realloc()`` indirectly call``PyObject_Malloc()`` and
|
||||||
|
``PyObject_Realloc()``.
|
||||||
|
|
||||||
|
This PEP redesigns the debug checks as hooks on the existing allocators
|
||||||
|
in debug mode. Examples of call traces without the hooks:
|
||||||
|
|
||||||
|
* ``PyMem_RawMalloc()`` => ``_PyMem_RawMalloc()`` => ``malloc()``
|
||||||
|
* ``PyMem_Realloc()`` => ``_PyMem_RawRealloc()`` => ``realloc()``
|
||||||
|
* ``PyObject_Free()`` => ``_PyObject_Free()``
|
||||||
|
|
||||||
|
Call traces when the hooks are installed (debug mode):
|
||||||
|
|
||||||
|
* ``PyMem_RawMalloc()`` => ``_PyMem_DebugMalloc()``
|
||||||
|
=> ``_PyMem_RawMalloc()`` => ``malloc()``
|
||||||
|
* ``PyMem_Realloc()`` => ``_PyMem_DebugRealloc()``
|
||||||
|
=> ``_PyMem_RawRealloc()`` => ``realloc()``
|
||||||
|
* ``PyObject_Free()`` => ``_PyMem_DebugFree()``
|
||||||
|
=> ``_PyObject_Free()``
|
||||||
|
|
||||||
|
As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now call
|
||||||
|
``malloc()`` and ``realloc()`` in both release mode and debug mode,
|
||||||
|
instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in
|
||||||
|
debug mode.
|
||||||
|
|
||||||
|
When at least one memory allocator is replaced with
|
||||||
|
``PyMem_SetAllocator()``, the ``PyMem_SetupDebugHooks()`` function must
|
||||||
|
be called to reinstall the debug hooks on top on the new allocator.
|
||||||
|
|
||||||
|
|
||||||
|
Don't call malloc() directly anymore
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
``PyObject_Malloc()`` falls back on ``PyMem_Malloc()`` instead of
|
||||||
|
``malloc()`` if size is greater or equal than 512 bytes, and
|
||||||
|
``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` instead of
|
||||||
|
``realloc()``
|
||||||
|
|
||||||
|
Direct calls to ``malloc()`` are replaced with ``PyMem_Malloc()``, or
|
||||||
|
``PyMem_RawMalloc()`` if the GIL is not held.
|
||||||
|
|
||||||
|
External libraries like zlib or OpenSSL can be configured to allocate memory
|
||||||
|
using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a
|
||||||
|
library can only be replaced globally (rather than on an object-by-object
|
||||||
|
basis), it shouldn't be replaced when Python is embedded in an application.
|
||||||
|
|
||||||
|
For the "track memory usage" use case, it is important to track memory
|
||||||
|
allocated in external libraries to have accurate reports, because these
|
||||||
|
allocations can be large (e.g. they can raise a ``MemoryError`` exception)
|
||||||
|
and would otherwise be missed in memory usage reports.
|
||||||
|
|
||||||
|
|
||||||
|
Examples
|
||||||
|
========
|
||||||
|
|
||||||
|
Use case 1: Replace Memory Allocators, keep pymalloc
|
||||||
|
----------------------------------------------------
|
||||||
|
|
||||||
|
Dummy example wasting 2 bytes per memory block,
|
||||||
|
and 10 bytes per *pymalloc* arena::
|
||||||
|
|
||||||
|
#include <stdlib.h>
|
||||||
|
|
||||||
|
size_t alloc_padding = 2;
|
||||||
|
size_t arena_padding = 10;
|
||||||
|
|
||||||
|
void* my_malloc(void *ctx, size_t size)
|
||||||
|
{
|
||||||
|
int padding = *(int *)ctx;
|
||||||
|
return malloc(size + padding);
|
||||||
|
}
|
||||||
|
|
||||||
|
void* my_realloc(void *ctx, void *ptr, size_t new_size)
|
||||||
|
{
|
||||||
|
int padding = *(int *)ctx;
|
||||||
|
return realloc(ptr, new_size + padding);
|
||||||
|
}
|
||||||
|
|
||||||
|
void my_free(void *ctx, void *ptr)
|
||||||
|
{
|
||||||
|
free(ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
void* my_alloc_arena(void *ctx, size_t size)
|
||||||
|
{
|
||||||
|
int padding = *(int *)ctx;
|
||||||
|
return malloc(size + padding);
|
||||||
|
}
|
||||||
|
|
||||||
|
void my_free_arena(void *ctx, void *ptr, size_t size)
|
||||||
|
{
|
||||||
|
free(ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
void setup_custom_allocator(void)
|
||||||
|
{
|
||||||
|
PyMemAllocator alloc;
|
||||||
|
PyObjectArenaAllocator arena;
|
||||||
|
|
||||||
|
alloc.ctx = &alloc_padding;
|
||||||
|
alloc.malloc = my_malloc;
|
||||||
|
alloc.realloc = my_realloc;
|
||||||
|
alloc.free = my_free;
|
||||||
|
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc);
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc);
|
||||||
|
/* leave PYMEM_DOMAIN_OBJ unchanged, use pymalloc */
|
||||||
|
|
||||||
|
arena.ctx = &arena_padding;
|
||||||
|
arena.alloc = my_alloc_arena;
|
||||||
|
arena.free = my_free_arena;
|
||||||
|
PyObject_SetArenaAllocator(&arena);
|
||||||
|
|
||||||
|
PyMem_SetupDebugHooks();
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
Use case 2: Replace Memory Allocators, override pymalloc
|
||||||
|
--------------------------------------------------------
|
||||||
|
|
||||||
|
If you have a dedicated allocator optimized for allocations of objects
|
||||||
|
smaller than 512 bytes with a short lifetime, pymalloc can be overriden
|
||||||
|
(replace ``PyObject_Malloc()``).
|
||||||
|
|
||||||
|
Dummy example wasting 2 bytes per memory block::
|
||||||
|
|
||||||
|
#include <stdlib.h>
|
||||||
|
|
||||||
|
size_t padding = 2;
|
||||||
|
|
||||||
|
void* my_malloc(void *ctx, size_t size)
|
||||||
|
{
|
||||||
|
int padding = *(int *)ctx;
|
||||||
|
return malloc(size + padding);
|
||||||
|
}
|
||||||
|
|
||||||
|
void* my_realloc(void *ctx, void *ptr, size_t new_size)
|
||||||
|
{
|
||||||
|
int padding = *(int *)ctx;
|
||||||
|
return realloc(ptr, new_size + padding);
|
||||||
|
}
|
||||||
|
|
||||||
|
void my_free(void *ctx, void *ptr)
|
||||||
|
{
|
||||||
|
free(ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
void setup_custom_allocator(void)
|
||||||
|
{
|
||||||
|
PyMemAllocator alloc;
|
||||||
|
alloc.ctx = &padding;
|
||||||
|
alloc.malloc = my_malloc;
|
||||||
|
alloc.realloc = my_realloc;
|
||||||
|
alloc.free = my_free;
|
||||||
|
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc);
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc);
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc);
|
||||||
|
|
||||||
|
PyMem_SetupDebugHooks();
|
||||||
|
}
|
||||||
|
|
||||||
|
The *pymalloc* arena does not need to be replaced, because it is no more
|
||||||
|
used by the new allocator.
|
||||||
|
|
||||||
|
|
||||||
|
Use case 3: Setup Hooks On Memory Block Allocators
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
Example to setup hooks on all memory block allocators::
|
||||||
|
|
||||||
|
struct {
|
||||||
|
PyMemAllocator raw;
|
||||||
|
PyMemAllocator mem;
|
||||||
|
PyMemAllocator obj;
|
||||||
|
/* ... */
|
||||||
|
} hook;
|
||||||
|
|
||||||
|
static void* hook_malloc(void *ctx, size_t size)
|
||||||
|
{
|
||||||
|
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
|
||||||
|
void *ptr;
|
||||||
|
/* ... */
|
||||||
|
ptr = alloc->malloc(alloc->ctx, size);
|
||||||
|
/* ... */
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void* hook_realloc(void *ctx, void *ptr, size_t new_size)
|
||||||
|
{
|
||||||
|
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
|
||||||
|
void *ptr2;
|
||||||
|
/* ... */
|
||||||
|
ptr2 = alloc->realloc(alloc->ctx, ptr, new_size);
|
||||||
|
/* ... */
|
||||||
|
return ptr2;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void hook_free(void *ctx, void *ptr)
|
||||||
|
{
|
||||||
|
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
|
||||||
|
/* ... */
|
||||||
|
alloc->free(alloc->ctx, ptr);
|
||||||
|
/* ... */
|
||||||
|
}
|
||||||
|
|
||||||
|
void setup_hooks(void)
|
||||||
|
{
|
||||||
|
PyMemAllocator alloc;
|
||||||
|
static int installed = 0;
|
||||||
|
|
||||||
|
if (installed)
|
||||||
|
return;
|
||||||
|
installed = 1;
|
||||||
|
|
||||||
|
alloc.malloc = hook_malloc;
|
||||||
|
alloc.realloc = hook_realloc;
|
||||||
|
alloc.free = hook_free;
|
||||||
|
PyMem_GetAllocator(PYMEM_DOMAIN_RAW, &hook.raw);
|
||||||
|
PyMem_GetAllocator(PYMEM_DOMAIN_MEM, &hook.mem);
|
||||||
|
PyMem_GetAllocator(PYMEM_DOMAIN_OBJ, &hook.obj);
|
||||||
|
|
||||||
|
alloc.ctx = &hook.raw;
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc);
|
||||||
|
|
||||||
|
alloc.ctx = &hook.mem;
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc);
|
||||||
|
|
||||||
|
alloc.ctx = &hook.obj;
|
||||||
|
PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc);
|
||||||
|
}
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
``PyMem_SetupDebugHooks()`` does not need to be called because
|
||||||
|
memory allocator are not replaced: the debug checks on memory
|
||||||
|
block allocators are installed automatically at startup.
|
||||||
|
|
||||||
|
|
||||||
|
Performances
|
||||||
|
============
|
||||||
|
|
||||||
|
The implementation of this PEP (issue #3329) has no visible overhead on
|
||||||
|
the Python benchmark suite.
|
||||||
|
|
||||||
|
Results of the `Python benchmarks suite
|
||||||
|
<http://hg.python.org/benchmarks>`_ (-b 2n3): some tests are 1.04x
|
||||||
|
faster, some tests are 1.04 slower. Results of pybench microbenchmark:
|
||||||
|
"+0.1%" slower globally (diff between -4.9% and +5.6%).
|
||||||
|
|
||||||
|
The full output of benchmarks is attached to the issue #3329.
|
||||||
|
|
||||||
|
|
||||||
|
Rejected Alternatives
|
||||||
|
=====================
|
||||||
|
|
||||||
|
More specific functions to get/set memory allocators
|
||||||
|
----------------------------------------------------
|
||||||
|
|
||||||
|
It was originally proposed a larger set of C API functions, with one pair
|
||||||
|
of functions for each allocator domain:
|
||||||
|
|
||||||
|
* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)``
|
||||||
|
* ``void PyMem_GetAllocator(PyMemAllocator *allocator)``
|
||||||
|
* ``void PyObject_GetAllocator(PyMemAllocator *allocator)``
|
||||||
|
* ``void PyMem_SetRawAllocator(PyMemAllocator *allocator)``
|
||||||
|
* ``void PyMem_SetAllocator(PyMemAllocator *allocator)``
|
||||||
|
* ``void PyObject_SetAllocator(PyMemAllocator *allocator)``
|
||||||
|
|
||||||
|
This alternative was rejected because it is not possible to write
|
||||||
|
generic code with more specific functions: code must be duplicated for
|
||||||
|
each memory allocator domain.
|
||||||
|
|
||||||
|
|
||||||
|
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
|
||||||
|
------------------------------------------------------
|
||||||
|
|
||||||
|
If ``PyMem_Malloc()`` called ``PyMem_RawMalloc()`` by default,
|
||||||
|
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also
|
||||||
|
patch ``PyMem_Malloc()`` indirectly.
|
||||||
|
|
||||||
|
This alternative was rejected because ``PyMem_SetAllocator()`` would
|
||||||
|
have a different behaviour depending on the domain. Always having the
|
||||||
|
same behaviour is less error-prone.
|
||||||
|
|
||||||
|
|
||||||
|
Add a new PYDEBUGMALLOC environment variable
|
||||||
|
--------------------------------------------
|
||||||
|
|
||||||
|
It was proposed to add a new ``PYDEBUGMALLOC`` environment variable to
|
||||||
|
enable debug checks on memory block allocators. It would have had the same
|
||||||
|
effect as calling the ``PyMem_SetupDebugHooks()``, without the need
|
||||||
|
to write any C code. Another advantage is to allow to enable debug checks
|
||||||
|
even in release mode: debug checks would always be compiled in, but only
|
||||||
|
enabled when the environment variable is present and non-empty.
|
||||||
|
|
||||||
|
This alternative was rejected because a new environment variable would
|
||||||
|
make Python initialization even more complex. `PEP 432
|
||||||
|
<http://www.python.org/dev/peps/pep-0432/>`_ tries to simplify the
|
||||||
|
CPython startup sequence.
|
||||||
|
|
||||||
|
|
||||||
|
Use macros to get customizable allocators
|
||||||
|
-----------------------------------------
|
||||||
|
|
||||||
|
To have no overhead in the default configuration, customizable
|
||||||
|
allocators would be an optional feature enabled by a configuration
|
||||||
|
option or by macros.
|
||||||
|
|
||||||
|
This alternative was rejected because the use of macros implies having
|
||||||
|
to recompile extensions modules to use the new allocator and allocator
|
||||||
|
hooks. Not having to recompile Python nor extension modules makes debug
|
||||||
|
hooks easier to use in practice.
|
||||||
|
|
||||||
|
|
||||||
|
Pass the C filename and line number
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
|
Define allocator functions as macros using ``__FILE__`` and ``__LINE__``
|
||||||
|
to get the C filename and line number of a memory allocation.
|
||||||
|
|
||||||
|
Example of ``PyMem_Malloc`` macro with the modified
|
||||||
|
``PyMemAllocator`` structure::
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
/* user context passed as the first argument
|
||||||
|
to the 3 functions */
|
||||||
|
void *ctx;
|
||||||
|
|
||||||
|
/* allocate a memory block */
|
||||||
|
void* (*malloc) (void *ctx, const char *filename, int lineno,
|
||||||
|
size_t size);
|
||||||
|
|
||||||
|
/* allocate or resize a memory block */
|
||||||
|
void* (*realloc) (void *ctx, const char *filename, int lineno,
|
||||||
|
void *ptr, size_t new_size);
|
||||||
|
|
||||||
|
/* release a memory block */
|
||||||
|
void (*free) (void *ctx, const char *filename, int lineno,
|
||||||
|
void *ptr);
|
||||||
|
} PyMemAllocator;
|
||||||
|
|
||||||
|
void* _PyMem_MallocTrace(const char *filename, int lineno,
|
||||||
|
size_t size);
|
||||||
|
|
||||||
|
/* the function is still needed for the Python stable ABI */
|
||||||
|
void* PyMem_Malloc(size_t size);
|
||||||
|
|
||||||
|
#define PyMem_Malloc(size) \
|
||||||
|
_PyMem_MallocTrace(__FILE__, __LINE__, size)
|
||||||
|
|
||||||
|
The GC allocator functions would also have to be patched. For example,
|
||||||
|
``_PyObject_GC_Malloc()`` is used in many C functions and so objects of
|
||||||
|
different types would have the same allocation location.
|
||||||
|
|
||||||
|
This alternative was rejected because passing a filename and a line
|
||||||
|
number to each allocator makes the API more complex: pass 3 new
|
||||||
|
arguments (ctx, filename, lineno) to each allocator function, instead of
|
||||||
|
just a context argument (ctx). Having to also modify GC allocator
|
||||||
|
functions adds too much complexity for a little gain.
|
||||||
|
|
||||||
|
|
||||||
|
GIL-free PyMem_Malloc()
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()``
|
||||||
|
indirectly calls ``PyObject_Malloc()`` which requires the GIL to be
|
||||||
|
held (it isn't thread-safe). That's why ``PyMem_Malloc()`` must be called
|
||||||
|
with the GIL held.
|
||||||
|
|
||||||
|
This PEP changes ``PyMem_Malloc()``: it now always calls ``malloc()``
|
||||||
|
rather than ``PyObject_Malloc()``. The "GIL must be held" restriction
|
||||||
|
could therefore be removed from ``PyMem_Malloc()``.
|
||||||
|
|
||||||
|
This alternative was rejected because allowing to call
|
||||||
|
``PyMem_Malloc()`` without holding the GIL can break applications
|
||||||
|
which setup their own allocators or allocator hooks. Holding the GIL is
|
||||||
|
convenient to develop a custom allocator: no need to care about other
|
||||||
|
threads. It is also convenient for a debug allocator hook: Python
|
||||||
|
objects can be safely inspected, and the C API may be used for reporting.
|
||||||
|
|
||||||
|
Moreover, calling ``PyGILState_Ensure()`` in a memory allocator has
|
||||||
|
unexpected behaviour, especially at Python startup and when creating of a
|
||||||
|
new Python thread state. It is better to free custom allocators of
|
||||||
|
the responsibility of acquiring the GIL.
|
||||||
|
|
||||||
|
|
||||||
|
Don't add PyMem_RawMalloc()
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is
|
||||||
|
held. Otherwise, keep ``malloc()`` unchanged.
|
||||||
|
|
||||||
|
The ``PyMem_Malloc()`` is used without the GIL held in some Python
|
||||||
|
functions. For example, the ``main()`` and ``Py_Main()`` functions of
|
||||||
|
Python call ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this
|
||||||
|
case, ``PyMem_Malloc()`` would be replaced with ``malloc()`` (or
|
||||||
|
``PyMem_RawMalloc()``).
|
||||||
|
|
||||||
|
This alternative was rejected because ``PyMem_RawMalloc()`` is required
|
||||||
|
for accurate reports of the memory usage. When a debug hook is used to
|
||||||
|
track the memory usage, the memory allocated by direct calls to
|
||||||
|
``malloc()`` cannot be tracked. ``PyMem_RawMalloc()`` can be hooked and
|
||||||
|
so all the memory allocated by Python can be tracked, including
|
||||||
|
memory allocated without holding the GIL.
|
||||||
|
|
||||||
|
|
||||||
|
Use existing debug tools to analyze memory use
|
||||||
|
----------------------------------------------
|
||||||
|
|
||||||
|
There are many existing debug tools to analyze memory use. Some
|
||||||
|
examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
||||||
|
<http://ibm.com/software/awdtools/purify/>`_, `Clang AddressSanitizer
|
||||||
|
<http://code.google.com/p/address-sanitizer/>`_, `failmalloc
|
||||||
|
<http://www.nongnu.org/failmalloc/>`_, etc.
|
||||||
|
|
||||||
|
The problem is to retrieve the Python object related to a memory pointer
|
||||||
|
to read its type and/or its content. Another issue is to retrieve the
|
||||||
|
source of the memory allocation: the C backtrace is usually useless
|
||||||
|
(same reasoning than macros using ``__FILE__`` and ``__LINE__``, see
|
||||||
|
`Pass the C filename and line number`_), the Python filename and line
|
||||||
|
number (or even the Python traceback) is more useful.
|
||||||
|
|
||||||
|
This alternative was rejected because classic tools are unable to
|
||||||
|
introspect Python internals to collect such information. Being able to
|
||||||
|
setup a hook on allocators called with the GIL held allows to collect a
|
||||||
|
lot of useful data from Python internals.
|
||||||
|
|
||||||
|
|
||||||
|
Add a msize() function
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
Add another function to ``PyMemAllocator`` and
|
||||||
|
``PyObjectArenaAllocator`` structures::
|
||||||
|
|
||||||
|
size_t msize(void *ptr);
|
||||||
|
|
||||||
|
This function returns the size of a memory block or a memory mapping.
|
||||||
|
Return (size_t)-1 if the function is not implemented or if the pointer
|
||||||
|
is unknown (ex: NULL pointer).
|
||||||
|
|
||||||
|
On Windows, this function can be implemented using ``_msize()`` and
|
||||||
|
``VirtualQuery()``.
|
||||||
|
|
||||||
|
The function can be used to implement a hook tracking the memory usage.
|
||||||
|
The ``free()`` method of an allocator only gets the address of a memory
|
||||||
|
block, whereas the size of the memory block is required to update the
|
||||||
|
memory usage.
|
||||||
|
|
||||||
|
The additional ``msize()`` function was rejected because only few
|
||||||
|
platforms implement it. For example, Linux with the GNU libc does not
|
||||||
|
provide a function to get the size of a memory block. ``msize()`` is not
|
||||||
|
currently used in the Python source code. The function would only be
|
||||||
|
used to track memory use, and make the API more complex. A debug hook
|
||||||
|
can implement the function internally, there is no need to add it to
|
||||||
|
``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures.
|
||||||
|
|
||||||
|
|
||||||
|
No context argument
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
Simplify the signature of allocator functions, remove the context
|
||||||
|
argument:
|
||||||
|
|
||||||
|
* ``void* malloc(size_t size)``
|
||||||
|
* ``void* realloc(void *ptr, size_t new_size)``
|
||||||
|
* ``void free(void *ptr)``
|
||||||
|
|
||||||
|
It is likely for an allocator hook to be reused for
|
||||||
|
``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()``, or even
|
||||||
|
``PyMem_SetRawAllocator()``, but the hook must call a different function
|
||||||
|
depending on the allocator. The context is a convenient way to reuse the
|
||||||
|
same custom allocator or hook for different Python allocators.
|
||||||
|
|
||||||
|
In C++, the context can be used to pass *this*.
|
||||||
|
|
||||||
|
|
||||||
|
External Libraries
|
||||||
|
==================
|
||||||
|
|
||||||
|
Examples of API used to customize memory allocators.
|
||||||
|
|
||||||
|
Libraries used by Python:
|
||||||
|
|
||||||
|
* OpenSSL: `CRYPTO_set_mem_functions()
|
||||||
|
<http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/mem.c;h=f7984fa958eb1edd6c61f6667f3f2b29753be662;hb=HEAD#l124>`_
|
||||||
|
to set memory management functions globally
|
||||||
|
* expat: `parserCreate()
|
||||||
|
<http://hg.python.org/cpython/file/cc27d50bd91a/Modules/expat/xmlparse.c#l724>`_
|
||||||
|
has a per-instance memory handler
|
||||||
|
* zlib: `zlib 1.2.8 Manual <http://www.zlib.net/manual.html#Usage>`_,
|
||||||
|
pass an opaque pointer
|
||||||
|
* bz2: `bzip2 and libbzip2, version 1.0.5
|
||||||
|
<http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html>`_,
|
||||||
|
pass an opaque pointer
|
||||||
|
* lzma: `LZMA SDK - How to Use
|
||||||
|
<http://www.asawicki.info/news_1368_lzma_sdk_-_how_to_use.html>`_,
|
||||||
|
pass an opaque pointer
|
||||||
|
* lipmpdec: no opaque pointer (classic malloc API)
|
||||||
|
|
||||||
|
Other libraries:
|
||||||
|
|
||||||
|
* glib: `g_mem_set_vtable()
|
||||||
|
<http://developer.gnome.org/glib/unstable/glib-Memory-Allocation.html#g-mem-set-vtable>`_
|
||||||
|
* libxml2:
|
||||||
|
`xmlGcMemSetup() <http://xmlsoft.org/html/libxml-xmlmemory.html>`_,
|
||||||
|
global
|
||||||
|
* Oracle's OCI: `Oracle Call Interface Programmer's Guide,
|
||||||
|
Release 2 (9.2)
|
||||||
|
<http://docs.oracle.com/cd/B10501_01/appdev.920/a96584/oci15re4.htm>`_,
|
||||||
|
pass an opaque pointer
|
||||||
|
|
||||||
|
The new *ctx* parameter of this PEP was inspired by the API of zlib and
|
||||||
|
Oracle's OCI libraries.
|
||||||
|
|
||||||
|
See also the `GNU libc: Memory Allocation Hooks
|
||||||
|
<http://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html>`_
|
||||||
|
which uses a different approach to hook memory allocators.
|
||||||
|
|
||||||
|
|
||||||
|
Memory Allocators
|
||||||
|
=================
|
||||||
|
|
||||||
|
The C standard library provides the well known ``malloc()`` function.
|
||||||
|
Its implementation depends on the platform and of the C library. The GNU
|
||||||
|
C library uses a modified ptmalloc2, based on "Doug Lea's Malloc"
|
||||||
|
(dlmalloc). FreeBSD uses `jemalloc
|
||||||
|
<http://www.canonware.com/jemalloc/>`_. Google provides *tcmalloc* which
|
||||||
|
is part of `gperftools <http://code.google.com/p/gperftools/>`_.
|
||||||
|
|
||||||
|
``malloc()`` uses two kinds of memory: heap and memory mappings. Memory
|
||||||
|
mappings are usually used for large allocations (ex: larger than 256
|
||||||
|
KB), whereas the heap is used for small allocations.
|
||||||
|
|
||||||
|
On UNIX, the heap is handled by ``brk()`` and ``sbrk()`` system calls,
|
||||||
|
and it is contiguous. On Windows, the heap is handled by
|
||||||
|
``HeapAlloc()`` and can be discontiguous. Memory mappings are handled by
|
||||||
|
``mmap()`` on UNIX and ``VirtualAlloc()`` on Windows, they can be
|
||||||
|
discontiguous.
|
||||||
|
|
||||||
|
Releasing a memory mapping gives back immediatly the memory to the
|
||||||
|
system. On UNIX, the heap memory is only given back to the system if the
|
||||||
|
released block is located at the end of the heap. Otherwise, the memory
|
||||||
|
will only be given back to the system when all the memory located after
|
||||||
|
the released memory is also released.
|
||||||
|
|
||||||
|
To allocate memory on the heap, an allocator tries to reuse free space.
|
||||||
|
If there is no contiguous space big enough, the heap must be enlarged,
|
||||||
|
even if there is more free space than required size. This issue is
|
||||||
|
called the "memory fragmentation": the memory usage seen by the system
|
||||||
|
is higher than real usage. On Windows, ``HeapAlloc()`` creates
|
||||||
|
a new memory mapping with ``VirtualAlloc()`` if there is not enough free
|
||||||
|
contiguous memory.
|
||||||
|
|
||||||
|
CPython has a *pymalloc* allocator for allocations smaller than 512
|
||||||
|
bytes. This allocator is optimized for small objects with a short
|
||||||
|
lifetime. It uses memory mappings called "arenas" with a fixed size of
|
||||||
|
256 KB.
|
||||||
|
|
||||||
|
Other allocators:
|
||||||
|
|
||||||
|
* Windows provides a `Low-fragmentation Heap
|
||||||
|
<http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750%28v=vs.85%29.aspx>`_.
|
||||||
|
|
||||||
|
* The Linux kernel uses `slab allocation
|
||||||
|
<http://en.wikipedia.org/wiki/Slab_allocation>`_.
|
||||||
|
|
||||||
|
* The glib library has a `Memory Slice API
|
||||||
|
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
|
||||||
|
efficient way to allocate groups of equal-sized chunks of memory
|
||||||
|
|
||||||
|
This PEP allows to choose exactly which memory allocator is used for your
|
||||||
|
application depending on its usage of the memory (number of allocations,
|
||||||
|
size of allocations, lifetime of objects, etc.).
|
||||||
|
|
||||||
|
|
||||||
|
Links
|
||||||
|
=====
|
||||||
|
|
||||||
|
CPython issues related to memory allocation:
|
||||||
|
|
||||||
|
* `Issue #3329: Add new APIs to customize memory allocators
|
||||||
|
<http://bugs.python.org/issue3329>`_
|
||||||
|
* `Issue #13483: Use VirtualAlloc to allocate memory arenas
|
||||||
|
<http://bugs.python.org/issue13483>`_
|
||||||
|
* `Issue #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline,
|
||||||
|
which isn't thread safe <http://bugs.python.org/issue16742>`_
|
||||||
|
* `Issue #18203: Replace calls to malloc() with PyMem_Malloc() or
|
||||||
|
PyMem_RawMalloc() <http://bugs.python.org/issue18203>`_
|
||||||
|
* `Issue #18227: Use Python memory allocators in external libraries like
|
||||||
|
zlib or OpenSSL <http://bugs.python.org/issue18227>`_
|
||||||
|
|
||||||
|
Projects analyzing the memory usage of Python applications:
|
||||||
|
|
||||||
|
* `pytracemalloc
|
||||||
|
<https://pypi.python.org/pypi/pytracemalloc>`_
|
||||||
|
* `Meliae: Python Memory Usage Analyzer
|
||||||
|
<https://pypi.python.org/pypi/meliae>`_
|
||||||
|
* `Guppy-PE: umbrella package combining Heapy and GSL
|
||||||
|
<http://guppy-pe.sourceforge.net/>`_
|
||||||
|
* `PySizer (developed for Python 2.4)
|
||||||
|
<http://pysizer.8325.org/>`_
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed into the public domain.
|
||||||
|
|
|
@ -0,0 +1,242 @@
|
||||||
|
PEP: 446
|
||||||
|
Title: Add new parameters to configure the inheritance of files and for non-blocking sockets
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Victor Stinner <victor.stinner@gmail.com>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 3-July-2013
|
||||||
|
Python-Version: 3.4
|
||||||
|
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This PEP proposes new portable parameters and functions to configure the
|
||||||
|
inheritance of file descriptors and the non-blocking flag of sockets.
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
Inheritance of file descriptors
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
The inheritance of file descriptors in child processes can be configured
|
||||||
|
on each file descriptor using a *close-on-exec* flag. By default, the
|
||||||
|
close-on-exec flag is not set.
|
||||||
|
|
||||||
|
On Windows, the close-on-exec flag is ``HANDLE_FLAG_INHERIT``. File
|
||||||
|
descriptors are not inherited if the ``bInheritHandles`` parameter of
|
||||||
|
the ``CreateProcess()`` function is ``FALSE``, even if the
|
||||||
|
``HANDLE_FLAG_INHERIT`` flag is set. If ``bInheritHandles`` is ``TRUE``,
|
||||||
|
only file descriptors with ``HANDLE_FLAG_INHERIT`` flag set are
|
||||||
|
inherited, others are not.
|
||||||
|
|
||||||
|
On UNIX, the close-on-exec flag is ``O_CLOEXEC``. File descriptors with
|
||||||
|
the ``O_CLOEXEC`` flag set are closed at the execution of a new program
|
||||||
|
(ex: when calling ``execv()``).
|
||||||
|
|
||||||
|
The ``O_CLOEXEC`` flag has no effect on ``fork()``, all file descriptors
|
||||||
|
are inherited by the child process. Futhermore, most properties file
|
||||||
|
descriptors are shared between the parent and the child processes,
|
||||||
|
except file attributes which are duplicated (``O_CLOEXEC`` is the only
|
||||||
|
file attribute). Setting ``O_CLOEXEC`` flag of a file descriptor in the
|
||||||
|
child process does not change the ``O_CLOEXEC`` flag of the file
|
||||||
|
descriptor in the parent process.
|
||||||
|
|
||||||
|
|
||||||
|
Issues of the inheritance of file descriptors
|
||||||
|
---------------------------------------------
|
||||||
|
|
||||||
|
Inheritance of file descriptors causes issues. For example, closing a
|
||||||
|
file descriptor in the parent process does not release the resource
|
||||||
|
(file, socket, ...), because the file descriptor is still open in the
|
||||||
|
child process.
|
||||||
|
|
||||||
|
Leaking file descriptors is also a major security vulnerability. An
|
||||||
|
untrusted child process can read sensitive data like passwords and take
|
||||||
|
control of the parent process though leaked file descriptors. It is for
|
||||||
|
example a known vulnerability to escape from a chroot.
|
||||||
|
|
||||||
|
|
||||||
|
Non-blocking sockets
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
To handle multiple network clients in a single thread, a multiplexing
|
||||||
|
function like ``select()`` can be used. For best performances, sockets
|
||||||
|
must be configured as non-blocking. Operations like ``send()`` and
|
||||||
|
``recv()`` return an ``EAGAIN`` or ``EWOULDBLOCK`` error if the
|
||||||
|
operation would block.
|
||||||
|
|
||||||
|
By default, newly created sockets are blocking. Setting the non-blocking
|
||||||
|
mode requires additional system calls.
|
||||||
|
|
||||||
|
On UNIX, the blocking flag is ``O_NONBLOCK``: a pipe and a socket are
|
||||||
|
non-blocking if the ``O_NONBLOCK`` flag is set.
|
||||||
|
|
||||||
|
|
||||||
|
Setting flags at the creation of the file descriptor
|
||||||
|
----------------------------------------------------
|
||||||
|
|
||||||
|
Windows and recent versions of other operating systems like Linux
|
||||||
|
support setting the close-on-exec flag directly at the creation of file
|
||||||
|
descriptors, and close-on-exec and blocking flags at the creation of
|
||||||
|
sockets.
|
||||||
|
|
||||||
|
Setting these flags at the creation is atomic and avoids additional
|
||||||
|
system calls.
|
||||||
|
|
||||||
|
|
||||||
|
Proposal
|
||||||
|
========
|
||||||
|
|
||||||
|
New cloexec And blocking Parameters
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
|
Add a new optional *cloexec* on functions creating file descriptors:
|
||||||
|
|
||||||
|
* ``io.FileIO``
|
||||||
|
* ``io.open()``
|
||||||
|
* ``open()``
|
||||||
|
* ``os.dup()``
|
||||||
|
* ``os.dup2()``
|
||||||
|
* ``os.fdopen()``
|
||||||
|
* ``os.open()``
|
||||||
|
* ``os.openpty()``
|
||||||
|
* ``os.pipe()``
|
||||||
|
* ``select.devpoll()``
|
||||||
|
* ``select.epoll()``
|
||||||
|
* ``select.kqueue()``
|
||||||
|
|
||||||
|
Add new optional *cloexec* and *blocking* parameters to functions
|
||||||
|
creating sockets:
|
||||||
|
|
||||||
|
* ``asyncore.dispatcher.create_socket()``
|
||||||
|
* ``socket.socket()``
|
||||||
|
* ``socket.socket.accept()``
|
||||||
|
* ``socket.socket.dup()``
|
||||||
|
* ``socket.socket.fromfd``
|
||||||
|
* ``socket.socketpair()``
|
||||||
|
|
||||||
|
The default value of *cloexec* is ``False`` and the default value of
|
||||||
|
*blocking* is ``True``.
|
||||||
|
|
||||||
|
The atomicity is not guaranteed. If the platform does not support
|
||||||
|
setting close-on-exec and blocking flags at the creation of the file
|
||||||
|
descriptor or socket, the flags are set using additional system calls.
|
||||||
|
|
||||||
|
|
||||||
|
New Functions
|
||||||
|
-------------
|
||||||
|
|
||||||
|
Add new functions the get and set the close-on-exec flag of a file
|
||||||
|
descriptor, available on all platforms:
|
||||||
|
|
||||||
|
* ``os.get_cloexec(fd:int) -> bool``
|
||||||
|
* ``os.set_cloexec(fd:int, cloexec: bool)``
|
||||||
|
|
||||||
|
Add new functions the get and set the blocking flag of a file
|
||||||
|
descriptor, only available on UNIX:
|
||||||
|
|
||||||
|
* ``os.get_blocking(fd:int) -> bool``
|
||||||
|
* ``os.set_blocking(fd:int, blocking: bool)``
|
||||||
|
|
||||||
|
|
||||||
|
Other Changes
|
||||||
|
-------------
|
||||||
|
|
||||||
|
The ``subprocess.Popen`` class must clear the close-on-exec flag of file
|
||||||
|
descriptors of the ``pass_fds`` parameter. The flag is cleared in the
|
||||||
|
child process before executing the program, the change does not change
|
||||||
|
the flag in the parent process.
|
||||||
|
|
||||||
|
The close-on-exec flag must also be set on private file descriptors and
|
||||||
|
sockets in the Python standard library. For example, on UNIX,
|
||||||
|
os.urandom() opens ``/dev/urandom`` to read some random bytes and the
|
||||||
|
file descriptor is closed at function exit. The file descriptor is not
|
||||||
|
expected to be inherited by child processes.
|
||||||
|
|
||||||
|
|
||||||
|
Rejected Alternatives
|
||||||
|
=====================
|
||||||
|
|
||||||
|
PEP 433
|
||||||
|
-------
|
||||||
|
|
||||||
|
The PEP 433 entitled "Easier suppression of file descriptor inheritance"
|
||||||
|
is a previous attempt proposing various other alternatives, but no
|
||||||
|
consensus could be reached.
|
||||||
|
|
||||||
|
This PEP has a well defined behaviour (the default value of the new
|
||||||
|
*cloexec* parameter is not configurable), is more conservative (no
|
||||||
|
backward compatibility issue), and is much simpler.
|
||||||
|
|
||||||
|
|
||||||
|
Add blocking parameter for file descriptors and use Windows overlapped I/O
|
||||||
|
--------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Windows supports non-blocking operations on files using an extension of
|
||||||
|
the Windows API called "Overlapped I/O". Using this extension requires
|
||||||
|
to modify the Python standard library and applications to pass a
|
||||||
|
``OVERLAPPED`` structure and an event loop to wait for the completion of
|
||||||
|
operations.
|
||||||
|
|
||||||
|
This PEP only tries to expose portable flags on file descriptors and
|
||||||
|
sockets. Supporting overlapped I/O requires an abstraction providing a
|
||||||
|
high-level and portable API for asynchronous operations on files and
|
||||||
|
sockets. Overlapped I/O are out of the scope of this PEP.
|
||||||
|
|
||||||
|
UNIX supports non-blocking files, moreover recent versions of operating
|
||||||
|
systems support setting the non-blocking flag at the creation of a file
|
||||||
|
descriptor. It would be possible to add a new optional *blocking*
|
||||||
|
parameter to Python functions creating file descriptors. On Windows,
|
||||||
|
creating a file descriptor with ``blocking=False`` would raise a
|
||||||
|
``NotImplementedError``. This behaviour is not acceptable for the ``os``
|
||||||
|
module which is designed as a thin wrapper on the C functions of the
|
||||||
|
operating system. If a platform does not support a function, the
|
||||||
|
function should not be available on the platform. For example,
|
||||||
|
the ``os.fork()`` function is not available on Windows.
|
||||||
|
|
||||||
|
For all these reasons, this alternative was rejected. The PEP 3156
|
||||||
|
proposes an abstraction for asynchronous I/O supporting non-blocking
|
||||||
|
files on Windows.
|
||||||
|
|
||||||
|
|
||||||
|
Links
|
||||||
|
=====
|
||||||
|
|
||||||
|
Python issues:
|
||||||
|
|
||||||
|
* `#10115: Support accept4() for atomic setting of flags at socket
|
||||||
|
creation <http://bugs.python.org/issue10115>`_
|
||||||
|
* `#12105: open() does not able to set flags, such as O_CLOEXEC
|
||||||
|
<http://bugs.python.org/issue12105>`_
|
||||||
|
* `#12107: TCP listening sockets created without FD_CLOEXEC flag
|
||||||
|
<http://bugs.python.org/issue12107>`_
|
||||||
|
* `#16850: Add "e" mode to open(): close-and-exec
|
||||||
|
(O_CLOEXEC) / O_NOINHERIT <http://bugs.python.org/issue16850>`_
|
||||||
|
* `#16860: Use O_CLOEXEC in the tempfile module
|
||||||
|
<http://bugs.python.org/issue16860>`_
|
||||||
|
* `#16946: subprocess: _close_open_fd_range_safe() does not set
|
||||||
|
close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined
|
||||||
|
<http://bugs.python.org/issue16946>`_
|
||||||
|
* `#17070: Use the new cloexec to improve security and avoid bugs
|
||||||
|
<http://bugs.python.org/issue17070>`_
|
||||||
|
|
||||||
|
Other links:
|
||||||
|
|
||||||
|
* `Secure File Descriptor Handling
|
||||||
|
<http://udrepper.livejournal.com/20407.html>`_ (Ulrich Drepper,
|
||||||
|
2008)
|
||||||
|
* `Ghosts of Unix past, part 2: Conflated designs
|
||||||
|
<http://lwn.net/Articles/412131/>`_ (Neil Brown, 2010) explains the
|
||||||
|
history of ``O_CLOEXEC`` and ``O_NONBLOCK`` flags
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed into the public domain.
|
||||||
|
|
Loading…
Reference in New Issue