merge
This commit is contained in:
commit
1ed9a8a185
18
pep-0008.txt
18
pep-0008.txt
|
@ -158,9 +158,21 @@ The preferred way of wrapping long lines is by using Python's implied
|
|||
line continuation inside parentheses, brackets and braces. Long lines
|
||||
can be broken over multiple lines by wrapping expressions in
|
||||
parentheses. These should be used in preference to using a backslash
|
||||
for line continuation. Make sure to indent the continued line
|
||||
appropriately. The preferred place to break around a binary operator
|
||||
is *after* the operator, not before it. Some examples::
|
||||
for line continuation.
|
||||
|
||||
Backslashes may still be appropriate at times. For example, long,
|
||||
multiple ``with``-statements cannot use implicit continuation, so
|
||||
backslashes are acceptable::
|
||||
|
||||
with open('/path/to/some/file/you/want/to/read') as file_1, \
|
||||
open('/path/to/some/file/being/written', 'w') as file_2:
|
||||
file_2.write(file_1.read())
|
||||
|
||||
Another such case is with ``assert`` statements.
|
||||
|
||||
Make sure to indent the continued line appropriately. The preferred
|
||||
place to break around a binary operator is *after* the operator, not
|
||||
before it. Some examples::
|
||||
|
||||
class Rectangle(Blob):
|
||||
|
||||
|
|
23
pep-0315.txt
23
pep-0315.txt
|
@ -4,7 +4,7 @@ Version: $Revision$
|
|||
Last-Modified: $Date$
|
||||
Author: Raymond Hettinger <python@rcn.com>
|
||||
W Isaac Carroll <icarroll@pobox.com>
|
||||
Status: Deferred
|
||||
Status: Rejected
|
||||
Type: Standards Track
|
||||
Content-Type: text/plain
|
||||
Created: 25-Apr-2003
|
||||
|
@ -21,19 +21,32 @@ Abstract
|
|||
|
||||
Notice
|
||||
|
||||
Deferred; see
|
||||
Rejected; see
|
||||
http://mail.python.org/pipermail/python-ideas/2013-June/021610.html
|
||||
|
||||
This PEP has been deferred since 2006; see
|
||||
http://mail.python.org/pipermail/python-dev/2006-February/060718.html
|
||||
|
||||
Subsequent efforts to revive the PEP in April 2009 did not
|
||||
meet with success because no syntax emerged that could
|
||||
compete with a while-True and an inner if-break.
|
||||
compete with the following form:
|
||||
|
||||
A syntax was found for a basic do-while loop but it found
|
||||
had little support because the condition was at the top:
|
||||
while True:
|
||||
<setup code>
|
||||
if not <condition>:
|
||||
break
|
||||
<loop body>
|
||||
|
||||
A syntax alternative to the one proposed in the PEP was found for
|
||||
a basic do-while loop but it gained little support because the
|
||||
condition was at the top:
|
||||
|
||||
do ... while <cond>:
|
||||
<loop body>
|
||||
|
||||
Users of the language are advised to use the while-True form with
|
||||
an inner if-break when a do-while loop would have been appropriate.
|
||||
|
||||
|
||||
Motivation
|
||||
|
||||
|
|
1747
pep-0426.txt
1747
pep-0426.txt
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,249 @@
|
|||
{
|
||||
"id": "http://www.python.org/dev/peps/pep-0426/",
|
||||
"$schema": "http://json-schema.org/draft-04/schema#",
|
||||
"title": "Metadata for Python Software Packages 2.0",
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"metadata_version": {
|
||||
"description": "Version of the file format",
|
||||
"type": "string",
|
||||
"pattern": "^(\\d+(\\.\\d+)*)$"
|
||||
},
|
||||
"generator": {
|
||||
"description": "Name and version of the program that produced this file.",
|
||||
"type": "string",
|
||||
"pattern": "^[0-9A-Za-z]([0-9A-Za-z_.-]*[0-9A-Za-z])( \\((\\d+(\\.\\d+)*)((a|b|c|rc)(\\d+))?(\\.(post)(\\d+))?(\\.(dev)(\\d+))\\))?$"
|
||||
},
|
||||
"name": {
|
||||
"description": "The name of the distribution.",
|
||||
"type": "string",
|
||||
"pattern": "^[0-9A-Za-z]([0-9A-Za-z_.-]*[0-9A-Za-z])?$"
|
||||
},
|
||||
"version": {
|
||||
"description": "The distribution's public version identifier",
|
||||
"type": "string",
|
||||
"pattern": "^(\\d+(\\.\\d+)*)((a|b|c|rc)(\\d+))?(\\.(post)(\\d+))?(\\.(dev)(\\d+))?$"
|
||||
},
|
||||
"source_label": {
|
||||
"description": "A constrained identifying text string",
|
||||
"type": "string",
|
||||
"pattern": "^[0-9a-z_.-+]+$"
|
||||
},
|
||||
"source_url": {
|
||||
"description": "A string containing a full URL where the source for this specific version of the distribution can be downloaded.",
|
||||
"type": "string",
|
||||
"format": "uri"
|
||||
},
|
||||
"summary": {
|
||||
"description": "A one-line summary of what the distribution does.",
|
||||
"type": "string"
|
||||
},
|
||||
"document_names": {
|
||||
"description": "Names of supporting metadata documents",
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"description": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/document_name"
|
||||
},
|
||||
"changelog": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/document_name"
|
||||
},
|
||||
"license": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/document_name"
|
||||
}
|
||||
},
|
||||
"additionalProperties": false
|
||||
},
|
||||
"keywords": {
|
||||
"description": "A list of additional keywords to be used to assist searching for the distribution in a larger catalog.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"license": {
|
||||
"description": "A string indicating the license covering the distribution.",
|
||||
"type": "string"
|
||||
},
|
||||
"classifiers": {
|
||||
"description": "A list of strings, with each giving a single classification value for the distribution.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"contacts": {
|
||||
"description": "A list of contributor entries giving the recommended contact points for getting more information about the project.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"$ref": "#/definitions/contact"
|
||||
}
|
||||
},
|
||||
"contributors": {
|
||||
"description": "A list of contributor entries for other contributors not already listed as current project points of contact.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"$ref": "#/definitions/contact"
|
||||
}
|
||||
},
|
||||
"project_urls": {
|
||||
"description": "A mapping of arbitrary text labels to additional URLs relevant to the project.",
|
||||
"type": "object"
|
||||
},
|
||||
"extras": {
|
||||
"description": "A list of optional sets of dependencies that may be used to define conditional dependencies in \"may_require\" and similar fields.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/extra_name"
|
||||
}
|
||||
},
|
||||
"distributes": {
|
||||
"description": "A list of subdistributions made available through this metadistribution.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/dependencies"
|
||||
},
|
||||
"may_distribute": {
|
||||
"description": "A list of subdistributions that may be made available through this metadistribution, based on the extras requested and the target deployment environment.",
|
||||
"$ref": "#/definitions/conditional_dependencies"
|
||||
},
|
||||
"run_requires": {
|
||||
"description": "A list of other distributions needed when to run this distribution.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/dependencies"
|
||||
},
|
||||
"run_may_require": {
|
||||
"description": "A list of other distributions that may be needed when this distribution is deployed, based on the extras requested and the target deployment environment.",
|
||||
"$ref": "#/definitions/conditional_dependencies"
|
||||
},
|
||||
"test_requires": {
|
||||
"description": "A list of other distributions needed when this distribution is tested.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/dependencies"
|
||||
},
|
||||
"test_may_require": {
|
||||
"description": "A list of other distributions that may be needed when this distribution is tested, based on the extras requested and the target deployment environment.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/conditional_dependencies"
|
||||
},
|
||||
"build_requires": {
|
||||
"description": "A list of other distributions needed when this distribution is built.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/dependencies"
|
||||
},
|
||||
"build_may_require": {
|
||||
"description": "A list of other distributions that may be needed when this distribution is built, based on the extras requested and the target deployment environment.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/conditional_dependencies"
|
||||
},
|
||||
"dev_requires": {
|
||||
"description": "A list of other distributions needed when this distribution is developed.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/dependencies"
|
||||
},
|
||||
"dev_may_require": {
|
||||
"description": "A list of other distributions that may be needed when this distribution is developed, based on the extras requested and the target deployment environment.",
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/conditional_dependencies"
|
||||
},
|
||||
"provides": {
|
||||
"description": "A list of strings naming additional dependency requirements that are satisfied by installing this distribution. These strings must be of the form Name or Name (Version), as for the requires field.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"obsoleted_by": {
|
||||
"description": "A string that indicates that this project is no longer being developed. The named project provides a substitute or replacement.",
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/version_specifier"
|
||||
},
|
||||
"supports_environments": {
|
||||
"description": "A list of strings specifying the environments that the distribution explicitly supports.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/environment_marker"
|
||||
}
|
||||
},
|
||||
"metabuild_hooks": {
|
||||
"description": "The metabuild_hooks field is used to define various operations that may be invoked on a distribution in a platform independent manner.",
|
||||
"type": "object"
|
||||
},
|
||||
"extensions": {
|
||||
"description": "Extensions to the metadata may be present in a mapping under the 'extensions' key.",
|
||||
"type": "object"
|
||||
}
|
||||
},
|
||||
|
||||
"required": ["metadata_version", "name", "version"],
|
||||
"additionalProperties": false,
|
||||
|
||||
"definitions": {
|
||||
"contact": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string"
|
||||
},
|
||||
"email": {
|
||||
"type": "string"
|
||||
},
|
||||
"url": {
|
||||
"type": "string"
|
||||
},
|
||||
"role": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"required": ["name"],
|
||||
"additionalProperties": false
|
||||
},
|
||||
"dependencies": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/version_specifier"
|
||||
}
|
||||
},
|
||||
"conditional_dependencies": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"extra": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/extra_name"
|
||||
},
|
||||
"environment": {
|
||||
"type": "string",
|
||||
"$ref": "#/definitions/environment_marker"
|
||||
},
|
||||
"dependencies": {
|
||||
"type": "array",
|
||||
"$ref": "#/definitions/dependencies"
|
||||
}
|
||||
},
|
||||
"required": ["dependencies"],
|
||||
"additionalProperties": false
|
||||
}
|
||||
},
|
||||
"version_specifier": {
|
||||
"type": "string"
|
||||
},
|
||||
"extra_name": {
|
||||
"type": "string"
|
||||
},
|
||||
"environment_marker": {
|
||||
"type": "string"
|
||||
},
|
||||
"document_name": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
|
@ -5,7 +5,7 @@ Last-Modified: $Date$
|
|||
Author: Barry Warsaw <barry@python.org>,
|
||||
Eli Bendersky <eliben@gmail.com>,
|
||||
Ethan Furman <ethan@stoneleaf.us>
|
||||
Status: Accepted
|
||||
Status: Final
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 2013-02-23
|
||||
|
@ -467,6 +467,10 @@ assignment to ``Animal`` is equivalent to::
|
|||
... cat = 3
|
||||
... dog = 4
|
||||
|
||||
The reason for defaulting to ``1`` as the starting number and not ``0`` is
|
||||
that ``0`` is ``False`` in a boolean sense, but enum members all evaluate
|
||||
to ``True``.
|
||||
|
||||
|
||||
Proposed variations
|
||||
===================
|
||||
|
|
323
pep-0440.txt
323
pep-0440.txt
|
@ -9,7 +9,7 @@ Status: Draft
|
|||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 18 Mar 2013
|
||||
Post-History: 30 Mar 2013, 27-May-2013
|
||||
Post-History: 30 Mar 2013, 27 May 2013, 20 Jun 2013
|
||||
Replaces: 386
|
||||
|
||||
|
||||
|
@ -27,7 +27,7 @@ standardised approach to versioning, as described in PEP 345 and PEP 386.
|
|||
This PEP was broken out of the metadata 2.0 specification in PEP 426.
|
||||
|
||||
Unlike PEP 426, the notes that remain in this document are intended as
|
||||
part of the final specification.
|
||||
part of the final specification (except for this one).
|
||||
|
||||
|
||||
Definitions
|
||||
|
@ -40,7 +40,7 @@ document are to be interpreted as described in RFC 2119.
|
|||
The following terms are to be interpreted as described in PEP 426:
|
||||
|
||||
* "Distributions"
|
||||
* "Versions"
|
||||
* "Releases"
|
||||
* "Build tools"
|
||||
* "Index servers"
|
||||
* "Publication tools"
|
||||
|
@ -52,9 +52,13 @@ The following terms are to be interpreted as described in PEP 426:
|
|||
Version scheme
|
||||
==============
|
||||
|
||||
Distribution versions are identified by both a public version identifier,
|
||||
which supports all defined version comparison operations, and a build
|
||||
label, which supports only strict equality comparisons.
|
||||
Distributions are identified by a public version identifier which
|
||||
supports all defined version comparison operations
|
||||
|
||||
Distributions may also define a source label, which is not used by
|
||||
automated tools. Source labels are useful when a project internal
|
||||
versioning scheme requires translation to create a compliant public
|
||||
version identifier.
|
||||
|
||||
The version scheme is used both to describe the distribution version
|
||||
provided by a particular distribution archive, as well as to place
|
||||
|
@ -84,7 +88,7 @@ Public version identifiers are separated into up to four segments:
|
|||
* Post-release segment: ``.postN``
|
||||
* Development release segment: ``.devN``
|
||||
|
||||
Any given version will be a "release", "pre-release", "post-release" or
|
||||
Any given release will be a "final release", "pre-release", "post-release" or
|
||||
"developmental release" as defined in the following sections.
|
||||
|
||||
.. note::
|
||||
|
@ -105,28 +109,37 @@ Source labels
|
|||
Source labels are text strings with minimal defined semantics.
|
||||
|
||||
To ensure source labels can be readily incorporated as part of file names
|
||||
and URLs, they MUST be comprised of only ASCII alphanumerics, plus signs,
|
||||
periods and hyphens.
|
||||
and URLs, and to avoid formatting inconsistences in hexadecimal hash
|
||||
representations they MUST be limited to the following set of permitted
|
||||
characters:
|
||||
|
||||
In addition, source labels MUST be unique within a given distribution.
|
||||
* Lowercase ASCII letters (``[a-z]``)
|
||||
* ASCII digits (``[0-9]``)
|
||||
* underscores (``_``)
|
||||
* hyphens (``-``)
|
||||
* periods (``.``)
|
||||
* plus signs (``+``)
|
||||
|
||||
As with distribution names, all comparisons of source labels MUST be case
|
||||
insensitive.
|
||||
Source labels MUST start and end with an ASCII letter or digit.
|
||||
|
||||
Source labels MUST be unique within each project and MUST NOT match any
|
||||
defined version for the project.
|
||||
|
||||
|
||||
Releases
|
||||
--------
|
||||
Final releases
|
||||
--------------
|
||||
|
||||
A version identifier that consists solely of a release segment is termed
|
||||
a "release".
|
||||
A version identifier that consists solely of a release segment is
|
||||
termed a "final release".
|
||||
|
||||
The release segment consists of one or more non-negative integer values,
|
||||
separated by dots::
|
||||
The release segment consists of one or more non-negative integer
|
||||
values, separated by dots::
|
||||
|
||||
N[.N]+
|
||||
|
||||
Releases within a project will typically be numbered in a consistently
|
||||
increasing fashion.
|
||||
Final releases within a project MUST be numbered in a consistently
|
||||
increasing fashion, otherwise automated tools will not be able to upgrade
|
||||
them correctly.
|
||||
|
||||
Comparison and ordering of release segments considers the numeric value
|
||||
of each component of the release segment in turn. When comparing release
|
||||
|
@ -157,8 +170,8 @@ For example::
|
|||
2.0
|
||||
2.0.1
|
||||
|
||||
A release series is any set of release numbers that start with a common
|
||||
prefix. For example, ``3.3.1``, ``3.3.5`` and ``3.3.9.45`` are all
|
||||
A release series is any set of final release numbers that start with a
|
||||
common prefix. For example, ``3.3.1``, ``3.3.5`` and ``3.3.9.45`` are all
|
||||
part of the ``3.3`` release series.
|
||||
|
||||
.. note::
|
||||
|
@ -206,8 +219,8 @@ of both ``c`` and ``rc`` releases for a common release segment.
|
|||
Post-releases
|
||||
-------------
|
||||
|
||||
Some projects use post-releases to address minor errors in a release that
|
||||
do not affect the distributed software (for example, correcting an error
|
||||
Some projects use post-releases to address minor errors in a final release
|
||||
that do not affect the distributed software (for example, correcting an error
|
||||
in the release notes).
|
||||
|
||||
If used as part of a project's development cycle, these post-releases are
|
||||
|
@ -371,7 +384,7 @@ are permitted and MUST be ordered as shown::
|
|||
.devN, aN, bN, cN, rcN, <no suffix>, .postN
|
||||
|
||||
Note that `rc` will always sort after `c` (regardless of the numeric
|
||||
component) although they are semantically equivalent. Tools are free to
|
||||
component) although they are semantically equivalent. Tools MAY
|
||||
reject this case as ambiguous and remain in compliance with the PEP.
|
||||
|
||||
Within an alpha (``1.0a1``), beta (``1.0b1``), or release candidate
|
||||
|
@ -506,6 +519,22 @@ numbering based on API compatibility, as well as triggering more appropriate
|
|||
version comparison semantics.
|
||||
|
||||
|
||||
Olson database versioning
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The ``pytz`` project inherits its versioning scheme from the corresponding
|
||||
Olson timezone database versioning scheme: the year followed by a lowercase
|
||||
character indicating the version of the database within that year.
|
||||
|
||||
This can be translated to a compliant 3-part version identifier as
|
||||
``0.<year>.<serial>``, where the serial starts at zero (for the '<year>a'
|
||||
release) and is incremented with each subsequent database update within the
|
||||
year.
|
||||
|
||||
As with other translated version identifiers, the corresponding Olson
|
||||
database version would be recorded in the source label field.
|
||||
|
||||
|
||||
Version specifiers
|
||||
==================
|
||||
|
||||
|
@ -521,7 +550,6 @@ clause:
|
|||
* ``~=``: `Compatible release`_ clause
|
||||
* ``==``: `Version matching`_ clause
|
||||
* ``!=``: `Version exclusion`_ clause
|
||||
* ``is``: `Source reference`_ clause
|
||||
* ``<=``, ``>=``: `Inclusive ordered comparison`_ clause
|
||||
* ``<``, ``>``: `Exclusive ordered comparison`_ clause
|
||||
|
||||
|
@ -605,6 +633,11 @@ version. The *only* substitution performed is the zero padding of the
|
|||
release segment to ensure the release segments are compared with the same
|
||||
length.
|
||||
|
||||
Whether or not strict version matching is appropriate depends on the specific
|
||||
use case for the version specifier. Automated tools SHOULD at least issue
|
||||
warnings and MAY reject them entirely when strict version matches are used
|
||||
inappropriately.
|
||||
|
||||
Prefix matching may be requested instead of strict comparison, by appending
|
||||
a trailing ``.*`` to the version identifier in the version matching clause.
|
||||
This means that additional trailing segments will be ignored when
|
||||
|
@ -645,75 +678,6 @@ match or not as shown::
|
|||
!= 1.1.* # Same prefix, so 1.1.post1 does not match clause
|
||||
|
||||
|
||||
Source reference
|
||||
----------------
|
||||
|
||||
A source reference includes the source reference operator ``is`` and
|
||||
a source label or a source URL.
|
||||
|
||||
Installation tools MAY also permit direct references to a platform
|
||||
appropriate binary archive in a source reference clause.
|
||||
|
||||
Publication tools and public index servers SHOULD NOT permit direct
|
||||
references to a platform appropriate binary archive in a source
|
||||
reference clause.
|
||||
|
||||
Source label matching works solely on strict equality comparisons: the
|
||||
candidate source label must be exactly the same as the source label in the
|
||||
version clause for the clause to match the candidate distribution.
|
||||
|
||||
For example, a source reference could be used to depend directly on a
|
||||
version control hash based identifier rather than the translated public
|
||||
version::
|
||||
|
||||
exact-dependency (is 1.3.7+build.11.e0f985a)
|
||||
|
||||
A source URL is distinguished from a source label by the presence of
|
||||
``:`` and ``/`` characters in the source reference. As these characters
|
||||
are not permitted in source labels, they indicate that the reference uses
|
||||
a source URL.
|
||||
|
||||
Some appropriate targets for a source URL are a source tarball, an sdist
|
||||
archive or a direct reference to a tag or specific commit in an online
|
||||
version control system. The exact URLs and
|
||||
targets supported will be installation tool specific.
|
||||
|
||||
For example, a local source archive may be referenced directly::
|
||||
|
||||
pip (is file:///localbuilds/pip-1.3.1.zip)
|
||||
|
||||
All source URL references SHOULD either specify a local file URL, a secure
|
||||
transport mechanism (such as ``https``) or else include an expected hash
|
||||
value in the URL for verification purposes. If an insecure network
|
||||
transport is specified without any hash information (or with hash
|
||||
information that the tool doesn't understand), automated tools SHOULD
|
||||
at least emit a warning and MAY refuse to rely on the URL.
|
||||
|
||||
It is RECOMMENDED that only hashes which are unconditionally provided by
|
||||
the latest version of the standard library's ``hashlib`` module be used
|
||||
for source archive hashes. At time of writing, that list consists of
|
||||
``'md5'``, ``'sha1'``, ``'sha224'``, ``'sha256'``, ``'sha384'``, and
|
||||
``'sha512'``.
|
||||
|
||||
For source archive references, an expected hash value may be
|
||||
specified by including a ``<hash-algorithm>=<expected-hash>`` as part of
|
||||
the URL fragment.
|
||||
|
||||
For version control references, the ``VCS+protocol`` scheme SHOULD be
|
||||
used to identify both the version control system and the secure transport.
|
||||
|
||||
To support version control systems that do not support including commit or
|
||||
tag references directly in the URL, that information may be appended to the
|
||||
end of the URL using the ``@<tag>`` notation.
|
||||
|
||||
The use of ``is`` when defining dependencies for published distributions
|
||||
is strongly discouraged as it greatly complicates the deployment of
|
||||
security fixes. The source label matching operator is intended primarily
|
||||
for use when defining dependencies for repeatable *deployments of
|
||||
applications* while using a shared distribution index, as well as to
|
||||
reference dependencies which are not published through an index server.
|
||||
|
||||
|
||||
Inclusive ordered comparison
|
||||
----------------------------
|
||||
|
||||
|
@ -752,62 +716,108 @@ Handling of pre-releases
|
|||
------------------------
|
||||
|
||||
Pre-releases of any kind, including developmental releases, are implicitly
|
||||
excluded from all version specifiers, *unless* a pre-release or developmental
|
||||
release is explicitly mentioned in one of the clauses. For example, these
|
||||
specifiers implicitly exclude all pre-releases and development
|
||||
releases of later versions::
|
||||
|
||||
2.2
|
||||
>= 1.0
|
||||
|
||||
While these specifiers would include at least some of them::
|
||||
|
||||
2.2.dev0
|
||||
2.2, != 2.3b2
|
||||
>= 1.0a1
|
||||
>= 1.0c1
|
||||
>= 1.0, != 1.0b2
|
||||
>= 1.0, < 2.0.dev123
|
||||
excluded from all version specifiers, *unless* they are already present
|
||||
on the system, explicitly requested by the user, or if the only available
|
||||
version that satisfies the version specifier is a pre-release.
|
||||
|
||||
By default, dependency resolution tools SHOULD:
|
||||
|
||||
* accept already installed pre-releases for all version specifiers
|
||||
* accept remotely available pre-releases for version specifiers which
|
||||
include at least one version clauses that references a pre-release
|
||||
* accept remotely available pre-releases for version specifiers where
|
||||
there is no final or post release that satisfies the version specifier
|
||||
* exclude all other pre-releases from consideration
|
||||
|
||||
Dependency resolution tools MAY issue a warning if a pre-release is needed
|
||||
to satisfy a version specifier.
|
||||
|
||||
Dependency resolution tools SHOULD also allow users to request the
|
||||
following alternative behaviours:
|
||||
|
||||
* accepting pre-releases for all version specifiers
|
||||
* excluding pre-releases for all version specifiers (reporting an error or
|
||||
warning if a pre-release is already installed locally)
|
||||
warning if a pre-release is already installed locally, or if a
|
||||
pre-release is the only way to satisfy a particular specifier)
|
||||
|
||||
Dependency resolution tools MAY also allow the above behaviour to be
|
||||
controlled on a per-distribution basis.
|
||||
|
||||
Post-releases and purely numeric releases receive no special treatment in
|
||||
version specifiers - they are always included unless explicitly excluded.
|
||||
Post-releases and final releases receive no special treatment in version
|
||||
specifiers - they are always included unless explicitly excluded.
|
||||
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
* ``3.1``: version 3.1 or later, but not
|
||||
version 4.0 or later. Excludes pre-releases and developmental releases.
|
||||
* ``3.1.2``: version 3.1.2 or later, but not
|
||||
version 3.2.0 or later. Excludes pre-releases and developmental releases.
|
||||
* ``3.1a1``: version 3.1a1 or later, but not
|
||||
version 4.0 or later. Allows pre-releases like 3.2a4 and developmental
|
||||
releases like 3.2.dev1.
|
||||
* ``3.1``: version 3.1 or later, but not version 4.0 or later.
|
||||
* ``3.1.2``: version 3.1.2 or later, but not version 3.2.0 or later.
|
||||
* ``3.1a1``: version 3.1a1 or later, but not version 4.0 or later.
|
||||
* ``== 3.1``: specifically version 3.1 (or 3.1.0), excludes all pre-releases,
|
||||
post releases, developmental releases and any 3.1.x maintenance releases.
|
||||
* ``== 3.1.*``: any version that starts with 3.1, excluding pre-releases and
|
||||
developmental releases. Equivalent to the ``3.1.0`` compatible release
|
||||
clause.
|
||||
* ``== 3.1.*``: any version that starts with 3.1. Equivalent to the
|
||||
``3.1.0`` compatible release clause.
|
||||
* ``3.1.0, != 3.1.3``: version 3.1.0 or later, but not version 3.1.3 and
|
||||
not version 3.2.0 or later. Excludes pre-releases and developmental
|
||||
releases.
|
||||
not version 3.2.0 or later.
|
||||
|
||||
|
||||
Direct references
|
||||
=================
|
||||
|
||||
Some automated tools may permit the use of a direct reference as an
|
||||
alternative to a normal version specifier. A direct reference consists of
|
||||
the word ``from`` and an explicit URL.
|
||||
|
||||
Whether or not direct references are appropriate depends on the specific
|
||||
use case for the version specifier. Automated tools SHOULD at least issue
|
||||
warnings and MAY reject them entirely when direct references are used
|
||||
inappropriately.
|
||||
|
||||
Public index servers SHOULD NOT allow the use of direct references in
|
||||
uploaded distributions. Direct references are intended as a tool for
|
||||
software integrators rather than publishers.
|
||||
|
||||
Depending on the use case, some appropriate targets for a direct URL
|
||||
reference may be a valid ``source_url`` entry (see PEP 426), an sdist, or
|
||||
a wheel binary archive. The exact URLs and targets supported will be tool
|
||||
dependent.
|
||||
|
||||
For example, a local source archive may be referenced directly::
|
||||
|
||||
pip (from file:///localbuilds/pip-1.3.1.zip)
|
||||
|
||||
Alternatively, a prebuilt archive may also be referenced::
|
||||
|
||||
pip (from file:///localbuilds/pip-1.3.1-py33-none-any.whl)
|
||||
|
||||
All direct references that do not refer to a local file URL SHOULD
|
||||
specify a secure transport mechanism (such as ``https``), include an
|
||||
expected hash value in the URL for verification purposes, or both. If an
|
||||
insecure transport is specified without any hash information, with hash
|
||||
information that the tool doesn't understand, or with a selected hash
|
||||
algorithm that the tool considers too weak to trust, automated tools
|
||||
SHOULD at least emit a warning and MAY refuse to rely on the URL.
|
||||
|
||||
It is RECOMMENDED that only hashes which are unconditionally provided by
|
||||
the latest version of the standard library's ``hashlib`` module be used
|
||||
for source archive hashes. At time of writing, that list consists of
|
||||
``'md5'``, ``'sha1'``, ``'sha224'``, ``'sha256'``, ``'sha384'``, and
|
||||
``'sha512'``.
|
||||
|
||||
For source archive and wheel references, an expected hash value may be
|
||||
specified by including a ``<hash-algorithm>=<expected-hash>`` entry as
|
||||
part of the URL fragment.
|
||||
|
||||
Version control references, the ``VCS+protocol`` scheme SHOULD be
|
||||
used to identify both the version control system and the secure transport.
|
||||
|
||||
To support version control systems that do not support including commit or
|
||||
tag references directly in the URL, that information may be appended to the
|
||||
end of the URL using the ``@<tag>`` notation.
|
||||
|
||||
Remote URL examples::
|
||||
|
||||
pip (from https://github.com/pypa/pip/archive/1.3.1.zip)
|
||||
pip (from http://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686)
|
||||
pip (from git+https://github.com/pypa/pip.git@1.3.1)
|
||||
|
||||
|
||||
Updating the versioning specification
|
||||
|
@ -825,28 +835,30 @@ Summary of differences from \PEP 386
|
|||
|
||||
* Moved the description of version specifiers into the versioning PEP
|
||||
|
||||
* added the "source label" concept to better handle projects that wish to
|
||||
* Added the "source label" concept to better handle projects that wish to
|
||||
use a non-compliant versioning scheme internally, especially those based
|
||||
on DVCS hashes
|
||||
|
||||
* added the "compatible release" clause
|
||||
* Added the "direct reference" concept as a standard notation for direct
|
||||
references to resources (rather than each tool needing to invents its own)
|
||||
|
||||
* added the "source reference" clause
|
||||
* Added the "compatible release" clause
|
||||
|
||||
* added the trailing wildcard syntax for prefix based version matching
|
||||
* Added the trailing wildcard syntax for prefix based version matching
|
||||
and exclusion
|
||||
|
||||
* changed the top level sort position of the ``.devN`` suffix
|
||||
* Changed the top level sort position of the ``.devN`` suffix
|
||||
|
||||
* allowed single value version numbers
|
||||
* Allowed single value version numbers
|
||||
|
||||
* explicit exclusion of leading or trailing whitespace
|
||||
* Explicit exclusion of leading or trailing whitespace
|
||||
|
||||
* explicit criterion for the exclusion of date based versions
|
||||
* Explicit criterion for the exclusion of date based versions
|
||||
|
||||
* implicitly exclude pre-releases unless explicitly requested
|
||||
* Implicitly exclude pre-releases unless they're already present or
|
||||
needed to satisfy a dependency
|
||||
|
||||
* treat post releases the same way as unqualified releases
|
||||
* Treat post releases the same way as unqualified releases
|
||||
|
||||
* Discuss ordering and dependencies across metadata versions
|
||||
|
||||
|
@ -995,11 +1007,12 @@ The previous interpretation also excluded post-releases from some version
|
|||
specifiers for no adequately justified reason.
|
||||
|
||||
The updated interpretation is intended to make it difficult to accidentally
|
||||
accept a pre-release version as satisfying a dependency, while allowing
|
||||
pre-release versions to be explicitly requested when needed.
|
||||
accept a pre-release version as satisfying a dependency, while still
|
||||
allowing pre-release versions to be retrieved automatically when that's the
|
||||
only way to satisfy a dependency.
|
||||
|
||||
The "some forward compatibility assumed" default version constraint is
|
||||
taken directly from the Ruby community's "pessimistic version constraint"
|
||||
derived from the Ruby community's "pessimistic version constraint"
|
||||
operator [2]_ to allow projects to take a cautious approach to forward
|
||||
compatibility promises, while still easily setting a minimum required
|
||||
version for their dependencies. It is made the default behaviour rather
|
||||
|
@ -1022,16 +1035,26 @@ improved tools for dynamic path manipulation.
|
|||
|
||||
The trailing wildcard syntax to request prefix based version matching was
|
||||
added to make it possible to sensibly define both compatible release clauses
|
||||
and the desired pre-release handling semantics for ``<`` and ``>`` ordered
|
||||
comparison clauses.
|
||||
and the desired pre- and post-release handling semantics for ``<`` and ``>``
|
||||
ordered comparison clauses.
|
||||
|
||||
Source references are added for two purposes. In conjunction with source
|
||||
labels, they allow hash based references to exact versions that aren't
|
||||
compliant with the fully ordered public version scheme, such as those
|
||||
generated from version control. In combination with source URLs, they
|
||||
also allow the new metadata standard to natively support an existing
|
||||
feature of ``pip``, which allows arbitrary URLs like
|
||||
``file:///localbuilds/exampledist-1.0-py33-none-any.whl``.
|
||||
|
||||
Adding direct references
|
||||
------------------------
|
||||
|
||||
Direct references are added as an "escape clause" to handle messy real
|
||||
world situations that don't map neatly to the standard distribution model.
|
||||
This includes dependencies on unpublished software for internal use, as well
|
||||
as handling the more complex compatibility issues that may arise when
|
||||
wrapping third party libraries as C extensions (this is of especial concern
|
||||
to the scientific community).
|
||||
|
||||
Index servers are deliberately given a lot of freedom to disallow direct
|
||||
references, since they're intended primarily as a tool for integrators
|
||||
rather than publishers. PyPI in particular is currently going through the
|
||||
process of *eliminating* dependencies on external references, as unreliable
|
||||
external services have the effect of slowing down installation operations,
|
||||
as well as reducing PyPI's own apparent reliability.
|
||||
|
||||
|
||||
References
|
||||
|
|
|
@ -4,13 +4,13 @@ Version: $Revision$
|
|||
Last-Modified: $Date$
|
||||
Author: Antoine Pitrou <solipsis@pitrou.net>
|
||||
BDFL-Delegate: Benjamin Peterson <benjamin@python.org>
|
||||
Status: Draft
|
||||
Status: Accepted
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 2013-05-18
|
||||
Python-Version: 3.4
|
||||
Post-History: 2013-05-18
|
||||
Resolution: TBD
|
||||
Resolution: http://mail.python.org/pipermail/python-dev/2013-June/126746.html
|
||||
|
||||
|
||||
Abstract
|
||||
|
@ -201,8 +201,7 @@ Predictability
|
|||
--------------
|
||||
|
||||
Following this scheme, an object's finalizer is always called exactly
|
||||
once. The only exception is if an object is resurrected: the finalizer
|
||||
will be called again when the object becomes unreachable again.
|
||||
once, even if it was resurrected afterwards.
|
||||
|
||||
For CI objects, the order in which finalizers are called (step 2 above)
|
||||
is undefined.
|
||||
|
|
73
pep-0443.txt
73
pep-0443.txt
|
@ -4,7 +4,7 @@ Version: $Revision$
|
|||
Last-Modified: $Date$
|
||||
Author: Łukasz Langa <lukasz@langa.pl>
|
||||
Discussions-To: Python-Dev <python-dev@python.org>
|
||||
Status: Accepted
|
||||
Status: Final
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 22-May-2013
|
||||
|
@ -193,48 +193,37 @@ handling of old-style classes and Zope's ExtensionClasses. More
|
|||
importantly, it introduces support for Abstract Base Classes (ABC).
|
||||
|
||||
When a generic function implementation is registered for an ABC, the
|
||||
dispatch algorithm switches to a mode of MRO calculation for the
|
||||
provided argument which includes the relevant ABCs. The algorithm is as
|
||||
follows::
|
||||
dispatch algorithm switches to an extended form of C3 linearization,
|
||||
which includes the relevant ABCs in the MRO of the provided argument.
|
||||
The algorithm inserts ABCs where their functionality is introduced, i.e.
|
||||
``issubclass(cls, abc)`` returns ``True`` for the class itself but
|
||||
returns ``False`` for all its direct base classes. Implicit ABCs for
|
||||
a given class (either registered or inferred from the presence of
|
||||
a special method like ``__len__()``) are inserted directly after the
|
||||
last ABC explicitly listed in the MRO of said class.
|
||||
|
||||
def _compose_mro(cls, haystack):
|
||||
"""Calculates the MRO for a given class `cls`, including relevant
|
||||
abstract base classes from `haystack`."""
|
||||
bases = set(cls.__mro__)
|
||||
mro = list(cls.__mro__)
|
||||
for regcls in haystack:
|
||||
if regcls in bases or not issubclass(cls, regcls):
|
||||
continue # either present in the __mro__ or unrelated
|
||||
for index, base in enumerate(mro):
|
||||
if not issubclass(base, regcls):
|
||||
break
|
||||
if base in bases and not issubclass(regcls, base):
|
||||
# Conflict resolution: put classes present in __mro__
|
||||
# and their subclasses first.
|
||||
index += 1
|
||||
mro.insert(index, regcls)
|
||||
return mro
|
||||
|
||||
In its most basic form, it returns the MRO for the given type::
|
||||
In its most basic form, this linearization returns the MRO for the given
|
||||
type::
|
||||
|
||||
>>> _compose_mro(dict, [])
|
||||
[<class 'dict'>, <class 'object'>]
|
||||
|
||||
When the haystack consists of ABCs that the specified type is a subclass
|
||||
of, they are inserted in a predictable order::
|
||||
When the second argument contains ABCs that the specified type is
|
||||
a subclass of, they are inserted in a predictable order::
|
||||
|
||||
>>> _compose_mro(dict, [Sized, MutableMapping, str,
|
||||
... Sequence, Iterable])
|
||||
[<class 'dict'>, <class 'collections.abc.MutableMapping'>,
|
||||
<class 'collections.abc.Iterable'>, <class 'collections.abc.Sized'>,
|
||||
<class 'collections.abc.Mapping'>, <class 'collections.abc.Sized'>,
|
||||
<class 'collections.abc.Iterable'>, <class 'collections.abc.Container'>,
|
||||
<class 'object'>]
|
||||
|
||||
While this mode of operation is significantly slower, all dispatch
|
||||
decisions are cached. The cache is invalidated on registering new
|
||||
implementations on the generic function or when user code calls
|
||||
``register()`` on an ABC to register a new virtual subclass. In the
|
||||
latter case, it is possible to create a situation with ambiguous
|
||||
dispatch, for instance::
|
||||
``register()`` on an ABC to implicitly subclass it. In the latter case,
|
||||
it is possible to create a situation with ambiguous dispatch, for
|
||||
instance::
|
||||
|
||||
>>> from collections import Iterable, Container
|
||||
>>> class P:
|
||||
|
@ -261,9 +250,9 @@ guess::
|
|||
RuntimeError: Ambiguous dispatch: <class 'collections.abc.Container'>
|
||||
or <class 'collections.abc.Iterable'>
|
||||
|
||||
Note that this exception would not be raised if ``Iterable`` and
|
||||
``Container`` had been provided as base classes during class definition.
|
||||
In this case dispatch happens in the MRO order::
|
||||
Note that this exception would not be raised if one or more ABCs had
|
||||
been provided explicitly as base classes during class definition. In
|
||||
this case dispatch happens in the MRO order::
|
||||
|
||||
>>> class Ten(Iterable, Container):
|
||||
... def __iter__(self):
|
||||
|
@ -275,6 +264,24 @@ In this case dispatch happens in the MRO order::
|
|||
>>> g(Ten())
|
||||
'iterable'
|
||||
|
||||
A similar conflict arises when subclassing an ABC is inferred from the
|
||||
presence of a special method like ``__len__()`` or ``__contains__()``::
|
||||
|
||||
>>> class Q:
|
||||
... def __contains__(self, value):
|
||||
... return False
|
||||
...
|
||||
>>> issubclass(Q, Container)
|
||||
True
|
||||
>>> Iterable.register(Q)
|
||||
>>> g(Q())
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
RuntimeError: Ambiguous dispatch: <class 'collections.abc.Container'>
|
||||
or <class 'collections.abc.Iterable'>
|
||||
|
||||
An early version of the PEP contained a custom approach that was simpler
|
||||
but created a number of edge cases with surprising results [#why-c3]_.
|
||||
|
||||
Usage Patterns
|
||||
==============
|
||||
|
@ -378,6 +385,8 @@ References
|
|||
a particular annotation style".
|
||||
(http://www.python.org/dev/peps/pep-0008)
|
||||
|
||||
.. [#why-c3] http://bugs.python.org/issue18244
|
||||
|
||||
.. [#pep-3124] http://www.python.org/dev/peps/pep-3124/
|
||||
|
||||
.. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules
|
||||
|
|
|
@ -0,0 +1,773 @@
|
|||
PEP: 445
|
||||
Title: Add new APIs to customize Python memory allocators
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Victor Stinner <victor.stinner@gmail.com>
|
||||
BDFL-Delegate: Antoine Pitrou <solipsis@pitrou.net>
|
||||
Status: Accepted
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 15-june-2013
|
||||
Python-Version: 3.4
|
||||
Resolution: http://mail.python.org/pipermail/python-dev/2013-July/127222.html
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes new Application Programming Interfaces (API) to customize
|
||||
Python memory allocators. The only implementation required to conform to
|
||||
this PEP is CPython, but other implementations may choose to be compatible,
|
||||
or to re-use a similar scheme.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Use cases:
|
||||
|
||||
* Applications embedding Python which want to isolate Python memory from
|
||||
the memory of the application, or want to use a different memory
|
||||
allocator optimized for its Python usage
|
||||
* Python running on embedded devices with low memory and slow CPU.
|
||||
A custom memory allocator can be used for efficiency and/or to get
|
||||
access all the memory of the device.
|
||||
* Debug tools for memory allocators:
|
||||
|
||||
- track the memory usage (find memory leaks)
|
||||
- get the location of a memory allocation: Python filename and line
|
||||
number, and the size of a memory block
|
||||
- detect buffer underflow, buffer overflow and misuse of Python
|
||||
allocator APIs (see `Redesign Debug Checks on Memory Block
|
||||
Allocators as Hooks`_)
|
||||
- force memory allocations to fail to test handling of the
|
||||
``MemoryError`` exception
|
||||
|
||||
|
||||
Proposal
|
||||
========
|
||||
|
||||
New Functions and Structures
|
||||
----------------------------
|
||||
|
||||
* Add a new GIL-free (no need to hold the GIL) memory allocator:
|
||||
|
||||
- ``void* PyMem_RawMalloc(size_t size)``
|
||||
- ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
|
||||
- ``void PyMem_RawFree(void *ptr)``
|
||||
- The newly allocated memory will not have been initialized in any
|
||||
way.
|
||||
- Requesting zero bytes returns a distinct non-*NULL* pointer if
|
||||
possible, as if ``PyMem_Malloc(1)`` had been called instead.
|
||||
|
||||
* Add a new ``PyMemAllocator`` structure::
|
||||
|
||||
typedef struct {
|
||||
/* user context passed as the first argument to the 3 functions */
|
||||
void *ctx;
|
||||
|
||||
/* allocate a memory block */
|
||||
void* (*malloc) (void *ctx, size_t size);
|
||||
|
||||
/* allocate or resize a memory block */
|
||||
void* (*realloc) (void *ctx, void *ptr, size_t new_size);
|
||||
|
||||
/* release a memory block */
|
||||
void (*free) (void *ctx, void *ptr);
|
||||
} PyMemAllocator;
|
||||
|
||||
* Add a new ``PyMemAllocatorDomain`` enum to choose the Python
|
||||
allocator domain. Domains:
|
||||
|
||||
- ``PYMEM_DOMAIN_RAW``: ``PyMem_RawMalloc()``, ``PyMem_RawRealloc()``
|
||||
and ``PyMem_RawFree()``
|
||||
|
||||
- ``PYMEM_DOMAIN_MEM``: ``PyMem_Malloc()``, ``PyMem_Realloc()`` and
|
||||
``PyMem_Free()``
|
||||
|
||||
- ``PYMEM_DOMAIN_OBJ``: ``PyObject_Malloc()``, ``PyObject_Realloc()``
|
||||
and ``PyObject_Free()``
|
||||
|
||||
* Add new functions to get and set memory block allocators:
|
||||
|
||||
- ``void PyMem_GetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
||||
- ``void PyMem_SetAllocator(PyMemAllocatorDomain domain, PyMemAllocator *allocator)``
|
||||
- The new allocator must return a distinct non-*NULL* pointer when
|
||||
requesting zero bytes
|
||||
- For the ``PYMEM_DOMAIN_RAW`` domain, the allocator must be
|
||||
thread-safe: the GIL is not held when the allocator is called.
|
||||
|
||||
* Add a new ``PyObjectArenaAllocator`` structure::
|
||||
|
||||
typedef struct {
|
||||
/* user context passed as the first argument to the 2 functions */
|
||||
void *ctx;
|
||||
|
||||
/* allocate an arena */
|
||||
void* (*alloc) (void *ctx, size_t size);
|
||||
|
||||
/* release an arena */
|
||||
void (*free) (void *ctx, void *ptr, size_t size);
|
||||
} PyObjectArenaAllocator;
|
||||
|
||||
* Add new functions to get and set the arena allocator used by
|
||||
*pymalloc*:
|
||||
|
||||
- ``void PyObject_GetArenaAllocator(PyObjectArenaAllocator *allocator)``
|
||||
- ``void PyObject_SetArenaAllocator(PyObjectArenaAllocator *allocator)``
|
||||
|
||||
* Add a new function to reinstall the debug checks on memory allocators when
|
||||
a memory allocator is replaced with ``PyMem_SetAllocator()``:
|
||||
|
||||
- ``void PyMem_SetupDebugHooks(void)``
|
||||
- Install the debug hooks on all memory block allocators. The function can be
|
||||
called more than once, hooks are only installed once.
|
||||
- The function does nothing is Python is not compiled in debug mode.
|
||||
|
||||
* Memory block allocators always return *NULL* if *size* is greater than
|
||||
``PY_SSIZE_T_MAX``. The check is done before calling the inner
|
||||
function.
|
||||
|
||||
.. note::
|
||||
The *pymalloc* allocator is optimized for objects smaller than 512 bytes
|
||||
with a short lifetime. It uses memory mappings with a fixed size of 256
|
||||
KB called "arenas".
|
||||
|
||||
Here is how the allocators are set up by default:
|
||||
|
||||
* ``PYMEM_DOMAIN_RAW``, ``PYMEM_DOMAIN_MEM``: ``malloc()``,
|
||||
``realloc()`` and ``free()``; call ``malloc(1)`` when requesting zero
|
||||
bytes
|
||||
* ``PYMEM_DOMAIN_OBJ``: *pymalloc* allocator which falls back on
|
||||
``PyMem_Malloc()`` for allocations larger than 512 bytes
|
||||
* *pymalloc* arena allocator: ``VirtualAlloc()`` and ``VirtualFree()`` on
|
||||
Windows, ``mmap()`` and ``munmap()`` when available, or ``malloc()``
|
||||
and ``free()``
|
||||
|
||||
|
||||
Redesign Debug Checks on Memory Block Allocators as Hooks
|
||||
---------------------------------------------------------
|
||||
|
||||
Since Python 2.3, Python implements different checks on memory
|
||||
allocators in debug mode:
|
||||
|
||||
* Newly allocated memory is filled with the byte ``0xCB``, freed memory
|
||||
is filled with the byte ``0xDB``.
|
||||
* Detect API violations, ex: ``PyObject_Free()`` called on a memory
|
||||
block allocated by ``PyMem_Malloc()``
|
||||
* Detect write before the start of the buffer (buffer underflow)
|
||||
* Detect write after the end of the buffer (buffer overflow)
|
||||
|
||||
In Python 3.3, the checks are installed by replacing ``PyMem_Malloc()``,
|
||||
``PyMem_Realloc()``, ``PyMem_Free()``, ``PyObject_Malloc()``,
|
||||
``PyObject_Realloc()`` and ``PyObject_Free()`` using macros. The new
|
||||
allocator allocates a larger buffer and writes a pattern to detect buffer
|
||||
underflow, buffer overflow and use after free (by filling the buffer with
|
||||
the byte ``0xDB``). It uses the original ``PyObject_Malloc()``
|
||||
function to allocate memory. So ``PyMem_Malloc()`` and
|
||||
``PyMem_Realloc()`` indirectly call``PyObject_Malloc()`` and
|
||||
``PyObject_Realloc()``.
|
||||
|
||||
This PEP redesigns the debug checks as hooks on the existing allocators
|
||||
in debug mode. Examples of call traces without the hooks:
|
||||
|
||||
* ``PyMem_RawMalloc()`` => ``_PyMem_RawMalloc()`` => ``malloc()``
|
||||
* ``PyMem_Realloc()`` => ``_PyMem_RawRealloc()`` => ``realloc()``
|
||||
* ``PyObject_Free()`` => ``_PyObject_Free()``
|
||||
|
||||
Call traces when the hooks are installed (debug mode):
|
||||
|
||||
* ``PyMem_RawMalloc()`` => ``_PyMem_DebugMalloc()``
|
||||
=> ``_PyMem_RawMalloc()`` => ``malloc()``
|
||||
* ``PyMem_Realloc()`` => ``_PyMem_DebugRealloc()``
|
||||
=> ``_PyMem_RawRealloc()`` => ``realloc()``
|
||||
* ``PyObject_Free()`` => ``_PyMem_DebugFree()``
|
||||
=> ``_PyObject_Free()``
|
||||
|
||||
As a result, ``PyMem_Malloc()`` and ``PyMem_Realloc()`` now call
|
||||
``malloc()`` and ``realloc()`` in both release mode and debug mode,
|
||||
instead of calling ``PyObject_Malloc()`` and ``PyObject_Realloc()`` in
|
||||
debug mode.
|
||||
|
||||
When at least one memory allocator is replaced with
|
||||
``PyMem_SetAllocator()``, the ``PyMem_SetupDebugHooks()`` function must
|
||||
be called to reinstall the debug hooks on top on the new allocator.
|
||||
|
||||
|
||||
Don't call malloc() directly anymore
|
||||
------------------------------------
|
||||
|
||||
``PyObject_Malloc()`` falls back on ``PyMem_Malloc()`` instead of
|
||||
``malloc()`` if size is greater or equal than 512 bytes, and
|
||||
``PyObject_Realloc()`` falls back on ``PyMem_Realloc()`` instead of
|
||||
``realloc()``
|
||||
|
||||
Direct calls to ``malloc()`` are replaced with ``PyMem_Malloc()``, or
|
||||
``PyMem_RawMalloc()`` if the GIL is not held.
|
||||
|
||||
External libraries like zlib or OpenSSL can be configured to allocate memory
|
||||
using ``PyMem_Malloc()`` or ``PyMem_RawMalloc()``. If the allocator of a
|
||||
library can only be replaced globally (rather than on an object-by-object
|
||||
basis), it shouldn't be replaced when Python is embedded in an application.
|
||||
|
||||
For the "track memory usage" use case, it is important to track memory
|
||||
allocated in external libraries to have accurate reports, because these
|
||||
allocations can be large (e.g. they can raise a ``MemoryError`` exception)
|
||||
and would otherwise be missed in memory usage reports.
|
||||
|
||||
|
||||
Examples
|
||||
========
|
||||
|
||||
Use case 1: Replace Memory Allocators, keep pymalloc
|
||||
----------------------------------------------------
|
||||
|
||||
Dummy example wasting 2 bytes per memory block,
|
||||
and 10 bytes per *pymalloc* arena::
|
||||
|
||||
#include <stdlib.h>
|
||||
|
||||
size_t alloc_padding = 2;
|
||||
size_t arena_padding = 10;
|
||||
|
||||
void* my_malloc(void *ctx, size_t size)
|
||||
{
|
||||
int padding = *(int *)ctx;
|
||||
return malloc(size + padding);
|
||||
}
|
||||
|
||||
void* my_realloc(void *ctx, void *ptr, size_t new_size)
|
||||
{
|
||||
int padding = *(int *)ctx;
|
||||
return realloc(ptr, new_size + padding);
|
||||
}
|
||||
|
||||
void my_free(void *ctx, void *ptr)
|
||||
{
|
||||
free(ptr);
|
||||
}
|
||||
|
||||
void* my_alloc_arena(void *ctx, size_t size)
|
||||
{
|
||||
int padding = *(int *)ctx;
|
||||
return malloc(size + padding);
|
||||
}
|
||||
|
||||
void my_free_arena(void *ctx, void *ptr, size_t size)
|
||||
{
|
||||
free(ptr);
|
||||
}
|
||||
|
||||
void setup_custom_allocator(void)
|
||||
{
|
||||
PyMemAllocator alloc;
|
||||
PyObjectArenaAllocator arena;
|
||||
|
||||
alloc.ctx = &alloc_padding;
|
||||
alloc.malloc = my_malloc;
|
||||
alloc.realloc = my_realloc;
|
||||
alloc.free = my_free;
|
||||
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc);
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc);
|
||||
/* leave PYMEM_DOMAIN_OBJ unchanged, use pymalloc */
|
||||
|
||||
arena.ctx = &arena_padding;
|
||||
arena.alloc = my_alloc_arena;
|
||||
arena.free = my_free_arena;
|
||||
PyObject_SetArenaAllocator(&arena);
|
||||
|
||||
PyMem_SetupDebugHooks();
|
||||
}
|
||||
|
||||
|
||||
Use case 2: Replace Memory Allocators, override pymalloc
|
||||
--------------------------------------------------------
|
||||
|
||||
If you have a dedicated allocator optimized for allocations of objects
|
||||
smaller than 512 bytes with a short lifetime, pymalloc can be overriden
|
||||
(replace ``PyObject_Malloc()``).
|
||||
|
||||
Dummy example wasting 2 bytes per memory block::
|
||||
|
||||
#include <stdlib.h>
|
||||
|
||||
size_t padding = 2;
|
||||
|
||||
void* my_malloc(void *ctx, size_t size)
|
||||
{
|
||||
int padding = *(int *)ctx;
|
||||
return malloc(size + padding);
|
||||
}
|
||||
|
||||
void* my_realloc(void *ctx, void *ptr, size_t new_size)
|
||||
{
|
||||
int padding = *(int *)ctx;
|
||||
return realloc(ptr, new_size + padding);
|
||||
}
|
||||
|
||||
void my_free(void *ctx, void *ptr)
|
||||
{
|
||||
free(ptr);
|
||||
}
|
||||
|
||||
void setup_custom_allocator(void)
|
||||
{
|
||||
PyMemAllocator alloc;
|
||||
alloc.ctx = &padding;
|
||||
alloc.malloc = my_malloc;
|
||||
alloc.realloc = my_realloc;
|
||||
alloc.free = my_free;
|
||||
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc);
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc);
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc);
|
||||
|
||||
PyMem_SetupDebugHooks();
|
||||
}
|
||||
|
||||
The *pymalloc* arena does not need to be replaced, because it is no more
|
||||
used by the new allocator.
|
||||
|
||||
|
||||
Use case 3: Setup Hooks On Memory Block Allocators
|
||||
--------------------------------------------------
|
||||
|
||||
Example to setup hooks on all memory block allocators::
|
||||
|
||||
struct {
|
||||
PyMemAllocator raw;
|
||||
PyMemAllocator mem;
|
||||
PyMemAllocator obj;
|
||||
/* ... */
|
||||
} hook;
|
||||
|
||||
static void* hook_malloc(void *ctx, size_t size)
|
||||
{
|
||||
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
|
||||
void *ptr;
|
||||
/* ... */
|
||||
ptr = alloc->malloc(alloc->ctx, size);
|
||||
/* ... */
|
||||
return ptr;
|
||||
}
|
||||
|
||||
static void* hook_realloc(void *ctx, void *ptr, size_t new_size)
|
||||
{
|
||||
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
|
||||
void *ptr2;
|
||||
/* ... */
|
||||
ptr2 = alloc->realloc(alloc->ctx, ptr, new_size);
|
||||
/* ... */
|
||||
return ptr2;
|
||||
}
|
||||
|
||||
static void hook_free(void *ctx, void *ptr)
|
||||
{
|
||||
PyMemAllocator *alloc = (PyMemAllocator *)ctx;
|
||||
/* ... */
|
||||
alloc->free(alloc->ctx, ptr);
|
||||
/* ... */
|
||||
}
|
||||
|
||||
void setup_hooks(void)
|
||||
{
|
||||
PyMemAllocator alloc;
|
||||
static int installed = 0;
|
||||
|
||||
if (installed)
|
||||
return;
|
||||
installed = 1;
|
||||
|
||||
alloc.malloc = hook_malloc;
|
||||
alloc.realloc = hook_realloc;
|
||||
alloc.free = hook_free;
|
||||
PyMem_GetAllocator(PYMEM_DOMAIN_RAW, &hook.raw);
|
||||
PyMem_GetAllocator(PYMEM_DOMAIN_MEM, &hook.mem);
|
||||
PyMem_GetAllocator(PYMEM_DOMAIN_OBJ, &hook.obj);
|
||||
|
||||
alloc.ctx = &hook.raw;
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &alloc);
|
||||
|
||||
alloc.ctx = &hook.mem;
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_MEM, &alloc);
|
||||
|
||||
alloc.ctx = &hook.obj;
|
||||
PyMem_SetAllocator(PYMEM_DOMAIN_OBJ, &alloc);
|
||||
}
|
||||
|
||||
.. note::
|
||||
``PyMem_SetupDebugHooks()`` does not need to be called because
|
||||
memory allocator are not replaced: the debug checks on memory
|
||||
block allocators are installed automatically at startup.
|
||||
|
||||
|
||||
Performances
|
||||
============
|
||||
|
||||
The implementation of this PEP (issue #3329) has no visible overhead on
|
||||
the Python benchmark suite.
|
||||
|
||||
Results of the `Python benchmarks suite
|
||||
<http://hg.python.org/benchmarks>`_ (-b 2n3): some tests are 1.04x
|
||||
faster, some tests are 1.04 slower. Results of pybench microbenchmark:
|
||||
"+0.1%" slower globally (diff between -4.9% and +5.6%).
|
||||
|
||||
The full output of benchmarks is attached to the issue #3329.
|
||||
|
||||
|
||||
Rejected Alternatives
|
||||
=====================
|
||||
|
||||
More specific functions to get/set memory allocators
|
||||
----------------------------------------------------
|
||||
|
||||
It was originally proposed a larger set of C API functions, with one pair
|
||||
of functions for each allocator domain:
|
||||
|
||||
* ``void PyMem_GetRawAllocator(PyMemAllocator *allocator)``
|
||||
* ``void PyMem_GetAllocator(PyMemAllocator *allocator)``
|
||||
* ``void PyObject_GetAllocator(PyMemAllocator *allocator)``
|
||||
* ``void PyMem_SetRawAllocator(PyMemAllocator *allocator)``
|
||||
* ``void PyMem_SetAllocator(PyMemAllocator *allocator)``
|
||||
* ``void PyObject_SetAllocator(PyMemAllocator *allocator)``
|
||||
|
||||
This alternative was rejected because it is not possible to write
|
||||
generic code with more specific functions: code must be duplicated for
|
||||
each memory allocator domain.
|
||||
|
||||
|
||||
Make PyMem_Malloc() reuse PyMem_RawMalloc() by default
|
||||
------------------------------------------------------
|
||||
|
||||
If ``PyMem_Malloc()`` called ``PyMem_RawMalloc()`` by default,
|
||||
calling ``PyMem_SetAllocator(PYMEM_DOMAIN_RAW, alloc)`` would also
|
||||
patch ``PyMem_Malloc()`` indirectly.
|
||||
|
||||
This alternative was rejected because ``PyMem_SetAllocator()`` would
|
||||
have a different behaviour depending on the domain. Always having the
|
||||
same behaviour is less error-prone.
|
||||
|
||||
|
||||
Add a new PYDEBUGMALLOC environment variable
|
||||
--------------------------------------------
|
||||
|
||||
It was proposed to add a new ``PYDEBUGMALLOC`` environment variable to
|
||||
enable debug checks on memory block allocators. It would have had the same
|
||||
effect as calling the ``PyMem_SetupDebugHooks()``, without the need
|
||||
to write any C code. Another advantage is to allow to enable debug checks
|
||||
even in release mode: debug checks would always be compiled in, but only
|
||||
enabled when the environment variable is present and non-empty.
|
||||
|
||||
This alternative was rejected because a new environment variable would
|
||||
make Python initialization even more complex. `PEP 432
|
||||
<http://www.python.org/dev/peps/pep-0432/>`_ tries to simplify the
|
||||
CPython startup sequence.
|
||||
|
||||
|
||||
Use macros to get customizable allocators
|
||||
-----------------------------------------
|
||||
|
||||
To have no overhead in the default configuration, customizable
|
||||
allocators would be an optional feature enabled by a configuration
|
||||
option or by macros.
|
||||
|
||||
This alternative was rejected because the use of macros implies having
|
||||
to recompile extensions modules to use the new allocator and allocator
|
||||
hooks. Not having to recompile Python nor extension modules makes debug
|
||||
hooks easier to use in practice.
|
||||
|
||||
|
||||
Pass the C filename and line number
|
||||
-----------------------------------
|
||||
|
||||
Define allocator functions as macros using ``__FILE__`` and ``__LINE__``
|
||||
to get the C filename and line number of a memory allocation.
|
||||
|
||||
Example of ``PyMem_Malloc`` macro with the modified
|
||||
``PyMemAllocator`` structure::
|
||||
|
||||
typedef struct {
|
||||
/* user context passed as the first argument
|
||||
to the 3 functions */
|
||||
void *ctx;
|
||||
|
||||
/* allocate a memory block */
|
||||
void* (*malloc) (void *ctx, const char *filename, int lineno,
|
||||
size_t size);
|
||||
|
||||
/* allocate or resize a memory block */
|
||||
void* (*realloc) (void *ctx, const char *filename, int lineno,
|
||||
void *ptr, size_t new_size);
|
||||
|
||||
/* release a memory block */
|
||||
void (*free) (void *ctx, const char *filename, int lineno,
|
||||
void *ptr);
|
||||
} PyMemAllocator;
|
||||
|
||||
void* _PyMem_MallocTrace(const char *filename, int lineno,
|
||||
size_t size);
|
||||
|
||||
/* the function is still needed for the Python stable ABI */
|
||||
void* PyMem_Malloc(size_t size);
|
||||
|
||||
#define PyMem_Malloc(size) \
|
||||
_PyMem_MallocTrace(__FILE__, __LINE__, size)
|
||||
|
||||
The GC allocator functions would also have to be patched. For example,
|
||||
``_PyObject_GC_Malloc()`` is used in many C functions and so objects of
|
||||
different types would have the same allocation location.
|
||||
|
||||
This alternative was rejected because passing a filename and a line
|
||||
number to each allocator makes the API more complex: pass 3 new
|
||||
arguments (ctx, filename, lineno) to each allocator function, instead of
|
||||
just a context argument (ctx). Having to also modify GC allocator
|
||||
functions adds too much complexity for a little gain.
|
||||
|
||||
|
||||
GIL-free PyMem_Malloc()
|
||||
-----------------------
|
||||
|
||||
In Python 3.3, when Python is compiled in debug mode, ``PyMem_Malloc()``
|
||||
indirectly calls ``PyObject_Malloc()`` which requires the GIL to be
|
||||
held (it isn't thread-safe). That's why ``PyMem_Malloc()`` must be called
|
||||
with the GIL held.
|
||||
|
||||
This PEP changes ``PyMem_Malloc()``: it now always calls ``malloc()``
|
||||
rather than ``PyObject_Malloc()``. The "GIL must be held" restriction
|
||||
could therefore be removed from ``PyMem_Malloc()``.
|
||||
|
||||
This alternative was rejected because allowing to call
|
||||
``PyMem_Malloc()`` without holding the GIL can break applications
|
||||
which setup their own allocators or allocator hooks. Holding the GIL is
|
||||
convenient to develop a custom allocator: no need to care about other
|
||||
threads. It is also convenient for a debug allocator hook: Python
|
||||
objects can be safely inspected, and the C API may be used for reporting.
|
||||
|
||||
Moreover, calling ``PyGILState_Ensure()`` in a memory allocator has
|
||||
unexpected behaviour, especially at Python startup and when creating of a
|
||||
new Python thread state. It is better to free custom allocators of
|
||||
the responsibility of acquiring the GIL.
|
||||
|
||||
|
||||
Don't add PyMem_RawMalloc()
|
||||
---------------------------
|
||||
|
||||
Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is
|
||||
held. Otherwise, keep ``malloc()`` unchanged.
|
||||
|
||||
The ``PyMem_Malloc()`` is used without the GIL held in some Python
|
||||
functions. For example, the ``main()`` and ``Py_Main()`` functions of
|
||||
Python call ``PyMem_Malloc()`` whereas the GIL do not exist yet. In this
|
||||
case, ``PyMem_Malloc()`` would be replaced with ``malloc()`` (or
|
||||
``PyMem_RawMalloc()``).
|
||||
|
||||
This alternative was rejected because ``PyMem_RawMalloc()`` is required
|
||||
for accurate reports of the memory usage. When a debug hook is used to
|
||||
track the memory usage, the memory allocated by direct calls to
|
||||
``malloc()`` cannot be tracked. ``PyMem_RawMalloc()`` can be hooked and
|
||||
so all the memory allocated by Python can be tracked, including
|
||||
memory allocated without holding the GIL.
|
||||
|
||||
|
||||
Use existing debug tools to analyze memory use
|
||||
----------------------------------------------
|
||||
|
||||
There are many existing debug tools to analyze memory use. Some
|
||||
examples: `Valgrind <http://valgrind.org/>`_, `Purify
|
||||
<http://ibm.com/software/awdtools/purify/>`_, `Clang AddressSanitizer
|
||||
<http://code.google.com/p/address-sanitizer/>`_, `failmalloc
|
||||
<http://www.nongnu.org/failmalloc/>`_, etc.
|
||||
|
||||
The problem is to retrieve the Python object related to a memory pointer
|
||||
to read its type and/or its content. Another issue is to retrieve the
|
||||
source of the memory allocation: the C backtrace is usually useless
|
||||
(same reasoning than macros using ``__FILE__`` and ``__LINE__``, see
|
||||
`Pass the C filename and line number`_), the Python filename and line
|
||||
number (or even the Python traceback) is more useful.
|
||||
|
||||
This alternative was rejected because classic tools are unable to
|
||||
introspect Python internals to collect such information. Being able to
|
||||
setup a hook on allocators called with the GIL held allows to collect a
|
||||
lot of useful data from Python internals.
|
||||
|
||||
|
||||
Add a msize() function
|
||||
----------------------
|
||||
|
||||
Add another function to ``PyMemAllocator`` and
|
||||
``PyObjectArenaAllocator`` structures::
|
||||
|
||||
size_t msize(void *ptr);
|
||||
|
||||
This function returns the size of a memory block or a memory mapping.
|
||||
Return (size_t)-1 if the function is not implemented or if the pointer
|
||||
is unknown (ex: NULL pointer).
|
||||
|
||||
On Windows, this function can be implemented using ``_msize()`` and
|
||||
``VirtualQuery()``.
|
||||
|
||||
The function can be used to implement a hook tracking the memory usage.
|
||||
The ``free()`` method of an allocator only gets the address of a memory
|
||||
block, whereas the size of the memory block is required to update the
|
||||
memory usage.
|
||||
|
||||
The additional ``msize()`` function was rejected because only few
|
||||
platforms implement it. For example, Linux with the GNU libc does not
|
||||
provide a function to get the size of a memory block. ``msize()`` is not
|
||||
currently used in the Python source code. The function would only be
|
||||
used to track memory use, and make the API more complex. A debug hook
|
||||
can implement the function internally, there is no need to add it to
|
||||
``PyMemAllocator`` and ``PyObjectArenaAllocator`` structures.
|
||||
|
||||
|
||||
No context argument
|
||||
-------------------
|
||||
|
||||
Simplify the signature of allocator functions, remove the context
|
||||
argument:
|
||||
|
||||
* ``void* malloc(size_t size)``
|
||||
* ``void* realloc(void *ptr, size_t new_size)``
|
||||
* ``void free(void *ptr)``
|
||||
|
||||
It is likely for an allocator hook to be reused for
|
||||
``PyMem_SetAllocator()`` and ``PyObject_SetAllocator()``, or even
|
||||
``PyMem_SetRawAllocator()``, but the hook must call a different function
|
||||
depending on the allocator. The context is a convenient way to reuse the
|
||||
same custom allocator or hook for different Python allocators.
|
||||
|
||||
In C++, the context can be used to pass *this*.
|
||||
|
||||
|
||||
External Libraries
|
||||
==================
|
||||
|
||||
Examples of API used to customize memory allocators.
|
||||
|
||||
Libraries used by Python:
|
||||
|
||||
* OpenSSL: `CRYPTO_set_mem_functions()
|
||||
<http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/mem.c;h=f7984fa958eb1edd6c61f6667f3f2b29753be662;hb=HEAD#l124>`_
|
||||
to set memory management functions globally
|
||||
* expat: `parserCreate()
|
||||
<http://hg.python.org/cpython/file/cc27d50bd91a/Modules/expat/xmlparse.c#l724>`_
|
||||
has a per-instance memory handler
|
||||
* zlib: `zlib 1.2.8 Manual <http://www.zlib.net/manual.html#Usage>`_,
|
||||
pass an opaque pointer
|
||||
* bz2: `bzip2 and libbzip2, version 1.0.5
|
||||
<http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html>`_,
|
||||
pass an opaque pointer
|
||||
* lzma: `LZMA SDK - How to Use
|
||||
<http://www.asawicki.info/news_1368_lzma_sdk_-_how_to_use.html>`_,
|
||||
pass an opaque pointer
|
||||
* lipmpdec: no opaque pointer (classic malloc API)
|
||||
|
||||
Other libraries:
|
||||
|
||||
* glib: `g_mem_set_vtable()
|
||||
<http://developer.gnome.org/glib/unstable/glib-Memory-Allocation.html#g-mem-set-vtable>`_
|
||||
* libxml2:
|
||||
`xmlGcMemSetup() <http://xmlsoft.org/html/libxml-xmlmemory.html>`_,
|
||||
global
|
||||
* Oracle's OCI: `Oracle Call Interface Programmer's Guide,
|
||||
Release 2 (9.2)
|
||||
<http://docs.oracle.com/cd/B10501_01/appdev.920/a96584/oci15re4.htm>`_,
|
||||
pass an opaque pointer
|
||||
|
||||
The new *ctx* parameter of this PEP was inspired by the API of zlib and
|
||||
Oracle's OCI libraries.
|
||||
|
||||
See also the `GNU libc: Memory Allocation Hooks
|
||||
<http://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html>`_
|
||||
which uses a different approach to hook memory allocators.
|
||||
|
||||
|
||||
Memory Allocators
|
||||
=================
|
||||
|
||||
The C standard library provides the well known ``malloc()`` function.
|
||||
Its implementation depends on the platform and of the C library. The GNU
|
||||
C library uses a modified ptmalloc2, based on "Doug Lea's Malloc"
|
||||
(dlmalloc). FreeBSD uses `jemalloc
|
||||
<http://www.canonware.com/jemalloc/>`_. Google provides *tcmalloc* which
|
||||
is part of `gperftools <http://code.google.com/p/gperftools/>`_.
|
||||
|
||||
``malloc()`` uses two kinds of memory: heap and memory mappings. Memory
|
||||
mappings are usually used for large allocations (ex: larger than 256
|
||||
KB), whereas the heap is used for small allocations.
|
||||
|
||||
On UNIX, the heap is handled by ``brk()`` and ``sbrk()`` system calls,
|
||||
and it is contiguous. On Windows, the heap is handled by
|
||||
``HeapAlloc()`` and can be discontiguous. Memory mappings are handled by
|
||||
``mmap()`` on UNIX and ``VirtualAlloc()`` on Windows, they can be
|
||||
discontiguous.
|
||||
|
||||
Releasing a memory mapping gives back immediatly the memory to the
|
||||
system. On UNIX, the heap memory is only given back to the system if the
|
||||
released block is located at the end of the heap. Otherwise, the memory
|
||||
will only be given back to the system when all the memory located after
|
||||
the released memory is also released.
|
||||
|
||||
To allocate memory on the heap, an allocator tries to reuse free space.
|
||||
If there is no contiguous space big enough, the heap must be enlarged,
|
||||
even if there is more free space than required size. This issue is
|
||||
called the "memory fragmentation": the memory usage seen by the system
|
||||
is higher than real usage. On Windows, ``HeapAlloc()`` creates
|
||||
a new memory mapping with ``VirtualAlloc()`` if there is not enough free
|
||||
contiguous memory.
|
||||
|
||||
CPython has a *pymalloc* allocator for allocations smaller than 512
|
||||
bytes. This allocator is optimized for small objects with a short
|
||||
lifetime. It uses memory mappings called "arenas" with a fixed size of
|
||||
256 KB.
|
||||
|
||||
Other allocators:
|
||||
|
||||
* Windows provides a `Low-fragmentation Heap
|
||||
<http://msdn.microsoft.com/en-us/library/windows/desktop/aa366750%28v=vs.85%29.aspx>`_.
|
||||
|
||||
* The Linux kernel uses `slab allocation
|
||||
<http://en.wikipedia.org/wiki/Slab_allocation>`_.
|
||||
|
||||
* The glib library has a `Memory Slice API
|
||||
<https://developer.gnome.org/glib/unstable/glib-Memory-Slices.html>`_:
|
||||
efficient way to allocate groups of equal-sized chunks of memory
|
||||
|
||||
This PEP allows to choose exactly which memory allocator is used for your
|
||||
application depending on its usage of the memory (number of allocations,
|
||||
size of allocations, lifetime of objects, etc.).
|
||||
|
||||
|
||||
Links
|
||||
=====
|
||||
|
||||
CPython issues related to memory allocation:
|
||||
|
||||
* `Issue #3329: Add new APIs to customize memory allocators
|
||||
<http://bugs.python.org/issue3329>`_
|
||||
* `Issue #13483: Use VirtualAlloc to allocate memory arenas
|
||||
<http://bugs.python.org/issue13483>`_
|
||||
* `Issue #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline,
|
||||
which isn't thread safe <http://bugs.python.org/issue16742>`_
|
||||
* `Issue #18203: Replace calls to malloc() with PyMem_Malloc() or
|
||||
PyMem_RawMalloc() <http://bugs.python.org/issue18203>`_
|
||||
* `Issue #18227: Use Python memory allocators in external libraries like
|
||||
zlib or OpenSSL <http://bugs.python.org/issue18227>`_
|
||||
|
||||
Projects analyzing the memory usage of Python applications:
|
||||
|
||||
* `pytracemalloc
|
||||
<https://pypi.python.org/pypi/pytracemalloc>`_
|
||||
* `Meliae: Python Memory Usage Analyzer
|
||||
<https://pypi.python.org/pypi/meliae>`_
|
||||
* `Guppy-PE: umbrella package combining Heapy and GSL
|
||||
<http://guppy-pe.sourceforge.net/>`_
|
||||
* `PySizer (developed for Python 2.4)
|
||||
<http://pysizer.8325.org/>`_
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed into the public domain.
|
||||
|
|
@ -0,0 +1,242 @@
|
|||
PEP: 446
|
||||
Title: Add new parameters to configure the inheritance of files and for non-blocking sockets
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Victor Stinner <victor.stinner@gmail.com>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 3-July-2013
|
||||
Python-Version: 3.4
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes new portable parameters and functions to configure the
|
||||
inheritance of file descriptors and the non-blocking flag of sockets.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Inheritance of file descriptors
|
||||
-------------------------------
|
||||
|
||||
The inheritance of file descriptors in child processes can be configured
|
||||
on each file descriptor using a *close-on-exec* flag. By default, the
|
||||
close-on-exec flag is not set.
|
||||
|
||||
On Windows, the close-on-exec flag is ``HANDLE_FLAG_INHERIT``. File
|
||||
descriptors are not inherited if the ``bInheritHandles`` parameter of
|
||||
the ``CreateProcess()`` function is ``FALSE``, even if the
|
||||
``HANDLE_FLAG_INHERIT`` flag is set. If ``bInheritHandles`` is ``TRUE``,
|
||||
only file descriptors with ``HANDLE_FLAG_INHERIT`` flag set are
|
||||
inherited, others are not.
|
||||
|
||||
On UNIX, the close-on-exec flag is ``O_CLOEXEC``. File descriptors with
|
||||
the ``O_CLOEXEC`` flag set are closed at the execution of a new program
|
||||
(ex: when calling ``execv()``).
|
||||
|
||||
The ``O_CLOEXEC`` flag has no effect on ``fork()``, all file descriptors
|
||||
are inherited by the child process. Futhermore, most properties file
|
||||
descriptors are shared between the parent and the child processes,
|
||||
except file attributes which are duplicated (``O_CLOEXEC`` is the only
|
||||
file attribute). Setting ``O_CLOEXEC`` flag of a file descriptor in the
|
||||
child process does not change the ``O_CLOEXEC`` flag of the file
|
||||
descriptor in the parent process.
|
||||
|
||||
|
||||
Issues of the inheritance of file descriptors
|
||||
---------------------------------------------
|
||||
|
||||
Inheritance of file descriptors causes issues. For example, closing a
|
||||
file descriptor in the parent process does not release the resource
|
||||
(file, socket, ...), because the file descriptor is still open in the
|
||||
child process.
|
||||
|
||||
Leaking file descriptors is also a major security vulnerability. An
|
||||
untrusted child process can read sensitive data like passwords and take
|
||||
control of the parent process though leaked file descriptors. It is for
|
||||
example a known vulnerability to escape from a chroot.
|
||||
|
||||
|
||||
Non-blocking sockets
|
||||
--------------------
|
||||
|
||||
To handle multiple network clients in a single thread, a multiplexing
|
||||
function like ``select()`` can be used. For best performances, sockets
|
||||
must be configured as non-blocking. Operations like ``send()`` and
|
||||
``recv()`` return an ``EAGAIN`` or ``EWOULDBLOCK`` error if the
|
||||
operation would block.
|
||||
|
||||
By default, newly created sockets are blocking. Setting the non-blocking
|
||||
mode requires additional system calls.
|
||||
|
||||
On UNIX, the blocking flag is ``O_NONBLOCK``: a pipe and a socket are
|
||||
non-blocking if the ``O_NONBLOCK`` flag is set.
|
||||
|
||||
|
||||
Setting flags at the creation of the file descriptor
|
||||
----------------------------------------------------
|
||||
|
||||
Windows and recent versions of other operating systems like Linux
|
||||
support setting the close-on-exec flag directly at the creation of file
|
||||
descriptors, and close-on-exec and blocking flags at the creation of
|
||||
sockets.
|
||||
|
||||
Setting these flags at the creation is atomic and avoids additional
|
||||
system calls.
|
||||
|
||||
|
||||
Proposal
|
||||
========
|
||||
|
||||
New cloexec And blocking Parameters
|
||||
-----------------------------------
|
||||
|
||||
Add a new optional *cloexec* on functions creating file descriptors:
|
||||
|
||||
* ``io.FileIO``
|
||||
* ``io.open()``
|
||||
* ``open()``
|
||||
* ``os.dup()``
|
||||
* ``os.dup2()``
|
||||
* ``os.fdopen()``
|
||||
* ``os.open()``
|
||||
* ``os.openpty()``
|
||||
* ``os.pipe()``
|
||||
* ``select.devpoll()``
|
||||
* ``select.epoll()``
|
||||
* ``select.kqueue()``
|
||||
|
||||
Add new optional *cloexec* and *blocking* parameters to functions
|
||||
creating sockets:
|
||||
|
||||
* ``asyncore.dispatcher.create_socket()``
|
||||
* ``socket.socket()``
|
||||
* ``socket.socket.accept()``
|
||||
* ``socket.socket.dup()``
|
||||
* ``socket.socket.fromfd``
|
||||
* ``socket.socketpair()``
|
||||
|
||||
The default value of *cloexec* is ``False`` and the default value of
|
||||
*blocking* is ``True``.
|
||||
|
||||
The atomicity is not guaranteed. If the platform does not support
|
||||
setting close-on-exec and blocking flags at the creation of the file
|
||||
descriptor or socket, the flags are set using additional system calls.
|
||||
|
||||
|
||||
New Functions
|
||||
-------------
|
||||
|
||||
Add new functions the get and set the close-on-exec flag of a file
|
||||
descriptor, available on all platforms:
|
||||
|
||||
* ``os.get_cloexec(fd:int) -> bool``
|
||||
* ``os.set_cloexec(fd:int, cloexec: bool)``
|
||||
|
||||
Add new functions the get and set the blocking flag of a file
|
||||
descriptor, only available on UNIX:
|
||||
|
||||
* ``os.get_blocking(fd:int) -> bool``
|
||||
* ``os.set_blocking(fd:int, blocking: bool)``
|
||||
|
||||
|
||||
Other Changes
|
||||
-------------
|
||||
|
||||
The ``subprocess.Popen`` class must clear the close-on-exec flag of file
|
||||
descriptors of the ``pass_fds`` parameter. The flag is cleared in the
|
||||
child process before executing the program, the change does not change
|
||||
the flag in the parent process.
|
||||
|
||||
The close-on-exec flag must also be set on private file descriptors and
|
||||
sockets in the Python standard library. For example, on UNIX,
|
||||
os.urandom() opens ``/dev/urandom`` to read some random bytes and the
|
||||
file descriptor is closed at function exit. The file descriptor is not
|
||||
expected to be inherited by child processes.
|
||||
|
||||
|
||||
Rejected Alternatives
|
||||
=====================
|
||||
|
||||
PEP 433
|
||||
-------
|
||||
|
||||
The PEP 433 entitled "Easier suppression of file descriptor inheritance"
|
||||
is a previous attempt proposing various other alternatives, but no
|
||||
consensus could be reached.
|
||||
|
||||
This PEP has a well defined behaviour (the default value of the new
|
||||
*cloexec* parameter is not configurable), is more conservative (no
|
||||
backward compatibility issue), and is much simpler.
|
||||
|
||||
|
||||
Add blocking parameter for file descriptors and use Windows overlapped I/O
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Windows supports non-blocking operations on files using an extension of
|
||||
the Windows API called "Overlapped I/O". Using this extension requires
|
||||
to modify the Python standard library and applications to pass a
|
||||
``OVERLAPPED`` structure and an event loop to wait for the completion of
|
||||
operations.
|
||||
|
||||
This PEP only tries to expose portable flags on file descriptors and
|
||||
sockets. Supporting overlapped I/O requires an abstraction providing a
|
||||
high-level and portable API for asynchronous operations on files and
|
||||
sockets. Overlapped I/O are out of the scope of this PEP.
|
||||
|
||||
UNIX supports non-blocking files, moreover recent versions of operating
|
||||
systems support setting the non-blocking flag at the creation of a file
|
||||
descriptor. It would be possible to add a new optional *blocking*
|
||||
parameter to Python functions creating file descriptors. On Windows,
|
||||
creating a file descriptor with ``blocking=False`` would raise a
|
||||
``NotImplementedError``. This behaviour is not acceptable for the ``os``
|
||||
module which is designed as a thin wrapper on the C functions of the
|
||||
operating system. If a platform does not support a function, the
|
||||
function should not be available on the platform. For example,
|
||||
the ``os.fork()`` function is not available on Windows.
|
||||
|
||||
For all these reasons, this alternative was rejected. The PEP 3156
|
||||
proposes an abstraction for asynchronous I/O supporting non-blocking
|
||||
files on Windows.
|
||||
|
||||
|
||||
Links
|
||||
=====
|
||||
|
||||
Python issues:
|
||||
|
||||
* `#10115: Support accept4() for atomic setting of flags at socket
|
||||
creation <http://bugs.python.org/issue10115>`_
|
||||
* `#12105: open() does not able to set flags, such as O_CLOEXEC
|
||||
<http://bugs.python.org/issue12105>`_
|
||||
* `#12107: TCP listening sockets created without FD_CLOEXEC flag
|
||||
<http://bugs.python.org/issue12107>`_
|
||||
* `#16850: Add "e" mode to open(): close-and-exec
|
||||
(O_CLOEXEC) / O_NOINHERIT <http://bugs.python.org/issue16850>`_
|
||||
* `#16860: Use O_CLOEXEC in the tempfile module
|
||||
<http://bugs.python.org/issue16860>`_
|
||||
* `#16946: subprocess: _close_open_fd_range_safe() does not set
|
||||
close-on-exec flag on Linux < 2.6.23 if O_CLOEXEC is defined
|
||||
<http://bugs.python.org/issue16946>`_
|
||||
* `#17070: Use the new cloexec to improve security and avoid bugs
|
||||
<http://bugs.python.org/issue17070>`_
|
||||
|
||||
Other links:
|
||||
|
||||
* `Secure File Descriptor Handling
|
||||
<http://udrepper.livejournal.com/20407.html>`_ (Ulrich Drepper,
|
||||
2008)
|
||||
* `Ghosts of Unix past, part 2: Conflated designs
|
||||
<http://lwn.net/Articles/412131/>`_ (Neil Brown, 2010) explains the
|
||||
history of ``O_CLOEXEC`` and ``O_NONBLOCK`` flags
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed into the public domain.
|
||||
|
Loading…
Reference in New Issue