Merge branch 'master' of github.com:python/peps

This commit is contained in:
Larry Hastings 2018-07-19 19:36:00 -07:00
commit 791464129f
84 changed files with 9249 additions and 2243 deletions

11
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,11 @@
<!--
Please include the PEP number in the pull request title, example:
PEP NNN: Summary of the changes made
In addition, please sign the CLA.
For more information, please read our Contributing Guidelines (CONTRIBUTING.rst)
-->

1
.gitignore vendored
View File

@ -1,4 +1,5 @@
pep-0000.txt pep-0000.txt
pep-0000.rst
pep-????.html pep-????.html
__pycache__ __pycache__
*.pyc *.pyc

View File

@ -1,6 +1,6 @@
language: python language: python
python: python:
- 3.5 - "3.7-dev"
sudo: false sudo: false
cache: pip cache: pip

13
CODE_OF_CONDUCT.rst Normal file
View File

@ -0,0 +1,13 @@
Code of Conduct
===============
Please note that all interactions on
`Python Software Foundation <https://www.python.org/psf-landing/>`__-supported
infrastructure is `covered
<https://www.python.org/psf/records/board/minutes/2014-01-06/#management-of-the-psfs-web-properties>`__
by the `PSF Code of Conduct <https://www.python.org/psf/codeofconduct/>`__,
which includes all infrastructure used in the development of Python itself
(e.g. mailing lists, issue trackers, GitHub, etc.).
In general this means everyone is expected to be open, considerate, and
respectful of others no matter what their position is within the project.

53
CONTRIBUTING.rst Normal file
View File

@ -0,0 +1,53 @@
Contributing Guidelines
=======================
To learn more about the purpose of PEPs and how to go about writing a PEP, please
start reading at PEP 1 (`pep-0001.txt <./pep-0001.txt>`_ in this repo). Note that
PEP 0, the index PEP, is now automatically generated, and not committed to the repo.
Before writing a new PEP
------------------------
Has this idea been proposed on `python-ideas <https://mail.python.org/mailman/listinfo/python-ideas>`_
and received general acceptance as being an idea worth pursuing? (if not then
please start a discussion there before submitting a pull request).
More details about it in `PEP 1 <https://www.python.org/dev/peps/pep-0001/#start-with-an-idea-for-python>`_.
Do you have an implementation of your idea? (this is important for when you
propose this PEP to `python-dev <https://mail.python.org/mailman/listinfo/python-dev>`_
as code maintenance is a critical aspect of all PEP proposals prior to a
final decision; in special circumstances an implementation can be deferred)
Commit messages
---------------
When committing to a PEP, please always include the PEP number in the subject
title. For example, ``PEP NNN: <summary of changes>``.
Sign the CLA
------------
Before you hit "Create pull request", please take a moment to ensure that this
project can legally accept your contribution by verifying you have signed the
PSF Contributor Agreement:
https://www.python.org/psf/contrib/contrib-form/
If you haven't signed the CLA before, please follow the steps outlined in the
CPython devguide to do so:
https://devguide.python.org/pullrequest/#licensing
Thanks again to your contribution and we look forward to looking at it!
Code of Conduct
---------------
All interactions for this project are covered by the
`PSF Code of Conduct <https://www.python.org/psf/codeofconduct/>`_. Everyone is
expected to be open, considerate, and respectful of others no matter their
position within the project.

View File

@ -15,13 +15,13 @@ PYTHON=python3
.rst.html: .rst.html:
@$(PYTHON) $(PEP2HTML) $< @$(PYTHON) $(PEP2HTML) $<
TARGETS= $(patsubst %.rst,%.html,$(wildcard pep-????.rst)) $(patsubst %.txt,%.html,$(wildcard pep-????.txt)) pep-0000.html TARGETS= $(patsubst %.rst,%.html,$(wildcard pep-????.rst)) $(patsubst %.txt,%.html,$(wildcard pep-????.txt)) pep-0000.html
all: pep-0000.txt $(TARGETS) all: pep-0000.rst $(TARGETS)
$(TARGETS): pep2html.py $(TARGETS): pep2html.py
pep-0000.txt: $(wildcard pep-????.txt) $(wildcard pep-????.rst) $(wildcard pep0/*.py) pep-0000.rst: $(wildcard pep-????.txt) $(wildcard pep-????.rst) $(wildcard pep0/*.py) genpepindex.py
$(PYTHON) genpepindex.py . $(PYTHON) genpepindex.py .
rss: rss:
@ -31,6 +31,7 @@ install:
echo "Installing is not necessary anymore. It will be done in post-commit." echo "Installing is not necessary anymore. It will be done in post-commit."
clean: clean:
-rm pep-0000.rst
-rm pep-0000.txt -rm pep-0000.txt
-rm *.html -rm *.html

View File

@ -11,6 +11,12 @@ PEPs and how to go about writing a PEP, please start reading at PEP 1
now automatically generated, and not committed to the repo. now automatically generated, and not committed to the repo.
Contributing to PEPs
====================
See the `Contributing Guidelines <./CONTRIBUTING.rst>`_.
reStructuredText for PEPs reStructuredText for PEPs
========================= =========================
@ -26,12 +32,41 @@ package, which is available from `PyPI <http://pypi.python.org>`_.
If you have pip, ``pip install docutils`` should install it. If you have pip, ``pip install docutils`` should install it.
Generating HTML Generating the PEP Index
=============== ========================
PEP 0 is automatically generated based on the metadata headers in other
PEPs. The script handling this is ``genpepindex.py``, with supporting
libraries in the ``pep0`` directory.
Checking PEP formatting and rendering
=====================================
Do not commit changes with bad formatting. To check the formatting of Do not commit changes with bad formatting. To check the formatting of
a PEP, use the Makefile. In particular, to generate HTML for PEP 999, a PEP, use the Makefile. In particular, to generate HTML for PEP 999,
your source code should be in ``pep-0999.txt`` and the HTML will be your source code should be in ``pep-0999.rst`` and the HTML will be
generated to ``pep-0999.html`` by the command ``make pep-0999.html``. generated to ``pep-0999.html`` by the command ``make pep-0999.html``.
The default Make target generates HTML for all PEPs. If you don't have The default Make target generates HTML for all PEPs.
Make, use the ``pep2html.py`` script.
If you don't have Make, use the ``pep2html.py`` script directly.
Generating HTML for python.org
==============================
python.org includes its own helper modules to render PEPs as HTML, with
suitable links back to the source pages in the version control repository.
These can be found at https://github.com/python/pythondotorg/tree/master/peps
When making changes to the PEP management process that may impact python.org's
rendering pipeline:
* Clone the python.org repository from https://github.com/python/pythondotorg/
* Get set up for local python.org development as per
https://pythondotorg.readthedocs.io/install.html#manual-setup
* Adjust ``PEP_REPO_PATH`` in ``pydotorg/settings/local.py`` to refer to your
local clone of the PEP repository
* Run ``./manage.py generate_pep_pages`` as described in
https://pythondotorg.readthedocs.io/pep_generation.html

View File

@ -36,7 +36,7 @@ def main(argv):
peps = [] peps = []
if os.path.isdir(path): if os.path.isdir(path):
for file_path in os.listdir(path): for file_path in os.listdir(path):
if file_path == 'pep-0000.txt': if file_path.startswith('pep-0000.'):
continue continue
abs_file_path = os.path.join(path, file_path) abs_file_path = os.path.join(path, file_path)
if not os.path.isfile(abs_file_path): if not os.path.isfile(abs_file_path):
@ -61,7 +61,7 @@ def main(argv):
else: else:
raise ValueError("argument must be a directory or file path") raise ValueError("argument must be a directory or file path")
with codecs.open('pep-0000.txt', 'w', encoding='UTF-8') as pep0_file: with codecs.open('pep-0000.rst', 'w', encoding='UTF-8') as pep0_file:
write_pep0(peps, pep0_file) write_pep0(peps, pep0_file)
if __name__ == "__main__": if __name__ == "__main__":

Binary file not shown.

Before

Width:  |  Height:  |  Size: 20 KiB

BIN
pep-0001-process_flow.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

View File

@ -245,7 +245,22 @@ Once a PEP has been accepted, the reference implementation must be
completed. When the reference implementation is complete and incorporated completed. When the reference implementation is complete and incorporated
into the main source code repository, the status will be changed to "Final". into the main source code repository, the status will be changed to "Final".
A PEP can also be assigned status "Deferred". The PEP author or an To allow gathering of additional design and interface feedback before committing
to long term stability for a language feature or standard library API, a PEP
may also be marked as "Provisional". This is short for "Provisionally Accepted",
and indicates that the proposal has been accepted for inclusion in the reference
implementation, but additional user feedback is needed before the full design
can be considered "Final". Unlike regular accepted PEPs, provisionally accepted
PEPs may still be Rejected or Withdrawn *even after the related changes have
been included in a Python release*.
Wherever possible, it is considered preferable to reduce the scope of a proposal
to avoid the need to rely on the "Provisional" status (e.g. by deferring some
features to later PEPs), as this status can lead to version compatibility
challenges in the wider Python ecosystem. PEP 411 provides additional details
on potential use cases for the Provisional status.
A PEP can also be assigned the status "Deferred". The PEP author or an
editor can assign the PEP this status when no progress is being made editor can assign the PEP this status when no progress is being made
on the PEP. Once a PEP is deferred, a PEP editor can re-assign it on the PEP. Once a PEP is deferred, a PEP editor can re-assign it
to draft status. to draft status.
@ -267,7 +282,17 @@ an API can replace version 1.
The possible paths of the status of PEPs are as follows: The possible paths of the status of PEPs are as follows:
.. image:: pep-0001-1.png .. image:: pep-0001-process_flow.png
:alt: PEP process flow diagram
While not shown in the diagram, "Accepted" PEPs may technically move to
"Rejected" or "Withdrawn" even after acceptance. This will only occur if
the implementation process reveals fundamental flaws in the design that were
not noticed prior to acceptance of the PEP. Unlike Provisional PEPs, these
transitions are only permitted if the accepted proposal has *not* been included
in a Python release - released changes must instead go through the regular
deprecation process (which may require a new PEP providing the rationale for
the deprecation).
Some Informational and Process PEPs may also have a status of "Active" Some Informational and Process PEPs may also have a status of "Active"
if they are never meant to be completed. E.g. PEP 1 (this PEP). if they are never meant to be completed. E.g. PEP 1 (this PEP).
@ -281,6 +306,11 @@ reached the Final state. Once a PEP has been completed, the Language and
Standard Library References become the formal documentation of the expected Standard Library References become the formal documentation of the expected
behavior. behavior.
If changes based on implementation experience and user feedback are made to
Standards track PEPs while in the Accepted or Provisional State, those changes
should be noted in the PEP, such that the PEP accurately describes the state of
the implementation at the point where it is marked Final.
Informational and Process PEPs may be updated over time to reflect changes Informational and Process PEPs may be updated over time to reflect changes
to development practices and other details. The precise process followed in to development practices and other details. The precise process followed in
these cases will depend on the nature and purpose of the PEP being updated. these cases will depend on the nature and purpose of the PEP being updated.
@ -345,6 +375,15 @@ Each PEP should have the following parts:
appropriate for either the Python language reference or the appropriate for either the Python language reference or the
standard library reference. standard library reference.
9. How to Teach This -- For a PEP that adds new functionality or changes
language behavior, it is helpful to include a section on how to
teach users, new and experienced, how to apply the PEP to their
work.
This section may include key points and recommended documentation
changes that would help users adopt a new feature or migrate their
code to use a language change.
PEP Formats and Templates PEP Formats and Templates
========================= =========================
@ -354,10 +393,8 @@ ReStructuredText_ allows for rich markup that is still quite easy to
read, but also results in good-looking and functional HTML. PEP 12 read, but also results in good-looking and functional HTML. PEP 12
contains instructions and a template [4]_ for reStructuredText PEPs. contains instructions and a template [4]_ for reStructuredText PEPs.
A Python script automatically converts PEPs to HTML for viewing on The PEP text files are automatically converted to HTML [5]_ for easier
the web [5]_. The conversion of reStructuredText PEPs is handled by `online reading <https://www.python.org/dev/peps/>`__.
the Docutils_ module; the same script also renders a legacy plain-text
format of PEP internally, to support pre-reST documents.
PEP Header Preamble PEP Header Preamble
@ -372,7 +409,7 @@ optional and are described below. All other headers are required. ::
Author: <list of authors' real names and optionally, email addrs> Author: <list of authors' real names and optionally, email addrs>
* BDFL-Delegate: <PEP czar's real name> * BDFL-Delegate: <PEP czar's real name>
* Discussions-To: <email address> * Discussions-To: <email address>
Status: <Draft | Active | Accepted | Deferred | Rejected | Status: <Draft | Active | Accepted | Provisional | Deferred | Rejected |
Withdrawn | Final | Superseded> Withdrawn | Final | Superseded>
Type: <Standards Track | Informational | Process> Type: <Standards Track | Informational | Process>
* Content-Type: <text/x-rst | text/plain> * Content-Type: <text/x-rst | text/plain>
@ -441,8 +478,8 @@ Standards Track PEPs will typically have a Python-Version header which
indicates the version of Python that the feature will be released with. indicates the version of Python that the feature will be released with.
Standards Track PEPs without a Python-Version header indicate Standards Track PEPs without a Python-Version header indicate
interoperability standards that will initially be supported through interoperability standards that will initially be supported through
external libraries and tools, and then supplemented by a later PEP to external libraries and tools, and then potentially supplemented by a later PEP
add support to the standard library. Informational and Process PEPs do to add support to the standard library. Informational and Process PEPs do
not need a Python-Version header. not need a Python-Version header.
PEPs may have a Requires header, indicating the PEP numbers that this PEPs may have a Requires header, indicating the PEP numbers that this
@ -458,11 +495,15 @@ obsolete.
Auxiliary Files Auxiliary Files
=============== ===============
PEPs may include auxiliary files such as diagrams. Such files must be PEPs may include auxiliary files such as diagrams. Such files should be
named ``pep-XXXX-Y.ext``, where "XXXX" is the PEP number, "Y" is a named ``pep-XXXX-Y.ext``, where "XXXX" is the PEP number, "Y" is a
serial number (starting at 1), and "ext" is replaced by the actual serial number (starting at 1), and "ext" is replaced by the actual
file extension (e.g. "png"). file extension (e.g. "png").
Alternatively, all support files may be placed in a subdirectory called
``pep-XXXX``, where "XXXX" is the PEP number. When using a subdirectory, there
are no constraints on the names used in files.
Reporting PEP Bugs, or Submitting PEP Updates Reporting PEP Bugs, or Submitting PEP Updates
============================================= =============================================
@ -472,15 +513,15 @@ factors, such as the maturity of the PEP, the preferences of the PEP
author, and the nature of your comments. For the early draft stages author, and the nature of your comments. For the early draft stages
of the PEP, it's probably best to send your comments and changes of the PEP, it's probably best to send your comments and changes
directly to the PEP author. For more mature, or finished PEPs you may directly to the PEP author. For more mature, or finished PEPs you may
want to submit corrections to the Python `issue tracker`_ so that your want to submit corrections as a `GitHub issue`_ or `GitHub pull request`_ so that
changes don't get lost. If the PEP author is a Python developer, assign the your changes don't get lost.
bug/patch to them, otherwise assign it to a PEP editor.
When in doubt about where to send your changes, please check first When in doubt about where to send your changes, please check first
with the PEP author and/or a PEP editor. with the PEP author and/or a PEP editor.
PEP authors with git push privileges for the PEP repository can update the PEP authors with git push privileges for the PEP repository can update the
PEPs themselves by using "git push" to submit their changes. PEPs themselves by using "git push" or the GitHub PR interface to submit their
changes.
Transferring PEP Ownership Transferring PEP Ownership
@ -600,11 +641,9 @@ References and Footnotes
.. [4] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw .. [4] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw
(http://www.python.org/dev/peps/pep-0012) (http://www.python.org/dev/peps/pep-0012)
.. [5] The script referred to here is pep2pyramid.py, the successor to .. [5] More details on the PEP rendering and publication process can be found
pep2html.py, both of which live in the same directory in the hg in the PEPs repo README at
repo as the PEPs themselves. Try ``pep2html.py --help`` for https://github.com/python/peps/blob/master/README.rst
details. The URL for viewing PEPs on the web is
http://www.python.org/dev/peps/.
.. _issue tracker: .. _issue tracker:
http://bugs.python.org/ http://bugs.python.org/
@ -619,6 +658,8 @@ References and Footnotes
.. _`GitHub pull request`: https://github.com/python/peps/pulls .. _`GitHub pull request`: https://github.com/python/peps/pulls
.. _`GitHub issue`: https://github.com/python/peps/issues
Copyright Copyright
========= =========

580
pep-0001/process_flow.svg Normal file
View File

@ -0,0 +1,580 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
version="1.0"
width="518.000000pt"
height="230.000000pt"
viewBox="0 0 518.000000 230.000000"
preserveAspectRatio="xMidYMid meet"
id="svg3789"
sodipodi:docname="pep-0001-1.svg"
inkscape:version="0.92.2 (5c3e80d, 2017-08-06)">
<metadata
id="metadata3795">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
</cc:Work>
</rdf:RDF>
</metadata>
<defs
id="defs3793">
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker26466"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path26464"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker24282"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path24280" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker18063"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path18061" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker16749"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path16747"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker15177"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path15175" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker14679"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path14677"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker13779"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path13777" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker12309"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path12307"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker11613"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path11611" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker9945"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path9943"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker5313"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path5311" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow1Mend"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path4732"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:stockid="Arrow1Lend"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow1Lend"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path4726"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.8) rotate(180) translate(12.5,0)" />
</marker>
<marker
inkscape:stockid="Arrow2Lend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker5033"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path5031"
style="fill-rule:evenodd;stroke-width:0.625;stroke-linejoin:round;stroke:#000000;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 8.7185878,4.0337352 L -2.2072895,0.016013256 L 8.7185884,-4.0017078 C 6.9730900,-1.6296469 6.9831476,1.6157441 8.7185878,4.0337352 z "
transform="scale(1.1) rotate(180) translate(1,0)" />
</marker>
<marker
inkscape:stockid="Arrow1Lstart"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow1Lstart"
style="overflow:visible"
inkscape:isstock="true">
<path
id="path4723"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.8) translate(12.5,0)" />
</marker>
<marker
inkscape:stockid="Arrow2Lend"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow2Lend"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path4744"
style="fill-rule:evenodd;stroke-width:0.625;stroke-linejoin:round;stroke:#000000;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 8.7185878,4.0337352 L -2.2072895,0.016013256 L 8.7185884,-4.0017078 C 6.9730900,-1.6296469 6.9831476,1.6157441 8.7185878,4.0337352 z "
transform="scale(1.1) rotate(180) translate(1,0)" />
</marker>
</defs>
<sodipodi:namedview
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1"
objecttolerance="10"
gridtolerance="10"
guidetolerance="10"
inkscape:pageopacity="0"
inkscape:pageshadow="2"
inkscape:window-width="2560"
inkscape:window-height="1376"
id="namedview3791"
showgrid="false"
inkscape:zoom="1.7126796"
inkscape:cx="356.07349"
inkscape:cy="132.46132"
inkscape:window-x="3200"
inkscape:window-y="0"
inkscape:window-maximized="1"
inkscape:current-layer="svg3789"
showguides="true"
inkscape:guide-bbox="true">
<sodipodi:guide
position="10.509846,117.79787"
orientation="1,0"
id="guide4717"
inkscape:locked="false" />
<sodipodi:guide
position="88.019964,219.39306"
orientation="0,1"
id="guide4719"
inkscape:locked="false" />
<sodipodi:guide
position="416.89059,151.51696"
orientation="0,1"
id="guide21702"
inkscape:locked="false" />
<sodipodi:guide
position="219.83096,105.97429"
orientation="0,1"
id="guide21704"
inkscape:locked="false" />
<sodipodi:guide
position="254.42587,56.052518"
orientation="0,1"
id="guide21706"
inkscape:locked="false" />
</sodipodi:namedview>
<g
id="g4690"
transform="translate(-95.026522,-3.0384519)">
<rect
id="rect4612"
width="127"
height="37"
x="194.48441"
y="61.797592"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
<text
id="text4622"
y="88.499252"
x="198.61234"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="88.499252"
x="198.61234"
id="tspan4620"
sodipodi:role="line">Provisional</tspan></text>
</g>
<g
id="g4675"
transform="translate(-0.490154,-0.39305957)">
<rect
id="rect4606"
width="127"
height="37"
x="11"
y="11"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;image-rendering:auto" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="45.952637"
y="37.70166"
id="text4630"><tspan
id="tspan4632"
sodipodi:role="line"
x="45.952637"
y="37.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Draft</tspan></text>
</g>
<g
id="g4700"
transform="translate(26.712527,-17.52529)">
<rect
id="rect4616"
width="127"
height="37"
x="194.92232"
y="172.58888"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="199.76997"
y="199.29054"
id="text4638"><tspan
sodipodi:role="line"
id="tspan4636"
x="199.76997"
y="199.29054"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Withdrawn</tspan></text>
</g>
<g
id="g4695"
transform="translate(-20.143873,-6.5596308)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="113.471"
x="180.90918"
height="37"
width="127"
id="rect4614" />
<text
id="text4642"
y="138.04034"
x="196.46143"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="138.04034"
x="196.46143"
id="tspan4640"
sodipodi:role="line">Rejected</tspan></text>
</g>
<g
id="g4710"
transform="translate(2.9753917,-0.39303668)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="11"
x="371"
height="37"
width="127"
id="rect3797" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="409.07324"
y="37.70166"
id="text4646"><tspan
sodipodi:role="line"
id="tspan4644"
x="409.07324"
y="37.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Final</tspan></text>
</g>
<g
id="g4685"
transform="translate(-1.7850301,-1.7067669)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="12.31373"
x="192.29486"
height="37"
width="127"
id="rect4610" />
<text
id="text4650"
y="36.883068"
x="205.44623"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="36.883068"
x="205.44623"
id="tspan4648"
sodipodi:role="line">Accepted</tspan></text>
</g>
<g
id="g4680"
transform="translate(-0.490154,12.699399)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="173"
x="11"
height="37"
width="127"
id="rect4608" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="26.165527"
y="199.70166"
id="text4654"><tspan
sodipodi:role="line"
id="tspan4652"
x="26.165527"
y="199.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Deferred</tspan></text>
</g>
<flowRoot
xml:space="preserve"
id="flowRoot4656"
style="fill:black;stroke:none;stroke-opacity:1;stroke-width:1px;stroke-linejoin:miter;stroke-linecap:butt;fill-opacity:1;font-family:sans-serif;font-style:normal;font-weight:normal;font-size:40px;line-height:125%;letter-spacing:0px;word-spacing:0px"><flowRegion
id="flowRegion4658"><rect
id="rect4660"
width="8.7582054"
height="81.743256"
x="99.843544"
y="115.73779" /></flowRegion><flowPara
id="flowPara4662" /></flowRoot> <g
id="g4715"
transform="translate(2.9753917,12.699399)">
<rect
id="rect4604"
width="127"
height="37"
x="371"
y="173"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="400.96826"
y="199.70166"
id="text4666"><tspan
sodipodi:role="line"
id="tspan4664"
x="400.96826"
y="199.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Active</tspan></text>
</g>
<g
id="g4705"
transform="translate(0,7.4535255)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="90.699661"
x="374.46555"
height="37"
width="127"
id="rect4618" />
<text
id="text4670"
y="115.269"
x="387.37521"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="115.269"
x="387.37521"
id="tspan4668"
sodipodi:role="line">Replaced</tspan></text>
</g>
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow1Mend)"
d="M 17.515858,48.274259 V 183.04703"
id="path4721"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
<path
sodipodi:nodetypes="cc"
inkscape:connector-curvature="0"
id="path5309"
d="M 26.545549,185.99111 V 49.504289"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker5313)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:3.00000001, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#marker9945)"
d="M 439.22346,47.398438 V 96.65244"
id="path9941"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker12309)"
d="m 137.13596,29.727733 h 52.04728"
id="path12305"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
<path
sodipodi:nodetypes="cc"
inkscape:connector-curvature="0"
id="path13775"
d="M 318.43082,27.538182 H 370.4781"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker13779)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker14679)"
d="M 69.843993,48.540525 V 77.774512 H 97.059211"
id="path14669"
inkscape:connector-curvature="0" />
<path
inkscape:connector-curvature="0"
id="path15173"
d="M 55.952177,47.433136 V 123.87614 H 158.23713"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker15177)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker16749)"
d="M 42.113037,47.312202 V 173.81084 H 220.13598"
id="path16745"
inkscape:connector-curvature="0" />
<path
inkscape:connector-curvature="0"
id="path18059"
d="M 226.35002,69.110911 H 403.09174 V 50.174148"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker18063)" />
<path
inkscape:connector-curvature="0"
id="path24278"
d="m 226.85286,85.449036 h 29.78208 v 20.112814"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:3.00000001,1.5;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#marker24282)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:3.00000001,1.5;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#marker26466)"
d="M 226.44001,77.758634 H 316.6304 V 153.39305"
id="path26462"
inkscape:connector-curvature="0" />
</svg>

After

Width:  |  Height:  |  Size: 27 KiB

View File

@ -68,7 +68,7 @@ Some other good reasons to ignore a particular guideline:
Python that don't support the feature recommended by the style guide. Python that don't support the feature recommended by the style guide.
Code lay-out Code Lay-out
============ ============
Indentation Indentation
@ -179,7 +179,6 @@ starts the multiline construct, as in::
'd', 'e', 'f', 'd', 'e', 'f',
) )
Tabs or Spaces? Tabs or Spaces?
--------------- ---------------
@ -198,7 +197,6 @@ the ``-t`` option, it issues warnings about code that illegally mixes
tabs and spaces. When using ``-tt`` these warnings become errors. tabs and spaces. When using ``-tt`` these warnings become errors.
These options are highly recommended! These options are highly recommended!
Maximum Line Length Maximum Line Length
------------------- -------------------
@ -249,8 +247,7 @@ Another such case is with ``assert`` statements.
Make sure to indent the continued line appropriately. Make sure to indent the continued line appropriately.
Should a Line Break Before or After a Binary Operator?
Should a line break before or after a binary operator?
------------------------------------------------------ ------------------------------------------------------
For decades the recommended style was to break after binary operators. For decades the recommended style was to break after binary operators.
@ -287,7 +284,6 @@ In Python code, it is permissible to break before or after a binary
operator, as long as the convention is consistent locally. For new operator, as long as the convention is consistent locally. For new
code Knuth's style is suggested. code Knuth's style is suggested.
Blank Lines Blank Lines
----------- -----------
@ -309,7 +305,6 @@ you may use them to separate pages of related sections of your file.
Note, some editors and web-based code viewers may not recognize Note, some editors and web-based code viewers may not recognize
control-L as a form feed and will show another glyph in its place. control-L as a form feed and will show another glyph in its place.
Source File Encoding Source File Encoding
-------------------- --------------------
@ -333,17 +328,16 @@ technical terms are used which aren't English). In addition, string
literals and comments must also be in ASCII. The only exceptions are literals and comments must also be in ASCII. The only exceptions are
(a) test cases testing the non-ASCII features, and (a) test cases testing the non-ASCII features, and
(b) names of authors. Authors whose names are not based on the (b) names of authors. Authors whose names are not based on the
Latin alphabet (latin-1, ISO/IEC 8859-1 character set) MUST provide Latin alphabet (latin-1, ISO/IEC 8859-1 character set) MUST provide
a transliteration of their names in this character set. a transliteration of their names in this character set.
Open source projects with a global audience are encouraged to adopt a Open source projects with a global audience are encouraged to adopt a
similar policy. similar policy.
Imports Imports
------- -------
- Imports should usually be on separate lines, e.g.:: - Imports should usually be on separate lines::
Yes: import os Yes: import os
import sys import sys
@ -359,9 +353,9 @@ Imports
Imports should be grouped in the following order: Imports should be grouped in the following order:
1. standard library imports 1. Standard library imports.
2. related third party imports 2. Related third party imports.
3. local application/library specific imports 3. Local application/library specific imports.
You should put a blank line between each group of imports. You should put a blank line between each group of imports.
@ -370,16 +364,16 @@ Imports
messages) if the import system is incorrectly configured (such as messages) if the import system is incorrectly configured (such as
when a directory inside a package ends up on ``sys.path``):: when a directory inside a package ends up on ``sys.path``)::
import mypkg.sibling import mypkg.sibling
from mypkg import sibling from mypkg import sibling
from mypkg.sibling import example from mypkg.sibling import example
However, explicit relative imports are an acceptable alternative to However, explicit relative imports are an acceptable alternative to
absolute imports, especially when dealing with complex package layouts absolute imports, especially when dealing with complex package layouts
where using absolute imports would be unnecessarily verbose:: where using absolute imports would be unnecessarily verbose::
from . import sibling from . import sibling
from .sibling import example from .sibling import example
Standard library code should avoid complex package layouts and always Standard library code should avoid complex package layouts and always
use absolute imports. use absolute imports.
@ -393,7 +387,7 @@ Imports
from myclass import MyClass from myclass import MyClass
from foo.bar.yourclass import YourClass from foo.bar.yourclass import YourClass
If this spelling causes local name clashes, then spell them :: If this spelling causes local name clashes, then spell them explicitly::
import myclass import myclass
import foo.bar.yourclass import foo.bar.yourclass
@ -412,8 +406,7 @@ Imports
When republishing names this way, the guidelines below regarding When republishing names this way, the guidelines below regarding
public and internal interfaces still apply. public and internal interfaces still apply.
Module Level Dunder Names
Module level dunder names
------------------------- -------------------------
Module level "dunders" (i.e. names with two leading and two trailing Module level "dunders" (i.e. names with two leading and two trailing
@ -421,9 +414,7 @@ underscores) such as ``__all__``, ``__author__``, ``__version__``,
etc. should be placed after the module docstring but before any import etc. should be placed after the module docstring but before any import
statements *except* ``from __future__`` imports. Python mandates that statements *except* ``from __future__`` imports. Python mandates that
future-imports must appear in the module before any other code except future-imports must appear in the module before any other code except
docstrings. docstrings::
For example::
"""This is the example module. """This is the example module.
@ -524,7 +515,6 @@ Avoid extraneous whitespace in the following situations:
y = 2 y = 2
long_variable = 3 long_variable = 3
Other Recommendations Other Recommendations
--------------------- ---------------------
@ -642,7 +632,8 @@ Other Recommendations
if foo == 'blah': one(); two(); three() if foo == 'blah': one(); two(); three()
When to use trailing commas
When to Use Trailing Commas
=========================== ===========================
Trailing commas are usually optional, except they are mandatory when Trailing commas are usually optional, except they are mandatory when
@ -690,7 +681,7 @@ Comments that contradict the code are worse than no comments. Always
make a priority of keeping the comments up-to-date when the code make a priority of keeping the comments up-to-date when the code
changes! changes!
Comments should be complete sentences. The first word should be Comments should be complete sentences. The first word should be
capitalized, unless it is an identifier that begins with a lower case capitalized, unless it is an identifier that begins with a lower case
letter (never alter the case of identifiers!). letter (never alter the case of identifiers!).
@ -748,7 +739,7 @@ Conventions for writing good documentation strings
- PEP 257 describes good docstring conventions. Note that most - PEP 257 describes good docstring conventions. Note that most
importantly, the ``"""`` that ends a multiline docstring should be importantly, the ``"""`` that ends a multiline docstring should be
on a line by itself, e.g.:: on a line by itself::
"""Return a foobang """Return a foobang
@ -882,18 +873,18 @@ Note that there is a separate convention for builtin names: most builtin
names are single words (or two words run together), with the CapWords names are single words (or two words run together), with the CapWords
convention used only for exception names and builtin constants. convention used only for exception names and builtin constants.
Type variable names Type Variable Names
~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~
Names of type variables introduced in PEP 484 should normally use CapWords Names of type variables introduced in PEP 484 should normally use CapWords
preferring short names: ``T``, ``AnyStr``, ``Num``. It is recommended to add preferring short names: ``T``, ``AnyStr``, ``Num``. It is recommended to add
suffixes ``_co`` or ``_contra`` to the variables used to declare covariant suffixes ``_co`` or ``_contra`` to the variables used to declare covariant
or contravariant behavior correspondingly. Examples:: or contravariant behavior correspondingly::
from typing import TypeVar from typing import TypeVar
VT_co = TypeVar('VT_co', covariant=True) VT_co = TypeVar('VT_co', covariant=True)
KT_contra = TypeVar('KT_contra', contravariant=True) KT_contra = TypeVar('KT_contra', contravariant=True)
Exception Names Exception Names
~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~
@ -914,7 +905,7 @@ older convention of prefixing such globals with an underscore (which
you might want to do to indicate these globals are "module you might want to do to indicate these globals are "module
non-public"). non-public").
Function and variable names Function and Variable Names
~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
Function names should be lowercase, with words separated by Function names should be lowercase, with words separated by
@ -926,7 +917,7 @@ mixedCase is allowed only in contexts where that's already the
prevailing style (e.g. threading.py), to retain backwards prevailing style (e.g. threading.py), to retain backwards
compatibility. compatibility.
Function and method arguments Function and Method Arguments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Always use ``self`` for the first argument to instance methods. Always use ``self`` for the first argument to instance methods.
@ -966,7 +957,7 @@ Constants are usually defined on a module level and written in all
capital letters with underscores separating words. Examples include capital letters with underscores separating words. Examples include
``MAX_OVERFLOW`` and ``TOTAL``. ``MAX_OVERFLOW`` and ``TOTAL``.
Designing for inheritance Designing for Inheritance
~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~
Always decide whether a class's methods and instance variables Always decide whether a class's methods and instance variables
@ -975,7 +966,7 @@ doubt, choose non-public; it's easier to make it public later than to
make a public attribute non-public. make a public attribute non-public.
Public attributes are those that you expect unrelated clients of your Public attributes are those that you expect unrelated clients of your
class to use, with your commitment to avoid backward incompatible class to use, with your commitment to avoid backwards incompatible
changes. Non-public attributes are those that are not intended to be changes. Non-public attributes are those that are not intended to be
used by third parties; you make no guarantees that non-public used by third parties; you make no guarantees that non-public
attributes won't change or even be removed. attributes won't change or even be removed.
@ -1041,8 +1032,7 @@ With this in mind, here are the Pythonic guidelines:
need to avoid accidental name clashes with potential use by need to avoid accidental name clashes with potential use by
advanced callers. advanced callers.
Public and Internal Interfaces
Public and internal interfaces
------------------------------ ------------------------------
Any backwards compatibility guarantees apply only to public interfaces. Any backwards compatibility guarantees apply only to public interfaces.
@ -1180,9 +1170,7 @@ Programming Recommendations
continuation characters thanks to the containing parentheses. continuation characters thanks to the containing parentheses.
- When catching exceptions, mention specific exceptions whenever - When catching exceptions, mention specific exceptions whenever
possible instead of using a bare ``except:`` clause. possible instead of using a bare ``except:`` clause::
For example, use::
try: try:
import platform_specific_module import platform_specific_module
@ -1250,17 +1238,16 @@ Programming Recommendations
- Context managers should be invoked through separate functions or methods - Context managers should be invoked through separate functions or methods
whenever they do something other than acquire and release resources. whenever they do something other than acquire and release resources.
For example:
Yes:: Yes::
with conn.begin_transaction(): with conn.begin_transaction():
do_stuff_in_transaction(conn) do_stuff_in_transaction(conn)
No:: No::
with conn: with conn:
do_stuff_in_transaction(conn) do_stuff_in_transaction(conn)
The latter example doesn't provide any information to indicate that The latter example doesn't provide any information to indicate that
the ``__enter__`` and ``__exit__`` methods are doing something other the ``__enter__`` and ``__exit__`` methods are doing something other
@ -1301,14 +1288,13 @@ Programming Recommendations
- Use string methods instead of the string module. - Use string methods instead of the string module.
String methods are always much faster and share the same API with String methods are always much faster and share the same API with
unicode strings. Override this rule if backward compatibility with unicode strings. Override this rule if backwards compatibility with
Pythons older than 2.0 is required. Pythons older than 2.0 is required.
- Use ``''.startswith()`` and ``''.endswith()`` instead of string - Use ``''.startswith()`` and ``''.endswith()`` instead of string
slicing to check for prefixes or suffixes. slicing to check for prefixes or suffixes.
startswith() and endswith() are cleaner and less error prone. For startswith() and endswith() are cleaner and less error prone::
example::
Yes: if foo.startswith('bar'): Yes: if foo.startswith('bar'):
No: if foo[:3] == 'bar': No: if foo[:3] == 'bar':
@ -1328,7 +1314,7 @@ Programming Recommendations
Note that in Python 3, ``unicode`` and ``basestring`` no longer exist Note that in Python 3, ``unicode`` and ``basestring`` no longer exist
(there is only ``str``) and a bytes object is no longer a kind of (there is only ``str``) and a bytes object is no longer a kind of
string (it is a sequence of integers instead) string (it is a sequence of integers instead).
- For sequences, (strings, lists, tuples), use the fact that empty - For sequences, (strings, lists, tuples), use the fact that empty
sequences are false. :: sequences are false. ::
@ -1336,8 +1322,8 @@ Programming Recommendations
Yes: if not seq: Yes: if not seq:
if seq: if seq:
No: if len(seq): No: if len(seq):
if not len(seq): if not len(seq):
- Don't write string literals that rely on significant trailing - Don't write string literals that rely on significant trailing
whitespace. Such trailing whitespace is visually indistinguishable whitespace. Such trailing whitespace is visually indistinguishable
@ -1375,7 +1361,7 @@ annotations are changing.
- For code that wants to make a different use of function annotations - For code that wants to make a different use of function annotations
it is recommended to put a comment of the form:: it is recommended to put a comment of the form::
# type: ignore # type: ignore
near the top of the file; this tells type checker to ignore all near the top of the file; this tells type checker to ignore all
annotations. (More fine-grained ways of disabling complaints from annotations. (More fine-grained ways of disabling complaints from
@ -1397,7 +1383,7 @@ annotations are changing.
can be added in the form of comments. See the relevant section of can be added in the form of comments. See the relevant section of
PEP 484 [6]_. PEP 484 [6]_.
Variable annotations Variable Annotations
-------------------- --------------------
PEP 526 introduced variable annotations. The style recommendations for them are PEP 526 introduced variable annotations. The style recommendations for them are
@ -1413,19 +1399,19 @@ similar to those on function annotations described above:
- Yes:: - Yes::
code: int code: int
class Point: class Point:
coords: Tuple[int, int] coords: Tuple[int, int]
label: str = '<unknown>' label: str = '<unknown>'
- No:: - No::
code:int # No space after colon code:int # No space after colon
code : int # Space before colon code : int # Space before colon
class Test: class Test:
result: int=0 # No spaces around equality sign result: int=0 # No spaces around equality sign
- Although the PEP 526 is accepted for Python 3.6, the variable annotation - Although the PEP 526 is accepted for Python 3.6, the variable annotation
syntax is the preferred syntax for stub files on all versions of Python syntax is the preferred syntax for stub files on all versions of Python
@ -1460,7 +1446,6 @@ References
https://www.python.org/dev/peps/pep-0484/#suggested-syntax-for-python-2-7-and-straddling-code https://www.python.org/dev/peps/pep-0484/#suggested-syntax-for-python-2-7-and-straddling-code
Copyright Copyright
========= =========

View File

@ -8,6 +8,7 @@ Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 22-Aug-2001 Created: 22-Aug-2001
Post-History: Post-History:
Replaces: 102
Abstract Abstract
@ -158,7 +159,7 @@ to perform some manual editing steps.
- Check the stable buildbots. - Check the stable buildbots.
Go to http://buildbot.python.org/all/waterfall Go to http://buildbot.python.org/all/#/grid
Look at the buildbots for the release Look at the buildbots for the release
you're making. Ignore any that are offline (or inform the community so you're making. Ignore any that are offline (or inform the community so

View File

@ -72,7 +72,7 @@ or::
More precisely, the first or second line must match the following More precisely, the first or second line must match the following
regular expression:: regular expression::
^[ \t\v]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+) ^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)
The first group of this The first group of this
expression is then interpreted as encoding name. If the encoding expression is then interpreted as encoding name. If the encoding

View File

@ -9,6 +9,7 @@ Content-Type: text/x-rst
Created: 18-Jun-2002 Created: 18-Jun-2002
Python-Version: 2.4 Python-Version: 2.4
Post-History: 18-Jun-2002, 23-Mar-2004, 22-Aug-2004 Post-History: 18-Jun-2002, 23-Mar-2004, 22-Aug-2004
Replaces: 215
Abstract Abstract

View File

@ -9,6 +9,21 @@ Content-Type: text/x-rst
Created: 22-Jan-2003 Created: 22-Jan-2003
Post-History: 27-Jan-2003, 31-Jan-2003, 17-Jun-2005 Post-History: 27-Jan-2003, 31-Jan-2003, 17-Jun-2005
Historical Note
===============
While this original PEP was withdrawn, a variant of this feature
was eventually implemented for Python 3.8 in https://bugs.python.org/issue33499
Several of the issues and concerns originally raised in this PEP were resolved
by other changes in the intervening years:
- the introduction of isolated mode to handle potential security concerns
- the switch to ``importlib``, a fully import-hook based import system implementation
- PEP 3147's change in the bytecode cache layout to use ``__pycache__``
subdirectories, including the ``source_to_cache(path)`` and
``cache_to_source(path)`` APIs that allow the interpreter to automatically
handle the redirection to a separate cache directory
Abstract Abstract
======== ========

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Richard Jones <richard@python.org> Author: Richard Jones <richard@python.org>
Discussions-To: Distutils SIG Discussions-To: Distutils SIG
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 28-Apr-2005 Created: 28-Apr-2005

View File

@ -50,13 +50,12 @@ Maintenance releases
==================== ====================
Being the last of the 2.x series, 2.7 will have an extended period of Being the last of the 2.x series, 2.7 will have an extended period of
maintenance. The current plan is to support it for at least 10 years maintenance. Specifically, 2.7 will receive bugfix support until
from the initial 2.7 release. This means there will be bugfix releases January 1, 2020. All 2.7 development work will cease in 2020.
until 2020.
Planned future release dates: Planned future release dates:
- 2.7.15 2018 - 2.7.16 late 2018 - early 2019
Dates of previous maintenance releases: Dates of previous maintenance releases:
@ -84,6 +83,8 @@ Dates of previous maintenance releases:
- 2.7.13 2016-12-17 - 2.7.13 2016-12-17
- 2.7.14rc1 2017-08-26 - 2.7.14rc1 2017-08-26
- 2.7.14 2017-09-16 - 2.7.14 2017-09-16
- 2.7.15rc1 2018-04-14
- 2.7.15 2018-05-01
2.7.0 Release Schedule 2.7.0 Release Schedule
====================== ======================

View File

@ -3,7 +3,7 @@ Title: Database of Installed Python Distributions
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Tarek Ziadé <tarek@ziade.org> Author: Tarek Ziadé <tarek@ziade.org>
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 22-Feb-2009 Created: 22-Feb-2009
@ -152,7 +152,7 @@ This `.dist-info` directory can contain these files:
- `RECORD`: records the list of installed files - `RECORD`: records the list of installed files
- `INSTALLER`: records the name of the tool used to install the project - `INSTALLER`: records the name of the tool used to install the project
- `REQUESTED`: the presence of this file indicates that the project - `REQUESTED`: the presence of this file indicates that the project
installation was explicitly requested (i.e., not installed as a dependency). installation was explicitly requested (i.e., not installed as a dependency).
The METADATA, RECORD and INSTALLER files are mandatory, while REQUESTED may The METADATA, RECORD and INSTALLER files are mandatory, while REQUESTED may
be missing. be missing.

View File

@ -144,7 +144,7 @@ specifiers like::
Java offers a `Decimal.Format Class`_ that uses picture patterns (one Java offers a `Decimal.Format Class`_ that uses picture patterns (one
for positive numbers and an optional one for negatives) such as: for positive numbers and an optional one for negatives) such as:
``"#,##0.00;(#,##0.00)"``. It allows arbitrary groupings including ``"#,##0.00;(#,##0.00)"``. It allows arbitrary groupings including
hundreds and ten-thousands and uneven groupings. The special patten hundreds and ten-thousands and uneven groupings. The special pattern
characters are non-localized (using a DOT for a decimal separator and characters are non-localized (using a DOT for a decimal separator and
a COMMA for a grouping separator). The user can supply an alternate a COMMA for a grouping separator). The user can supply an alternate
set of symbols using the formatter's *DecimalFormatSymbols* object. set of symbols using the formatter's *DecimalFormatSymbols* object.

View File

@ -52,7 +52,7 @@ be removed at any time in any way. These include:
- Function, class, module, attribute, method, and C-API names and types that - Function, class, module, attribute, method, and C-API names and types that
are prefixed by "_" (except special names). The contents of these are prefixed by "_" (except special names). The contents of these
can also are not subject to the policy. are also not subject to the policy.
- Inheritance patterns of internal classes. - Inheritance patterns of internal classes.

View File

@ -4,12 +4,13 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Kerrick Staley <mail@kerrickstaley.com>, Author: Kerrick Staley <mail@kerrickstaley.com>,
Nick Coghlan <ncoghlan@gmail.com>, Nick Coghlan <ncoghlan@gmail.com>,
Barry Warsaw <barry@python.org> Barry Warsaw <barry@python.org>,
Petr Viktorin <encukou@gmail.com>
Status: Active Status: Active
Type: Informational Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 02-Mar-2011 Created: 02-Mar-2011
Post-History: 04-Mar-2011, 20-Jul-2011, 16-Feb-2012, 30-Sep-2014 Post-History: 04-Mar-2011, 20-Jul-2011, 16-Feb-2012, 30-Sep-2014, 28-Apr-2018
Resolution: https://mail.python.org/pipermail/python-dev/2012-February/116594.html Resolution: https://mail.python.org/pipermail/python-dev/2012-February/116594.html
@ -22,8 +23,9 @@ Python interpreter (i.e. the version invoked by the ``python`` command).
* ``python2`` will refer to some version of Python 2.x. * ``python2`` will refer to some version of Python 2.x.
* ``python3`` will refer to some version of Python 3.x. * ``python3`` will refer to some version of Python 3.x.
* for the time being, all distributions *should* ensure that ``python`` * for the time being, all distributions *should* ensure that ``python``,
refers to the same target as ``python2``. if installed, refers to the same target as ``python2``, unless the user
deliberately overrides this or a virtual environment is active.
* however, end users should be aware that ``python`` refers to ``python3`` * however, end users should be aware that ``python`` refers to ``python3``
on at least Arch Linux (that change is what prompted the creation of this on at least Arch Linux (that change is what prompted the creation of this
PEP), so ``python`` should be used in the shebang line only for scripts PEP), so ``python`` should be used in the shebang line only for scripts
@ -43,8 +45,7 @@ Recommendation
* When invoked, ``python2`` should run some version of the Python 2 * When invoked, ``python2`` should run some version of the Python 2
interpreter, and ``python3`` should run some version of the Python 3 interpreter, and ``python3`` should run some version of the Python 3
interpreter. interpreter.
* The more general ``python`` command should be installed whenever * If the ``python`` command is installed, it should invoke the same version of
any version of Python 2 is installed and should invoke the same version of
Python as the ``python2`` command (however, note that some distributions Python as the ``python2`` command (however, note that some distributions
have already chosen to have ``python`` implement the ``python3`` have already chosen to have ``python`` implement the ``python3``
command; see the `Rationale`_ and `Migration Notes`_ below). command; see the `Rationale`_ and `Migration Notes`_ below).
@ -62,14 +63,30 @@ Recommendation
context. context.
* One exception to this is scripts that are deliberately written to be source * One exception to this is scripts that are deliberately written to be source
compatible with both Python 2.x and 3.x. Such scripts may continue to use compatible with both Python 2.x and 3.x. Such scripts may continue to use
``python`` on their shebang line without affecting their portability. ``python`` on their shebang line.
* When packaging software that is source compatible with both versions,
distributions may change such ``python`` shebangs to ``python3``.
This ensures software is used with the latest version of
Python available, and it can remove a dependency on Python 2.
* When reinvoking the interpreter from a Python script, querying * When reinvoking the interpreter from a Python script, querying
``sys.executable`` to avoid hardcoded assumptions regarding the ``sys.executable`` to avoid hardcoded assumptions regarding the
interpreter location remains the preferred approach. interpreter location remains the preferred approach.
* In controlled environments aimed at expert users, where being explicit
is valued over user experience (for example, in test environments and
package build systems), distributions may choose to not provide the
``python`` command even if ``python2`` is available.
(All software in such a controlled environment must use ``python3`` or
``python2`` rather than ``python``, which means scripts that deliberately
use ``python`` need to be modified for such environments.)
* When a virtual environment (created by the PEP 405 ``venv`` package or a
similar tool) is active, the ``python`` command should refer to the
virtual environment's interpreter. In other words, activating a virtual
environment counts as deliberate user action to change the default
``python`` interpreter.
These recommendations are the outcome of the relevant python-dev discussions These recommendations are the outcome of the relevant python-dev discussions
in March and July 2011 ([1]_, [2]_), February 2012 ([4]_) and in March and July 2011 ([1]_, [2]_), February 2012 ([4]_),
September 2014 ([6]_). September 2014 ([6]_), and discussion on GitHub in April 2018 ([7]_).
Rationale Rationale
@ -91,11 +108,6 @@ on the part of distribution maintainers.
Future Changes to this Recommendation Future Changes to this Recommendation
===================================== =====================================
It is anticipated that there will eventually come a time where the third
party ecosystem surrounding Python 3 is sufficiently mature for this
recommendation to be updated to suggest that the ``python`` symlink
refer to ``python3`` rather than ``python2``.
This recommendation will be periodically reviewed over the next few years, This recommendation will be periodically reviewed over the next few years,
and updated when the core development team judges it appropriate. As a and updated when the core development team judges it appropriate. As a
point of reference, regular maintenance releases for the Python 2.7 series point of reference, regular maintenance releases for the Python 2.7 series
@ -150,15 +162,13 @@ making such a change.
* When the ``pythonX.X`` binaries are provided by a distribution, the * When the ``pythonX.X`` binaries are provided by a distribution, the
``python2`` and ``python3`` commands should refer to one of those files ``python2`` and ``python3`` commands should refer to one of those files
rather than being provided as a separate binary file. rather than being provided as a separate binary file.
* It is suggested that even distribution-specific packages follow the * It is strongly encouraged that distribution-specific packages use ``python2``
``python2``/``python3`` convention, even in code that is not intended to or ``python3`` rather than ``python``, even in code that is not intended to
operate on other distributions. This will reduce problems if the operate on other distributions. This will reduce problems if the
distribution later decides to change the version of the Python interpreter distribution later decides to change the version of the Python interpreter
that the ``python`` command invokes, or if a sysadmin installs a custom that the ``python`` command invokes, or if a sysadmin installs a custom
``python`` command with a different major version than the distribution ``python`` command with a different major version than the distribution
default. Distributions can test whether they are fully following this default.
convention by changing the ``python`` interpreter on a test box and checking
to see if anything breaks.
* If the above point is adhered to and sysadmins are permitted to change the * If the above point is adhered to and sysadmins are permitted to change the
``python`` command, then the ``python`` command should always be implemented ``python`` command, then the ``python`` command should always be implemented
as a link to the interpreter binary (or a link to a link) and not vice as a link to the interpreter binary (or a link to a link) and not vice
@ -267,6 +277,10 @@ References
.. [6] PEP 394 - Clarification of what "python" command should invoke .. [6] PEP 394 - Clarification of what "python" command should invoke
(https://mail.python.org/pipermail/python-dev/2014-September/136374.html) (https://mail.python.org/pipermail/python-dev/2014-September/136374.html)
.. [7] PEP 394: Allow the `python` command to not be installed, and other
minor edits
(https://github.com/python/peps/pull/630)
Copyright Copyright
=========== ===========
This document has been placed in the public domain. This document has been placed in the public domain.

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>, Author: Nick Coghlan <ncoghlan@gmail.com>,
Eli Bendersky <eliben@gmail.com> Eli Bendersky <eliben@gmail.com>
Status: Accepted Status: Active
Type: Informational Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2012-02-10 Created: 2012-02-10

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: 07-Aug-2012 Last-Modified: 07-Aug-2012
Author: Daniel Holth <dholth@gmail.com> Author: Daniel Holth <dholth@gmail.com>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com> BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 27-Jul-2012 Created: 27-Jul-2012

View File

@ -7,7 +7,7 @@ Author: Nick Coghlan <ncoghlan@gmail.com>,
Donald Stufft <donald@stufft.io> Donald Stufft <donald@stufft.io>
BDFL-Delegate: Donald Stufft <donald@stufft.io> BDFL-Delegate: Donald Stufft <donald@stufft.io>
Discussions-To: Distutils SIG <distutils-sig@python.org> Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Deferred Status: Withdrawn
Type: Informational Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Requires: 440, 508, 518 Requires: 440, 508, 518
@ -18,6 +18,20 @@ Post-History: 14 Nov 2012, 5 Feb 2013, 7 Feb 2013, 9 Feb 2013,
Replaces: 345 Replaces: 345
PEP Withdrawal
==============
The ground-up metadata redesign proposed in this PEP has been withdrawn in
favour of the more modest proposal in PEP 566, which retains the basic
Key:Value format of previous metadata versions, but also defines a standardised
mechanism for translating that format to nested JSON-compatible data structures.
Some of the ideas in this PEP (or the related PEP 459) may still be considered
as part of later proposals, but they will be handled in a more incremental
fashion, rather than as a single large proposed change with no feasible
migration plan.
Abstract Abstract
======== ========
@ -25,16 +39,13 @@ This PEP describes a mechanism for publishing and exchanging metadata
related to Python distributions. It includes specifics of the field names, related to Python distributions. It includes specifics of the field names,
and their semantics and usage. and their semantics and usage.
This document specifies version 3.0 of the metadata format. This document specifies the never released version 2.0 of the metadata format.
Version 1.0 is specified in PEP 241. Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314. Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345. Version 1.2 is specified in PEP 345.
Version 2.0 is specified in earlier drafts of this PEP and was never formally Version 2.0 of the metadata format proposed migrating from directly defining a
approved for use.
Version 3.0 of the metadata format migrates from directly defining a
custom key-value file format to instead defining a JSON-compatible in-memory custom key-value file format to instead defining a JSON-compatible in-memory
representation that may be used to define metadata representation in other representation that may be used to define metadata representation in other
contexts (such as API and archive format definitions). contexts (such as API and archive format definitions).
@ -44,8 +55,8 @@ fields to be added for particular purposes without requiring updates to
the core metadata format. the core metadata format.
Note on PEP Deferral Note on PEP History
==================== ===================
This PEP was initially deferred for an extended period, from December 2013 This PEP was initially deferred for an extended period, from December 2013
through to March 2017, as distutils-sig worked through a number of other through to March 2017, as distutils-sig worked through a number of other
@ -74,7 +85,7 @@ of publishing and distributing software to be moved out to PEP 459, a separate
proposal for a number of standard metadata extensions that provide additional proposal for a number of standard metadata extensions that provide additional
optional information about a release. optional information about a release.
As of September 2017, it has been deferred again, on the grounds that As of September 2017, it was deferred again, on the grounds that
it doesn't actually help solve any particularly pressing problems: it doesn't actually help solve any particularly pressing problems:
- JSON representation would be better handled through defining a - JSON representation would be better handled through defining a
@ -87,6 +98,9 @@ it doesn't actually help solve any particularly pressing problems:
.. _specifications: https://packaging.python.org/specifications/ .. _specifications: https://packaging.python.org/specifications/
.. _minor spec version update: https://mail.python.org/pipermail/distutils-sig/2017-September/031465.html .. _minor spec version update: https://mail.python.org/pipermail/distutils-sig/2017-September/031465.html
Finally, the PEP was withdrawn in February 2018 in favour of PEP 566 (which
pursues that more incremental strategy).
Purpose Purpose
======= =======
@ -391,7 +405,7 @@ binary archive from a source archive.
These locations are to be confirmed, since they depend on the definition These locations are to be confirmed, since they depend on the definition
of sdist 2.0 and the revised installation database standard. There will of sdist 2.0 and the revised installation database standard. There will
also be a wheel 1.1 format update after this PEP is approved that also be a wheel 1.1 format update after this PEP is approved that
mandates provision of 3.0+ metadata. mandates provision of 2.0+ metadata.
Note that these metadata files MAY be processed even if the version of the Note that these metadata files MAY be processed even if the version of the
containing location is too low to indicate that they are valid. Specifically, containing location is too low to indicate that they are valid. Specifically,
@ -414,7 +428,7 @@ used directly as a data input format. Generating the metadata as part of the
publication process also helps to deal with version specific fields (including publication process also helps to deal with version specific fields (including
the source URL and the version field itself). the source URL and the version field itself).
For backwards compatibility with older installation tools, metadata 3.0 For backwards compatibility with older installation tools, metadata 2.0
files MAY be distributed alongside legacy metadata. files MAY be distributed alongside legacy metadata.
Index servers MAY allow distributions to be uploaded and installation tools Index servers MAY allow distributions to be uploaded and installation tools
@ -443,8 +457,8 @@ with RFC 3986.
The current version of the schema file covers the previous draft of the The current version of the schema file covers the previous draft of the
PEP, and has not yet been updated for the split into the essential PEP, and has not yet been updated for the split into the essential
dependency resolution metadata and multiple standard extensions, and nor dependency resolution metadata and multiple standard extensions, and nor
has it been updated for the various other differences between the 3.0 has it been updated for the various other differences between the current
draft and the earlier 2.0 drafts. draft and the earlier drafts.
Core metadata Core metadata
@ -467,7 +481,7 @@ installation to occur.
Metadata version Metadata version
---------------- ----------------
Version of the file format; ``"3.0"`` is the only legal value. Version of the file format; ``"2.0"`` is the only legal value.
Automated tools consuming metadata SHOULD warn if ``metadata_version`` is Automated tools consuming metadata SHOULD warn if ``metadata_version`` is
greater than the highest version they support, and MUST fail if greater than the highest version they support, and MUST fail if
@ -481,7 +495,7 @@ all of the needed fields.
Example:: Example::
"metadata_version": "3.0" "metadata_version": "2.0"
Generator Generator
@ -1046,7 +1060,7 @@ Appendix A: Conversion notes for legacy metadata
================================================ ================================================
The reference implementations for converting from legacy metadata to The reference implementations for converting from legacy metadata to
metadata 3.0 are: metadata 2.0 are:
* the `wheel project <https://bitbucket.org/dholth/wheel/overview>`__, which * the `wheel project <https://bitbucket.org/dholth/wheel/overview>`__, which
adds the ``bdist_wheel`` command to ``setuptools`` adds the ``bdist_wheel`` command to ``setuptools``
@ -1114,7 +1128,7 @@ format.
Appendix C: Summary of differences from \PEP 345 Appendix C: Summary of differences from \PEP 345
================================================= =================================================
* Metadata-Version is now 3.0, with semantics specified for handling * Metadata-Version is now 2.0, with semantics specified for handling
version changes version changes
* The increasingly complex ad hoc "Key: Value" format has been replaced by * The increasingly complex ad hoc "Key: Value" format has been replaced by
@ -1175,7 +1189,7 @@ provision of multiple versions of the metadata in parallel.
Existing tools won't abide by this guideline until they're updated to Existing tools won't abide by this guideline until they're updated to
support the new metadata standard, so the new semantics will first take support the new metadata standard, so the new semantics will first take
effect for a hypothetical 2.x -> 3.0 transition. For the 1.x -> 3.0 effect for a hypothetical 2.x -> 3.0 transition. For the 1.x -> 2.x
transition, we will use the approach where tools continue to produce the transition, we will use the approach where tools continue to produce the
existing supplementary files (such as ``entry_points.txt``) in addition existing supplementary files (such as ``entry_points.txt``) in addition
to any equivalents specified using the new features of the standard to any equivalents specified using the new features of the standard
@ -1283,7 +1297,7 @@ packages.
The ability to declare an extension as required is included primarily to The ability to declare an extension as required is included primarily to
allow the definition of the metadata hooks extension to be deferred until allow the definition of the metadata hooks extension to be deferred until
some time after the initial adoption of the metadata 3.0 specification. If some time after the initial adoption of the metadata 2.0 specification. If
a release needs a ``postinstall`` hook to run in order to complete a release needs a ``postinstall`` hook to run in order to complete
the installation successfully, then earlier versions of tools should fall the installation successfully, then earlier versions of tools should fall
back to installing from source rather than installing from a wheel file and back to installing from source rather than installing from a wheel file and
@ -1299,10 +1313,10 @@ order to better prioritise our efforts in migrating to the new metadata
standard. These all reflect information that may be nice to have in the standard. These all reflect information that may be nice to have in the
new metadata, but which can be readily added through metadata extensions or new metadata, but which can be readily added through metadata extensions or
in metadata 2.1 without breaking any use cases already supported by metadata in metadata 2.1 without breaking any use cases already supported by metadata
3.0. 2.0.
Once the ``pypi``, ``setuptools``, ``pip``, ``wheel`` and ``distlib`` Once the ``pypi``, ``setuptools``, ``pip``, ``wheel`` and ``distlib``
projects support creation and consumption of metadata 3.0, then we may projects support creation and consumption of metadata 2.0, then we may
revisit the creation of metadata 2.1 with some or all of these additional revisit the creation of metadata 2.1 with some or all of these additional
features. features.
@ -1484,7 +1498,7 @@ the idea won't be reconsidered until metadata 2.1 at the earliest).
References References
========== ==========
This document specifies version 3.0 of the metadata format. This document specifies version 2.0 of the metadata format.
Version 1.0 is specified in PEP 241. Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314. Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345. Version 1.2 is specified in PEP 345.

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Daniel Holth <dholth@gmail.com> Author: Daniel Holth <dholth@gmail.com>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com> BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: <distutils-sig@python.org> Discussions-To: <distutils-sig@python.org>
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 20-Sep-2012 Created: 20-Sep-2012

View File

@ -61,13 +61,16 @@ The releases so far:
- 3.4.6 final: January 17, 2017 - 3.4.6 final: January 17, 2017
- 3.4.7 candidate 1: July 25, 2017 - 3.4.7 candidate 1: July 25, 2017
- 3.4.7 final: August 9, 2017 - 3.4.7 final: August 9, 2017
- 3.4.8 candidate 1: January 23, 2018
- 3.4.8 final: February 4, 2018
.. There are no currently planned releases of Python 3.4. .. There are no specific plans for the next release of Python 3.4.
Planned future releases: Planned future releases:
- 3.4.8 candidate 1: January 21, 2018 - 3.4.9 candidate 1: July 18, 2018
- 3.4.8 final: February 4, 2018 - 3.4.9 final: August 1, 2018
Features for 3.4 Features for 3.4

View File

@ -11,6 +11,7 @@ Content-Type: text/x-rst
Created: 2013-02-23 Created: 2013-02-23
Python-Version: 3.4 Python-Version: 3.4
Post-History: 2013-02-23, 2013-05-02 Post-History: 2013-02-23, 2013-05-02
Replaces: 354
Resolution: https://mail.python.org/pipermail/python-dev/2013-May/126112.html Resolution: https://mail.python.org/pipermail/python-dev/2013-May/126112.html

View File

@ -6,7 +6,7 @@ Author: Nick Coghlan <ncoghlan@gmail.com>,
Donald Stufft <donald@stufft.io> Donald Stufft <donald@stufft.io>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com> BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: Distutils SIG <distutils-sig@python.org> Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Accepted Status: Active
Type: Informational Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 18 Mar 2013 Created: 18 Mar 2013

View File

@ -8,6 +8,7 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 5-August-2013 Created: 5-August-2013
Python-Version: 3.4 Python-Version: 3.4
Replaces: 433
Abstract Abstract

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Donald Stufft <donald@stufft.io> Author: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Richard Jones <richard@python.org> BDFL-Delegate: Richard Jones <richard@python.org>
Discussions-To: distutils-sig@python.org Discussions-To: distutils-sig@python.org
Status: Accepted Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 04-Aug-2013 Created: 04-Aug-2013

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com> Author: Nick Coghlan <ncoghlan@gmail.com>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com> BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: Distutils SIG <distutils-sig@python.org> Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Deferred Status: Withdrawn
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Requires: 426 Requires: 426
@ -13,6 +13,17 @@ Created: 11 Nov 2013
Post-History: 21 Dec 2013 Post-History: 21 Dec 2013
PEP Withdrawal
==============
This PEP depends on PEP 426, which has itself been withdrawn. See the
PEP Withdrawal section in that PEP for details.
In the meantime, metadata extensions will continue to be handled as they
have been for past examples like ``entry_points.txt``: as additional files
installed into metadata directories alongside the main `METADATA` file.
Abstract Abstract
======== ========
@ -22,22 +33,6 @@ Like all metadata extensions, each standard extension format is
independently versioned. Changing any of the formats requires an update independently versioned. Changing any of the formats requires an update
to this PEP, but does not require an update to the core packaging metadata. to this PEP, but does not require an update to the core packaging metadata.
PEP Deferral
============
This PEP depends on PEP 426, which has itself been deferred. See the
PEP Deferral section in that PEP for details.
.. note::
These extensions may eventually be separated out into their own PEPs,
but we're already suffering from PEP overload in the packaging
metadata space.
This PEP was initially created by slicing out large sections of earlier
drafts of PEP 426 and making them extensions, so some of the specifics
may still be rough in the new context.
Standard Extension Namespace Standard Extension Namespace
============================ ============================

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Donald Stufft <donald@stufft.io> Author: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Richard Jones <richard@python.org> BDFL-Delegate: Richard Jones <richard@python.org>
Discussions-To: distutils-sig@python.org Discussions-To: distutils-sig@python.org
Status: Accepted Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 02-Mar-2014 Created: 02-Mar-2014

View File

@ -2,13 +2,13 @@ PEP: 467
Title: Minor API improvements for binary sequences Title: Minor API improvements for binary sequences
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com> Author: Nick Coghlan <ncoghlan@gmail.com>, Ethan Furman <ethan@stoneleaf.us>
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2014-03-30 Created: 2014-03-30
Python-Version: 3.5 Python-Version: 3.8
Post-History: 2014-03-30 2014-08-15 2014-08-16 Post-History: 2014-03-30 2014-08-15 2014-08-16 2016-06-07 2016-09-01
Abstract Abstract
@ -20,22 +20,25 @@ that is now referred to as ``bytearray``. Other aspects of operating in
the binary domain in Python have also evolved over the course of the Python the binary domain in Python have also evolved over the course of the Python
3 series. 3 series.
This PEP proposes four small adjustments to the APIs of the ``bytes``, This PEP proposes five small adjustments to the APIs of the ``bytes`` and
``bytearray`` and ``memoryview`` types to make it easier to operate entirely ``bytearray`` types to make it easier to operate entirely in the binary domain:
in the binary domain:
* Deprecate passing single integer values to ``bytes`` and ``bytearray`` * Deprecate passing single integer values to ``bytes`` and ``bytearray``
* Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors
* Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods
``memoryview.iterbytes`` alternative iterators * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators
And one built-in::
* bchr
Proposals Proposals
========= =========
Deprecation of current "zero-initialised sequence" behaviour Deprecation of current "zero-initialised sequence" behaviour without removal
------------------------------------------------------------ ----------------------------------------------------------------------------
Currently, the ``bytes`` and ``bytearray`` constructors accept an integer Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
argument and interpret it as meaning to create a zero-initialised sequence argument and interpret it as meaning to create a zero-initialised sequence
@ -46,62 +49,75 @@ of the given size::
>>> bytearray(3) >>> bytearray(3)
bytearray(b'\x00\x00\x00') bytearray(b'\x00\x00\x00')
This PEP proposes to deprecate that behaviour in Python 3.5, and remove it This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
entirely in Python 3.6. it in place for at least as long as Python 2.7 is supported, possibly
indefinitely.
No other changes are proposed to the existing constructors. No other changes are proposed to the existing constructors.
Addition of explicit "zero-initialised sequence" constructors Addition of explicit "count and byte initialised sequence" constructors
------------------------------------------------------------- -----------------------------------------------------------------------
To replace the deprecated behaviour, this PEP proposes the addition of an To replace the deprecated behaviour, this PEP proposes the addition of an
explicit ``zeros`` alternative constructor as a class method on both explicit ``fromsize`` alternative constructor as a class method on both
``bytes`` and ``bytearray``:: ``bytes`` and ``bytearray`` whose first argument is the count, and whose
second argument is the fill byte to use (defaults to ``\x00``)::
>>> bytes.zeros(3) >>> bytes.fromsize(3)
b'\x00\x00\x00' b'\x00\x00\x00'
>>> bytearray.zeros(3) >>> bytearray.fromsize(3)
bytearray(b'\x00\x00\x00') bytearray(b'\x00\x00\x00')
>>> bytes.fromsize(5, b'\x0a')
b'\x0a\x0a\x0a\x0a\x0a'
>>> bytearray.fromsize(5, b'\x0a')
bytearray(b'\x0a\x0a\x0a\x0a\x0a')
It will behave just as the current constructors behave when passed a single ``fromsize`` will behave just as the current constructors behave when passed a single
integer. integer, while allowing for non-zero fill values when needed.
The specific choice of ``zeros`` as the alternative constructor name is taken
from the corresponding initialisation function in NumPy (although, as these
are 1-dimensional sequence types rather than N-dimensional matrices, the
constructors take a length as input rather than a shape tuple)
Addition of explicit "single byte" constructors Addition of "bchr" function and explicit "single byte" constructors
----------------------------------------------- -------------------------------------------------------------------
As binary counterparts to the text ``chr`` function, this PEP proposes the As binary counterparts to the text ``chr`` function, this PEP proposes
addition of an explicit ``byte`` alternative constructor as a class method the addition of a ``bchr`` function and an explicit ``fromord`` alternative
on both ``bytes`` and ``bytearray``:: constructor as a class method on both ``bytes`` and ``bytearray``::
>>> bytes.byte(3) >>> bchr(ord("A"))
b'\x03' b'A'
>>> bytearray.byte(3) >>> bchr(ord(b"A"))
bytearray(b'\x03') b'A'
>>> bytes.fromord(65)
b'A'
>>> bytearray.fromord(65)
bytearray(b'A')
These methods will only accept integers in the range 0 to 255 (inclusive):: These methods will only accept integers in the range 0 to 255 (inclusive)::
>>> bytes.byte(512) >>> bytes.fromord(512)
Traceback (most recent call last): Traceback (most recent call last):
File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <module>
ValueError: bytes must be in range(0, 256) ValueError: integer must be in range(0, 256)
>>> bytes.byte(1.0) >>> bytes.fromord(1.0)
Traceback (most recent call last): Traceback (most recent call last):
File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <module>
TypeError: 'float' object cannot be interpreted as an integer TypeError: 'float' object cannot be interpreted as an integer
The documentation of the ``ord`` builtin will be updated to explicitly note While this does create some duplication, there are valid reasons for it:
that ``bytes.byte`` is the inverse operation for binary data, while ``chr``
is the inverse operation for text data.
Behaviourally, ``bytes.byte(x)`` will be equivalent to the current * the ``bchr`` builtin is to recreate the ord/chr/unichr trio from Python
2 under a different naming scheme
* the class method is mainly for the ``bytearray.fromord`` case, with
``bytes.fromord`` added for consistency
The documentation of the ``ord`` builtin will be updated to explicitly note
that ``bchr`` is the primary inverse operation for binary data, while ``chr``
is the inverse operation for text data, and that ``bytes.fromord`` and
``bytearray.fromord`` also exist.
Behaviourally, ``bytes.fromord(x)`` will be equivalent to the current
``bytes([x])`` (and similarly for ``bytearray``). The new spelling is ``bytes([x])`` (and similarly for ``bytearray``). The new spelling is
expected to be easier to discover and easier to read (especially when used expected to be easier to discover and easier to read (especially when used
in conjunction with indexing operations on binary sequence types). in conjunction with indexing operations on binary sequence types).
@ -110,35 +126,37 @@ As a separate method, the new spelling will also work better with higher
order functions like ``map``. order functions like ``map``.
Addition of "getbyte" method to retrieve a single byte
------------------------------------------------------
This PEP proposes that ``bytes`` and ``bytearray`` gain the method ``getbyte``
which will always return ``bytes``::
>>> b'abc'.getbyte(0)
b'a'
If an index is asked for that doesn't exist, ``IndexError`` is raised::
>>> b'abc'.getbyte(9)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index out of range
Addition of optimised iterator methods that produce ``bytes`` objects Addition of optimised iterator methods that produce ``bytes`` objects
--------------------------------------------------------------------- ---------------------------------------------------------------------
This PEP proposes that ``bytes``, ``bytearray`` and ``memoryview`` gain an This PEP proposes that ``bytes`` and ``bytearray``gain an optimised
optimised ``iterbytes`` method that produces length 1 ``bytes`` objects ``iterbytes`` method that produces length 1 ``bytes`` objects rather than
rather than integers:: integers::
for x in data.iterbytes(): for x in data.iterbytes():
# x is a length 1 ``bytes`` object, rather than an integer # x is a length 1 ``bytes`` object, rather than an integer
The method can be used with arbitrary buffer exporting objects by wrapping For example::
them in a ``memoryview`` instance first::
for x in memoryview(data).iterbytes(): >>> tuple(b"ABC".iterbytes())
# x is a length 1 ``bytes`` object, rather than an integer (b'A', b'B', b'C')
For ``memoryview``, the semantics of ``iterbytes()`` are defined such that::
memview.tobytes() == b''.join(memview.iterbytes())
This allows the raw bytes of the memory view to be iterated over without
needing to make a copy, regardless of the defined shape and format.
The main advantage this method offers over the ``map(bytes.byte, data)``
approach is that it is guaranteed *not* to fail midstream with a
``ValueError`` or ``TypeError``. By contrast, when using the ``map`` based
approach, the type and value of the individual items in the iterable are
only checked as they are retrieved and passed through the ``bytes.byte``
constructor.
Design discussion Design discussion
@ -163,10 +181,18 @@ This PEP isn't revisiting that original design decision, just changing the
spelling as users sometimes find the current behaviour of the binary sequence spelling as users sometimes find the current behaviour of the binary sequence
constructors surprising. In particular, there's a reasonable case to be made constructors surprising. In particular, there's a reasonable case to be made
that ``bytes(x)`` (where ``x`` is an integer) should behave like the that ``bytes(x)`` (where ``x`` is an integer) should behave like the
``bytes.byte(x)`` proposal in this PEP. Providing both behaviours as separate ``bytes.fromord(x)`` proposal in this PEP. Providing both behaviours as separate
class methods avoids that ambiguity. class methods avoids that ambiguity.
Open Questions
==============
Do we add ``iterbytes`` to ``memoryview``, or modify
``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or
do we ignore memory for now and add it later?
References References
========== ==========
@ -180,19 +206,11 @@ References
(http://bugs.python.org/issue21644) (http://bugs.python.org/issue21644)
.. [5] August 2014 discussion thread on python-dev .. [5] August 2014 discussion thread on python-dev
(https://mail.python.org/pipermail/python-ideas/2014-March/027295.html) (https://mail.python.org/pipermail/python-ideas/2014-March/027295.html)
.. [6] June 2016 discussion thread on python-dev
(https://mail.python.org/pipermail/python-dev/2016-June/144875.html)
Copyright Copyright
========= =========
This document has been placed in the public domain. This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Donald Stufft <donald@stufft.io> Author: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Paul Moore <p.f.moore@gmail.com> BDFL-Delegate: Paul Moore <p.f.moore@gmail.com>
Discussions-To: distutils-sig@python.org Discussions-To: distutils-sig@python.org
Status: Accepted Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 12-May-2014 Created: 12-May-2014

View File

@ -57,13 +57,15 @@ The releases so far:
- 3.5.3 final: January 17, 2017 - 3.5.3 final: January 17, 2017
- 3.5.4 candidate 1: July 25, 2017 - 3.5.4 candidate 1: July 25, 2017
- 3.5.4 final: August 8, 2017 - 3.5.4 final: August 8, 2017
- 3.5.5 candidate 1: January 23, 2018
- 3.5.5 final: February 4, 2018
.. There are no currently planned releases for Python 3.5. .. There are no specific plans for the next release of Python 3.5.
Planned future releases: Planned future releases:
- 3.5.5 candidate 1: January 21, 2018 - 3.5.6 candidate 1: July 18, 2018
- 3.5.5 final: February 4, 2018 - 3.5.6 final: August 1, 2018

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Guido van Rossum <guido@python.org>, Jukka Lehtosalo <jukka.lehtosalo@iki.fi>, Łukasz Langa <lukasz@python.org> Author: Guido van Rossum <guido@python.org>, Jukka Lehtosalo <jukka.lehtosalo@iki.fi>, Łukasz Langa <lukasz@python.org>
BDFL-Delegate: Mark Shannon BDFL-Delegate: Mark Shannon
Discussions-To: Python-Dev <python-dev@python.org> Discussions-To: Python-Dev <python-dev@python.org>
Status: Accepted Status: Provisional
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 29-Sep-2014 Created: 29-Sep-2014
@ -342,7 +342,7 @@ Additionally, ``Any`` is a valid value for every type variable.
Consider the following:: Consider the following::
def count_truthy(elements: List[Any]) -> int: def count_truthy(elements: List[Any]) -> int:
return sum(1 for elem in elements if element) return sum(1 for elem in elements if elem)
This is equivalent to omitting the generic notation and just saying This is equivalent to omitting the generic notation and just saying
``elements: List``. ``elements: List``.
@ -355,6 +355,7 @@ You can include a ``Generic`` base class to define a user-defined class
as generic. Example:: as generic. Example::
from typing import TypeVar, Generic from typing import TypeVar, Generic
from logging import Logger
T = TypeVar('T') T = TypeVar('T')
@ -373,7 +374,7 @@ as generic. Example::
return self.value return self.value
def log(self, message: str) -> None: def log(self, message: str) -> None:
self.logger.info('{}: {}'.format(self.name message)) self.logger.info('{}: {}'.format(self.name, message))
``Generic[T]`` as a base class defines that the class ``LoggedVar`` ``Generic[T]`` as a base class defines that the class ``LoggedVar``
takes a single type parameter ``T``. This also makes ``T`` valid as takes a single type parameter ``T``. This also makes ``T`` valid as
@ -582,9 +583,9 @@ argument(s) is substituted. Otherwise, ``Any`` is assumed. Example::
T = TypeVar('T') T = TypeVar('T')
class Node(Generic[T]): class Node(Generic[T]):
x = None # type: T # Instance attribute (see below)
def __init__(self, label: T = None) -> None: def __init__(self, label: T = None) -> None:
... ...
x = None # Type: T
x = Node('') # Inferred type is Node[str] x = Node('') # Inferred type is Node[str]
y = Node(0) # Inferred type is Node[int] y = Node(0) # Inferred type is Node[int]
@ -983,15 +984,17 @@ for example, the above is equivalent to::
def handle_employee(e: Optional[Employee]) -> None: ... def handle_employee(e: Optional[Employee]) -> None: ...
An optional type is also automatically assumed when the default value is A past version of this PEP allowed type checkers to assume an optional
``None``, for example:: type when the default value is ``None``, as in this code::
def handle_employee(e: Employee = None): ... def handle_employee(e: Employee = None): ...
This is equivalent to:: This would have been treated as equivalent to::
def handle_employee(e: Optional[Employee] = None) -> None: ... def handle_employee(e: Optional[Employee] = None) -> None: ...
This is no longer the recommended behavior. Type checkers should move
towards requiring the optional type to be made explicit.
Support for singleton types in unions Support for singleton types in unions
------------------------------------- -------------------------------------
@ -1367,11 +1370,12 @@ Positional-only arguments
Some functions are designed to take their arguments only positionally, Some functions are designed to take their arguments only positionally,
and expect their callers never to use the argument's name to provide and expect their callers never to use the argument's name to provide
that argument by keyword. All arguments with names beginning with that argument by keyword. All arguments with names beginning with
``__`` are assumed to be positional-only:: ``__`` are assumed to be positional-only, except if their names also
end with ``__``::
def quux(__x: int) -> None: ... def quux(__x: int, __y__: int = 0) -> None: ...
quux(3) # This call is fine. quux(3, __y__=1) # This call is fine.
quux(__x=3) # This call is an error. quux(__x=3) # This call is an error.
@ -1409,7 +1413,7 @@ for example::
c = None # type: Coroutine[List[str], str, int] c = None # type: Coroutine[List[str], str, int]
... ...
x = c.send('hi') # type: List[str] x = c.send('hi') # type: List[str]
async def bar(): -> None: async def bar() -> None:
x = await c # type: int x = await c # type: int
The module also provides generic ABCs ``Awaitable``, The module also provides generic ABCs ``Awaitable``,
@ -1460,14 +1464,11 @@ No first-class syntax support for explicitly marking variables as being
of a specific type is added by this PEP. To help with type inference in of a specific type is added by this PEP. To help with type inference in
complex cases, a comment of the following format may be used:: complex cases, a comment of the following format may be used::
x = [] # type: List[Employee] x = [] # type: List[Employee]
x, y, z = [], [], [] # type: List[int], List[int], List[str] x, y, z = [], [], [] # type: List[int], List[int], List[str]
x, y, z = [], [], [] # type: (List[int], List[int], List[str]) x, y, z = [], [], [] # type: (List[int], List[int], List[str])
a, b, *c = range(5) # type: float, float, List[float] a, b, *c = range(5) # type: float, float, List[float]
x = [ x = [1, 2] # type: List[int]
1,
2,
] # type: List[int]
Type comments should be put on the last line of the statement that Type comments should be put on the last line of the statement that
contains the variable definition. They can also be placed on contains the variable definition. They can also be placed on
@ -1857,6 +1858,14 @@ Stub file package authors might use the following snippet in ``setup.py``::
], ],
... ...
(*UPDATE:* As of June 2018 the recommended way to distribute type
hints for third-party packages has changed -- in addition to typeshed
(see the next section) there is now a standard for distributing type
hints, PEP 561. It supports separately installable packages containing
stubs, stub files included in the same distribution as the executable
code of a package, and inline type hints, the latter two options
enabled by including a file named ``py.typed`` in the package.)
The Typeshed Repo The Typeshed Repo
----------------- -----------------

View File

@ -463,7 +463,7 @@ Expected Uses
The primary expected use case is various forms of testing -- "are the The primary expected use case is various forms of testing -- "are the
results computed near what I expect as a result?" This sort of test results computed near what I expect as a result?" This sort of test
may or may not be part of a formal unit testing suite. Such testing may or may not be part of a formal unit testing suite. Such testing
could be used one-off at the command line, in an iPython notebook, could be used one-off at the command line, in an IPython notebook,
part of doctests, or simple asserts in an ``if __name__ == "__main__"`` part of doctests, or simple asserts in an ``if __name__ == "__main__"``
block. block.

View File

@ -268,7 +268,7 @@ categories based on GNU autotools. This expanded scheme should help installers
to implement system policy, but installers may root each category at any to implement system policy, but installers may root each category at any
location. location.
A UNIX install scheme might map the categories to their installation patnhs A UNIX install scheme might map the categories to their installation paths
like this:: like this::
{ {

View File

@ -34,9 +34,9 @@ Release Manager and Crew
3.6 Lifespan 3.6 Lifespan
============ ============
3.6 will receive bugfix updates approximately every 3-6 months for 3.6 will receive bugfix updates approximately every 3 months for
approximately 18 months. After the release of 3.7.0 final, a final approximately 24 months. After the release of 3.7.0 final, two more
3.6 bugfix update will be released. After that, it is expected that 3.6 bugfix updates will be released. After that, it is expected that
security updates (source only) will be released until 5 years after security updates (source only) will be released until 5 years after
the release of 3.6 final, so until approximately December 2021. the release of 3.6 final, so until approximately December 2021.
@ -93,32 +93,39 @@ Actual:
- 3.6.4 final: 2017-12-19 - 3.6.4 final: 2017-12-19
Expected:
3.6.5 schedule 3.6.5 schedule
-------------- --------------
- 3.6.5 candidate: 2018-03-12 (tenative) - 3.6.5 candidate: 2018-03-13
- 3.6.5 final: 2018-03-26 (tentative) - 3.6.5 final: 2018-03-28
3.6.6 schedule 3.6.6 schedule
-------------- --------------
- 3.6.6 candidate: 2018-06-04 (tenative) - 3.6.6 candidate: 2018-06-12
- 3.6.6 final: 2018-06-15 (tentative) - 3.6.6 final: 2018-06-27
Expected:
3.6.7 schedule 3.6.7 schedule
-------------- --------------
- 3.6.7 candidate: 2018-09-10 (tentative)
- 3.6.7 final: 2018-09-24 (tentative)
3.6.8 schedule
--------------
Final maintenance mode release, final binary releases. Final maintenance mode release, final binary releases.
- 3.6.6 candidate: 2018-09-10 (tenative) - 3.6.8 candidate: 2018-12-03 (tentative)
- 3.6.6 final: 2018-09-24 (tentative) - 3.6.8 final: 2018-12-16 (tentative)
3.6.8 and beyond schedule 3.6.9 and beyond schedule
------------------------- -------------------------
Security fixes only, as needed, until 2021-12 Security fixes only, as needed, until 2021-12

748
pep-0505.rst Normal file
View File

@ -0,0 +1,748 @@
PEP: 505
Title: None-aware operators
Version: $Revision$
Last-Modified: $Date$
Author: Mark E. Haase <mehaase@gmail.com>, Steve Dower <steve.dower@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Sep-2015
Python-Version: 3.8
Abstract
========
Several modern programming languages have so-called "``null``-coalescing" or
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP
(starting in version 7). These operators provide syntactic sugar for common
patterns involving null references.
* The "``null``-coalescing" operator is a binary operator that returns its left
operand if it is not ``null``. Otherwise it returns its right operand.
* The "``null``-aware member access" operator accesses an instance member only
if that instance is non-``null``. Otherwise it returns ``null``. (This is also
called a "safe navigation" operator.)
* The "``null``-aware index access" operator accesses an element of a collection
only if that collection is non-``null``. Otherwise it returns ``null``. (This
is another type of "safe navigation" operator.)
This PEP proposes three ``None``-aware operators for Python, based on the
definitions and other language's implementations of those above. Specifically:
* The "``None`` coalescing`` binary operator ``??`` returns the left hand side
if it evaluates to a value that is not ``None``, or else it evaluates and
returns the right hand side. A coalescing ``??=`` augmented assignment
operator is included.
* The "``None``-aware attribute access" operator ``?.`` evaluates the complete
expression if the left hand side evaluates to a value that is not ``None``
* The "``None``-aware indexing" operator ``?[]`` evaluates the complete
expression if the left hand site evaluates to a value that is not ``None``
Syntax and Semantics
====================
Specialness of ``None``
-----------------------
The ``None`` object denotes the lack of a value. For the purposes of these
operators, the lack of a value indicates that the remainder of the expression
also lacks a value and should not be evaluated.
A rejected proposal was to treat any value that evaluates to false in a
Boolean context as not having a value. However, the purpose of these operators
is to propagate the "lack of value" state, rather that the "false" state.
Some argue that this makes ``None`` special. We contend that ``None`` is
already special, and that using it as both the test and the result of these
operators does not change the existing semantics in any way.
See the `Rejected Ideas`_ section for discussion on the rejected approaches.
Grammar changes
---------------
The following rules of the Python grammar are updated to read::
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=' | '??=')
power: coalesce ['**' factor]
coalesce: atom_expr ['??' factor]
atom_expr: ['await'] atom trailer*
trailer: ('(' [arglist] ')' |
'[' subscriptlist ']' |
'?[' subscriptlist ']' |
'.' NAME |
'?.' NAME)
Inserting the ``coalesce`` rule in this location ensures that expressions
resulting in ``None`` are natuarlly coalesced before they are used in
operations that would typically raise ``TypeError``. Like ``and`` and ``or``
the right-hand expression is not evaluated until the left-hand side is
determined to be ``None``. For example::
a, b = None, None
def c(): return None
def ex(): raise Exception()
(a ?? 2 ** b ?? 3) == a ?? (2 ** (b ?? 3))
(a * b ?? c // d) == a * (b ?? c) // d
(a ?? True and b ?? False) == (a ?? True) and (b ?? False)
(c() ?? c() ?? True) == True
(True ?? ex()) == True
(c ?? ex)() == c()
Augmented coalescing assignment only rebinds the name if its current value is
``None``. If the target name already has a value, the right-hand side is not
evaluated. For example::
a = None
b = ''
c = 0
a ??= 'value'
b ??= undefined_name
c ??= shutil.rmtree('/') # don't try this at home, kids
assert a == 'value'
assert b == ''
assert c == '0' and any(os.scandir('/'))
Adding new trailers for the other ``None``-aware operators ensures that they
may be used in all valid locations for the existing equivalent operators,
including as part of an assignment target (more details below). As the existing
evaluation rules are not directly embedded in the grammar, we specify the
required changes here.
Assume that the ``atom`` is always successfully evaluated. Each ``trailer`` is
then evaluated from left to right, applying its own parameter (either its
arguments, subscripts or attribute name) to produce the value for the next
``trailer``. Finally, if present, ``await`` is applied.
For example, ``await a.b(c).d[e]`` is currently parsed as
``['await', 'a', '.b', '(c)', '.d', '[e]']`` and evaluated::
_v = a
_v = _v.b
_v = _v(c)
_v = _v.d
_v = _v[e]
await _v
When a ``None``-aware operator is present, the left-to-right evaluation may be
short-circuited. For example, ``await a?.b(c).d?[e]`` is evaluated::
_v = a
if _v is not None:
_v = _v.b
_v = _v(c)
_v = _v.d
if _v is not None:
_v = _v[e]
await _v
.. note::
``await`` will almost certainly fail in this context, as it would in
the case where code attempts ``await None``. We are not proposing to add a
``None``-aware ``await`` keyword here, and merely include it in this
example for completeness of the specification, since the ``atom_expr``
grammar rule includes the keyword. If it were in its own rule, we would have
never mentioned it.
Parenthesised expressions are handled by the ``atom`` rule (not shown above),
which will implicitly terminate the short-circuiting behaviour of the above
transformation. For example, ``(a?.b ?? c).d?.e`` is evaluated as::
# a?.b
_v = a
if _v is not None:
_v = _v.b
# ... ?? c
if _v is None:
_v = c
# (...).d?.e
_v = _v.d
if _v is not None:
_v = _v.e
When used as an assignment target, the ``None``-aware operations may only be
used in a "load" context. That is, ``a?.b = 1`` and ``a?[b] = 1`` will raise
``SyntaxError``. Use earlier in the expression (``a?.b.c = 1``) is permitted,
though unlikely to be useful unless combined with a coalescing operation::
(a?.b ?? d).c = 1
Examples
========
This section presents some examples of common ``None`` patterns and shows what
conversion to use ``None``-aware operators may look like.
Standard Library
----------------
Using the ``find-pep505.py`` script[3]_ an analysis of the Python 3.7 standard
library discovered up to 678 code snippets that could be replaced with use of
one of the ``None``-aware operators::
$ find /usr/lib/python3.7 -name '*.py' | xargs python3.7 find-pep505.py
<snip>
Total None-coalescing `if` blocks: 449
Total [possible] None-coalescing `or`: 120
Total None-coalescing ternaries: 27
Total Safe navigation `and`: 13
Total Safe navigation `if` blocks: 61
Total Safe navigation ternaries: 8
Some of these are shown below as examples before and after converting to use the
new operators.
From ``bisect.py``::
def insort_right(a, x, lo=0, hi=None):
# ...
if hi is None:
hi = len(a)
# ...
After updating to use the ``??=`` augmented assignment statement::
def insort_right(a, x, lo=0, hi=None):
# ...
hi ??= len(a)
# ...
From ``calendar.py``::
encoding = options.encoding
if encoding is None:
encoding = sys.getdefaultencoding()
optdict = dict(encoding=encoding, css=options.css)
After updating to use the ``??`` operator::
optdict = dict(encoding=encoding ?? sys.getdefaultencoding(),
css=options.css)
From ``dis.py``::
def _get_const_info(const_index, const_list):
argval = const_index
if const_list is not None:
argval = const_list[const_index]
return argval, repr(argval)
After updating to use the ``?[]`` and ``??`` operators::
def _get_const_info(const_index, const_list):
argval = const_list?[const_index] ?? const_index
return argval, repr(argval)
From ``inspect.py``::
for base in object.__bases__:
for name in getattr(base, "__abstractmethods__", ()):
value = getattr(object, name, None)
if getattr(value, "__isabstractmethod__", False):
return True
After updating to use the ``?.`` operator (and deliberately not converting to
use ``any()``)::
for base in object.__bases__:
for name in base?.__abstractmethods__ ?? ():
if object?.name?.__isabstractmethod__:
return True
From ``os.py``::
if entry.is_dir():
dirs.append(name)
if entries is not None:
entries.append(entry)
else:
nondirs.append(name)
After updating to use the ``?.`` operator::
if entry.is_dir():
dirs.append(name)
entries?.append(entry)
else:
nondirs.append(name)
jsonify
-------
This example is from a Python web crawler that uses the Flask framework as its
front-end. This function retrieves information about a web site from a SQL
database and formats it as JSON to send to an HTTP client::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
url=site.url.rstrip('/')
)
Both ``first_seen`` and ``last_seen`` are allowed to be ``null`` in the
database, and they are also allowed to be ``null`` in the JSON response. JSON
does not have a native way to represent a ``datetime``, so the server's contract
states that any non-``null`` date is represented as an ISO-8601 string.
Without knowing the exact semantics of the ``first_seen`` and ``last_seen``
attributes, it is impossible to know whether the attribute can be safely or
performantly accessed multiple times.
One way to fix this code is to replace each conditional expression with an
explicit value assignment and a full ``if``/``else`` block::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
first_seen_dt = site.first_seen
if first_seen_dt is None:
first_seen = None
else:
first_seen = first_seen_dt.isoformat()
last_seen_dt = site.last_seen
if last_seen_dt is None:
last_seen = None
else:
last_seen = last_seen_dt.isoformat()
return jsonify(
first_seen=first_seen,
id=site.id,
is_active=site.is_active,
last_seen=last_seen,
url=site.url.rstrip('/')
)
This adds ten lines of code and four new code paths to the function,
dramatically increasing the apparent complexity. Rewriting using the
``None``-aware attribute operator results in shorter code with more clear
intent::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen?.isoformat(),
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen?.isoformat(),
url=site.url.rstrip('/')
)
Grab
----
The next example is from a Python scraping library called `Grab
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
ab/upload.py>`_::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
if ctype is None:
return 'application/octet-stream'
else:
return ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
if filename is None:
self.filename = self.get_random_filename()
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
if filename is None:
self.filename = os.path.split(path)[1]
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
This example contains several good examples of needing to provide default
values. Rewriting to use conditional expressions reduces the overall lines of
code, but does not necessarily improve readability::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return 'application/octet-stream' if ctype is None else ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = (self.get_random_filename() if filename
is None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = (os.path.split(path)[1] if filename is
None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
The first ternary expression is tidy, but it reverses the intuitive order of
the operands: it should return ``ctype`` if it has a value and use the string
literal as fallback. The other ternary expressions are unintuitive and so
long that they must be wrapped. The overall readability is worsened, not
improved.
Rewriting using the ``None`` coalescing operator::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return ctype ?? 'application/octet-stream'
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = filename ?? self.get_random_filename()
self.content_type = content_type ?? self.find_content_type(self.filename)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = filename ?? os.path.split(path)[1]
self.content_type = content_type ?? self.find_content_type(self.filename)
This syntax has an intuitive ordering of the operands. In ``find_content_type``,
for example, the preferred value ``ctype`` appears before the fallback value.
The terseness of the syntax also makes for fewer lines of code and less code to
visually parse, and reading from left-to-right and top-to-bottom more accurately
follows the execution flow.
Rejected Ideas
==============
The first three ideas in this section are oft-proposed alternatives to treating
``None`` as special. For further background on why these are rejected, see their
treatment in `PEP 531 <https://www.python.org/dev/peps/pep-0531/>`_ and
`PEP 532 <https://www.python.org/dev/peps/pep-0532/>`_ and the associated
discussions.
No-Value Protocol
-----------------
The operators could be generalised to user-defined types by defining a protocol
to indicate when a value represents "no value". Such a protocol may be a dunder
method ``__has_value__(self)` that returns ``True`` if the value should be
treated as having a value, and ``False`` if the value should be treated as no
value.
With this generalization, ``object`` would implement a dunder method equivalent
to this::
def __has_value__(self):
return True
``NoneType`` would implement a dunder method equivalent to this::
def __has_value__(self):
return False
In the specification section, all uses of ``x is None`` would be replaced with
``not x.__has_value__()``.
This generalization would allow for domain-specific "no-value" objects to be
coalesced just like ``None``. For example the ``pyasn1`` package has a type
called ``Null`` that represents an ASN.1 ``null``::
>>> from pyasn1.type import univ
>>> univ.Null() ?? univ.Integer(123)
Integer(123)
Similarly, values such as ``math.nan`` and ``NotImplemented`` could be treated
as representing no value.
However, the "no-value" nature of these values is domain-specific, which means
they *should* be treated as a value by the language. For example,
``math.nan.imag`` is well defined (it's ``0.0``), and so short-circuiting
``math.nan?.imag`` to return ``math.nan`` would be incorrect.
As ``None`` is already defined by the language as being the value that
represents "no value", and the current specification would not preclude
switching to a protocol in the future (though changes to built-in objects would
not be compatible), this idea is rejected for now.
Boolean-aware operators
-----------------------
This suggestion is fundamentally the same as adding a no-value protocol, and so
the discussion above also applies.
Similar behavior to the ``??`` operator can be achieved with an ``or``
expression, however ``or`` checks whether its left operand is false-y and not
specifically ``None``. This approach is attractive, as it requires fewer changes
to the language, but ultimately does not solve the underlying problem correctly.
Assuming the check is for truthiness rather than ``None``, there is no longer a
need for the ``??`` operator. However, applying this check to the ``?.`` and
``?[]`` operators prevents perfectly valid operations applying
Consider the following example, where ``get_log_list()`` may return either a
list containing current log messages (potentially empty), or ``None`` if logging
is not enabled::
lst = get_log_list()
lst?.append('A log message')
If ``?.`` is checking for true values rather than specifically ``None`` and the
log has not been initialized with any items, no item will ever be appended. This
violates the obvious intent of the code, which is to append an item. The
``append`` method is available on an empty list, as are all other list methods,
and there is no reason to assume that these members should not be used because
the list is presently empty.
Further, there is no sensible result to use in place of the expression. A
normal ``lst.append`` returns ``None``, but under this idea ``lst?.append`` may
result in either ``[]`` or ``None``, depending on the value of ``lst``. As with
the examples in the previous section, this makes no sense.
As checking for truthiness rather than ``None`` results in apparently valid
expressions no longer executing as intended, this idea is rejected.
Exception-aware operators
-------------------------
Arguably, the reason to short-circuit an expression when ``None`` is encountered
is to avoid the ``AttributeError`` or ``TypeError`` that would be raised under
normal circumstances. As an alternative to testing for ``None``, the ``?.`` and
``?[]`` operators could instead handle ``AttributeError`` and ``TypeError``
raised by the operation and skip the remainder of the expression.
This produces a transformation for ``a?.b.c?.d.e`` similar to this::
_v = a
try:
_v = _v.b
except AttributeError:
pass
else:
_v = _v.c
try:
_v = _v.d
except AttributeError:
pass
else:
_v = _v.e
One open question is which value should be returned as the expression when an
exception is handled. The above example simply leaves the partial result, but
this is not helpful for replacing with a default value. An alternative would be
to force the result to ``None``, which then raises the question as to why
``None`` is special enough to be the result but not special enough to be the
test.
Secondly, this approach masks errors within code executed implicitly as part of
the expression. For ``?.``, any ``AttributeError`` within a property or
``__getattr__`` implementation would be hidden, and similarly for ``?[]`` and
``__getitem__`` implementations.
Similarly, simple typing errors such as ``{}?.ietms()`` could go unnoticed.
Existing conventions for handling these kinds of errors in the form of the
``getattr`` builtin and the ``.get(key, default)`` method pattern established by
``dict`` show that it is already possible to explicitly use this behaviour.
As this approach would hide errors in code, it is rejected.
``None``-aware Function Call
----------------------------
The ``None``-aware syntax applies to attribute and index access, so it seems
natural to ask if it should also apply to function invocation syntax. It might
be written as ``foo?()``, where ``foo`` is only called if it is not None.
This has been deferred on the basis of the proposed operators being intended
to aid traversal of partially populated hierarchical data structures, *not*
for traversal of arbitrary class hierarchies. This is reflected in the fact
that none of the other mainstream languages that already offer this syntax
have found it worthwhile to support a similar syntax for optional function
invocations.
A workaround similar to that used by C# would be to write
``maybe_none?.__call__(arguments)``. If the callable is ``None``, the
expression will not be evaluated. (The C# equivalent uses ``?.Invoke()`` on its
callable type.)
``?`` Unary Postfix Operator
----------------------------
To generalize the ``None``-aware behavior and limit the number of new operators
introduced, a unary, postfix operator spelled ``?`` was suggested. The idea is
that ``?`` might return a special object that could would override dunder
methods that return ``self``. For example, ``foo?`` would evaluate to ``foo`` if
it is not ``None``, otherwise it would evaluate to an instance of
``NoneQuestion``::
class NoneQuestion():
def __call__(self, *args, **kwargs):
return self
def __getattr__(self, name):
return self
def __getitem__(self, key):
return self
With this new operator and new type, an expression like ``foo?.bar[baz]``
evaluates to ``NoneQuestion`` if ``foo`` is None. This is a nifty
generalization, but it's difficult to use in practice since most existing code
won't know what ``NoneQuestion`` is.
Going back to one of the motivating examples above, consider the following::
>>> import json
>>> created = None
>>> json.dumps({'created': created?.isoformat()})``
The JSON serializer does not know how to serialize ``NoneQuestion``, nor will
any other API. This proposal actually requires *lots of specialized logic*
throughout the standard library and any third party library.
At the same time, the ``?`` operator may also be **too general**, in the sense
that it can be combined with any other operator. What should the following
expressions mean?::
>>> x? + 1
>>> x? -= 1
>>> x? == 1
>>> ~x?
This degree of generalization is not useful. The operators actually proposed
herein are intentionally limited to a few operators that are expected to make it
easier to write common code patterns.
Built-in ``maybe``
------------------
Haskell has a concept called `Maybe <https://wiki.haskell.org/Maybe>`_ that
encapsulates the idea of an optional value without relying on any special
keyword (e.g. ``null``) or any special instance (e.g. ``None``). In Haskell, the
purpose of ``Maybe`` is to avoid separate handling of "something" and nothing".
A Python package called `pymaybe <https://pypi.org/p/pymaybe/>`_ provides a
rough approximation. The documentation shows the following example::
>>> maybe('VALUE').lower()
'value'
>>> maybe(None).invalid().method().or_else('unknown')
'unknown'
The function ``maybe()`` returns either a ``Something`` instance or a
``Nothing`` instance. Similar to the unary postfix operator described in the
previous section, ``Nothing`` overrides dunder methods in order to allow
chaining on a missing value.
Note that ``or_else()`` is eventually required to retrieve the underlying value
from ``pymaybe``'s wrappers. Furthermore, ``pymaybe`` does not short circuit any
evaluation. Although ``pymaybe`` has some strengths and may be useful in its own
right, it also demonstrates why a pure Python implementation of coalescing is
not nearly as powerful as support built into the language.
The idea of adding a builtin ``maybe`` type to enable this scenario is rejected.
Just use a conditional expression
---------------------------------
Another common way to initialize default values is to use the ternary operator.
Here is an excerpt from the popular `Requests package
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
a8149f5/requests/models.py#L212>`_::
data = [] if data is None else data
files = [] if files is None else files
headers = {} if headers is None else headers
params = {} if params is None else params
hooks = {} if hooks is None else hooks
This particular formulation has the undesirable effect of putting the operands
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
``[]`` as a fallback," but the code puts the fallback *before* the preferred
value.
The author of this package could have written it like this instead::
data = data if data is not None else []
files = files if files is not None else []
headers = headers if headers is not None else {}
params = params if params is not None else {}
hooks = hooks if hooks is not None else {}
This ordering of the operands is more intuitive, but it requires 4 extra
characters (for "not "). It also highlights the repetition of identifiers:
``data if data``, ``files if files``, etc.
When written using the ``None`` coalescing operator, the sample reads::
data = data ?? []
files = files ?? []
headers = headers ?? {}
params = params ?? {}
hooks = hooks ?? {}
References
==========
.. [1] C# Reference: Operators
(https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
.. [2] A Tour of the Dart Language: Operators
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
.. [3] Associated scripts
(https://github.com/python/peps/tree/master/pep-0505/)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

File diff suppressed because it is too large Load Diff

View File

@ -3,7 +3,7 @@ Title: Adding A Secrets Module To The Standard Library
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Steven D'Aprano <steve@pearwood.info> Author: Steven D'Aprano <steve@pearwood.info>
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 19-Sep-2015 Created: 19-Sep-2015

View File

@ -359,7 +359,7 @@ of ``PythonCore`` is omitted but shown in a later example::
(winreg.HKEY_LOCAL_MACHINE, r'Software\Python', winreg.KEY_WOW64_32KEY), (winreg.HKEY_LOCAL_MACHINE, r'Software\Python', winreg.KEY_WOW64_32KEY),
]: ]:
with winreg.OpenKeyEx(hive, key, access=winreg.KEY_READ | flags) as root_key: with winreg.OpenKeyEx(hive, key, access=winreg.KEY_READ | flags) as root_key:
for comany in enum_keys(root_key): for company in enum_keys(root_key):
if company == 'PyLauncher': if company == 'PyLauncher':
continue continue
@ -478,4 +478,4 @@ References
Copyright Copyright
========= =========
This document has been placed in the public domain. This document has been placed in the public domain.

View File

@ -6,7 +6,7 @@ Author: Nathaniel J. Smith <njs@pobox.com>,
Thomas Kluyver <thomas@kluyver.me.uk> Thomas Kluyver <thomas@kluyver.me.uk>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com> BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: <distutils-sig@python.org> Discussions-To: <distutils-sig@python.org>
Status: Accepted Status: Provisional
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 30-Sep-2015 Created: 30-Sep-2015

View File

@ -7,8 +7,8 @@ Author: Brett Cannon <brett@python.org>,
Donald Stufft <donald@stufft.io> Donald Stufft <donald@stufft.io>
BDFL-Delegate: Nick Coghlan BDFL-Delegate: Nick Coghlan
Discussions-To: distutils-sig <distutils-sig at python.org> Discussions-To: distutils-sig <distutils-sig at python.org>
Status: Accepted Status: Provisional
Type: Informational Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 10-May-2016 Created: 10-May-2016
Post-History: 10-May-2016, Post-History: 10-May-2016,
@ -129,24 +129,63 @@ of requirements for the build system to simply begin execution.
Specification Specification
============= =============
File Format
-----------
The build system dependencies will be stored in a file named The build system dependencies will be stored in a file named
``pyproject.toml`` that is written in the TOML format [#toml]_. This ``pyproject.toml`` that is written in the TOML format [#toml]_.
format was chosen as it is human-usable (unlike JSON [#json]_), it is
flexible enough (unlike configparser [#configparser]_), stems from a This format was chosen as it is human-usable (unlike JSON [#json]_),
standard (also unlike configparser [#configparser]_), and it is not it is flexible enough (unlike configparser [#configparser]_), stems
overly complex (unlike YAML [#yaml]_). The TOML format is already in from a standard (also unlike configparser [#configparser]_), and it
use by the Rust community as part of their is not overly complex (unlike YAML [#yaml]_). The TOML format is
already in use by the Rust community as part of their
Cargo package manager [#cargo]_ and in private email stated they have Cargo package manager [#cargo]_ and in private email stated they have
been quite happy with their choice of TOML. A more thorough been quite happy with their choice of TOML. A more thorough
discussion as to why various alternatives were not chosen can be read discussion as to why various alternatives were not chosen can be read
in the `Other file formats`_ section. in the `Other file formats`_ section.
There will be a ``[build-system]`` table in the Tables not specified in this PEP are reserved for future use by other
configuration file to store build-related data. Initially only one key PEPs.
of the table will be valid and mandatory: ``requires``. That key will
have a value of a list of strings representing the PEP 508 build-system table
dependencies required to execute the build system (currently that ------------------
means what dependencies are required to execute a ``setup.py`` file).
The ``[build-system]`` table is used to store build-related data.
Initially only one key of the table will be valid and mandatory:
``requires``. This key must have a value of a list of strings
representing PEP 508 dependencies required to execute the build
system (currently that means what dependencies are required to
execute a ``setup.py`` file).
For the vast majority of Python projects that rely upon setuptools,
the ``pyproject.toml`` file will be::
[build-system]
# Minimum requirements for the build system to execute.
requires = ["setuptools", "wheel"] # PEP 508 specifications.
Because the use of setuptools and wheel are so expansive in the
community at the moment, build tools are expected to use the example
configuration file above as their default semantics when a
``pyproject.toml`` file is not present.
tool table
----------
The ``[tool]`` table is where tools can have users specify
configuration data as long as they use a sub-table within ``[tool]``,
e.g. the `flit <https://pypi.python.org/pypi/flit>`_ tool would store
its configuration in ``[tool.flit]``.
We need some mechanism to allocate names within the ``tool.*``
namespace, to make sure that different projects don't attempt to use
the same sub-table and collide. Our rule is that a project can use
the subtable ``tool.$NAME`` if, and only if, they own the entry for
``$NAME`` in the Cheeseshop/PyPI.
JSON Schema
-----------
To provide a type-specific representation of the resulting data from To provide a type-specific representation of the resulting data from
the TOML file for illustrative purposes only, the following JSON the TOML file for illustrative purposes only, the following JSON
@ -180,31 +219,6 @@ Schema [#jsonschema]_ would match the data format::
} }
} }
For the vast majority of Python projects that rely upon setuptools,
the ``pyproject.toml`` file will be::
[build-system]
# Minimum requirements for the build system to execute.
requires = ["setuptools", "wheel"] # PEP 508 specifications.
Because the use of setuptools and wheel are so expansive in the
community at the moment, build tools are expected to use the example
configuration file above as their default semantics when a
``pyproject.toml`` file is not present.
All other top-level keys and tables are reserved for future use by
other PEPs except for the ``[tool]`` table. Within that table, tools
can have users specify configuration data as long as they use a
sub-table within ``[tool]``, e.g. the
`flit <https://pypi.python.org/pypi/flit>`_ tool would store its
configuration in ``[tool.flit]``.
We need some mechanism to allocate names within the ``tool.*``
namespace, to make sure that different projects don't attempt to use
the same sub-table and collide. Our rule is that a project can use
the subtable ``tool.$NAME`` if, and only if, they own the entry for
``$NAME`` in the Cheeseshop/PyPI.
Rejected Ideas Rejected Ideas
============== ==============
@ -255,6 +269,44 @@ vendored easily by projects. This outright excluded certain formats
like XML which are not friendly towards human beings and were never like XML which are not friendly towards human beings and were never
seriously discussed. seriously discussed.
Overview of file formats considered
'''''''''''''''''''''''''''''''''''
The key reasons for rejecting the other alternatives considered are
summarised in the following sections, while the full review (including
positive arguments in favour of TOML) can be found at [#file_formats]_.
TOML was ultimately selected as it provided all the features we
were interested in, while avoiding the downsides introduced by
the alternatives.
======================= ==== ==== ==== =======
Feature TOML YAML JSON CFG/INI
======================= ==== ==== ==== =======
Well-defined yes yes yes
Real data types yes yes yes
Reliable Unicode yes yes yes
Reliable comments yes yes
Easy for humans to edit yes ?? ??
Easy for tools to edit yes ?? yes ??
In standard library yes yes
Easy for pip to vendor yes n/a n/a
======================= ==== ==== ==== =======
("??" in the table indicates items where most folks would be
inclined to answer "yes", but there turn out to be a lot of
quirks and edge cases that arise in practice due to either
the lack of a clear specification, or else the underlying
file format specification being surprisingly complicated)
The ``pytoml`` TOML parser is ~300 lines of pure Python code,
so being outside the standard library didn't count heavily
against it.
Python literals were also discussed as a potential format, but
weren't considered in the file format review (since they're not
a common pre-existing file format).
JSON JSON
'''' ''''
@ -375,6 +427,17 @@ An example Python literal file for the proposed data would be::
} }
Sticking with ``setup.cfg``
---------------------------
There are two issues with ``setup.cfg`` used by setuptools as a general
format. One is that they are ``.ini`` files which have issues as mentioned
in the configparser_ discussion above. The other is that the schema for
that file has never been rigorously defined and thus it's unknown which
format would be safe to use going forward without potentially confusing
setuptools installations.
Other file names Other file names
---------------- ----------------
@ -473,6 +536,9 @@ References
.. [#jsonschema] JSON Schema .. [#jsonschema] JSON Schema
(http://json-schema.org/) (http://json-schema.org/)
.. [#file_formats] Nathaniel J. Smith's file format review
(https://gist.github.com/njsmith/78f68204c5d969f8c8bc645ef77d4a8f)
Copyright Copyright

View File

@ -365,7 +365,8 @@ and local variables should have a single space after corresponding colon.
There should be no space before the colon. If an assignment has right hand There should be no space before the colon. If an assignment has right hand
side, then the equality sign should have exactly one space on both sides. side, then the equality sign should have exactly one space on both sides.
Examples: Examples:
* Yes::
- Yes::
code: int code: int
@ -373,7 +374,7 @@ Examples:
coords: Tuple[int, int] coords: Tuple[int, int]
label: str = '<unknown>' label: str = '<unknown>'
* No:: - No::
code:int # No space after colon code:int # No space after colon
code : int # Space before colon code : int # Space before colon

View File

@ -34,7 +34,7 @@ Release Manager and Crew
3.7 Lifespan 3.7 Lifespan
============ ============
3.7 will receive bugfix updates approximately every 3-6 months for 3.7 will receive bugfix updates approximately every 1-3 months for
approximately 18 months. After the release of 3.8.0 final, a final approximately 18 months. After the release of 3.8.0 final, a final
3.7 bugfix update will be released. After that, it is expected that 3.7 bugfix update will be released. After that, it is expected that
security updates (source only) will be released until 5 years after security updates (source only) will be released until 5 years after
@ -56,15 +56,21 @@ Actual:
- 3.7.0 alpha 4: 2018-01-09 - 3.7.0 alpha 4: 2018-01-09
- 3.7.0 beta 1: 2018-01-31 - 3.7.0 beta 1: 2018-01-31
(No new features beyond this point.) (No new features beyond this point.)
- 3.7.0 beta 2: 2018-02-27
- 3.7.0 beta 3: 2018-03-29
- 3.7.0 beta 4: 2018-05-02
- 3.7.0 beta 5: 2018-05-30
- 3.7.0 candidate 1: 2018-06-12
- 3.7.0 final: 2018-06-27
Expected: Expected:
- 3.7.0 beta 2: 2018-02-26 Maintenance releases
- 3.7.0 beta 3: 2018-03-26 --------------------
- 3.7.0 beta 4: 2018-04-30
- 3.7.0 candidate 1: 2018-05-21 Expected:
- 3.7.0 candidate 2: 2018-06-04 (if necessary)
- 3.7.0 final: 2018-06-15 - 3.7.1: 2018-07-xx
Features for 3.7 Features for 3.7

View File

@ -97,6 +97,9 @@ with a runtime ``PYTHONCOERCECLOCALE=warn`` environment variable setting
that allows developers and system integrators to opt-in to receiving locale that allows developers and system integrators to opt-in to receiving locale
coercion and compatibility warnings, without emitting them by default. coercion and compatibility warnings, without emitting them by default.
The output examples in the PEP itself have also been updated to remove
the warnings and make them easier to read.
Background Background
========== ==========
@ -352,10 +355,12 @@ proposed solution:
PEP process or Python release announcements. However, to minimize the chance PEP process or Python release announcements. However, to minimize the chance
of introducing new problems for end users, we'll do this *without* using the of introducing new problems for end users, we'll do this *without* using the
warnings system, so even running with ``-Werror`` won't turn it into a runtime warnings system, so even running with ``-Werror`` won't turn it into a runtime
exception. exception. (Note: these warnings ended up being silenced by default. See the
Implementation Note above for more details)
* for Python 3.7, any changed defaults will offer some form of explicit "off" * for Python 3.7, any changed defaults will offer some form of explicit "off"
switch at build time, runtime, or both switch at build time, runtime, or both
Minimizing the negative impact on systems currently correctly configured to Minimizing the negative impact on systems currently correctly configured to
use GB-18030 or another partially ASCII compatible universal encoding leads to use GB-18030 or another partially ASCII compatible universal encoding leads to
the following design principle: the following design principle:
@ -459,6 +464,9 @@ successfully configured::
Python detected LC_CTYPE=C: LC_CTYPE coerced to C.UTF-8 (set another Python detected LC_CTYPE=C: LC_CTYPE coerced to C.UTF-8 (set another
locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behaviour). locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behaviour).
(Note: this warning ended up being silenced by default. See the
Implementation Note above for more details)
As long as the current platform provides at least one of the candidate UTF-8 As long as the current platform provides at least one of the candidate UTF-8
based environments, this locale coercion will mean that the standard based environments, this locale coercion will mean that the standard
Python binary *and* locale-aware extensions should once again "just work" Python binary *and* locale-aware extensions should once again "just work"
@ -508,6 +516,9 @@ configured locale is still the default ``C`` locale and
C.utf8, or UTF-8 (if available) as alternative Unicode-compatible C.utf8, or UTF-8 (if available) as alternative Unicode-compatible
locales is recommended. locales is recommended.
(Note: this warning ended up being silenced by default. See the
Implementation Note above for more details)
In this case, no actual change will be made to the locale settings. In this case, no actual change will be made to the locale settings.
Instead, the warning informs both system and application integrators that Instead, the warning informs both system and application integrators that
@ -535,6 +546,10 @@ The locale warning behaviour would be controlled by the flag
``--with[out]-c-locale-warning``, which would set the ``PY_WARN_ON_C_LOCALE`` ``--with[out]-c-locale-warning``, which would set the ``PY_WARN_ON_C_LOCALE``
preprocessor definition. preprocessor definition.
(Note: this compile time warning option ended up being replaced by a runtime
``PYTHONCOERCECLOCALE=warn`` option. See the Implementation Note above for
more details)
On platforms which don't use the ``autotools`` based build system (i.e. On platforms which don't use the ``autotools`` based build system (i.e.
Windows) these preprocessor variables would always be undefined. Windows) these preprocessor variables would always be undefined.
@ -925,8 +940,6 @@ cover, as it avoids causing any problems in cases like the following::
$ LANG=C LC_MONETARY=ja_JP.utf8 ./python -c \ $ LANG=C LC_MONETARY=ja_JP.utf8 ./python -c \
"from locale import setlocale, LC_ALL, currency; setlocale(LC_ALL, ''); print(currency(1e6))" "from locale import setlocale, LC_ALL, currency; setlocale(LC_ALL, ''); print(currency(1e6))"
Python detected LC_CTYPE=C: LC_CTYPE & LANG coerced to C.UTF-8 (set another
locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behavior).
¥1000000 ¥1000000
@ -966,9 +979,6 @@ from a PEP 538 enabled CPython build, where each line after the first is
executed by doing "up-arrow, left-arrow x4, delete, enter":: executed by doing "up-arrow, left-arrow x4, delete, enter"::
$ LANG=C ./python $ LANG=C ./python
Python detected LC_CTYPE=C: LC_CTYPE & LANG coerced to C.UTF-8 (set
another locale or PYTHONCOERCECLOCALE=0 to disable this locale
coercion behavior).
Python 3.7.0a0 (heads/pep538-coerce-c-locale:188e780, May 7 2017, 00:21:13) Python 3.7.0a0 (heads/pep538-coerce-c-locale:188e780, May 7 2017, 00:21:13)
[GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux [GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information. Type "help", "copyright", "credits" or "license" for more information.
@ -1064,7 +1074,7 @@ Accordingly, this PEP originally proposed to disable locale coercion and
warnings at build time for these platforms, on the assumption that it would warnings at build time for these platforms, on the assumption that it would
be entirely redundant. be entirely redundant.
However, that assumpion turned out to be incorrect assumption, as subsequent However, that assumption turned out to be incorrect, as subsequent
investigations showed that if you explicitly configure ``LANG=C`` on investigations showed that if you explicitly configure ``LANG=C`` on
these platforms, extension modules like GNU readline will misbehave in much the these platforms, extension modules like GNU readline will misbehave in much the
same way as they do on other \*nix systems. [21_] same way as they do on other \*nix systems. [21_]

View File

@ -4,10 +4,11 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Erik M. Bray, Masayuki Yamamoto Author: Erik M. Bray, Masayuki Yamamoto
BDFL-Delegate: Nick Coghlan BDFL-Delegate: Nick Coghlan
Status: Accepted Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 20-Dec-2016 Created: 20-Dec-2016
Python-Version: 3.7
Post-History: 16-Dec-2016, 31-Aug-2017, 08-Sep-2017 Post-History: 16-Dec-2016, 31-Aug-2017, 08-Sep-2017
Resolution: https://mail.python.org/pipermail/python-dev/2017-September/149358.html Resolution: https://mail.python.org/pipermail/python-dev/2017-September/149358.html

View File

@ -18,7 +18,7 @@ Abstract
Add a new "UTF-8 Mode" to enhance Python's use of UTF-8. When UTF-8 Mode Add a new "UTF-8 Mode" to enhance Python's use of UTF-8. When UTF-8 Mode
is active, Python will: is active, Python will:
* use the ``utf-8`` encoding, irregardless of the locale currently set by * use the ``utf-8`` encoding, regardless of the locale currently set by
the current platform, and the current platform, and
* change the ``stdin`` and ``stdout`` error handlers to * change the ``stdin`` and ``stdout`` error handlers to
``surrogateescape``. ``surrogateescape``.
@ -163,7 +163,7 @@ The UTF-8 Mode has the same effect as locale coercion:
``surrogateescape``. ``surrogateescape``.
These changes only affect Python code. But the locale coercion has These changes only affect Python code. But the locale coercion has
addiditonal effects: the ``LC_CTYPE`` environment variable and the additional effects: the ``LC_CTYPE`` environment variable and the
``LC_CTYPE`` locale are set to a UTF-8 locale like ``C.UTF-8``. One side ``LC_CTYPE`` locale are set to a UTF-8 locale like ``C.UTF-8``. One side
effect is that non-Python code is also impacted by the locale coercion. effect is that non-Python code is also impacted by the locale coercion.
The two PEPs are complementary. The two PEPs are complementary.

View File

@ -3,12 +3,14 @@ Title: Package Index Name Retention
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Łukasz Langa <lukasz@python.org> Author: Łukasz Langa <lukasz@python.org>
BDFL-Delegate: Donald Stufft <donald@stufft.io> BDFL-Delegate: Mark Mangoba <mmangoba@python.org>
Discussions-To: distutils-sig <distutils-sig@python.org> Discussions-To: distutils-sig <distutils-sig@python.org>
Status: Draft Status: Final
Type: Process Type: Process
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 12-January-2017 Created: 12-January-2017
Post-History:
Resolution: https://mail.python.org/pipermail/distutils-sig/2018-March/032089.html
Abstract Abstract
@ -36,6 +38,22 @@ This document aims to provide general guidelines for solving the
most typical cases of such conflicts. most typical cases of such conflicts.
Approval Process
================
As the application of this policy has potential legal ramifications for the
Python Software Foundation, the approval process used is more formal than that
used for most PEPs.
Rather than accepting the PEP directly, the assigned BDFL-Delegate will instead
recommend its acceptance to the PSF's Packaging Working Group. After
consultation with the PSF's General Counsel, adoption of the policy will then
be subject to a formal vote within the working group.
This formal approval process will be used for both initial adoption of the
policy, and for adoption of any future amendments.
Specification Specification
============= =============
@ -61,7 +79,9 @@ The use cases covered by this document are:
* resolving disputes over a name. * resolving disputes over a name.
* Invalid projects. * Invalid projects:
* projects subject to a claim of intellectual property infringement.
The proposed extension to the Terms of Use, as expressed in the The proposed extension to the Terms of Use, as expressed in the
Implementation section, will be published as a separate document on the Implementation section, will be published as a separate document on the
@ -112,7 +132,7 @@ are met:
* the project has been determined *abandoned* by the rules described * the project has been determined *abandoned* by the rules described
above; above;
* the candidate is able to demonstrate own failed attempts to contact * the candidate is able to demonstrate their own failed attempts to contact
the existing owner; the existing owner;
* the candidate is able to demonstrate improvements made on the * the candidate is able to demonstrate improvements made on the
candidate's own fork of the project; candidate's own fork of the project;
@ -137,7 +157,7 @@ of reusing the name when ALL of the following are met:
* the project has been determined *abandoned* by the rules described * the project has been determined *abandoned* by the rules described
above; above;
* the candidate is able to demonstrate own failed attempts to contact * the candidate is able to demonstrate their own failed attempts to contact
the existing owner; the existing owner;
* the candidate is able to demonstrate that the project suggested to * the candidate is able to demonstrate that the project suggested to
reuse the name already exists and meets notability requirements; reuse the name already exists and meets notability requirements;
@ -196,24 +216,53 @@ is considered invalid and will be removed from the Index:
The Package Index maintainers pre-emptively declare certain package The Package Index maintainers pre-emptively declare certain package
names as unavailable for security reasons. names as unavailable for security reasons.
If you find a project that you think might be considered invalid, create Intellectual property policy
a support request [7]_. Maintainers of the Package Index will review ----------------------------
the case.
It is the policy of Python Software Foundation and the Package Index
maintainers to be appropriately responsive to claims of intellectual
property infringement by third parties. It is not the policy of
the Python Software Foundation nor the Package Index maintainers
to pre-screen uploaded packages for any type of intellectual property
infringement.
Possibly-infringing packages should be reported to legal@python.org
and counsel to the Python Software Foundation will determine an
appropriate response. A package can be removed or transferred to a
new owner at the sole discretion of the Python Software Foundation to
address a claim of infringement.
A project published on the Package Index meeting ANY of the following
may be considered infringing and subject to removal from the Index
or transferral to a new owner:
* project contains unlicensed copyrighted material from a third party,
and is subject to a properly made claim under the DMCA;
* project uses a third party's trademark in a way not covered by
nominal or fair use guidelines;
* project clearly implicates a patented system or process, and is
the subject of a complaint; or
* project is subject to an active lawsuit.
In the event of a complaint for intellectual property infringement,
a copy of the complaint will be sent to the package owner. In some
cases, action may be taken by the Package Index maintainers before
the owner responds.
The role of the Python Software Foundation The role of the Python Software Foundation
------------------------------------------ ------------------------------------------
The Python Software Foundation [8]_ is the non-profit legal entity that The Python Software Foundation [7]_ is the non-profit legal entity that
provides the Package Index as a community service. provides the Package Index as a community service.
The Package Index maintainers can escalate issues covered by this The Package Index maintainers can escalate issues covered by this
document for resolution by the PSF Board if the matter is not clear document for resolution by the Packaging Workgroup if the matter is not clear
enough. Some decisions *require* additional judgement by the Board, enough. Some decisions *require* additional judgement by the Board,
especially in cases of Code of Conduct violations or legal claims. especially in cases of Code of Conduct violations or legal claims.
Decisions made by the Board are published as Resolutions [9]_. Recommendations made by the Board are sent to the Packaging Workgroup [8]_ for review.
The Board has the final say in any disputes covered by this document and The Packaging Workgroup has the final say in any disputes covered by this document and
can decide to reassign or remove a project from the Package Index after can decide to reassign or remove a project from the Package Index after
careful consideration even when not all requirements listed careful consideration even when not all requirements listed
here are met. here are met.
@ -266,7 +315,7 @@ References
(https://pypi.org/policy/terms-of-use/) (https://pypi.org/policy/terms-of-use/)
.. [2] The Python Package Index .. [2] The Python Package Index
(https://pypi.python.org/) (https://pypi.org/)
.. [3] The Comprehensive Perl Archive Network .. [3] The Comprehensive Perl Archive Network
(http://www.cpan.org/) (http://www.cpan.org/)
@ -280,14 +329,11 @@ References
.. [6] Python Community Code of Conduct .. [6] Python Community Code of Conduct
(https://www.python.org/psf/codeofconduct/) (https://www.python.org/psf/codeofconduct/)
.. [7] PyPI Support Requests .. [7] Python Software Foundation
(https://sourceforge.net/p/pypi/support-requests/)
.. [8] Python Software Foundation
(https://www.python.org/psf/) (https://www.python.org/psf/)
.. [9] PSF Board Resolutions .. [8] Python Packaging Working Group
(https://www.python.org/psf/records/board/resolutions/) (https://wiki.python.org/psf/PackagingWG/)
Copyright Copyright

View File

@ -199,8 +199,8 @@ approaches related to structural subtyping in Python and other languages:
Such behavior seems to be a perfect fit for both runtime and static behavior Such behavior seems to be a perfect fit for both runtime and static behavior
of protocols. As discussed in `rationale`_, we propose to add static support of protocols. As discussed in `rationale`_, we propose to add static support
for such behavior. In addition, to allow users to achieve such runtime for such behavior. In addition, to allow users to achieve such runtime
behavior for *user-defined* protocols a special ``@runtime`` decorator will behavior for *user-defined* protocols a special ``@runtime_checkable`` decorator
be provided, see detailed `discussion`_ below. will be provided, see detailed `discussion`_ below.
* TypeScript [typescript]_ provides support for user-defined classes and * TypeScript [typescript]_ provides support for user-defined classes and
interfaces. Explicit implementation declaration is not required and interfaces. Explicit implementation declaration is not required and
@ -381,8 +381,7 @@ Explicitly declaring implementation
To explicitly declare that a certain class implements a given protocol, To explicitly declare that a certain class implements a given protocol,
it can be used as a regular base class. In this case a class could use it can be used as a regular base class. In this case a class could use
default implementations of protocol members. ``typing.Sequence`` is a good default implementations of protocol members. Static analysis tools are
example of a protocol with useful default methods. Static analysis tools are
expected to automatically detect that a class implements a given protocol. expected to automatically detect that a class implements a given protocol.
So while it's possible to subclass a protocol explicitly, it's *not necessary* So while it's possible to subclass a protocol explicitly, it's *not necessary*
to do so for the sake of type-checking. to do so for the sake of type-checking.
@ -587,6 +586,30 @@ Continuing the previous example::
walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable' walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable'
Self-types in protocols
-----------------------
The self-types in protocols follow the corresponding specification
[self-types]_ of PEP 484. For example::
C = TypeVar('C', bound='Copyable')
class Copyable(Protocol):
def copy(self: C) -> C:
class One:
def copy(self) -> 'One':
...
T = TypeVar('T', bound='Other')
class Other:
def copy(self: T) -> T:
...
c: Copyable
c = One() # OK
c = Other() # Also OK
Using Protocols Using Protocols
=============== ===============
@ -665,14 +688,14 @@ classes. For example::
One can use multiple inheritance to define an intersection of protocols. One can use multiple inheritance to define an intersection of protocols.
Example:: Example::
from typing import Sequence, Hashable from typing import Iterable, Hashable
class HashableFloats(Sequence[float], Hashable, Protocol): class HashableFloats(Iterable[float], Hashable, Protocol):
pass pass
def cached_func(args: HashableFloats) -> float: def cached_func(args: HashableFloats) -> float:
... ...
cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence cached_func((1, 2, 3)) # OK, tuple is both hashable and iterable
If this will prove to be a widely used scenario, then a special If this will prove to be a widely used scenario, then a special
intersection type construct could be added in future as specified by PEP 483, intersection type construct could be added in future as specified by PEP 483,
@ -740,8 +763,8 @@ aliases::
.. _discussion: .. _discussion:
``@runtime`` decorator and narrowing types by ``isinstance()`` ``@runtime_checkable`` decorator and narrowing types by ``isinstance()``
-------------------------------------------------------------- ------------------------------------------------------------------------
The default semantics is that ``isinstance()`` and ``issubclass()`` fail The default semantics is that ``isinstance()`` and ``issubclass()`` fail
for protocol types. This is in the spirit of duck typing -- protocols for protocol types. This is in the spirit of duck typing -- protocols
@ -752,38 +775,58 @@ However, it should be possible for protocol types to implement custom
instance and class checks when this makes sense, similar to how ``Iterable`` instance and class checks when this makes sense, similar to how ``Iterable``
and other ABCs in ``collections.abc`` and ``typing`` already do it, and other ABCs in ``collections.abc`` and ``typing`` already do it,
but this is limited to non-generic and unsubscripted generic protocols but this is limited to non-generic and unsubscripted generic protocols
(``Iterable`` is statically equivalent to ``Iterable[Any]`). (``Iterable`` is statically equivalent to ``Iterable[Any]``).
The ``typing`` module will define a special ``@runtime`` class decorator The ``typing`` module will define a special ``@runtime_checkable`` class decorator
that provides the same semantics for class and instance checks as for that provides the same semantics for class and instance checks as for
``collections.abc`` classes, essentially making them "runtime protocols":: ``collections.abc`` classes, essentially making them "runtime protocols"::
from typing import runtime, Protocol from typing import runtime, Protocol
@runtime @runtime_checkable
class Closable(Protocol): class SupportsClose(Protocol):
def close(self): def close(self):
... ...
assert isinstance(open('some/file'), Closable) assert isinstance(open('some/file'), SupportsClose)
Static type checkers will understand ``isinstance(x, Proto)`` and
``issubclass(C, Proto)`` for protocols defined with this decorator (as they
already do for ``Iterable`` etc.). Static type checkers will narrow types
after such checks by the type erased ``Proto`` (i.e. with all variables
having type ``Any`` and all methods having type ``Callable[..., Any]``).
Note that ``isinstance(x, Proto[int])`` etc. will always fail in agreement
with PEP 484. Examples::
from typing import Iterable, Iterator, Sequence
def process(items: Iterable[int]) -> None:
if isinstance(items, Iterator):
# 'items' has type 'Iterator[int]' here
elif isinstance(items, Sequence[int]):
# Error! Can't use 'isinstance()' with subscripted protocols
Note that instance checks are not 100% reliable statically, this is why Note that instance checks are not 100% reliable statically, this is why
this behavior is opt-in, see section on `rejected`_ ideas for examples. this behavior is opt-in, see section on `rejected`_ ideas for examples.
The most type checkers can do is to treat ``isinstance(obj, Iterator)``
roughly as a simpler way to write
``hasattr(x, '__iter__') and hasattr(x, '__next__')``. To minimize
the risks for this feature, the following rules are applied.
**Definitions**:
* *Data, and non-data protocols*: A protocol is called non-data protocol
if it only contains methods as members (for example ``Sized``,
``Iterator``, etc). A protocol that contains at least one non-method member
(like ``x: int``) is called a data protocol.
* *Unsafe overlap*: A type ``X`` is called unsafely overlapping with
a protocol ``P``, if ``X`` is not a subtype of ``P``, but it is a subtype
of the type erased version of ``P`` where all members have type ``Any``.
In addition, if at least one element of a union unsafely overlaps with
a protocol ``P``, then the whole union is unsafely overlapping with ``P``.
**Specification**:
* A protocol can be used as a second argument in ``isinstance()`` and
``issubclass()`` only if it is explicitly opt-in by ``@runtime_checkable``
decorator. This requirement exists because protocol checks are not type safe
in case of dynamically set attributes, and because type checkers can only prove
that an ``isinstance()`` check is safe only for a given class, not for all its
subclasses.
* ``isinstance()`` can be used with both data and non-data protocols, while
``issubclass()`` can be used only with non-data protocols. This restriction
exists because some data attributes can be set on an instance in constructor
and this information is not always available on the class object.
* Type checkers should reject an ``isinstance()`` or ``issubclass()`` call, if
there is an unsafe overlap between the type of the first argument and
the protocol.
* Type checkers should be able to select a correct element from a union after
a safe ``isinstance()`` or ``issubclass()`` call. For narrowing from non-union
types, type checkers can use their best judgement (this is intentionally
unspecified, since a precise specification would require intersection types).
Using Protocols in Python 2.7 - 3.5 Using Protocols in Python 2.7 - 3.5
@ -825,14 +868,12 @@ effects on the core interpreter and standard library except in the
a protocol or not. Add a class attribute ``_is_protocol = True`` a protocol or not. Add a class attribute ``_is_protocol = True``
if that is the case. Verify that a protocol class only has protocol if that is the case. Verify that a protocol class only has protocol
base classes in the MRO (except for object). base classes in the MRO (except for object).
* Implement ``@runtime`` that allows ``__subclasshook__()`` performing * Implement ``@runtime_checkable`` that allows ``__subclasshook__()``
structural instance and subclass checks as in ``collections.abc`` classes. performing structural instance and subclass checks as in ``collections.abc``
classes.
* All structural subtyping checks will be performed by static type checkers, * All structural subtyping checks will be performed by static type checkers,
such as ``mypy`` [mypy]_. No additional support for protocol validation will such as ``mypy`` [mypy]_. No additional support for protocol validation will
be provided at runtime. be provided at runtime.
* Classes ``Mapping``, ``MutableMapping``, ``Sequence``, and
``MutableSequence`` in ``collections.abc`` module will support structural
instance and subclass checks (like e.g. ``collections.abc.Iterable``).
Changes in the typing module Changes in the typing module
@ -849,8 +890,6 @@ The following classes in ``typing`` module will be protocols:
* ``Container`` * ``Container``
* ``Collection`` * ``Collection``
* ``Reversible`` * ``Reversible``
* ``Sequence``, ``MutableSequence``
* ``Mapping``, ``MutableMapping``
* ``ContextManager``, ``AsyncContextManager`` * ``ContextManager``, ``AsyncContextManager``
* ``SupportsAbs`` (and other ``Supports*`` classes) * ``SupportsAbs`` (and other ``Supports*`` classes)
@ -1026,11 +1065,10 @@ be considered "non-protocol". Therefore, it was decided to not introduce
"non-protocol" methods. "non-protocol" methods.
There is only one downside to this: it will require some boilerplate for There is only one downside to this: it will require some boilerplate for
implicit subtypes of ``Mapping`` and few other "large" protocols. But, this implicit subtypes of "large" protocols. But, this doesn't apply to "built-in"
applies to few "built-in" protocols (like ``Mapping`` and ``Sequence``) and protocols that are all "small" (i.e. have only few abstract methods).
people are already subclassing them. Also, such style is discouraged for Also, such style is discouraged for user-defined protocols. It is recommended
user-defined protocols. It is recommended to create compact protocols and to create compact protocols and combine them.
combine them.
Make protocols interoperable with other approaches Make protocols interoperable with other approaches
@ -1103,7 +1141,7 @@ Another potentially problematic case is assignment of attributes
self.x = 0 self.x = 0
c = C() c = C()
isinstance(c1, P) # False isinstance(c, P) # False
c.initialize() c.initialize()
isinstance(c, P) # True isinstance(c, P) # True
@ -1149,7 +1187,7 @@ This was rejected for the following reasons:
ABCs from ``typing`` module. If we prohibit explicit subclassing of these ABCs from ``typing`` module. If we prohibit explicit subclassing of these
ABCs, then quite a lot of code will break. ABCs, then quite a lot of code will break.
* Convenience: There are existing protocol-like ABCs (that will be turned * Convenience: There are existing protocol-like ABCs (that may be turned
into protocols) that have many useful "mix-in" (non-abstract) methods. into protocols) that have many useful "mix-in" (non-abstract) methods.
For example in the case of ``Sequence`` one only needs to implement For example in the case of ``Sequence`` one only needs to implement
``__getitem__`` and ``__len__`` in an explicit subclass, and one gets ``__getitem__`` and ``__len__`` in an explicit subclass, and one gets
@ -1301,33 +1339,16 @@ confusions.
Backwards Compatibility Backwards Compatibility
======================= =======================
This PEP is almost fully backwards compatible. Few collection classes such as This PEP is fully backwards compatible.
``Sequence`` and ``Mapping`` will be turned into runtime protocols, therefore
results of ``isinstance()`` checks are going to change in some edge cases.
For example, a class that implements the ``Sequence`` protocol but does not
explicitly inherit from ``Sequence`` currently returns ``False`` in
corresponding instance and class checks. With this PEP implemented, such
checks will return ``True``.
Implementation Implementation
============== ==============
A working implementation of this PEP for ``mypy`` type checker is found on The ``mypy`` type checker fully supports protocols (modulo a few
GitHub repo at https://github.com/ilevkivskyi/mypy/tree/protocols, known bugs). This includes treating all the builtin protocols, such as
corresponding ``typeshed`` stubs for more flavor are found at ``Iterable`` structurally. The runtime implementation of protocols is
https://github.com/ilevkivskyi/typeshed/tree/protocols. Installation steps:: available in ``typing_extensions`` module on PyPI.
git clone --recurse-submodules https://github.com/ilevkivskyi/mypy/
cd mypy && git checkout protocols && cd typeshed
git remote add proto https://github.com/ilevkivskyi/typeshed
git fetch proto && git checkout proto/protocols
cd .. && git add typeshed && sudo python3 -m pip install -U .
The runtime implementation of protocols in ``typing`` module is
found at https://github.com/ilevkivskyi/typehinting/tree/protocols.
The version of ``collections.abc`` with structural behavior for mappings and
sequences is found at https://github.com/ilevkivskyi/cpython/tree/protocols.
References References
@ -1372,6 +1393,9 @@ References
.. [elsewhere] .. [elsewhere]
https://github.com/python/peps/pull/224 https://github.com/python/peps/pull/224
.. [self-types]
https://www.python.org/dev/peps/pep-0484/#annotating-instance-and-class-methods
Copyright Copyright
========= =========

View File

@ -609,7 +609,7 @@ References
(https://github.com/AFPy/python_doc_fr/graphs/contributors?from=2016-01-01&to=2016-12-31&type=c) (https://github.com/AFPy/python_doc_fr/graphs/contributors?from=2016-01-01&to=2016-12-31&type=c)
.. [15] Python-doc on Transifex .. [15] Python-doc on Transifex
(https://www.transifex.com/python-doc/) (https://www.transifex.com/python-doc/public/)
.. [16] French translation .. [16] French translation
(https://www.afpy.org/doc/python/) (https://www.afpy.org/doc/python/)

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>, Author: Victor Stinner <victor.stinner@gmail.com>,
Cory Benfield <cory@lukasa.co.uk>, Cory Benfield <cory@lukasa.co.uk>,
BDFL-Delegate: Benjamin Peterson <benjamin@python.org> BDFL-Delegate: Benjamin Peterson <benjamin@python.org>
Status: Accepted Status: Rejected
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 30-May-2017 Created: 30-May-2017
@ -21,6 +21,15 @@ Backport the ssl.MemoryBIO and ssl.SSLObject classes from Python 3 to Python
2.7 to enhance the overall security of Python 2.7. 2.7 to enhance the overall security of Python 2.7.
Rejection Notice
================
This PEP is rejected, see `Withdraw PEP 546? Backport ssl.MemoryBIO and
ssl.SSLObject to Python 2.7
<https://mail.python.org/pipermail/python-dev/2018-May/153760.html>`_
discussion for the rationale.
Rationale Rationale
========= =========

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Marcel Plch <gmarcel.plch@gmail.com>, Author: Marcel Plch <gmarcel.plch@gmail.com>,
Petr Viktorin <encukou@gmail.com> Petr Viktorin <encukou@gmail.com>
Status: Draft Status: Deferred
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 25-May-2017 Created: 25-May-2017
@ -12,6 +12,17 @@ Python-Version: 3.7
Post-History: Post-History:
Deferral Notice
===============
Cython -- the most important use case for this PEP and the only explicit
one -- is not ready for multi-phase initialization yet.
It keeps global state in C-level static variables.
See discussion at `Cython issue 1923`_.
The PEP is deferred until the situation changes.
Abstract Abstract
======== ========
@ -186,6 +197,7 @@ References
.. _GitHub: https://github.com/python/cpython/pull/1761 .. _GitHub: https://github.com/python/cpython/pull/1761
.. _Cython issue 1715: https://github.com/cython/cython/issues/1715 .. _Cython issue 1715: https://github.com/cython/cython/issues/1715
.. _Possible Future Extensions section: https://www.python.org/dev/peps/pep-0489/#possible-future-extensions .. _Possible Future Extensions section: https://www.python.org/dev/peps/pep-0489/#possible-future-extensions
.. _Cython issue 1923: https://github.com/cython/cython/pull/1923
Copyright Copyright

View File

@ -4,28 +4,37 @@ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Steve Dower <steve.dower@python.org> Author: Steve Dower <steve.dower@python.org>
Status: Draft Status: Draft
Type: Standards Track Type: Informational
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 23-Aug-2017 Created: 23-Aug-2017
Python-Version: 3.7 Python-Version: 3.7
Post-History: 24-Aug-2017 (security-sig), 28-Aug-2017 (python-dev) Post-History: 24-Aug-2017 (security-sig), 28-Aug-2017 (python-dev)
Relationship to PEP 578
=======================
This PEP has been split into two since its original posting.
See `PEP 578 <https://www.python.org/dev/peps/pep-0578/>`_ for the
auditing APIs proposed for addition to the next version of Python.
This is now an informational PEP, providing guidance to those planning
to integrate Python into their secure or audited environments.
Abstract Abstract
======== ========
This PEP describes additions to the Python API and specific behaviors This PEP describes the concept of security transparency and how it
for the CPython implementation that make actions taken by the Python applies to the Python runtime. Visibility into actions taken by the
runtime visible to security and auditing tools. The goals in order of runtime is invaluable in integrating Python into an otherwise secure
increasing importance are to prevent malicious use of Python, to detect and/or monitored environment.
and report on malicious use, and most importantly to detect attempts to
bypass detection. Most of the responsibility for implementation is
required from users, who must customize and build Python for their own
environment.
We propose two small sets of public APIs to enable users to reliably The audit hooks described in PEP-578 are an essential component in
build their copy of Python without having to modify the core runtime, detecting, identifying and analyzing misuse of Python. While the hooks
protecting future maintainability. We also discuss recommendations for themselves are neutral (in that not every reported event is inherently
users to help them develop and configure their copy of Python. misuse), they provide essential context to those who are responsible
for monitoring an overall system or network. With enough transparency,
attackers are no longer able to hide.
Background Background
========== ==========
@ -126,14 +135,14 @@ tools, most network access and DNS resolution, and attempts to create
and hide files or configuration settings on the local machine. and hide files or configuration settings on the local machine.
To summarize, defenders have a need to audit specific uses of Python in To summarize, defenders have a need to audit specific uses of Python in
order to detect abnormal or malicious usage. Currently, the Python order to detect abnormal or malicious usage. With PEP 578, the Python
runtime does not provide any ability to do this, which (anecdotally) has runtime gains the ability to provide this. The aim of this PEP is to
led to organizations switching to other languages. The aim of this PEP assist system administrators with deploying a security transparent
is to enable system administrators to deploy a security transparent copy version of Python that can integrate with their existing auditing and
of Python that can integrate with their existing auditing and protection protection systems.
systems.
On Windows, some specific features that may be enabled by this include: On Windows, some specific features that may be integrated through the
hooks added by PEP 578 include:
* Script Block Logging [3]_ * Script Block Logging [3]_
* DeviceGuard [4]_ * DeviceGuard [4]_
@ -151,7 +160,7 @@ On Linux, some specific features that may be integrated are:
* SELinux labels [13]_ * SELinux labels [13]_
* check execute bit on imported modules * check execute bit on imported modules
On macOS, some features that may be used with the expanded APIs are: On macOS, some features that may be integrated are:
* OpenBSM [10]_ * OpenBSM [10]_
* syslog [11]_ * syslog [11]_
@ -161,9 +170,6 @@ production machines is highly appealing to system administrators and
will make Python a more trustworthy dependency for application will make Python a more trustworthy dependency for application
developers. developers.
Overview of Changes
===================
True security transparency is not fully achievable by Python in True security transparency is not fully achievable by Python in
isolation. The runtime can audit as many events as it likes, but unless isolation. The runtime can audit as many events as it likes, but unless
the logs are reviewed and analyzed there is no value. Python may impose the logs are reviewed and analyzed there is no value. Python may impose
@ -173,340 +179,64 @@ implementations of certain security features, and organizations with the
resources to fully customize their runtime should be encouraged to do resources to fully customize their runtime should be encouraged to do
so. so.
The aim of these changes is to enable system administrators to integrate Summary Recommendations
Python into their existing security systems, without dictating what =======================
those systems look like or how they should behave. We propose two API
changes to enable this: an Audit Hook and Verified Open Hook. Both are
not set by default, and both require modifications to the entry point
binary to enable any functionality. For the purposes of validation and
example, we propose a new ``spython``/``spython.exe`` entry point
program that enables some basic functionality using these hooks.
**However, security-conscious organizations are expected to create their
own entry points to meet their own needs.**
Audit Hook These are discussed in greater detail in later sections, but are
---------- presented here to frame the overall discussion.
In order to achieve security transparency, an API is required to raise Sysadmins should provide and use an alternate entry point (besides
messages from within certain operations. These operations are typically ``python.exe`` or ``pythonX.Y``) in order to reduce surface area and
deep within the Python runtime or standard library, such as dynamic code securely enable audit hooks. A discussion of what could be restricted
compilation, module imports, DNS resolution, or use of certain modules is below in `Restricting the Entry Point`_.
such as ``ctypes``.
The new C APIs required for audit hooks are:: Sysadmins should use all available measures provided by their operating
system to prevent modifications to their Python installation, such as
file permissions, access control lists and signature validation.
# Add an auditing hook Sysadmins should log everything and collect logs to a central location
typedef int (*hook_func)(const char *event, PyObject *args, as quickly as possible - avoid keeping logs on outer-ring machines.
void *userData);
int PySys_AddAuditHook(hook_func hook, void *userData);
# Raise an event with all auditing hooks Sysadmins should prioritize _detection_ of misuse over _prevention_ of
int PySys_Audit(const char *event, PyObject *args); misuse.
# Internal API used during Py_Finalize() - not publicly accessible
void _Py_ClearAuditHooks(void);
The new Python APIs for audit hooks are::
# Add an auditing hook
sys.addaudithook(hook: Callable[str, tuple]) -> None
# Raise an event with all auditing hooks
sys.audit(str, *args) -> None
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time, Restricting the Entry Point
including before ``Py_Initialize()``, or by calling ===========================
``sys.addaudithook()`` from Python code. Hooks are never removed or
replaced, and existing hooks have an opportunity to refuse to allow new
hooks to be added (adding an audit hook is audited, and so preexisting
hooks can raise an exception to block the new addition).
When events of interest are occurring, code can either call One of the primary vulnerabilities exposed by the presence of Python
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The on a machine is the ability to execute arbitrary code without
string argument is the name of the event, and the tuple contains detection or verification by the system. This is made significantly
arguments. A given event name should have a fixed schema for arguments, easier because the default entry point (``python.exe`` on Windows and
and both arguments are considered a public API (for a given x.y version ``pythonX.Y`` on other platforms) allows execution from the command
of Python), and thus should only change between feature releases with line, from standard input, and does not have any hooks enabled by
updated documentation. default.
When an event is audited, each hook is called in the order it was added Our recommendation is that production machines should use a modified
with the event name and tuple. If any hook returns with an exception entry point instead of the default. Once outside of the development
set, later hooks are ignored and *in general* the Python runtime should environment, there is rarely a need for the flexibility offered by the
terminate. This is intentional to allow hook implementations to decide default entry point.
how to respond to any particular event. The typical responses will be to
log the event, abort the operation with an exception, or to immediately
terminate the process with an operating system exit call.
When an event is audited but no hooks have been set, the ``audit()`` In this section, we describe a hypothetical ``spython`` entry point
function should include minimal overhead. Ideally, each argument is a (``spython.exe`` on Windows; ``spythonX.Y`` on other platforms) that
reference to existing data rather than a value calculated just for the provides a level of security transparency recommended for production
auditing call. machines. An associated example implementation shows many of the
features described here, though with a number of concessions for the
sake of avoiding platform-specific code. A sufficient implementation
will inherently require some integration with platform-specific
security features.
As hooks may be Python objects, they need to be freed during Official distributions will not include any ``spython`` by default, but
``Py_Finalize()``. To do this, we add an internal API third party distributions may include appropriately modified entry
``_Py_ClearAuditHooks()`` that releases any ``PyObject*`` hooks that are points that use the same name.
held, as well as any heap memory used. This is an internal function with
no public export, but it triggers an event for all audit hooks to ensure
that unexpected calls are logged.
See `Audit Hook Locations`_ for proposed audit hook points and schemas, **Remove most command-line arguments**
and the `Recommendations`_ section for discussion on
appropriate responses.
Verified Open Hook
------------------
Most operating systems have a mechanism to distinguish between files
that can be executed and those that can not. For example, this may be an
execute bit in the permissions field, or a verified hash of the file
contents to detect potential code tampering. These are an important
security mechanism for preventing execution of data or code that is not
approved for a given environment. Currently, Python has no way to
integrate with these when launching scripts or importing modules.
The new public C API for the verified open hook is::
# Set the handler
typedef PyObject *(*hook_func)(PyObject *path)
int PyImport_SetOpenForImportHook(void *handler)
# Open a file using the handler
PyObject *PyImport_OpenForImport(const char *path)
The new public Python API for the verified open hook is::
# Open a file using the handler
_imp.open_for_import(path)
The ``_imp.open_for_import()`` function is a drop-in replacement for
``open(str(pathlike), 'rb')``. Its default behaviour is to open a file
for raw, binary access - any more restrictive behaviour requires the
use of a custom handler. Only ``str`` arguments are accepted.
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
from C at any time, including before ``Py_Initialize()``. However, if a
hook has already been set then the call will fail. When
``open_for_import()`` is called with a hook set, the hook will be passed
the path and its return value will be returned directly. The returned
object should be an open file-like object that supports reading raw
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
the open handler has already had to read the file into memory in order
to perform whatever verification is necessary to determine whether the
content is permitted to be executed.
Note that these hooks can import and call the ``_io.open()`` function on
CPython without triggering themselves.
If the hook determines that the file is not suitable for execution, it
should raise an exception of its choice, as well as raising any other
auditing events or notifications.
All import and execution functionality involving code from a file will
be changed to use ``open_for_import()`` unconditionally. It is important
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
through this function - an audit hook that includes the code from these
calls will be added and is the best opportunity to validate code that is
read from the file. Given the current decoupling between import and
execution in Python, most imported code will go through both
``open_for_import()`` and the log hook for ``compile``, and so care
should be taken to avoid repeating verification steps.
.. note::
The use of ``open_for_import()`` by ``importlib`` is a valuable
first defence, but should not be relied upon to prevent misuse. In
particular, it is easy to monkeypatch ``importlib`` in order to
bypass the call. Auditing hooks are the primary way to achieve
security transparency, and are essential for detecting attempts to
bypass other functionality.
API Availability
----------------
While all the functions added here are considered public and stable API,
the behavior of the functions is implementation specific. The
descriptions here refer to the CPython implementation, and while other
implementations should provide the functions, there is no requirement
that they behave the same.
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
may do nothing. This allows code to make calls to ``sys.audit()``
without having to test for existence, but it should not assume that its
call will have any effect. (Including existence tests in
security-critical code allows another vector to bypass auditing, so it
is preferable that the function always exist.)
``_imp.open_for_import(path)`` should at a minimum always return
``_io.open(path, 'rb')``. Code using the function should make no further
assumptions about what may occur, and implementations other than CPython
are not required to let developers override the behavior of this
function with a hook.
Audit Hook Locations
====================
Calls to ``sys.audit()`` or ``PySys_Audit()`` will be added to the
following operations with the schema in Table 1. Unless otherwise
specified, the ability for audit hooks to abort any listed operation
should be considered part of the rationale for including the hook.
.. csv-table:: Table 1: Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
audit hooks are being added.
"
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
hooks they are being cleaned up, mainly in case the event is
triggered unexpectedly. This event cannot be aborted.
"
``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", "
Detects any attempt to set the ``open_for_import`` hook.
"
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
Detect dynamic code compilation, where ``code`` could be a string or
AST. Note that this will be called for regular imports of source
code, including those that were opened with ``open_for_import``.
"
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
Detect dynamic execution of code objects. This only occurs for
explicit calls, and is not raised for normal function invocation.
"
``import``, ``import``, "``(module, filename, sys.path,
sys.meta_path, sys.path_hooks)``", "Detect when modules are
imported. This is raised before the module name is resolved to a
file. All arguments other than the module name may be ``None`` if
they are not used or available.
"
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
Detect dynamic creation of code objects. This only occurs for
direct instantiation, and is not raised for normal compilation.
"
``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
dynamic creation of function objects. This only occurs for direct
instantiation, and is not raised for normal compilation.
"
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
``(module_or_path,)``", "Detect when native modules are used.
"
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
Collect information about specific symbols retrieved from native
modules.
"
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
when code is accessing arbitrary memory using ``ctypes``.
"
``id``, ``id``, "``(id_as_int,)``", "Detect when code is accessing
the id of objects, which in CPython reveals information about
memory layout.
"
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
when code is accessing frames directly.
"
``sys._current_frames``, ``sys._current_frames``, "", "Detect when
code is accessing frames directly.
"
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.setprofile`` eventually
calls this function, so the event will be audited for each thread.
"
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.settrace`` eventually
calls this function, so the event will be audited for each thread.
"
``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "
", "Detect changes to async generator hooks.
"
``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "
", "Detect changes to async generator hooks.
"
``_PyEval_SetCoroutineWrapper``, ``sys.set_coroutine_wrapper``, "
", "Detect changes to the coroutine wrapper.
"
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
access to network resources. The address is unmodified from the
original call.
"
``socket.__init__``, "socket()", "``(family, type, proto)``", "
Detect creation of sockets. The arguments will be int values.
"
``socket.gethostname``, ``socket.gethostname``, "", "Detect attempts
to retrieve the current host name.
"
``socket.sethostname``, ``socket.sethostname``, "``(name,)``", "
Detect attempts to change the current host name. The name argument
is passed as a bytes object.
"
"``socket.gethostbyname``, ``socket.gethostbyname_ex``",
"``socket.gethostbyname``", "``(name,)``", "Detect host name
resolution. The name argument is a str or bytes object.
"
``socket.gethostbyaddr``, ``socket.gethostbyaddr``, "
``(address,)``", "Detect host resolution. The address argument is a
str or bytes object.
"
``socket.getservbyname``, ``socket.getservbyname``, "``(name,
protocol)``", "Detect service resolution. The arguments are str
objects.
"
"``socket.getservbyport``", ``socket.getservbyport``, "``(port,
protocol)``", "Detect service resolution. The port argument is an
int and protocol is a str.
"
"``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
",``object.__getattr__``,"``(object, attr)``","Detect access to
restricted attributes. This event is raised for any built-in
members that are marked as restricted, and members that may allow
bypassing imports.
"
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
``object.__setattr__``","``(object, attr, value)``","Detect monkey
patching of types and objects. This event
is raised for the ``__class__`` attribute and any attribute on
``type`` objects.
"
"``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
attr)``","Detect deletion of object attributes. This event is raised
for any attribute on ``type`` objects.
"
"``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
global_name)``","Detect imports and global name lookup when
unpickling.
"
"``array_new``",``array.__new__``,"``(typecode, initial_value)``", "
Detects creation of array objects.
"
TODO - more hooks in ``_socket``, ``_ssl``, others?
SPython Entry Point
===================
A new entry point binary will be added, called ``spython.exe`` on
Windows and ``spythonX.Y`` on other platforms. This entry point is
intended primarily as an example, as we expect most users of this
functionality to implement their own entry point and hooks (see
`Recommendations`_). It will also be used for tests.
Source builds will build ``spython`` by default, but distributions
should not include it except as a test binary. The python.org managed
binary distributions will not include ``spython``.
**Do not accept most command-line arguments**
The ``spython`` entry point requires a script file be passed as the The ``spython`` entry point requires a script file be passed as the
first argument, and does not allow any options. This prevents arbitrary first argument, and does not allow any options to precede it. This
code execution from in-memory data or non-script files (such as pickles, prevents arbitrary code execution from in-memory data or non-script
which can be executed using ``-m pickle <path>``. files (such as pickles, which could be executed using
``-m pickle <path>``.
Options ``-B`` (do not write bytecode), ``-E`` (ignore environment Options ``-B`` (do not write bytecode), ``-E`` (ignore environment
variables) and ``-s`` (no user site) are assumed. variables) and ``-s`` (no user site) are assumed.
@ -517,38 +247,57 @@ will be used to initialize ``sys.path`` following the rules currently
described `for Windows described `for Windows
<https://docs.python.org/3/using/windows.html#finding-modules>`_. <https://docs.python.org/3/using/windows.html#finding-modules>`_.
When built with ``Py_DEBUG``, the ``spython`` entry point will allow a For the sake of demonstration, the example implementation of
``-i`` option with no other arguments to enter into interactive mode, ``spython`` also allows the ``-i`` option to start in interactive mode.
with audit messages being written to standard error rather than a file. This is not recommended for restricted entry points.
This is intended for testing and debugging only.
**Log security events to a file** **Log audited events**
Before initialization, ``spython`` will set an audit hook that writes Before initialization, ``spython`` sets an audit hook that writes all
events to a local file. By default, this file is the full path of the audited events to an OS-managed log file. On Windows, this is the Event
process with a ``.log`` suffix, but may be overridden with the Tracing functionality,[7]_ and on other platforms they go to
``SPYTHONLOG`` environment variable (despite such overrides being syslog.[11]_ Logs are copied from the machine as frequently as possible
explicitly discouraged in `Recommendations`_). to prevent loss of information should an attacker attempt to clear
local logs or prevent legitimate access to the machine.
The audit hook will also abort all ``sys.addaudithook`` events, The audit hook will also abort all ``sys.addaudithook`` events,
preventing any other hooks from being added. preventing any other hooks from being added.
The logging hook is written in native code and configured before the
interpreter is initialized. This is the only opportunity to ensure that
no Python code executes without auditing, and that Python code cannot
prevent registration of the hook.
Our primary aim is to record all actions taken by all Python processes,
so that detection may be performed offline against logged events.
Having all events recorded also allows for deeper analysis and the use
of machine learning algorithms. These are useful for detecting
persistent attacks, where the attacker is intending to remain within
the protected machines for some period of time, as well as for later
analysis to determine the impact and exposure caused by a successful
attack.
The example implementation of ``spython`` writes to a log file on the
local machine, for the sake of demonstration. When started with ``-i``,
the example implementation writes all audit events to standard error
instead of the log file. The ``SPYTHONLOG`` environment variable can be
used to specify the log file location.
**Restrict importable modules** **Restrict importable modules**
Also before initialization, ``spython`` will set an open-for-import Also before initialization, ``spython`` sets an open-for-import hook
hook that validates all files opened with ``os.open_for_import``. This that validates all files opened with ``os.open_for_import``. This
implementation will require all files to have a ``.py`` suffix (thereby implementation requires all files to have a ``.py`` suffix (preventing
blocking the use of cached bytecode), and will raise a custom audit the use of cached bytecode), and will raise a custom audit event
event ``spython.open_for_import`` containing ``(filename, ``spython.open_for_import`` containing ``(filename, True_if_allowed)``.
True_if_allowed)``.
On Windows, the hook will also open the file with flags that prevent any After opening the file, the entire contents is read into memory in a
other process from opening it with write access, which allows the hook single buffer and the file is closed.
to perform additional validation on the contents with confidence that it
will not be modified between the check and use. Compilation will later Compilation will later trigger a ``compile`` event, so there is no need
trigger a ``compile`` event, so there is no need to read the contents to validate the contents now using mechanisms that also apply to
now for AMSI, but other validation mechanisms such as DeviceGuard [4]_ dynamically generated code. However, if a whitelist of source files or
should be performed here. file hashes is available, then this is the point
**Restrict globals in pickles** **Restrict globals in pickles**
@ -556,35 +305,37 @@ The ``spython`` entry point will abort all ``pickle.find_class`` events
that use the default implementation. Overrides will not raise audit that use the default implementation. Overrides will not raise audit
events unless explicitly added, and so they will continue to be allowed. events unless explicitly added, and so they will continue to be allowed.
Performance Impact **Prevent os.system**
==================
The important performance impact is the case where events are being The ``spython`` entry point aborts all ``os.system`` calls.
raised but there are no hooks attached. This is the unavoidable case -
once a distributor or sysadmin begins adding audit hooks they have
explicitly chosen to trade performance for functionality. Performance
impact using ``spython`` or with hooks added are not of interest here,
since this is considered opt-in functionality.
Analysis using the ``performance`` tool shows no significant impact, It should be noted here that ``subprocess.Popen(shell=True)`` is
with the vast majority of benchmarks showing between 1.05x faster to allowed (though logged via the platform-specific process creation
1.05x slower. events). This tradeoff is made because it is much simpler to induce a
running application to call ``os.system`` with a single string argument
than a function with multiple arguments, and so it is more likely to be
used as part of an exploit. There is also little justification for
using ``os.system`` in production code, while ``subprocess.Popen`` has
a large number of legitimate uses. Though logs indicating the use of
the ``shell=True`` argument should be more carefully scrutinised.
In our opinion, the performance impact of the set of auditing points Sysadmins are encouraged to make these kinds of tradeoffs between
described in this PEP is negligible. restriction and detection, and generally should prefer detection.
Recommendations General Recommendations
=============== =======================
Specific recommendations are difficult to make, as the ideal Recommendations beyond those suggested in the previous section are
configuration for any environment will depend on the user's ability to difficult, as the ideal configuration for any environment depends on
manage, monitor, and respond to activity on their own network. However, the sysadmin's ability to manage, monitor, and respond to activity on
many of the proposals here do not appear to be of value without deeper their own network. Nonetheless, here we attempt to provide some context
illustration. This section provides recommendations using the terms and guidance for integrating Python into a complete system.
**should** (or **should not**), indicating that we consider it dangerous
to ignore the advice, and **may**, indicating that for the advice ought This section provides recommendations using the terms **should** (or
to be considered for high value systems. The term **sysadmins** refers **should not**), indicating that we consider it risky to ignore the
to whoever is responsible for deploying Python throughout your network; advice, and **may**, indicating that for the advice ought to be
considered for high value systems. The term **sysadmin** refers to
whoever is responsible for deploying Python throughout the network;
different organizations may have an alternative title for the different organizations may have an alternative title for the
responsible people. responsible people.
@ -666,73 +417,30 @@ Since ``importlib``'s use of ``open_for_import`` may be easily bypassed
with monkeypatching, an audit hook **should** be used to detect with monkeypatching, an audit hook **should** be used to detect
attribute changes on type objects. attribute changes on type objects.
[TODO: more good advice; less bad advice] Rejected Advice
===============
Rejected Ideas This section discusses common or "obviously good" recommendations that
============== we are specifically *not* making. These range from useless or incorrect
through to ideas that are simply not feasible in any real world
environment.
Separate module for audit hooks **Do not** attempt to implement a sandbox within the Python runtime.
------------------------------- There is a long history of attempts to allow arbitrary code limited use
of Python features (such as [14]_), but no general success. The best
options are to run unrestricted Python within a sandboxed environment
with at least hypervisor-level isolation, or to prevent unauthorised
code from starting at all.
The proposal is to add a new module for audit hooks, hypothetically **Do not** rely on static analysis to verify untrusted code before use.
``audit``. This would separate the API and implementation from the The best options are to pre-authorise trusted code, such as with code
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and signing, and if not possible to identify known-bad code, such as with
``PyAudit_Audit`` rather than the current variations. an anti-malware scanner.
Any such module would need to be a built-in module that is guaranteed to **Do not** use audit hooks to abort operations without logging the
always be present. The nature of these hooks is that they must be event first. You will regret not knowing why your process disappeared.
callable without condition, as any conditional imports or calls provide
more opportunities to intercept and suppress or modify events.
Given its nature as one of the most core modules, the ``sys`` module is [TODO - more bad advice]
somewhat protected against module shadowing attacks. Replacing ``sys``
with a sufficiently functional module that the application can still run
is a much more complicated task than replacing a module with only one
function of interest. An attacker that has the ability to shadow the
``sys`` module is already capable of running arbitrary code from files,
whereas an ``audit`` module can be replaced with a single statement::
import sys; sys.modules['audit'] = type('audit', (object,),
{'audit': lambda *a: None, 'addhook': lambda *a: None})
Multiple layers of protection already exist for monkey patching attacks
against either ``sys`` or ``audit``, but assignments or insertions to
``sys.modules`` are not audited.
This idea is rejected because it makes substituting ``audit`` calls
throughout all callers near trivial.
Flag in sys.flags to indicate "secure" mode
-------------------------------------------
The proposal is to add a value in ``sys.flags`` to indicate when Python
is running in a "secure" mode. This would allow applications to detect
when some features are enabled and modify their behaviour appropriately.
Currently there are no guarantees made about security by this PEP - this
section is the first time the word "secure" has been used. Security
**transparency** does not result in any changed behaviour, so there is
no appropriate reason for applications to modify their behaviour.
Both application-level APIs ``sys.audit`` and ``_imp.open_for_import``
are always present and functional, regardless of whether the regular
``python`` entry point or some alternative entry point is used. Callers
cannot determine whether any hooks have been added (except by performing
side-channel analysis), nor do they need to. The calls should be fast
enough that callers do not need to avoid them, and the sysadmin is
responsible for ensuring their added hooks are fast enough to not affect
application performance.
The argument that this is "security by obscurity" is valid, but
irrelevant. Security by obscurity is only an issue when there are no
other protective mechanisms; obscurity as the first step in avoiding
attack is strongly recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
discussion).
This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in
use.
Further Reading Further Reading
=============== ===============
@ -789,7 +497,7 @@ References
.. [4] `<https://aka.ms/deviceguard>`_ .. [4] `<https://aka.ms/deviceguard>`_
.. [5] AMSI, `<https://msdn.microsoft.com/en-us/library/windows/desktop/dn889587(v=vs.85).aspx>`_ .. [5] Antimalware Scan Interface, `<https://msdn.microsoft.com/en-us/library/windows/desktop/dn889587(v=vs.85).aspx>`_
.. [6] Persistent Zone Identifiers, `<https://msdn.microsoft.com/en-us/library/ms537021(v=vs.85).aspx>`_ .. [6] Persistent Zone Identifiers, `<https://msdn.microsoft.com/en-us/library/ms537021(v=vs.85).aspx>`_
@ -807,6 +515,8 @@ References
.. [13] SELinux access decisions `<http://man7.org/linux/man-pages/man3/avc_entry_ref_init.3.html>`_ .. [13] SELinux access decisions `<http://man7.org/linux/man-pages/man3/avc_entry_ref_init.3.html>`_
.. [14] The failure of pysandbox `<https://lwn.net/Articles/574215/>`_
Acknowledgments Acknowledgments
=============== ===============
@ -820,7 +530,7 @@ discussions.
Copyright Copyright
========= =========
Copyright (c) 2017 by Microsoft Corporation. This material may be Copyright (c) 2017-2018 by Microsoft Corporation. This material may be
distributed only subject to the terms and conditions set forth in the distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0 or later (the latest version is presently Open Publication License, v1.0 or later (the latest version is presently
available at http://www.opencontent.org/openpub/). available at http://www.opencontent.org/openpub/).

View File

@ -6,15 +6,16 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 2017-09-05 Created: 2017-09-05
Python-Version: 3.8 Python-Version: 3.8
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017 Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017,
09-May-2018
Abstract Abstract
======== ========
CPython has supported multiple interpreters in the same process (AKA CPython has supported multiple interpreters in the same process (AKA
"subinterpreters") since version 1.5. The feature has been available "subinterpreters") since version 1.5 (1997). The feature has been
via the C-API. [c-api]_ Subinterpreters operate in available via the C-API. [c-api]_ Subinterpreters operate in
`relative isolation from one another <Interpreter Isolation_>`_, which `relative isolation from one another <Interpreter Isolation_>`_, which
provides the basis for an provides the basis for an
`alternative concurrency model <Concurrency_>`_. `alternative concurrency model <Concurrency_>`_.
@ -30,7 +31,7 @@ Proposal
The ``interpreters`` module will be added to the stdlib. It will The ``interpreters`` module will be added to the stdlib. It will
provide a high-level interface to subinterpreters and wrap a new provide a high-level interface to subinterpreters and wrap a new
low-level ``_interpreters`` (in the same was as the ``threading`` low-level ``_interpreters`` (in the same way as the ``threading``
module). See the `Examples`_ section for concrete usage and use cases. module). See the `Examples`_ section for concrete usage and use cases.
Along with exposing the existing (in CPython) subinterpreter support, Along with exposing the existing (in CPython) subinterpreter support,
@ -47,6 +48,8 @@ At first only the following types will be supported for sharing:
* None * None
* bytes * bytes
* str
* int
* PEP 3118 buffer objects (via ``send_buffer()``) * PEP 3118 buffer objects (via ``send_buffer()``)
Support for other basic types (e.g. int, Ellipsis) will be added later. Support for other basic types (e.g. int, Ellipsis) will be added later.
@ -87,6 +90,14 @@ For creating and using interpreters:
| channels=None) | | (This blocks the current thread until done.) | | channels=None) | | (This blocks the current thread until done.) |
+-----------------------+-----------------------------------------------------+ +-----------------------+-----------------------------------------------------+
|
+----------------+--------------+------------------------------------------------------+
| exception | base | description |
+================+==============+======================================================+
| RunFailedError | RuntimeError | Interpreter.run() resulted in an uncaught exception. |
+----------------+--------------+------------------------------------------------------+
For sharing data between interpreters: For sharing data between interpreters:
+--------------------------------+--------------------------------------------+ +--------------------------------+--------------------------------------------+
@ -120,9 +131,11 @@ For sharing data between interpreters:
| .recv_nowait(default=None) -> | | Like recv(), but return the default | | .recv_nowait(default=None) -> | | Like recv(), but return the default |
| object | | instead of waiting. | | object | | instead of waiting. |
+-------------------------------+-----------------------------------------------+ +-------------------------------+-----------------------------------------------+
| .close() | | No longer associate the current interpreter | | .release() | | No longer associate the current interpreter |
| | | with the channel (on the receiving end). | | | | with the channel (on the receiving end). |
+-------------------------------+-----------------------------------------------+ +-------------------------------+-----------------------------------------------+
| .close(force=False) | | Close the channel in all interpreters. |
+-------------------------------+-----------------------------------------------+
| |
@ -147,9 +160,31 @@ For sharing data between interpreters:
+---------------------------+-------------------------------------------------+ +---------------------------+-------------------------------------------------+
| .send_buffer_nowait(obj) | | Like send_buffer(), but fail if not received. | | .send_buffer_nowait(obj) | | Like send_buffer(), but fail if not received. |
+---------------------------+-------------------------------------------------+ +---------------------------+-------------------------------------------------+
| .close() | | No longer associate the current interpreter | | .release() | | No longer associate the current interpreter |
| | | with the channel (on the sending end). | | | | with the channel (on the sending end). |
+---------------------------+-------------------------------------------------+ +---------------------------+-------------------------------------------------+
| .close(force=False) | | Close the channel in all interpreters. |
+---------------------------+-------------------------------------------------+
|
+----------------------+--------------------+------------------------------------------------+
| exception | base | description |
+======================+====================+================================================+
| ChannelError | Exception | The base class for channel-related exceptions. |
+----------------------+--------------------+------------------------------------------------+
| ChannelNotFoundError | ChannelError | The identified channel was not found. |
+----------------------+--------------------+------------------------------------------------+
| ChannelEmptyError | ChannelError | The channel was unexpectedly empty. |
+----------------------+--------------------+------------------------------------------------+
| ChannelNotEmptyError | ChannelError | The channel was unexpectedly not empty. |
+----------------------+--------------------+------------------------------------------------+
| NotReceivedError | ChannelError | Nothing was waiting to receive a sent object. |
+----------------------+--------------------+------------------------------------------------+
| ChannelClosedError | ChannelError | The channel is closed. |
+----------------------+--------------------+------------------------------------------------+
| ChannelReleasedError | ChannelClosedError | The channel is released (but not yet closed). |
+----------------------+--------------------+------------------------------------------------+
Examples Examples
@ -218,7 +253,7 @@ Synchronize using a channel
interp.run(tw.dedent(""" interp.run(tw.dedent("""
reader.recv() reader.recv()
print("during") print("during")
reader.close() reader.release()
"""), """),
shared=dict( shared=dict(
reader=r, reader=r,
@ -229,7 +264,7 @@ Synchronize using a channel
t.start() t.start()
print('after') print('after')
s.send(b'') s.send(b'')
s.close() s.release()
Sharing a file descriptor Sharing a file descriptor
------------------------- -------------------------
@ -280,7 +315,7 @@ Passing objects via marshal
obj = marshal.loads(data) obj = marshal.loads(data)
do_something(obj) do_something(obj)
data = reader.recv() data = reader.recv()
reader.close() reader.release()
""")) """))
t = threading.Thread(target=run) t = threading.Thread(target=run)
t.start() t.start()
@ -310,7 +345,7 @@ Passing objects via pickle
obj = pickle.loads(data) obj = pickle.loads(data)
do_something(obj) do_something(obj)
data = reader.recv() data = reader.recv()
reader.close() reader.release()
""")) """))
t = threading.Thread(target=run) t = threading.Thread(target=run)
t.start() t.start()
@ -514,6 +549,8 @@ channels to the following:
* None * None
* bytes * bytes
* str
* int
* PEP 3118 buffer objects (via ``send_buffer()``) * PEP 3118 buffer objects (via ``send_buffer()``)
Limiting the initial shareable types is a practical matter, reducing Limiting the initial shareable types is a practical matter, reducing
@ -563,7 +600,7 @@ Existing Usage
Subinterpreters are not a widely used feature. In fact, the only Subinterpreters are not a widely used feature. In fact, the only
documented cases of wide-spread usage are documented cases of wide-spread usage are
`mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_and `mod_wsgi <https://github.com/GrahamDumpleton/mod_wsgi>`_ and
`JEP <https://github.com/ninia/jep>`_. On the one hand, this case `JEP <https://github.com/ninia/jep>`_. On the one hand, this case
provides confidence that existing subinterpreter support is relatively provides confidence that existing subinterpreter support is relatively
stable. On the other hand, there isn't much of a sample size from which stable. On the other hand, there isn't much of a sample size from which
@ -686,16 +723,24 @@ The module also provides the following class:
"run()" call into one long script. This is the same as how the "run()" call into one long script. This is the same as how the
REPL operates. REPL operates.
Regarding uncaught exceptions, we noted that they are
"effectively" propagated into the code where ``run()`` was called.
To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its
traceback (see ``traceback.TracebackException``), wrap it in a
RuntimeError, and raise that.
Supported code: source text. Supported code: source text.
Uncaught Exceptions
-------------------
Regarding uncaught exceptions in ``Interpreter.run()``, we noted that
they are "effectively" propagated into the code where ``run()`` was
called. To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its traceback
(see ``traceback.TracebackException``), set it to ``__cause__`` on a
new ``RunFailedError``, and raise that.
Raising (a proxy of) the exception is problematic since it's harder to
distinguish between an error in the ``run()`` call and an uncaught
exception from the subinterpreter.
API for sharing data API for sharing data
-------------------- --------------------
@ -703,8 +748,8 @@ Subinterpreters are less useful without a mechanism for sharing data
between them. Sharing actual Python objects between interpreters, between them. Sharing actual Python objects between interpreters,
however, has enough potential problems that we are avoiding support however, has enough potential problems that we are avoiding support
for that here. Instead, only mimimum set of types will be supported. for that here. Instead, only mimimum set of types will be supported.
Initially this will include ``bytes`` and channels. Further types may Initially this will include ``None``, ``bytes``, ``str``, ``int``,
be supported later. and channels. Further types may be supported later.
The ``interpreters`` module provides a way for users to determine The ``interpreters`` module provides a way for users to determine
whether an object is shareable or not: whether an object is shareable or not:
@ -737,11 +782,12 @@ many-to-many, channels have no buffer.
Create a new channel and return (recv, send), the RecvChannel and Create a new channel and return (recv, send), the RecvChannel and
SendChannel corresponding to the ends of the channel. The channel SendChannel corresponding to the ends of the channel. The channel
is not closed and destroyed (i.e. garbage-collected) until the number is not closed and destroyed (i.e. garbage-collected) until the number
of associated interpreters returns to 0. of associated interpreters returns to 0 (including when the channel
is explicitly closed).
An interpreter gets associated with a channel by calling its "send()" An interpreter gets associated with a channel by calling its "send()"
or "recv()" method. That association gets dropped by calling or "recv()" method. That association gets dropped by calling
"close()" on the channel. "release()" on the channel.
Both ends of the channel are supported "shared" objects (i.e. may be Both ends of the channel are supported "shared" objects (i.e. may be
safely shared by different interpreters. Thus they may be passed as safely shared by different interpreters. Thus they may be passed as
@ -765,7 +811,8 @@ many-to-many, channels have no buffer.
interpreters: interpreters:
The list of associated interpreters: those that have called The list of associated interpreters: those that have called
the "recv()" or "__next__()" methods and haven't called "close()". the "recv()" or "__next__()" methods and haven't called
"release()" (and the channel hasn't been explicitly closed).
recv(): recv():
@ -773,10 +820,11 @@ many-to-many, channels have no buffer.
the channel. If none have been sent then wait until the next the channel. If none have been sent then wait until the next
send. This associates the current interpreter with the channel. send. This associates the current interpreter with the channel.
If the channel is already closed (see the close() method) If the channel is already closed then raise ChannelClosedError.
then raise EOFError. If the channel isn't closed, but the current If the channel isn't closed but the current interpreter already
interpreter already called the "close()" method (which drops its called the "release()" method (which drops its association with
association with the channel) then raise ValueError. the channel) then raise ChannelReleasedError (which is a subclass
of ChannelClosedError).
recv_nowait(default=None): recv_nowait(default=None):
@ -784,26 +832,35 @@ many-to-many, channels have no buffer.
then return the default. Otherwise, this is the same as the then return the default. Otherwise, this is the same as the
"recv()" method. "recv()" method.
close(): release():
No longer associate the current interpreter with the channel (on No longer associate the current interpreter with the channel (on
the receiving end) and block future association (via the "recv()" the receiving end) and block future association (via the "recv()"
method. If the interpreter was never associated with the channel method). If the interpreter was never associated with the channel
then still block future association. Once an interpreter is no then still block future association. Once an interpreter is no
longer associated with the channel, subsequent (or current) send() longer associated with the channel, subsequent (or current) send()
and recv() calls from that interpreter will raise ValueError and recv() calls from that interpreter will raise
(or EOFError if the channel is actually marked as closed). ChannelReleasedError (or ChannelClosedError if the channel
is actually marked as closed).
Once the number of associated interpreters on both ends drops Once the number of associated interpreters on both ends drops
to 0, the channel is actually marked as closed. The Python to 0, the channel is actually marked as closed. The Python
runtime will garbage collect all closed channels, though it may runtime will garbage collect all closed channels, though it may
not be immediately. Note that "close()" is automatically called not be immediately. Note that "release()" is automatically called
in behalf of the current interpreter when the channel is no longer in behalf of the current interpreter when the channel is no longer
used (i.e. has no references) in that interpreter. used (i.e. has no references) in that interpreter.
This operation is idempotent. Return True if "close()" has not This operation is idempotent. Return True if "release()" has not
been called before by the current interpreter. been called before by the current interpreter.
close(force=False):
Close both ends of the channel (in all interpreters). This means
that any further use of the channel raises ChannelClosedError. If
the channel is not empty then raise ChannelNotEmptyError (if
"force" is False) or discard the remaining objects (if "force"
is True) and close it.
``SendChannel(id)``:: ``SendChannel(id)``::
@ -827,16 +884,16 @@ many-to-many, channels have no buffer.
object is not shareable then ValueError is raised. Currently object is not shareable then ValueError is raised. Currently
only bytes are supported. only bytes are supported.
If the channel is already closed (see the close() method) If the channel is already closed then raise ChannelClosedError.
then raise EOFError. If the channel isn't closed, but the current If the channel isn't closed but the current interpreter already
interpreter already called the "close()" method (which drops its called the "release()" method (which drops its association with
association with the channel) then raise ValueError. the channel) then raise ChannelReleasedError.
send_nowait(obj): send_nowait(obj):
Send the object to the receiving end of the channel. If the other Send the object to the receiving end of the channel. If the other
end is not currently receiving then raise RuntimeError. Otherwise end is not currently receiving then raise NotReceivedError.
this is the same as "send()". Otherwise this is the same as "send()".
send_buffer(obj): send_buffer(obj):
@ -847,14 +904,23 @@ many-to-many, channels have no buffer.
send_buffer_nowait(obj): send_buffer_nowait(obj):
Send a MemoryView of the object rather than the object. If the Send a MemoryView of the object rather than the object. If the
other end is not currently receiving then raise RuntimeError. other end is not currently receiving then raise NotReceivedError.
Otherwise this is the same as "send_buffer()". Otherwise this is the same as "send_buffer()".
close(): release():
This is the same as "RecvChannel.close(), but applied to the This is the same as "RecvChannel.release(), but applied to the
sending end of the channel. sending end of the channel.
close(force=False):
Close both ends of the channel (in all interpreters). No matter
what the "send" end of the channel is immediately closed. If the
channel is empty then close the "recv" end immediately too.
Otherwise wait until the channel is empty before closing it (if
"force" is False) or discard the remaining items and close
immediately (if "force" is True).
Note that ``send_buffer()`` is similar to how Note that ``send_buffer()`` is similar to how
``multiprocessing.Connection`` works. [mp-conn]_ ``multiprocessing.Connection`` works. [mp-conn]_
@ -862,7 +928,9 @@ Note that ``send_buffer()`` is similar to how
Open Questions Open Questions
============== ==============
None * "force" argument to ``ch.release()``?
* add a "tp_share" type slot instead of using a global registry
for shareable types?
Open Implementation Questions Open Implementation Questions
@ -1020,9 +1088,8 @@ exception, effectively ending execution in the interpreter that tried
to use the poisoned channel. to use the poisoned channel.
This could be accomplished by adding a ``poison()`` method to both ends This could be accomplished by adding a ``poison()`` method to both ends
of the channel. The ``close()`` method could work if it had a ``force`` of the channel. The ``close()`` method can be used in this way
option to force the channel closed. Regardless, these semantics are (mostly), but these semantics are relatively specialized and can wait.
relatively specialized and can wait.
Sending channels over channels Sending channels over channels
------------------------------ ------------------------------
@ -1070,14 +1137,6 @@ generic module reset mechanism may prove unnecessary.
This isn't a critical feature initially. It can wait until later This isn't a critical feature initially. It can wait until later
if desirable. if desirable.
Support passing ints in channels
--------------------------------
Passing ints around should be fine and ultimately is probably
desirable. However, we can get by with serializing them as bytes
for now. The goal is a minimal API for the sake of basic
functionality at first.
File descriptors and sockets in channels File descriptors and sockets in channels
---------------------------------------- ----------------------------------------
@ -1119,7 +1178,8 @@ Channel context managers
Context manager support on ``RecvChannel`` and ``SendChannel`` may be Context manager support on ``RecvChannel`` and ``SendChannel`` may be
helpful. The implementation would be simple, wrapping a call to helpful. The implementation would be simple, wrapping a call to
``close()`` like files do. As with iteration, this can wait. ``close()`` (or maybe ``release()``) like files do. As with iteration,
this can wait.
Pipes and Queues Pipes and Queues
---------------- ----------------
@ -1136,19 +1196,11 @@ reasonable. The could be trivially implemented as wrappers around
channels. Alternatively they could be implemented for efficiency at the channels. Alternatively they could be implemented for efficiency at the
same low level as channels. same low level as channels.
interpreters.RunFailedError Buffering
--------------------------- ---------
As currently proposed, ``Interpreter.run()`` offers you no way to The proposed channels are unbuffered. This simplifies the API and
distinguish an error coming from the subinterpreter from any other implementation. If buffering is desireable we can add it later.
error in the current interpreter. Your only option would be to
explicitly wrap your ``run()`` call in a
``try: ... except RuntimeError:`` (since we wrap a proxy of the original
exception in a RuntimeError and raise that).
If this is a problem in practice then would could add something like
``interpreters.RunFailedError`` (subclassing RuntimeError) and raise that
in ``run()``.
Return a lock from send() Return a lock from send()
------------------------- -------------------------
@ -1162,6 +1214,26 @@ This matters for buffered channels (i.e. queues). For unbuffered
channels it is a non-issue. So this can be dealt with once channels channels it is a non-issue. So this can be dealt with once channels
support buffering. support buffering.
Add a "reraise" method to RunFailedError
----------------------------------------
While having ``__cause__`` set on ``RunFailedError`` helps produce a
more useful traceback, it's less helpful when handling the original
error. To help facilitate this, we could add
``RunFailedError.reraise()``. This method would enable the following
pattern::
try:
interp.run(script)
except RunFailedError as exc:
try:
exc.reraise()
except MyException:
...
This would be made even simpler if there existed a ``__reraise__``
protocol.
Rejected Ideas Rejected Ideas
============== ==============
@ -1170,7 +1242,7 @@ Explicit channel association
---------------------------- ----------------------------
Interpreters are implicitly associated with channels upon ``recv()`` and Interpreters are implicitly associated with channels upon ``recv()`` and
``send()`` calls. They are de-associated with ``close()`` calls. The ``send()`` calls. They are de-associated with ``release()`` calls. The
alternative would be explicit methods. It would be either alternative would be explicit methods. It would be either
``add_channel()`` and ``remove_channel()`` methods on ``Interpreter`` ``add_channel()`` and ``remove_channel()`` methods on ``Interpreter``
objects or something similar on channel objects. objects or something similar on channel objects.
@ -1216,15 +1288,16 @@ While that might not be a problem currently, it would be a problem once
interpreters get better isolation relative to memory management (which interpreters get better isolation relative to memory management (which
is necessary to stop sharing the GIL between interpreters). We've is necessary to stop sharing the GIL between interpreters). We've
resolved the semantics of how the exceptions propagate by raising a resolved the semantics of how the exceptions propagate by raising a
RuntimeError instead, which wraps a safe proxy for the original ``RunFailedError`` instead, for which ``__cause__`` wraps a safe proxy
exception and traceback. for the original exception and traceback.
Rejected possible solutions: Rejected possible solutions:
* set the RuntimeError's __cause__ to the proxy of the original
exception
* reproduce the exception and traceback in the original interpreter * reproduce the exception and traceback in the original interpreter
and raise that. and raise that.
* raise a subclass of RunFailedError that proxies the original
exception and traceback.
* raise RuntimeError instead of RunFailedError
* convert at the boundary (a la ``subprocess.CalledProcessError``) * convert at the boundary (a la ``subprocess.CalledProcessError``)
(requires a cross-interpreter representation) (requires a cross-interpreter representation)
* support customization via ``Interpreter.excepthook`` * support customization via ``Interpreter.excepthook``
@ -1282,7 +1355,7 @@ References
https://bugs.python.org/issue6531 https://bugs.python.org/issue6531
.. [mp-conn] .. [mp-conn]
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Connection https://docs.python.org/3/library/multiprocessing.html#connection-objects
.. [bug-rate] .. [bug-rate]
https://mail.python.org/pipermail/python-ideas/2017-September/047094.html https://mail.python.org/pipermail/python-ideas/2017-September/047094.html

View File

@ -6,7 +6,7 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 02-Jun-2017 Created: 02-Jun-2017
Python-Version: 3.7 Python-Version: 3.7
Post-History: 08-Sep-2017, 25-Nov-2017, 30-Nov-2017, 01-Dec-2017, 02-Dec-2017, 06-Jan-2018 Post-History: 08-Sep-2017, 25-Nov-2017, 30-Nov-2017, 01-Dec-2017, 02-Dec-2017, 06-Jan-2018, 04-Mar-2018
Resolution: https://mail.python.org/pipermail/python-dev/2017-December/151034.html Resolution: https://mail.python.org/pipermail/python-dev/2017-December/151034.html
Notice for Reviewers Notice for Reviewers
@ -93,7 +93,7 @@ There have been numerous attempts to define classes which exist
primarily to store values which are accessible by attribute lookup. primarily to store values which are accessible by attribute lookup.
Some examples include: Some examples include:
- collection.namedtuple in the standard library. - collections.namedtuple in the standard library.
- typing.NamedTuple in the standard library. - typing.NamedTuple in the standard library.
@ -170,7 +170,7 @@ The ``dataclass`` decorator is typically used with no parameters and
no parentheses. However, it also supports the following logical no parentheses. However, it also supports the following logical
signature:: signature::
def dataclass(*, init=True, repr=True, eq=True, order=False, hash=None, frozen=False) def dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
If ``dataclass`` is used just as a simple decorator with no If ``dataclass`` is used just as a simple decorator with no
parameters, it acts as if it has the default values documented in this parameters, it acts as if it has the default values documented in this
@ -184,7 +184,7 @@ signature. That is, these three uses of ``@dataclass`` are equivalent::
class C: class C:
... ...
@dataclass(init=True, repr=True, eq=True, order=False, hash=None, frozen=False) @dataclass(init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
class C: class C:
... ...
@ -200,10 +200,15 @@ The parameters to ``dataclass`` are:
are not included. For example: are not included. For example:
``InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=10)``. ``InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=10)``.
- ``eq``: If true (the default), ``__eq__`` and ``__ne__`` methods If the class already defines ``__repr__``, this parameter is
will be generated. These compare the class as if it were a tuple of ignored.
its fields, in order. Both instances in the comparison must be of
the identical type. - ``eq``: If true (the default), an ``__eq__`` method will be
generated. This method compares the class as if it were a tuple of its
fields, in order. Both instances in the comparison must be of the
identical type.
If the class already defines ``__eq__``, this parameter is ignored.
- ``order``: If true (the default is False), ``__lt__``, ``__le__``, - ``order``: If true (the default is False), ``__lt__``, ``__le__``,
``__gt__``, and ``__ge__`` methods will be generated. These compare ``__gt__``, and ``__ge__`` methods will be generated. These compare
@ -211,9 +216,11 @@ The parameters to ``dataclass`` are:
instances in the comparison must be of the identical type. If instances in the comparison must be of the identical type. If
``order`` is true and ``eq`` is false, a ``ValueError`` is raised. ``order`` is true and ``eq`` is false, a ``ValueError`` is raised.
- ``hash``: Either a bool or ``None``. If ``None`` (the default), the If the class already defines any of ``__lt__``, ``__le__``,
``__hash__`` method is generated according to how ``eq`` and ``__gt__``, or ``__ge__``, then ``ValueError`` is raised.
``frozen`` are set.
- ``unsafe_hash``: If ``False`` (the default), the ``__hash__`` method
is generated according to how ``eq`` and ``frozen`` are set.
If ``eq`` and ``frozen`` are both true, Data Classes will generate a If ``eq`` and ``frozen`` are both true, Data Classes will generate a
``__hash__`` method for you. If ``eq`` is true and ``frozen`` is ``__hash__`` method for you. If ``eq`` is true and ``frozen`` is
@ -224,15 +231,36 @@ The parameters to ``dataclass`` are:
to id-based hashing). to id-based hashing).
Although not recommended, you can force Data Classes to create a Although not recommended, you can force Data Classes to create a
``__hash__`` method with ``hash=True``. This might be the case if your ``__hash__`` method with ``unsafe_hash=True``. This might be the
class is logically immutable but can nonetheless be mutated. This case if your class is logically immutable but can nonetheless be
is a specialized use case and should be considered carefully. mutated. This is a specialized use case and should be considered
carefully.
If a class already has an explicitely defined ``__hash__`` the
behavior when adding ``__hash__`` is modified. An expicitely
defined ``__hash__`` is defined when:
- ``__eq__`` is defined in the class and ``__hash__`` is defined
with any value other than ``None``.
- ``__eq__`` is defined in the class and any non-``None``
``__hash__`` is defined.
- ``__eq__`` is not defined on the class, and any ``__hash__`` is
defined.
If ``unsafe_hash`` is true and an explicitely defined ``__hash__``
is present, then ``ValueError`` is raised.
If ``unsafe_hash`` is false and an explicitely defined ``__hash__``
is present, then no ``__hash__`` is added.
See the Python documentation [#]_ for more information. See the Python documentation [#]_ for more information.
- ``frozen``: If true (the default is False), assigning to fields will - ``frozen``: If true (the default is False), assigning to fields will
generate an exception. This emulates read-only frozen instances. generate an exception. This emulates read-only frozen instances.
See the discussion below. If either ``__getattr__`` or ``__setattr__`` is defined in the
class, then ``ValueError`` is raised. See the discussion below.
``field``\s may optionally specify a default value, using normal ``field``\s may optionally specify a default value, using normal
Python syntax:: Python syntax::
@ -533,7 +561,7 @@ Module level helper functions
- ``fields(class_or_instance)``: Returns a tuple of ``Field`` objects - ``fields(class_or_instance)``: Returns a tuple of ``Field`` objects
that define the fields for this Data Class. Accepts either a Data that define the fields for this Data Class. Accepts either a Data
Class, or an instance of a Data Class. Raises `ValueError` if not Class, or an instance of a Data Class. Raises ``ValueError`` if not
passed a Data Class or instance of one. Does not return passed a Data Class or instance of one. Does not return
pseudo-fields which are ``ClassVar`` or ``InitVar``. pseudo-fields which are ``ClassVar`` or ``InitVar``.

View File

@ -206,6 +206,59 @@ the backwards compatibility::
return meta(name, resolved_bases, ns, **kwds) return meta(name, resolved_bases, ns, **kwds)
Using ``__class_getitem__`` in C extensions
-------------------------------------------
As mentioned above, ``__class_getitem__`` is automatically a class method
if defined in Python code. To define this method in a C extension, one
should use flags ``METH_O|METH_CLASS``. For example, a simple way to make
an extension class generic is to use a method that simply returns the
original class objects, thus fully erasing the type information at runtime,
and deferring all check to static type checkers only::
typedef struct {
PyObject_HEAD
/* ... your code ... */
} SimpleGeneric;
static PyObject *
simple_class_getitem(PyObject *type, PyObject *item)
{
Py_INCREF(type);
return type;
}
static PyMethodDef simple_generic_methods[] = {
{"__class_getitem__", simple_class_getitem, METH_O|METH_CLASS, NULL},
/* ... other methods ... */
};
PyTypeObject SimpleGeneric_Type = {
PyVarObject_HEAD_INIT(NULL, 0)
"SimpleGeneric",
sizeof(SimpleGeneric),
0,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
.tp_methods = simple_generic_methods,
};
Such class can be used as a normal generic in Python type annotations
(a corresponding stub file should be provided for static type checkers,
see PEP 484 for details)::
from simple_extension import SimpleGeneric
from typing import TypeVar
T = TypeVar('T')
Alias = SimpleGeneric[str, T]
class SubClass(SimpleGeneric[T, int]):
...
data: Alias[int] # Works at runtime
more_data: SubClass[str] # Also works at runtime
Backwards compatibility and impact on users who don't use ``typing`` Backwards compatibility and impact on users who don't use ``typing``
==================================================================== ====================================================================

View File

@ -1,12 +1,12 @@
PEP: 561 PEP: 561
Title: Distributing and Packaging Type Information Title: Distributing and Packaging Type Information
Author: Ethan Smith <ethan@ethanhs.me> Author: Ethan Smith <ethan@ethanhs.me>
Status: Draft Status: Accepted
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 09-Sep-2017 Created: 09-Sep-2017
Python-Version: 3.7 Python-Version: 3.7
Post-History: 10-Sep-2017, 12-Sep-2017, 06-Oct-2017, 26-Oct-2017 Post-History: 10-Sep-2017, 12-Sep-2017, 06-Oct-2017, 26-Oct-2017, 12-Apr-2018
Abstract Abstract
@ -49,10 +49,11 @@ Definition of Terms
The definition of "MAY", "MUST", and "SHOULD", and "SHOULD NOT" are The definition of "MAY", "MUST", and "SHOULD", and "SHOULD NOT" are
to be interpreted as described in RFC 2119. to be interpreted as described in RFC 2119.
"inline" - the types are part of the runtime code using PEP 526 and 3107 "inline" - the types are part of the runtime code using PEP 526 and
syntax. PEP 3107 syntax (the filename ends in ``.py``).
"stubs" - files containing only type information, empty of runtime code. "stubs" - files containing only type information, empty of runtime code
(the filename ends in ``.pyi``).
"Distributions" are the packaged files which are used to publish and distribute "Distributions" are the packaged files which are used to publish and distribute
a release. [3]_ a release. [3]_
@ -60,13 +61,16 @@ a release. [3]_
"Module" a file containing Python runtime code or stubbed type information. "Module" a file containing Python runtime code or stubbed type information.
"Package" a directory or directories that namespace Python modules. "Package" a directory or directories that namespace Python modules.
(Note the distinction between packages and distributions. While most
distributions are named after the one package they install, some
distributions install multiple packages.)
Specification Specification
============= =============
There are several motivations and methods of supporting typing in a package. There are several motivations and methods of supporting typing in a package.
This PEP recognizes three (3) types of packages that users of typing wish to This PEP recognizes three types of packages that users of typing wish to
create: create:
1. The package maintainer would like to add type information inline. 1. The package maintainer would like to add type information inline.
@ -77,7 +81,7 @@ create:
a package, but the maintainer does not want to include them in the source a package, but the maintainer does not want to include them in the source
of the package. of the package.
This PEP aims to support these scenarios and make them simple to add to This PEP aims to support all three scenarios and make them simple to add to
packaging and deployment. packaging and deployment.
The two major parts of this specification are the packaging specifications The two major parts of this specification are the packaging specifications
@ -115,15 +119,15 @@ Distutils option example::
..., ...,
) )
For namespace packages, the ``py.typed`` file should be in the submodules of For namespace packages (see PEP 420), the ``py.typed`` file should be in the
the namespace, to avoid conflicts and for clarity. submodules of the namespace, to avoid conflicts and for clarity.
This PEP does not support distributing typing information as part of This PEP does not support distributing typing information as part of
module-only distributions. The code should be refactored into a package-based module-only distributions. The code should be refactored into a package-based
distribution and indicate that the package supports typing as described distribution and indicate that the package supports typing as described
above. above.
Stub Only Packages Stub-only Packages
'''''''''''''''''' ''''''''''''''''''
For package maintainers wishing to ship stub files containing all of their For package maintainers wishing to ship stub files containing all of their
@ -131,21 +135,26 @@ type information, it is preferred that the ``*.pyi`` stubs are alongside the
corresponding ``*.py`` files. However, the stubs can also be put in a separate corresponding ``*.py`` files. However, the stubs can also be put in a separate
package and distributed separately. Third parties can also find this method package and distributed separately. Third parties can also find this method
useful if they wish to distribute stub files. The name of the stub package useful if they wish to distribute stub files. The name of the stub package
MUST follow the scheme ``foopkg_stubs`` for type stubs for the package named MUST follow the scheme ``foopkg-stubs`` for type stubs for the package named
``foopkg``. The normal resolution order of checking ``*.pyi`` before ``*.py`` ``foopkg``. Note that for stub-only packages adding a ``py.typed`` marker is not
will be maintained. needed since the name ``*-stubs`` is enough to indicate it is a source of typing
information.
Third parties seeking to distribute stub files are encouraged to contact the Third parties seeking to distribute stub files are encouraged to contact the
maintainer of the package about distribution alongside the package. If the maintainer of the package about distribution alongside the package. If the
maintainer does not wish to maintain or package stub files or type information maintainer does not wish to maintain or package stub files or type information
inline, then a third party stub only package can be created. inline, then a third party stub-only package can be created.
In addition, stub-only distributions SHOULD indicate which version(s) In addition, stub-only distributions SHOULD indicate which version(s)
of the runtime package are supported by indicating the runtime distribution's of the runtime package are supported by indicating the runtime distribution's
version(s) through normal dependency data. For example, the version(s) through normal dependency data. For example, the
stub package ``flyingcircus_stubs`` can indicate the versions of the stub package ``flyingcircus-stubs`` can indicate the versions of the
runtime ``flyingcircus`` distribution it supports through ``install_requires`` runtime ``flyingcircus`` distribution it supports through ``install_requires``
in distutils-based tools, or the equivalent in other packaging tools. in distutils-based tools, or the equivalent in other packaging tools. Note that
in pip 9.0, if you update ``flyingcircus-stubs``, it will update
``flyingcircus``. In pip 9.0, you can use the
``--upgrade-strategy=only-if-needed`` flag. In pip 10.0 this is the default
behavior.
Type Checker Module Resolution Order Type Checker Module Resolution Order
@ -158,13 +167,14 @@ resolve modules containing type information:
2. Stubs or Python source manually put in the beginning of the path. Type 2. Stubs or Python source manually put in the beginning of the path. Type
checkers SHOULD provide this to allow the user complete control of which checkers SHOULD provide this to allow the user complete control of which
stubs to use, and patch broken stubs/inline types from packages. stubs to use, and to patch broken stubs/inline types from packages.
In mypy the ``$MYPYPATH`` environment variable can be used for this.
3. Stub packages - these packages can supersede the installed packages. 3. Stub packages - these packages SHOULD supersede any installed inline
They can be found at ``foopkg_stubs`` for package ``foopkg``. package. They can be found at ``foopkg-stubs`` for package ``foopkg``.
4. Inline packages - if there is nothing overriding the installed 4. Inline packages - if there is nothing overriding the installed
package, and it opts into type checking, inline types SHOULD be used. package, *and* it opts into type checking, inline types SHOULD be used.
5. Typeshed (if used) - Provides the stdlib types and several third party 5. Typeshed (if used) - Provides the stdlib types and several third party
libraries. libraries.
@ -177,27 +187,77 @@ that the type checker allow for the user to point to a particular Python
binary, in case it is not in the path. binary, in case it is not in the path.
Partial Stub Packages
---------------------
Many stub packages will only have part of the type interface for libraries
completed, especially initially. For the benefit of type checking and code
editors, packages can be "partial". This means modules not found in the stub
package SHOULD be searched for in parts four and five of the module resolution
order above, namely inline packages and typeshed.
Type checkers should merge the stub package and runtime package or typeshed
directories. This can be thought of as the functional equivalent of copying the
stub package into the same directory as the corresponding runtime package or
typeshed folder and type checking the combined directory structure. Thus type
checkers MUST maintain the normal resolution order of checking ``*.pyi`` before
``*.py`` files.
Stub packages can opt into declaring themselves as partial by including
``partial\n`` in the package's ``py.typed`` file.
Implementation Implementation
============== ==============
The proposed scheme of indicating support for typing is completely backwards The proposed scheme of indicating support for typing is completely backwards
compatible, and requires no modification to tooling. A sample package with compatible, and requires no modification to package tooling. A sample package
inline types is available [typed_pkg]_, as well as a sample package checker with inline types is available [typed_package]_, as well as a [stub_package]_. A
[pkg_checker]_ which reads the metadata of installed packages and reports on sample package checker [pkg_checker]_ which reads the metadata of installed
their status as either not typed, inline typed, or a stub package. packages and reports on their status as either not typed, inline typed, or a
stub package.
The mypy type checker has an implementation of PEP 561 searching which can be
read about in the mypy docs [4]_.
[numpy-stubs]_ is an example of a real stub-only package for the numpy
distribution.
Acknowledgements Acknowledgements
================ ================
This PEP would not have been possible without the ideas, feedback, and support This PEP would not have been possible without the ideas, feedback, and support
of Ivan Levkivskyi, Jelle Zijlstra, Nick Coghlan, Daniel F Moisset, Nathaniel of Ivan Levkivskyi, Jelle Zijlstra, Nick Coghlan, Daniel F Moisset, Andrey
Smith, and Guido van Rossum. Vlasovskikh, Nathaniel Smith, and Guido van Rossum.
Version History Version History
=============== ===============
* 2018-07-09
* Add links to sample stub-only packages
* 2018-06-19
* Partial stub packages can look at typeshed as well as runtime packages
* 2018-05-15
* Add partial stub package spec.
* 2018-04-09
* Add reference to mypy implementation
* Clarify stub package priority.
* 2018-02-02
* Change stub-only package suffix to be -stubs not _stubs.
* Note that py.typed is not needed for stub-only packages.
* Add note about pip and upgrading stub packages.
* 2017-11-12 * 2017-11-12
* Rewritten to use existing tooling only * Rewritten to use existing tooling only
@ -208,7 +268,7 @@ Version History
* Specification re-written to use package metadata instead of distribution * Specification re-written to use package metadata instead of distribution
metadata. metadata.
* Removed stub only packages and merged into third party packages spec. * Removed stub-only packages and merged into third party packages spec.
* Removed suggestion for typecheckers to consider checking runtime versions * Removed suggestion for typecheckers to consider checking runtime versions
* Implementations updated to reflect PEP changes. * Implementations updated to reflect PEP changes.
@ -238,9 +298,18 @@ References
.. [3] PEP 426 definitions .. [3] PEP 426 definitions
(https://www.python.org/dev/peps/pep-0426/) (https://www.python.org/dev/peps/pep-0426/)
.. [typed_pkg] Sample typed package .. [4] Example implementation in a type checker
(https://mypy.readthedocs.io/en/latest/installed_packages.html)
.. [stub_package] A stub-only package
(https://github.com/ethanhs/stub-package)
.. [typed_package] Sample typed package
(https://github.com/ethanhs/sample-typed-package) (https://github.com/ethanhs/sample-typed-package)
.. [numpy-stubs] Stubs for numpy
(https://github.com/numpy/numpy-stubs)
.. [pkg_checker] Sample package checker .. [pkg_checker] Sample package checker
(https://github.com/ethanhs/check_typedpkg) (https://github.com/ethanhs/check_typedpkg)

View File

@ -72,7 +72,7 @@ imports. Consider a simple example::
# main.py # main.py
import lib import lib
lib.submodule.HeavyClass # prints "Submodule loaded" lib.submod.HeavyClass # prints "Submodule loaded"
There is a related proposal PEP 549 that proposes to support instance There is a related proposal PEP 549 that proposes to support instance
properties for a similar functionality. The difference is this PEP proposes properties for a similar functionality. The difference is this PEP proposes

View File

@ -5,14 +5,14 @@ Last-Modified: $Date$
Author: Dustin Ingram <di@di.codes> Author: Dustin Ingram <di@di.codes>
BDFL-Delegate: Daniel Holth BDFL-Delegate: Daniel Holth
Discussions-To: distutils-sig <distutils-sig at python.org> Discussions-To: distutils-sig <distutils-sig at python.org>
Status: Draft Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 1-Dec-2017 Created: 1-Dec-2017
Python-Version: 3.x Python-Version: 3.x
Post-History: Post-History:
Replaces: 345 Replaces: 345
Resolution: https://mail.python.org/pipermail/distutils-sig/2018-February/032014.html
Abstract Abstract
======== ========
@ -81,6 +81,14 @@ Name
The specification for the format of this field is now identical to the The specification for the format of this field is now identical to the
distribution name specification defined in PEP 508. distribution name specification defined in PEP 508.
Description
:::::::::::
In addition to the ``Description`` header field, the distribution's
description may instead be provided in the message body (i.e., after a
completely blank line following the headers, with no indentation or other
special formatting necessary).
Version Specifiers Version Specifiers
================== ==================
@ -124,6 +132,8 @@ as follows:
single list containing all the original values for the given key; single list containing all the original values for the given key;
#. The ``Keywords`` field should be converted to a list by splitting the #. The ``Keywords`` field should be converted to a list by splitting the
original value on whitespace characters; original value on whitespace characters;
#. The message body, if present, should be set to the value of the
``description`` key.
#. The result should be stored as a string-keyed dictionary. #. The result should be stored as a string-keyed dictionary.
Summary of Differences From PEP 345 Summary of Differences From PEP 345

View File

@ -712,6 +712,15 @@ This proposal was deferred to Python 3.8+ because of the following:
ctx.run(func) ctx.run(func)
3. If ``Context`` was mutable it would mean that context variables
could be mutated separately (or concurrently) from the code that
runs within the context. That would be similar to obtaining a
reference to a running Python frame object and modifying its
``f_locals`` from another OS thread. Having one single way to
assign values to context variables makes contexts conceptually
simpler and more predictable, while keeping the door open for
future performance optimizations.
Having initial values for ContextVars Having initial values for ContextVars
------------------------------------- -------------------------------------

379
pep-0571.rst Normal file
View File

@ -0,0 +1,379 @@
PEP: 571
Title: The manylinux2010 Platform Tag
Version: $Revision$
Last-Modified: $Date$
Author: Mark Williams <mrw@enotuniq.org>,
Geoffrey Thomas <geofft@ldpreload.com>,
Thomas Kluyver <thomas@kluyver.me.uk>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Active
Type: Informational
Content-Type: text/x-rst
Created:
Post-History:
Resolution: https://mail.python.org/pipermail/distutils-sig/2018-April/032156.html
Abstract
========
This PEP proposes the creation of a ``manylinux2010`` platform tag to
succeed the ``manylinux1`` tag introduced by PEP 513 [1]_. It also
proposes that PyPI and ``pip`` both be updated to support uploading,
downloading, and installing ``manylinux2010`` distributions on compatible
platforms.
Rationale
=========
True to its name, the ``manylinux1`` platform tag has made the
installation of binary extension modules a reality on many Linux
systems. Libraries like ``cryptography`` [2]_ and ``numpy`` [3]_ are
more accessible to Python developers now that their installation on
common architectures does not depend on fragile development
environments and build toolchains.
``manylinux1`` wheels achieve their portability by allowing the
extension modules they contain to link against only a small set of
system-level shared libraries that export versioned symbols old enough
to benefit from backwards-compatibility policies. Extension modules
in a ``manylinux1`` wheel that rely on ``glibc``, for example, must be
built against version 2.5 or earlier; they may then be run systems
that provide more recent ``glibc`` version that still export the
required symbols at version 2.5.
PEP 513 drew its whitelisted shared libraries and their symbol
versions from CentOS 5.11, which was the oldest supported CentOS
release at the time of its writing. Unfortunately, CentOS 5.11
reached its end-of-life on March 31st, 2017 with a clear warning
against its continued use. [4]_ No further updates, such as security
patches, will be made available. This means that its packages will
remain at obsolete versions that hamper the efforts of Python software
packagers who use the ``manylinux1`` Docker image.
CentOS 6 is now the oldest supported CentOS release, and will receive
maintenance updates through November 30th, 2020. [5]_ We propose that
a new PEP 425-style [6]_ platform tag called ``manylinux2010`` be derived
from CentOS 6 and that the ``manylinux`` toolchain, PyPI, and ``pip``
be updated to support it.
This was originally proposed as ``manylinux2``, but the versioning has
been changed to use calendar years (also known as CalVer [23]_). This
makes it easier to define future *manylinux* tags out of order: for
example, a hypothetical ``manylinux2017`` standard may be defined via
a new PEP before ``manylinux2014``, or a ``manylinux2007`` standard
might be defined that targets systems older than this PEP but newer
than ``manylinux1``.
Calendar versioning also gives a rough idea of which Linux
distribution versions support which tag: ``manylinux2010`` will work
on most distribution versions released since 2010. This is only an
approximation, however: the actual compatibility rules are defined
below, and some newer distributions may not meet them.
The ``manylinux2010`` policy
============================
The following criteria determine a ``linux`` wheel's eligibility for
the ``manylinux2010`` tag:
1. The wheel may only contain binary executables and shared objects
compiled for one of the two architectures supported by CentOS 6:
x86_64 or i686. [5]_
2. The wheel's binary executables or shared objects may not link
against externally-provided libraries except those in the following
whitelist: ::
libgcc_s.so.1
libstdc++.so.6
libm.so.6
libdl.so.2
librt.so.1
libcrypt.so.1
libc.so.6
libnsl.so.1
libutil.so.1
libpthread.so.0
libresolv.so.2
libX11.so.6
libXext.so.6
libXrender.so.1
libICE.so.6
libSM.so.6
libGL.so.1
libgobject-2.0.so.0
libgthread-2.0.so.0
libglib-2.0.so.0
This list is identical to the externally-provided libraries
whitelisted for ``manylinux1``, minus ``libncursesw.so.5`` and
``libpanelw.so.5``. [7]_ ``libpythonX.Y`` remains ineligible for
inclusion for the same reasons outlined in PEP 513.
On Debian-based systems, these libraries are provided by the packages:
============ =======================================================
Package Libraries
============ =======================================================
libc6 libdl.so.2, libresolv.so.2, librt.so.1, libc.so.6,
libpthread.so.0, libm.so.6, libutil.so.1, libcrypt.so.1,
libnsl.so.1
libgcc1 libgcc_s.so.1
libgl1 libGL.so.1
libglib2.0-0 libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0
libice6 libICE.so.6
libsm6 libSM.so.6
libstdc++6 libstdc++.so.6
libx11-6 libX11.so.6
libxext6 libXext.so.6
libxrender1 libXrender.so.1
============ =======================================================
On RPM-based systems, they are provided by these packages:
============ =======================================================
Package Libraries
============ =======================================================
glib2 libglib-2.0.so.0, libgthread-2.0.so.0, libgobject-2.0.so.0
glibc libresolv.so.2, libutil.so.1, libnsl.so.1, librt.so.1,
libcrypt.so.1, libpthread.so.0, libdl.so.2, libm.so.6,
libc.so.6
libICE libICE.so.6
libX11 libX11.so.6
libXext: libXext.so.6
libXrender libXrender.so.1
libgcc: libgcc_s.so.1
libstdc++ libstdc++.so.6
mesa libGL.so.1
============ =======================================================
3. If the wheel contains binary executables or shared objects linked
against any whitelisted libraries that also export versioned
symbols, they may only depend on the following maximum versions::
GLIBC_2.12
CXXABI_1.3.3
GLIBCXX_3.4.13
GCC_4.3.0
As an example, ``manylinux2010`` wheels may include binary artifacts
that require ``glibc`` symbols at version ``GLIBC_2.4``, because
this an earlier version than the maximum of ``GLIBC_2.12``.
4. If a wheel is built for any version of CPython 2 or CPython
versions 3.0 up to and including 3.2, it *must* include a CPython
ABI tag indicating its Unicode ABI. A ``manylinux2010`` wheel built
against Python 2, then, must include either the ``cpy27mu`` tag
indicating it was built against an interpreter with the UCS-4 ABI
or the ``cpy27m`` tag indicating an interpeter with the UCS-2
ABI. [8]_ [9]_
5. A wheel *must not* require the ``PyFPE_jbuf`` symbol. This is
achieved by building it against a Python compiled *without* the
``--with-fpectl`` ``configure`` flag.
Compilation of Compliant Wheels
===============================
Like ``manylinux1``, the ``auditwheel`` tool adds ``manylinux2010``
platform tags to ``linux`` wheels built by ``pip wheel`` or
``bdist_wheel`` in a ``manylinux2010`` Docker container.
Docker Images
-------------
``manylinux2010`` Docker images based on CentOS 6 x86_64 and i686 are
provided for building binary ``linux`` wheels that can reliably be
converted to ``manylinux2010`` wheels. [10]_ These images come with a
full compiler suite installed (``gcc``, ``g++``, and ``gfortran``
4.8.2) as well as the latest releases of Python and ``pip``.
Compatibility with kernels that lack ``vsyscall``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A Docker container assumes that its userland is compatible with its
host's kernel. Unfortunately, an increasingly common kernel
configuration breaks breaks this assumption for x86_64 CentOS 6 Docker
images.
Versions 2.14 and earlier of ``glibc`` require the kernel provide an
archaic system call optimization known as ``vsyscall`` on x86_64. [11]_
To effect the optimization, the kernel maps a read-only page of
frequently-called system calls -- most notably ``time(2)`` -- into
each process at a fixed memory location. ``glibc`` then invokes these
system calls by dereferencing a function pointer to the appropriate
offset into the ``vsyscall`` page and calling it. This avoids the
overhead associated with invoking the kernel that affects normal
system call invocation. ``vsyscall`` has long been deprecated in
favor of an equivalent mechanism known as vDSO, or "virtual dynamic
shared object", in which the kernel instead maps a relocatable virtual
shared object containing the optimized system calls into each
process. [12]_
The ``vsyscall`` page has serious security implications because it
does not participate in address space layout randomization (ASLR).
Its predictable location and contents make it a useful source of
gadgets used in return-oriented programming attacks. [13]_ At the same
time, its elimination breaks the x86_64 ABI, because ``glibc``
versions that depend on ``vsyscall`` suffer from segmentation faults
when attempting to dereference a system call pointer into a
non-existent page. As a compromise, Linux 3.1 implemented an
"emulated" ``vsyscall`` that reduced the executable code, and thus the
material for ROP gadgets, mapped into the process. [14]_
``vsyscall=emulated`` has been the default configuration in most
distribution's kernels for many years.
Unfortunately, ``vsyscall`` emulation still exposes predicatable code
at a reliable memory location, and continues to be useful for
return-oriented programming. [15]_ Because most distributions have now
upgraded to ``glibc`` versions that do not depend on ``vsyscall``,
they are beginning to ship kernels that do not support ``vsyscall`` at
all. [16]_
CentOS 5.11 and 6 both include versions of ``glibc`` that depend on
the ``vsyscall`` page (2.5 and 2.12.2 respectively), so containers
based on either cannot run under kernels provided with many
distribution's upcoming releases. [17]_ If Travis CI, for example,
begins running jobs under
a kernel that does not provide the ``vsyscall`` interface, Python
packagers will not be able to use our Docker images there to build
``manylinux`` wheels. [19]_
We have derived a patch from the ``glibc`` git repository that
backports the removal of all dependencies on ``vsyscall`` to the
version of ``glibc`` included with our ``manylinux2010`` image. [20]_
Rebuilding ``glibc``, and thus building ``manylinux2010`` image itself,
still requires a host kernel that provides the ``vsyscall`` mechanism,
but the resulting image can be both run on hosts that provide it and
those that do not. Because the ``vsyscall`` interface is an
optimization that is only applied to running processes, the
``manylinux2010`` wheels built with this modified image should be
identical to those built on an unmodified CentOS 6 system. Also, the
``vsyscall`` problem applies only to x86_64; it is not part of the
i686 ABI.
Auditwheel
----------
The ``auditwheel`` tool has also been updated to produce
``manylinux2010`` wheels. [21]_ Its behavior and purpose are otherwise
unchanged from PEP 513.
Platform Detection for Installers
=================================
Platforms may define a ``manylinux2010_compatible`` boolean attribute on
the ``_manylinux`` module described in PEP 513. A platform is
considered incompatible with ``manylinux2010`` if the attribute is
``False``.
If the ``_manylinux`` module is not found, or it does not have the attribute
``manylinux2010_compatible``, tools may fall back to checking for glibc. If the
platform has glibc 2.12 or newer, it is assumed to be compatible unless the
``_manylinux`` module says otherwise.
Specifically, the algorithm we propose is::
def is_manylinux2010_compatible():
# Only Linux, and only x86-64 / i686
from distutils.util import get_platform
if get_platform() not in ["linux-x86_64", "linux-i686"]:
return False
# Check for presence of _manylinux module
try:
import _manylinux
return bool(_manylinux.manylinux2010_compatible)
except (ImportError, AttributeError):
# Fall through to heuristic check below
pass
# Check glibc version. CentOS 6 uses glibc 2.12.
# PEP 513 contains an implementation of this function.
return have_compatible_glibc(2, 12)
Backwards compatibility with ``manylinux1`` wheels
==================================================
As explained in PEP 513, the specified symbol versions for
``manylinux1`` whitelisted libraries constitute an *upper bound*. The
same is true for the symbol versions defined for ``manylinux2010`` in
this PEP. As a result, ``manylinux1`` wheels are considered
``manylinux2010`` wheels. A ``pip`` that recognizes the ``manylinux2010``
platform tag will thus install ``manylinux1`` wheels for
``manylinux2010`` platforms -- even when explicitly set -- when no
``manylinux2010`` wheels are available. [22]_
PyPI Support
============
PyPI should permit wheels containing the ``manylinux2010`` platform tag
to be uploaded in the same way that it permits ``manylinux1``. It
should not attempt to verify the compatibility of ``manylinux2010``
wheels.
References
==========
.. [1] PEP 513 -- A Platform Tag for Portable Linux Built Distributions
(https://www.python.org/dev/peps/pep-0513/)
.. [2] pyca/cryptography
(https://cryptography.io/)
.. [3] numpy
(https://numpy.org)
.. [4] CentOS 5.11 EOL announcement
(https://lists.centos.org/pipermail/centos-announce/2017-April/022350.html)
.. [5] CentOS Product Specifications
(https://web.archive.org/web/20180108090257/https://wiki.centos.org/About/Product)
.. [6] PEP 425 -- Compatibility Tags for Built Distributions
(https://www.python.org/dev/peps/pep-0425/)
.. [7] ncurses 5 -> 6 transition means we probably need to drop some
libraries from the manylinux whitelist
(https://github.com/pypa/manylinux/issues/94)
.. [8] PEP 3149
https://www.python.org/dev/peps/pep-3149/
.. [9] SOABI support for Python 2.X and PyPy
https://github.com/pypa/pip/pull/3075
.. [10] manylinux2 Docker images
(https://hub.docker.com/r/markrwilliams/manylinux2/)
.. [11] On vsyscalls and the vDSO
(https://lwn.net/Articles/446528/)
.. [12] vdso(7)
(http://man7.org/linux/man-pages/man7/vdso.7.html)
.. [13] Framing Signals -- A Return to Portable Shellcode
(http://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf)
.. [14] ChangeLog-3.1
(https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.1)
.. [15] Project Zero: Three bypasses and a fix for one of Flash's Vector.<*> mitigations
(https://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html)
.. [16] linux: activate CONFIG_LEGACY_VSYSCALL_NONE ?
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852620)
.. [17] [Wheel-builders] Heads-up re: new kernel configurations breaking the manylinux docker image
(https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html)
.. [18] No longer used
.. [19] Travis CI
(https://travis-ci.org/)
.. [20] remove-vsyscall.patch
https://github.com/markrwilliams/manylinux/commit/e9493d55471d153089df3aafca8cfbcb50fa8093#diff-3eda4130bdba562657f3ec7c1b3f5720
.. [21] auditwheel manylinux2 branch
(https://github.com/markrwilliams/auditwheel/tree/manylinux2)
.. [22] pip manylinux2 branch
https://github.com/markrwilliams/pip/commits/manylinux2
.. [23] Calendar Versioning
http://calver.org/
Copyright
=========
This document has been placed into the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

1262
pep-0572.rst Normal file

File diff suppressed because it is too large Load Diff

568
pep-0573.rst Normal file
View File

@ -0,0 +1,568 @@
PEP: 573
Title: Module State Access from C Extension Methods
Version: $Revision$
Last-Modified: $Date$
Author: Petr Viktorin <encukou@gmail.com>,
Nick Coghlan <ncoghlan@gmail.com>,
Eric Snow <ericsnowcurrently@gmail.com>
Marcel Plch <gmarcel.plch@gmail.com>
Discussions-To: import-sig@python.org
Status: Active
Type: Process
Content-Type: text/x-rst
Created: 02-Jun-2016
Python-Version: 3.8
Post-History:
Abstract
========
This PEP proposes to add a way for CPython extension methods to access context such as
the state of the modules they are defined in.
This will allow extension methods to use direct pointer dereferences
rather than PyState_FindModule for looking up module state, reducing or eliminating the
performance cost of using module-scoped state over process global state.
This fixes one of the remaining roadblocks for adoption of PEP 3121 (Extension
module initialization and finalization) and PEP 489
(Multi-phase extension module initialization).
Additionaly, support for easier creation of immutable exception classes is added.
This removes the need for keeping per-module state if it would only be used
for exception classes.
While this PEP takes an additional step towards fully solving the problems that PEP 3121 and PEP 489 started
tackling, it does not attempt to resolve *all* remaining concerns. In particular, accessing the module state from slot methods (``nb_add``, etc) remains slower than accessing that state from other extension methods.
Terminology
===========
Process-Global State
--------------------
C-level static variables. Since this is very low-level
memory storage, it must be managed carefully.
Per-module State
----------------
State local to a module object, allocated dynamically as part of a
module object's initialization. This isolates the state from other
instances of the module (including those in other subinterpreters).
Accessed by ``PyModule_GetState()``.
Static Type
-----------
A type object defined as a C-level static variable, i.e. a compiled-in type object.
A static type needs to be shared between module instances and has no
information of what module it belongs to.
Static types do not have ``__dict__`` (although their instances might).
Heap Type
---------
A type object created at run time.
Rationale
=========
PEP 489 introduced a new way to initialize extension modules, which brings
several advantages to extensions that implement it:
* The extension modules behave more like their Python counterparts.
* The extension modules can easily support loading into pre-existing
module objects, which paves the way for extension module support for
``runpy`` or for systems that enable extension module reloading.
* Loading multiple modules from the same extension is possible, which
makes testing module isolation (a key feature for proper sub-interpreter
support) possible from a single interpreter.
The biggest hurdle for adoption of PEP 489 is allowing access to module state
from methods of extension types.
Currently, the way to access this state from extension methods is by looking up the module via
``PyState_FindModule`` (in contrast to module level functions in extension modules, which
receive a module reference as an argument).
However, ``PyState_FindModule`` queries the thread-local state, making it relatively
costly compared to C level process global access and consequently deterring module authors from using it.
Also, ``PyState_FindModule`` relies on the assumption that in each
subinterpreter, there is at most one module corresponding to
a given ``PyModuleDef``. This does not align well with Python's import
machinery. Since PEP 489 aimed to fix that, the assumption does
not hold for modules that use multi-phase initialization, so
``PyState_FindModule`` is unavailable for these modules.
A faster, safer way of accessing module-level state from extension methods
is needed.
Immutable Exception Types
-------------------------
For isolated modules to work, any class whose methods touch module state
must be a heap type, so that each instance of a module can have its own
type object. With the changes proposed in this PEP, heap type instances will
have access to module state without global registration. But, to create
instances of heap types, one will need the module state in order to
get the type object corresponding to the appropriate module.
In short, heap types are "viral" anything that “touches” them must itself be
a heap type.
Curently, most exception types, apart from the ones in ``builtins``, are
heap types. This is likely simply because there is a convenient way
to create them: ``PyErr_NewException``.
Heap types generally have a mutable ``__dict__``.
In most cases, this mutability is harmful. For example, exception types
from the ``sqlite`` module are mutable and shared across subinterpreters.
This allows "smuggling" values to other subinterpreters via attributes of
``sqlite3.Error``.
Moreover, since raising exceptions is a common operation, and heap types
will be "viral", ``PyErr_NewException`` will tend to "infect" the module
with "heap type-ness" at least if the module decides play well with
subinterpreters/isolation.
Many modules could go without module state
entirely if the exception classes were immutable.
To solve this problem, a new function for creating immutable exception types
is proposed.
Background
===========
The implementation of a Python method may need access to one or more of
the following pieces of information:
* The instance it is called on (``self``)
* The underlying function
* The class the method was defined in
* The corresponding module
* The module state
In Python code, the Python-level equivalents may be retrieved as::
import sys
def meth(self):
instance = self
module_globals = globals()
module_object = sys.modules[__name__] # (1)
underlying_function = Foo.meth # (1)
defining_class = Foo # (1)
defining_class = __class__ # (2)
.. note::
The defining class is not ``type(self)``, since ``type(self)`` might
be a subclass of ``Foo``.
The statements marked (1) implicitly rely on name-based lookup via the function's ``__globals__``:
either the ``Foo`` attribute to access the defining class and Python function object, or ``__name__`` to find the module object in ``sys.modules``.
In Python code, this is feasible, as ``__globals__`` is set appropriately when the function definition is executed, and
even if the namespace has been manipulated to return a different object, at worst an exception will be raised.
The ``__class__`` closure, (2), is a safer way to get the defining class, but it still relies on ``__closure__`` being set appropriately.
By contrast, extension methods are typically implemented as normal C functions.
This means that they only have access to their arguments and C level thread-local
and process-global states. Traditionally, many extension modules have stored
their shared state in C-level process globals, causing problems when:
* running multiple initialize/finalize cycles in the same process
* reloading modules (e.g. to test conditional imports)
* loading extension modules in subinterpreters
PEP 3121 attempted to resolve this by offering the ``PyState_FindModule`` API, but this still has significant problems when it comes to extension methods (rather than module level functions):
* it is markedly slower than directly accessing C-level process-global state
* there is still some inherent reliance on process global state that means it still doesn't reliably handle module reloading
It's also the case that when looking up a C-level struct such as module state, supplying
an unexpected object layout can crash the interpreter, so it's significantly more important to ensure that extension
methods receive the kind of object they expect.
Proposal
========
Currently, a bound extension method (``PyCFunction`` or ``PyCFunctionWithKeywords``) receives only
``self``, and (if applicable) the supplied positional and keyword arguments.
While module-level extension functions already receive access to the defining module object via their
``self`` argument, methods of extension types don't have that luxury: they receive the bound instance
via ``self``, and hence have no direct access to the defining class or the module level state.
The additional module level context described above can be made available with two changes.
Both additions are optional; extension authors need to opt in to start
using them:
* Add a pointer to the module to heap type objects.
* Pass the defining class to the underlying C function.
The defining class is readily available at the time built-in
method object (``PyCFunctionObject``) is created, so it can be stored
in a new struct that extends ``PyCFunctionObject``.
The module state can then be retrieved from the module object via
``PyModule_GetState``.
Note that this proposal implies that any type whose method needs to access
per-module state must be a heap type, rather than a static type.
This is necessary to support loading multiple module objects from a single
extension: a static type, as a C-level global, has no information about
which module it belongs to.
Slot methods
------------
The above changes don't cover slot methods, such as ``tp_iter`` or ``nb_add``.
The problem with slot methods is that their C API is fixed, so we can't
simply add a new argument to pass in the defining class.
Two possible solutions have been proposed to this problem:
* Look up the class through walking the MRO.
This is potentially expensive, but will be useful if performance is not
a problem (such as when raising a module-level exception).
* Storing a pointer to the defining class of each slot in a separate table,
``__typeslots__`` [#typeslots-mail]_. This is technically feasible and fast,
but quite invasive.
Due to the invasiveness of the latter approach, this PEP proposes adding an MRO walking
helper for use in slot method implementations, deferring the more complex alternative
as a potential future optimisation. Modules affected by this concern also have the
option of using thread-local state or PEP 567 context variables, or else defining their
own reload-friendly lookup caching scheme.
Immutable Exception Types
-------------------------
To facilitate creating static exception classes, a new function is proposed:
``PyErr_PrepareImmutableException``. It will work similarly to ``PyErr_NewExceptionWithDoc``
but will take a ``PyTypeObject **`` pointer, which points to a ``PyTypeObject *`` that is
either ``NULL`` or an initialized ``PyTypeObject``.
This pointer may be declared in process-global state. The function will then
allocate the object and will keep in mind that already existing exception
should not be overwritten.
The extra indirection makes it possible to make ``PyErr_PrepareImmutableException``
part of the stable ABI by having the Python interpreter, rather than extension code,
allocate the ``PyTypeObject``.
Specification
=============
Adding module references to heap types
--------------------------------------
The ``PyHeapTypeObject`` struct will get a new member, ``PyObject *ht_module``,
that can store a pointer to the module object for which the type was defined.
It will be ``NULL`` by default, and should not be modified after the type
object is created.
A new factory method will be added for creating modules::
PyObject* PyType_FromModuleAndSpec(PyObject *module,
PyType_Spec *spec,
PyObject *bases)
This acts the same as ``PyType_FromSpecWithBases``, and additionally sets
``ht_module`` to the provided module object.
Additionally, an accessor, ``PyObject * PyType_GetModule(PyTypeObject *)``
will be provided.
It will return the ``ht_module`` if a heap type with module pointer set
is passed in, otherwise it will set a SystemError and return NULL.
Usually, creating a class with ``ht_module`` set will create a reference
cycle involving the class and the module.
This is not a problem, as tearing down modules is not a performance-sensitive
operation (and module-level functions typically also create reference cycles).
The existing "set all module globals to None" code that breaks function cycles
through ``f_globals`` will also break the new cycles through ``ht_module``.
Passing the defining class to extension methods
-----------------------------------------------
A new style of C-level functions will be added to the current selection of
``PyCFunction`` and ``PyCFunctionWithKeywords``::
PyObject *PyCMethod(PyObject *self,
PyTypeObject *defining_class,
PyObject *args, PyObject *kwargs)
A new method object flag, ``METH_METHOD``, will be added to signal that
the underlying C function is ``PyCMethod``.
To hold the extra information, a new structure extending ``PyCFunctionObject``
will be added::
typedef struct {
PyCFunctionObject func;
PyTypeObject *mm_class; /* Passed as 'defining_class' arg to the C func */
} PyCMethodObject;
To allow passing the defining class to the underlying C function, a change
to private API is required, now ``_PyMethodDef_RawFastCallDict`` and
``_PyMethodDef_RawFastCallKeywords`` will receive ``PyTypeObject *cls``
as one of their arguments.
A new macro ``PyCFunction_GET_CLASS(cls)`` will be added for easier access to mm_class.
Method construction and calling code and will be updated to honor
``METH_METHOD``.
Argument Clinic
---------------
To support passing the defining class to methods using Argument Clinic,
a new converter will be added to clinic.py: ``defining_class``.
Each method may only have one argument using this converter, and it must
appear after ``self``, or, if ``self`` is not used, as the first argument.
The argument will be of type ``PyTypeObject *``.
When used, Argument Clinic will select ``METH_METHOD`` as the calling
convention.
The argument will not appear in ``__text_signature__``.
This will be compatible with ``__init__`` and ``__new__`` methods, where an
MRO walker will be used to pass the defining class from clinic generated
code to the user's function.
Slot methods
------------
To allow access to per-module state from slot methods, an MRO walker
will be implemented::
PyTypeObject *PyType_DefiningTypeFromSlotFunc(PyTypeObject *type,
int slot, void *func)
The walker will go through bases of heap-allocated ``type``
and search for class that defines ``func`` at its ``slot``.
The ``func`` needs not to be inherited by ``type``, only requirement
for the walker to find the defining class is that the defining class
must be heap-allocated.
On failure, exception is set and NULL is returned.
Static exceptions
-----------------
A new function will be added::
int PyErr_PrepareImmutableException(PyTypeObject **exc,
const char *name,
const char *doc,
PyObject *base)
Creates an immutable exception type which can be shared
across multiple module objects.
If the type already exists (determined by a process-global pointer,
``*exc``), skip the initialization and only ``INCREF`` it.
If ``*exc`` is NULL, the function will
allocate a new exception type and initialize it using given parameters
the same way ``PyType_FromSpecAndBases`` would.
The ``doc`` and ``base`` arguments may be ``NULL``, defaulting to a
missing docstring and ``PyExc_Exception`` base class, respectively.
The exception type's ``tp_flags`` will be set to values common to
built-in exceptions and the ``Py_TPFLAGS_HEAP_IMMUTABLE`` flag (see below)
will be set.
On failure, ``PyErr_PrepareImmutableException`` will set an exception
and return -1.
If called with an initialized exception type (``*exc``
is non-NULL), the function will do nothing but incref ``*exc``.
A new flag, ``Py_TPFLAGS_HEAP_IMMUTABLE``, will be added to prevent
mutation of the type object. This makes it possible to
share the object safely between multiple interpreters.
This flag is checked in ``type_setattro`` and blocks
setting of attributes when set, similar to built-in types.
A new pointer, ``ht_moduleptr``, will be added to heap types to store ``exc``.
On deinitialization of the exception type, ``*exc`` will be set to ``NULL``.
This makes it safe for ``PyErr_PrepareImmutableException`` to check if
the exception was already initialized.
PyType_offsets
--------------
Some extension types are using instances with ``__dict__`` or ``__weakref__``
allocated. Currently, there is no way of passing offsets of these through
``PyType_Spec``. To allow this, a new structure and a spec slot are proposed.
A new structure, ``PyType_offsets``, will have two members containing the
offsets of ``__dict__`` and ``__weakref__``::
typedef struct {
Py_ssize_t dict;
Py_ssize_t weaklist;
} PyType_offsets;
The new slot, ``Py_offsets``, will be used to pass a ``PyType_offsets *``
structure containing the mentioned data.
Helpers
-------
Getting to per-module state from a heap type is a very common task. To make this
easier, a helper will be added::
void *PyType_GetModuleState(PyObject *type)
This function takes a heap type and on success, it returns pointer to state of the
module that the heap type belongs to.
On failure, two scenarios may occure. When a type without a module is passed in,
``SystemError`` is set and ``NULL`` returned. If the module is found, pointer
to the state, which may be ``NULL``, is returned without setting any exception.
Modules Converted in the Initial Implementation
-----------------------------------------------
To validate the approach, several modules will be modified during
the initial implementation:
The ``zipimport``, ``_io``, ``_elementtree``, and ``_csv`` modules
will be ported to PEP 489 multiphase initialization.
Summary of API Changes and Additions
====================================
New functions:
* PyType_GetModule
* PyType_DefiningTypeFromSlotFunc
* PyType_GetModuleState
* PyErr_PrepareImmutableException
New macros:
* PyCFunction_GET_CLASS
New types:
* PyCMethodObject
New structures:
* PyType_offsets
Modified functions:
* _PyMethodDef_RawFastCallDict now receives ``PyTypeObject *cls``.
* _PyMethodDef_RawFastCallKeywords now receives ``PyTypeObject *cls``.
Modified structures:
* _heaptypeobject - added ht_module and ht_moduleptr
Other changes:
* METH_METHOD call flag
* defining_class converter in clinic
* Py_TPFLAGS_HEAP_IMMUTABLE flag
* Py_offsets type spec slot
Backwards Compatibility
=======================
Two new pointers are added to all heap types.
All other changes are adding new functions, structures and a type flag.
The new ``PyErr_PrepareImmutableException`` function changes encourages
modules to switch from using heap type Exception classes to immutable ones,
and a number of modules will be switched in the initial implementation.
This change will prevent adding class attributes to such types.
For example, the following will raise AttributeError::
sqlite.OperationalError.foo = None
Instances and subclasses of such exceptions will not be affected.
Implementation
==============
An initial implementation is available in a Github repository [#gh-repo]_;
a patchset is at [#gh-patch]_.
Possible Future Extensions
==========================
Easy creation of types with module references
---------------------------------------------
It would be possible to add a PEP 489 execution slot type to make
creating heap types significantly easier than calling
``PyType_FromModuleAndSpec``.
This is left to a future PEP.
Optimization
------------
CPython optimizes calls to methods that have restricted signatures,
such as not allowing keyword arguments.
As proposed here, methods defined with the ``METH_METHOD`` flag do not support
these optimizations.
Optimized calls still have the option of accessing per-module state
the same way slot methods do.
References
==========
.. [#typeslots-mail] [Import-SIG] On singleton modules, heap types, and subinterpreters
(https://mail.python.org/pipermail/import-sig/2015-July/001035.html)
.. [#gh-repo]
https://github.com/Traceur759/cpython/commits/pep-c
.. [#gh-patch]
https://github.com/Traceur759/cpython/compare/master...Traceur759:pep-c.patch
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

512
pep-0574.rst Normal file
View File

@ -0,0 +1,512 @@
PEP: 574
Title: Pickle protocol 5 with out-of-band data
Version: $Revision$
Last-Modified: $Date$
Author: Antoine Pitrou <solipsis@pitrou.net>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Mar-2018
Post-History: 28-Mar-2018
Resolution:
Abstract
========
This PEP proposes to standardize a new pickle protocol version, and
accompanying APIs to take full advantage of it:
1. A new pickle protocol version (5) to cover the extra metadata needed
for out-of-band data buffers.
2. A new ``PickleBuffer`` type for ``__reduce_ex__`` implementations
to return out-of-band data buffers.
3. A new ``buffer_callback`` parameter when pickling, to handle out-of-band
data buffers.
4. A new ``buffers`` parameter when unpickling to provide out-of-band data
buffers.
The PEP guarantees unchanged behaviour for anyone not using the new APIs.
Rationale
=========
The pickle protocol was originally designed in 1995 for on-disk persistency
of arbitrary Python objects. The performance of a 1995-era storage medium
probably made it irrelevant to focus on performance metrics such as
use of RAM bandwidth when copying temporary data before writing it to disk.
Nowadays the pickle protocol sees a growing use in applications where most
of the data isn't ever persisted to disk (or, when it is, it uses a portable
format instead of Python-specific). Instead, pickle is being used to transmit
data and commands from one process to another, either on the same machine
or on multiple machines. Those applications will sometimes deal with very
large data (such as Numpy arrays or Pandas dataframes) that need to be
transferred around. For those applications, pickle is currently
wasteful as it imposes spurious memory copies of the data being serialized.
As a matter of fact, the standard ``multiprocessing`` module uses pickle
for serialization, and therefore also suffers from this problem when
sending large data to another process.
Third-party Python libraries, such as Dask [#dask]_, PyArrow [#pyarrow]_
and IPyParallel [#ipyparallel]_, have started implementing alternative
serialization schemes with the explicit goal of avoiding copies on large
data. Implementing a new serialization scheme is difficult and often
leads to reduced generality (since many Python objects support pickle
but not the new serialization scheme). Falling back on pickle for
unsupported types is an option, but then you get back the spurious
memory copies you wanted to avoid in the first place. For example,
``dask`` is able to avoid memory copies for Numpy arrays and
built-in containers thereof (such as lists or dicts containing Numpy
arrays), but if a large Numpy array is an attribute of a user-defined
object, ``dask`` will serialize the user-defined object as a pickle
stream, leading to memory copies.
The common theme of these third-party serialization efforts is to generate
a stream of object metadata (which contains pickle-like information about
the objects being serialized) and a separate stream of zero-copy buffer
objects for the payloads of large objects. Note that, in this scheme,
small objects such as ints, etc. can be dumped together with the metadata
stream. Refinements can include opportunistic compression of large data
depending on its type and layout, like ``dask`` does.
This PEP aims to make ``pickle`` usable in a way where large data is handled
as a separate stream of zero-copy buffers, letting the application handle
those buffers optimally.
Example
=======
To keep the example simple and avoid requiring knowledge of third-party
libraries, we will focus here on a bytearray object (but the issue is
conceptually the same with more sophisticated objects such as Numpy arrays).
Like most objects, the bytearray object isn't immediately understood by
the pickle module and must therefore specify its decomposition scheme.
Here is how a bytearray object currently decomposes for pickling::
>>> b.__reduce_ex__(4)
(<class 'bytearray'>, (b'abc',), None)
This is because the ``bytearray.__reduce_ex__`` implementation reads
morally as follows::
class bytearray:
def __reduce_ex__(self, protocol):
if protocol == 4:
return type(self), bytes(self), None
# Legacy code for earlier protocols omitted
In turn it produces the following pickle code::
>>> pickletools.dis(pickletools.optimize(pickle.dumps(b, protocol=4)))
0: \x80 PROTO 4
2: \x95 FRAME 30
11: \x8c SHORT_BINUNICODE 'builtins'
21: \x8c SHORT_BINUNICODE 'bytearray'
32: \x93 STACK_GLOBAL
33: C SHORT_BINBYTES b'abc'
38: \x85 TUPLE1
39: R REDUCE
40: . STOP
(the call to ``pickletools.optimize`` above is only meant to make the
pickle stream more readable by removing the MEMOIZE opcodes)
We can notice several things about the bytearray's payload (the sequence
of bytes ``b'abc'``):
* ``bytearray.__reduce_ex__`` produces a first copy by instantiating a
new bytes object from the bytearray's data.
* ``pickle.dumps`` produces a second copy when inserting the contents of
that bytes object into the pickle stream, after the SHORT_BINBYTES opcode.
* Furthermore, when deserializing the pickle stream, a temporary bytes
object is created when the SHORT_BINBYTES opcode is encountered (inducing
a data copy).
What we really want is something like the following:
* ``bytearray.__reduce_ex__`` produces a *view* of the bytearray's data.
* ``pickle.dumps`` doesn't try to copy that data into the pickle stream
but instead passes the buffer view to its caller (which can decide on the
most efficient handling of that buffer).
* When deserializing, ``pickle.loads`` takes the pickle stream and the
buffer view separately, and passes the buffer view directly to the
bytearray constructor.
We see that several conditions are required for the above to work:
* ``__reduce__`` or ``__reduce_ex__`` must be able to return *something*
that indicates a serializable no-copy buffer view.
* The pickle protocol must be able to represent references to such buffer
views, instructing the unpickler that it may have to get the actual buffer
out of band.
* The ``pickle.Pickler`` API must provide its caller with a way
to receive such buffer views while serializing.
* The ``pickle.Unpickler`` API must similarly allow its caller to provide
the buffer views required for deserialization.
* For compatibility, the pickle protocol must also be able to contain direct
serializations of such buffer views, such that current uses of the ``pickle``
API don't have to be modified if they are not concerned with memory copies.
Producer API
============
We are introducing a new type ``pickle.PickleBuffer`` which can be
instantiated from any buffer-supporting object, and is specifically meant
to be returned from ``__reduce__`` implementations::
class bytearray:
def __reduce_ex__(self, protocol):
if protocol >= 5:
return type(self), (PickleBuffer(self),), None
# Legacy code for earlier protocols omitted
``PickleBuffer`` is a simple wrapper that doesn't have all the memoryview
semantics and functionality, but is specifically recognized by the ``pickle``
module if protocol 5 or higher is enabled. It is an error to try to
serialize a ``PickleBuffer`` with pickle protocol version 4 or earlier.
Only the raw *data* of the ``PickleBuffer`` will be considered by the
``pickle`` module. Any type-specific *metadata* (such as shapes or
datatype) must be returned separately by the type's ``__reduce__``
implementation, as is already the case.
PickleBuffer objects
--------------------
The ``PickleBuffer`` class supports a very simple Python API. Its constructor
takes a single PEP 3118-compatible object [#pep-3118]_. ``PickleBuffer``
objects themselves support the buffer protocol, so consumers can
call ``memoryview(...)`` on them to get additional information
about the underlying buffer (such as the original type, shape, etc.).
In addition, ``PickleBuffer`` objects can be explicitly released using
their ``release()`` method.
On the C side, a simple API will be provided to create and inspect
PickleBuffer objects:
``PyObject *PyPickleBuffer_FromObject(PyObject *obj)``
Create a ``PickleBuffer`` object holding a view over the PEP 3118-compatible
*obj*.
``PyPickleBuffer_Check(PyObject *obj)``
Return whether *obj* is a ``PickleBuffer`` instance.
``const Py_buffer *PyPickleBuffer_GetBuffer(PyObject *picklebuf)``
Return a pointer to the internal ``Py_buffer`` owned by the ``PickleBuffer``
instance. An exception is raised if the buffer is released.
``int PyPickleBuffer_Release(PyObject *picklebuf)``
Release the ``PickleBuffer`` instance's underlying buffer.
``PickleBuffer`` can wrap any kind of buffer, including non-contiguous
buffers. It's up to consumers to decide how best to handle different kinds
of buffers (for example, some consumers may find it acceptable to make a
contiguous copy of non-contiguous buffers).
Consumer API
============
``pickle.Pickler.__init__`` and ``pickle.dumps`` are augmented with an additional
``buffer_callback`` parameter::
class Pickler:
def __init__(self, file, protocol=None, ..., buffer_callback=None):
"""
If *buffer_callback* is not None, then it is called with a list
of out-of-band buffer views when deemed necessary (this could be
once every buffer, or only after a certain size is reached,
or once at the end, depending on implementation details). The
callback should arrange to store or transmit those buffers without
changing their order.
If *buffer_callback* is None (the default), buffer views are
serialized into *file* as part of the pickle stream.
It is an error if *buffer_callback* is not None and *protocol* is
None or smaller than 5.
"""
def pickle.dumps(obj, protocol=None, *, ..., buffer_callback=None):
"""
See above for *buffer_callback*.
"""
``pickle.Unpickler.__init__`` and ``pickle.loads`` are augmented with an
additional ``buffers`` parameter::
class Unpickler:
def __init__(file, *, ..., buffers=None):
"""
If *buffers* is not None, it should be an iterable of buffer-enabled
objects that is consumed each time the pickle stream references
an out-of-band buffer view. Such buffers have been given in order
to the *buffer_callback* of a Pickler object.
If *buffers* is None (the default), then the buffers are taken
from the pickle stream, assuming they are serialized there.
It is an error for *buffers* to be None if the pickle stream
was produced with a non-None *buffer_callback*.
"""
def pickle.loads(data, *, ..., buffers=None):
"""
See above for *buffers*.
"""
Protocol changes
================
Three new opcodes are introduced:
* ``BYTEARRAY8`` creates a bytearray from the data following it in the pickle
stream and pushes it on the stack (just like ``BINBYTES8`` does for bytes
objects);
* ``NEXT_BUFFER`` fetches a buffer from the ``buffers`` iterable and pushes
it on the stack.
* ``READONLY_BUFFER`` makes a readonly view of the top of the stack.
When pickling encounters a ``PickleBuffer``, there can be four cases:
* If a ``buffer_callback`` is given and the ``PickleBuffer`` is writable,
the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode
is appended to the pickle stream.
* If a ``buffer_callback`` is given and the ``PickleBuffer`` is readonly,
the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode
is appended to the pickle stream, followed by a ``READONLY_BUFFER`` opcode.
* If no ``buffer_callback`` is given and the ``PickleBuffer`` is writable,
it is serialized into the pickle stream as if it were a ``bytearray`` object.
* If no ``buffer_callback`` is given and the ``PickleBuffer`` is readonly,
it is serialized into the pickle stream as if it were a ``bytes`` object.
The distinction between readonly and writable buffers is explained below
(see "Mutability").
Side effects
============
Improved in-band performance
----------------------------
Even in-band pickling can be improved by returning a ``PickleBuffer``
instance from ``__reduce_ex__``, as one copy is avoided on the serialization
path [#ogrisel-numpy]_.
Caveats
=======
Mutability
----------
PEP 3118 buffers [#pep-3118]_ can be readonly or writable. Some objects,
such as Numpy arrays, need to be backed by a mutable buffer for full
operation. Pickle consumers that use the ``buffer_callback`` and ``buffers``
arguments will have to be careful to recreate mutable buffers. When doing
I/O, this implies using buffer-passing API variants such as ``readinto``
(which are also often preferrable for performance).
Data sharing
------------
If you pickle and then unpickle an object in the same process, passing
out-of-band buffer views, then the unpickled object may be backed by the
same buffer as the original pickled object.
For example, it might be reasonable to implement reduction of a Numpy array
as follows (crucial metadata such as shapes is omitted for simplicity)::
class ndarray:
def __reduce_ex__(self, protocol):
if protocol == 5:
return numpy.frombuffer, (PickleBuffer(self), self.dtype)
# Legacy code for earlier protocols omitted
Then simply passing the PickleBuffer around from ``dumps`` to ``loads``
will produce a new Numpy array sharing the same underlying memory as the
original Numpy object (and, incidentally, keeping it alive)::
>>> import numpy as np
>>> a = np.zeros(10)
>>> a[0]
0.0
>>> buffers = []
>>> data = pickle.dumps(a, protocol=5, buffer_callback=buffers.extend)
>>> b = pickle.loads(data, buffers=buffers)
>>> b[0] = 42
>>> a[0]
42.0
This won't happen with the traditional ``pickle`` API (i.e. without passing
``buffers`` and ``buffer_callback`` parameters), because then the buffer view
is serialized inside the pickle stream with a copy.
Rejected alternatives
=====================
Using the existing persistent load interface
--------------------------------------------
The ``pickle`` persistence interface is a way of storing references to
designated objects in the pickle stream while handling their actual
serialization out of band. For example, one might consider the following
for zero-copy serialization of bytearrays::
class MyPickle(pickle.Pickler):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.buffers = []
def persistent_id(self, obj):
if type(obj) is not bytearray:
return None
else:
index = len(self.buffers)
self.buffers.append(obj)
return ('bytearray', index)
class MyUnpickle(pickle.Unpickler):
def __init__(self, *args, buffers, **kwargs):
super().__init__(*args, **kwargs)
self.buffers = buffers
def persistent_load(self, pid):
type_tag, index = pid
if type_tag == 'bytearray':
return self.buffers[index]
else:
assert 0 # unexpected type
This mechanism has two drawbacks:
* Each ``pickle`` consumer must reimplement ``Pickler`` and ``Unpickler``
subclasses, with custom code for each type of interest. Essentially,
N pickle consumers end up each implementing custom code for M producers.
This is difficult (especially for sophisticated types such as Numpy
arrays) and poorly scalable.
* Each object encountered by the pickle module (even simple built-in objects
such as ints and strings) triggers a call to the user's ``persistent_id()``
method, leading to a possible performance drop compared to nominal.
Open questions
==============
Should ``buffer_callback`` take a single buffers or a sequence of buffers?
* Taking a single buffer would allow returning a boolean indicating whether
the given buffer should be serialized in-band or out-of-band.
* Taking a sequence of buffers is potentially more efficient by reducing
function call overhead.
Should it be allowed to serialize a ``PickleBuffer`` in protocol 4 and earlier?
It would simply be serialized as a ``bytes`` object (if read-only) or
``bytearray`` (if writable).
* It can make implementing ``__reduce__`` simpler.
* Serializing a ``bytearray`` in protocol 4 makes a supplementary memory
copy when ``bytearray.__reduce_ex__`` returns a ``bytes`` object. This
is a performance regression that may be overlooked by ``__reduce__``
implementors.
Implementation
==============
A first implementation is available in the author's GitHub fork [#pickle5-git]_.
An experimental backport for Python 3.6 and 3.7 is downloadable from PyPI
[#pickle5-pypi]_.
Related work
============
Dask.distributed implements a custom zero-copy serialization with fallback
to pickle [#dask-serialization]_.
PyArrow implements zero-copy component-based serialization for a few
selected types [#pyarrow-serialization]_.
PEP 554 proposes hosting multiple interpreters in a single process, with
provisions for transferring buffers between interpreters as a communication
scheme [#pep-554]_.
Acknowledgements
================
Thanks to the following people for early feedback: Nick Coghlan, Olivier
Grisel, Stefan Krah, MinRK, Matt Rocklin, Eric Snow.
References
==========
.. [#dask] Dask.distributed -- A lightweight library for distributed computing
in Python
https://distributed.readthedocs.io/
.. [#dask-serialization] Dask.distributed custom serialization
https://distributed.readthedocs.io/en/latest/serialization.html
.. [#ipyparallel] IPyParallel -- Using IPython for parallel computing
https://ipyparallel.readthedocs.io/
.. [#pyarrow] PyArrow -- A cross-language development platform for in-memory data
https://arrow.apache.org/docs/python/
.. [#pyarrow-serialization] PyArrow IPC and component-based serialization
https://arrow.apache.org/docs/python/ipc.html#component-based-serialization
.. [#pep-3118] PEP 3118 -- Revising the buffer protocol
https://www.python.org/dev/peps/pep-3118/
.. [#pep-554] PEP 554 -- Multiple Interpreters in the Stdlib
https://www.python.org/dev/peps/pep-0554/
.. [#pickle5-git] ``pickle5`` branch on GitHub
https://github.com/pitrou/cpython/tree/pickle5
.. [#pickle5-pypi] ``pickle5`` project on PyPI
https://pypi.org/project/pickle5/
.. [#ogrisel-numpy] Draft use of pickle protocol 5 for Numpy array pickling
https://gist.github.com/ogrisel/a2b0e5ae4987a398caa7f9277cb3b90a
Copyright
=========
This document has been placed into the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

1122
pep-0575.rst Normal file

File diff suppressed because it is too large Load Diff

161
pep-0576.rst Normal file
View File

@ -0,0 +1,161 @@
PEP: 576
Title: Rationalize Built-in function classes
Author: Mark Shannon <mark@hotpy.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-May-2018
Python-Version: 3.8
Post-History: 17-May-2018
23-June-2018
08-July-2018
Abstract
========
Expose the "FastcallKeywords" convention used internally by CPython to third-party code, and make the ``inspect`` module use duck-typing.
In combination this will allow third-party C extensions and tools like Cython to create objects that use the same calling conventions as built-in and Python functions, thus gaining performance parity with built-in functions like ``len`` or ``print``.
A small improvement in the performance of existing code is expected.
Motivation
==========
Currently third-party module authors face a dilemna when implementing
functions in C. Either they can use one of the pre-existing built-in function
or method classes or implement their own custom class in C.
The first choice causes them to lose the ability to access the internals of the callable object.
The second choice is an additional maintenance burden and, more importantly,
has a significant negative impact on performance.
This PEP aims to allow authors of third-party C modules, and tools like to Cython, to utilize the faster calling convention used internally by CPython for built-in functions and methods, and to do so without a loss of capabilities relative to a function implemented in Python.
Introspection
-------------
The inspect module will fully support duck-typing when introspecting callables.
The ``inspect.Signature.from_callable()`` function computes the signature of a callable. If an object has a ``__signature__``
property, then ``inspect.Signature.from_callable()`` simply returns that. To further support duck-typing, if a callable has a ``__text_signature__``
then the ``__signature__`` will be created from that.
This means that 3rd party builtin-functions can implement ``__text_signature__`` if sufficient,
and the more expensive ``__signature__`` if necessary.
Efficient calls to third-party callables
----------------------------------------
Currently the majority of calls are dispatched to ``function``\s and ``method_descriptor``\s in custom code, using the "FastcallKeywords" internal calling convention. This PEP proposes that this calling convention is implemented via a C function pointer. Third-party callables which implement this binary interface will have the potential to be called as fast as a built-in function.
Continued prohibition of callable classes as base classes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Currently any attempt to use ``function``, ``method`` or ``method_descriptor`` as a base class for a new class will fail with a ``TypeError``. This behaviour is desirable as it prevents errors when a subclass overrides the ``__call__`` method. If callables could be sub-classed then any call to a ``function`` or a ``method_descriptor`` would need an additional check that the ``__call__`` method had not been overridden. By exposing an additional call mechanism, the potential for errors becomes greater. As a consequence, any third-partyy class implementing the addition call interface will not be usable as a base class.
New classes and changes to existing classes
===========================================
Python visible changes
----------------------
#. A new built-in class, ``builtin_function``, will be added.
#. ``types.BuiltinFunctionType`` will refer to ``builtin_function`` not ``builtin_function_or_method``.
#. Instances of the ``builtin_function`` class will retain the ``__module__`` property of ``builtin_function_or_method`` and gain the ``func_module`` and ``func_globals`` properties. The ``func_module`` allows access to the module to which the function belongs. Note that this is different from the ``__module__`` property which merely returns the name of the module. The ``func_globals`` property is equivalent to ``func_module.__dict__`` and is provided to mimic the Python function property of the same name.
#. When binding a ``method_descriptor`` instance to an instance of its owning class, a ``bound_method`` will be created instead of a ``builtin_function_or_method``. This means that the ``method_descriptors`` now mimic the behaviour of Python functions more closely. In other words, ``[].append`` becomes a ``bound_method`` instead of a ``builtin_function_or_method``.
C API changes
-------------
#. A new function ``PyBuiltinFunction_New(PyMethodDef *ml, PyObject *module)`` is added to create built-in functions.
#. ``PyCFunction_NewEx()`` and ``PyCFunction_New()`` are deprecated and will return a ``PyBuiltinFunction`` if able, otherwise a ``builtin_function_or_method``.
Retaining backwards compatibility in the C API and ABI
======================================================
The proposed changes are fully backwards and forwards compatible at both the API and ABI level.
Internal C changes
------------------
Two new flags will be allowed for the ``typeobject.tp_flags`` field.
These are ``Py_TPFLAGS_EXTENDED_CALL`` and ``Py_TPFLAGS_FUNCTION_DESCRIPTOR``
Py_TPFLAGS_EXTENDED_CALL
~~~~~~~~~~~~~~~~~~~~~~~~
For any built-in class that sets ``Py_TPFLAGS_EXTENDED_CALL``
The C struct corresponding to this built-in class must begin with the struct ``PyExtendedCallable`` which is defined as follows::
typedef PyObject *(*extended_call_ptr)(PyObject *callable, PyObject** args,
int positional_argcount, PyTupleObject* kwnames);
typedef struct {
PyObject_HEAD
extended_call_ptr ext_call;
} PyExtendedCallable;
Any class that sets the ``Py_TPFLAGS_EXTENDED_CALL`` cannot be used as a base class and a TypeError will be raised if any Python code tries to use it a base class.
Py_TPFLAGS_FUNCTION_DESCRIPTOR
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If this flag is set for a built-in class ``F``, then instances of that class are expected to behave the same as a Python function when used as a class attribute.
Specifically, this mean that the value of ``c.m`` where ``C.m`` is an instanceof the built-in class ``F`` (and ``c`` is an instance of ``C``) must be a bound-method binding ``C.m`` and ``c``.
Without this flag, it would be impossible for custom callables to behave like Python functions *and* be efficient as Python or built-in functions.
Changes to existing C structs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``function``, ``method_descriptor`` and ``method`` classes will have their corresponding structs changed to
start with the ``PyExtendedCallable`` struct.
Third-party built-in classes using the new extended call interface
------------------------------------------------------------------
To enable call performance on a par with Python functions and built-in functions, third-party callables should set the ``Py_TPFLAGS_EXTENDED_CALL`` bit of ``tp_flags`` and ensure that the corresponding C struct starts with the ``PyExtendedCallable``.
Any built-in class that has the ``Py_TPFLAGS_EXTENDED_CALL`` bit set must also implement the ``tp_call`` function and make sure its behaviour is consistent with the ``ext_call`` function.
Performance implications of these changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Adding a function pointer to each callable, rather than each class of callable, enables the choice of dispatching function (the code to shuffle arguments about and do error checking) to be made when the callable object is created rather than when it is called. This should reduce the number of instructions executed between the call-site in the interpreter and the execution of the callee.
Alternative Suggestions
=======================
PEP 580 <https://www.python.org/dev/peps/pep-0580/> is an alternative approach to solving the same problem as this PEP.
Reference implementation
========================
A draft implementation can be found at https://github.com/markshannon/cpython/tree/pep-576-minimal
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

826
pep-0577.rst Normal file
View File

@ -0,0 +1,826 @@
PEP: 577
Title: Augmented Assignment Expressions
Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Created: 14-May-2018
Python-Version: 3.8
Post-History: 22-May-2018
PEP Withdrawal
==============
While working on this PEP, I realised that it didn't really address what was
actually bothering me about PEP 572's proposed scoping rules for previously
unreferenced assignment targets, and also had some significant undesirable
consequences (most notably, allowing ``>>=` and ``<<=`` as inline augmented
assignment operators that meant something entirely different from the ``>=``
and ``<=`` comparison operators).
I also realised that even without dedicated syntax of their own, PEP 572 allows
inline augmented assignments to be written using the ``operator`` module::
from operator import iadd
if (target := iadd(target, value)) < limit:
...
(The restriction to simple names as inline assignment targets means that the
target expession can always be repeated without side effects)
Accordingly, I'm withdrawing this PEP without submitting it for pronouncement,
and will instead be writing a replacement PEP that focuses specifically on the
handling of assignment targets which haven't already been declared as local
variables in the current scope (for both regular block scopes, and for scoped
expressions).
Abstract
========
This is a proposal to allow augmented assignments such as ``x += 1`` to be
used as expressions, not just statements.
As part of this, ``NAME := EXPR`` is proposed as an inline assignment expression
that uses the new augmented assignment scoping rules, rather than implicitly
defining a new local variable name the way that existing name binding
statements do. The question of allowing expression level local variable
declarations at function scope is deliberately separated from the question of
allowing expression level name bindings, and deferred to a later PEP.
This PEP is a direct competitor to PEP 572 (although it borrows heavily from that
PEP's motivation, and even shares the proposed syntax for inline assignments).
See `Relationship with PEP 572`_ for more details on the connections between
the two PEPs.
To improve the usability of the new expressions, a semantic split is proposed
between the handling of augmented assignments in regular block scopes (modules,
classes, and functions), and the handling of augmented assignments in scoped
expressions (lambda expressions, generator expressions, and comprehensions),
such that all inline assignments default to targeting the nearest containing
block scope.
A new compile time ``TargetNameError`` is added as a subclass of ``SyntaxError``
to handle cases where it is deemed to be currently unclear which target is
expected to be rebound by an inline assignment, or else the target scope
for the inline assignment is considered invalid for another reason.
Syntax and semantics
====================
Augmented assignment expressions
--------------------------------
The language grammar would be adjusted to allow augmented assignments to
appear as expressions, where the result of the augmented assignment
expression is the same post-calculation reference as is being bound to the
given target.
For example::
>>> n = 0
>>> n += 5
5
>>> n -= 2
3
>>> n *= 3
9
>>> n
9
For mutable targets, this means the result is always just the original object::
>>> seq = []
>>> seq_id = id(seq)
>>> seq += range(3)
[0, 1, 2]
>>> seq_id == id(seq)
True
Augmented assignments to attributes and container subscripts will be permitted,
with the result being the post-calculation reference being bound to the target,
just as it is for simple name targets::
def increment(self, step=1):
return self._value += step
In these cases, ``__getitem__`` and ``__getattribute__`` will *not* be called
after the assignment has already taken place (they will only be called as
needed to evaluate the in-place operation).
Adding an inline assignment operator
------------------------------------
Given only the addition of augmented assignment expressions, it would be
possible to abuse a symbol like ``|=`` as a general purpose assignment
operator by defining a ``Target`` wrapper type that worked as follows::
>>> class Target:
... def __init__(self, value):
... self.value = value
... def __or__(self, other):
... return Target(other)
...
>>> x = Target(10)
>>> x.value
10
>>> x |= 42
<__main__.Target object at 0x7f608caa8048>
>>> x.value
42
This is similar to the way that storing a single reference in a list was long
used as a workaround for the lack of a ``nonlocal`` keyword, and can still be
used today (in combination with ``operator.itemsetter``) to work around the
lack of expression level assignments.
Rather than requiring such workarounds, this PEP instead proposes that
PEP 572's "NAME := EXPR" syntax be adopted as a new inline assignment
expression that uses the augmented assignment scoping rules described below.
This cleanly handles cases where only the new value is of interest, and the
previously bound value (if any) can just be discarded completely.
Note that for both simple names and complex assignment targets, the inline
assignment operator does *not* read the previous reference before assigning
the new one. However, when used at function scope (either directly or inside
a scoped expression), it does *not* implicitly define a new local variable,
and will instead raise ``TargetNameError`` (as described for augmented
assignments below).
Assignment operator precedence
------------------------------
To preserve the existing semantics of augmented assignment statements,
inline assignment operators will be defined as being of lower precedence
than all other operators, include the comma pseudo-operator. This ensures
that when used as a top level expression the entire right hand side of the
expression is still interpreted as the value to be processed (even when that
value is a tuple without parentheses).
The difference this introduces relative to PEP 572 is that where
``(n := first, second)`` sets ``n = first`` in PEP 572, in this PEP it would set
``n = (first, second)`, and getting the first meaning would require an extra
set of parentheses (``((n := first), second)``).
PEP 572 quite reasonably notes that this results in ambiguity when assignment
expressions are used as function call arguments. This PEP resolves that concern
a different way by requiring that assignment expressions be parenthesised
when used as arguments to a function call (unless they're the sole argument).
This is a more relaxed version of the restriction placed on generator
expressions (which always require parentheses, except when they're the sole
argument to a function call).
Augmented assignment to names in block scopes
---------------------------------------------
No target name binding changes are proposed for augmented assignments at module
or class scope (this also includes code executed using "exec" or "eval"). These
will continue to implicitly declare a new local variable as the binding target
as they do today, and (if necessary) will be able to resolve the name from an
outer scope before binding it locally.
At function scope, augmented assignments will be changed to require that there
be either a preceding name binding or variable declaration to explicitly
establish the target name as being local to the function, or else an explicit
``global`` or ``nonlocal`` declaration. ``TargetNameError``, a new
``SyntaxError`` subclass, will be raised at compile time if no such binding or
declaration is present.
For example, the following code would compile and run as it does today::
x = 0
x += 1 # Sets global "x" to 1
class C:
x += 1 # Sets local "x" to 2, leaves global "x" alone
def local_target():
x = 0
x += 1 # Sets local "x" to 1, leaves global "x" alone
def global_target():
global x
x += 1 # Increments global "x" each time this runs
def nonlocal_target():
x = 0
def g():
nonlocal x
x += 1 # Increments "x" in outer scope each time this runs
return x
return g
The follow examples would all still compile and then raise an error at runtime
as they do today::
n += 1 # Raises NameError at runtime
class C:
n += 1 # Raises NameError at runtime
def missing_global():
global n
n += 1 # Raises NameError at runtime
def delayed_nonlocal_initialisation():
def f():
nonlocal n
n += 1
f() # Raises NameError at runtime
n = 0
def skipped_conditional_initialisation():
if False:
n = 0
n += 1 # Raises UnboundLocalError at runtime
def local_declaration_without_initial_assignment():
n: typing.Any
n += 1 # Raises UnboundLocalError at runtime
Whereas the following would raise a compile time ``DeprecationWarning``
initially, and eventually change to report a compile time ``TargetNameError``::
def missing_target():
x += 1 # Compile time TargetNameError due to ambiguous target scope
# Is there a missing initialisation of "x" here? Or a missing
# global or nonlocal declaration?
As a conservative implementation approach, the compile time function name
resolution change would be introduced as a ``DeprecationWarning`` in Python
3.8, and then converted to ``TargetNameError`` in Python 3.9. This avoids
potential problems in cases where an unused function would currently raise
``UnboundLocalError`` if it was ever actually called, but the code is actually
unused - converting that latent runtime defect to a compile time error qualifies
as a backwards incompatible change that requires a deprecation period.
When augmented assignments are used as expressions in function scope (rather
than as standalone statements), there aren't any backwards compatibility
concerns, so the compile time name binding checks would be enforced immediately
in Python 3.8.
Similarly, the new inline assignment expressions would always require explicit
predeclaration of their target scope when used as part of a function, at least
for Python 3.8. (See the design discussion section for notes on potentially
revisiting that restriction in the future).
Augmented assignment to names in scoped expressions
---------------------------------------------------
Scoped expressions is a new collective term being proposed for expressions that
introduce a new nested scope of execution, either as an intrinsic part of their
operation (lambda expressions, generator expressions), or else as a way of
hiding name binding operations from the containing scope (container
comprehensions).
Unlike regular functions, these scoped expressions can't include explicit
``global`` or ``nonlocal`` declarations to rebind names directly in an outer
scope.
Instead, their name binding semantics for augmented assignment expressions would
be defined as follows:
* augmented assignment targets used in scoped expressions are expected to either
be already bound in the containing block scope, or else have their scope
explicitly declared in the containing block scope. If no suitable name
binding or declaration can be found in that scope, then ``TargetNameError``
will be raised at compile time (rather than creating a new binding within
the scoped expression).
* if the containing block scope is a function scope, and the target name is
explicitly declared as ``global`` or ``nonlocal``, then it will be use the
same scope declaration in the body of the scoped expression
* if the containing block scope is a function scope, and the target name is
a local variable in that function, then it will be implicitly declared as
``nonlocal`` in the body of the scoped expression
* if the containing block scope is a class scope, than ``TargetNameError`` will
always be raised, with a dedicated message indicating that combining class
scopes with augmented assignments in scoped expressions is not currently
permitted.
* if a name is declared as a formal parameter (lambda expressions), or as an
iteration variable (generator expressions, comprehensions), then that name
is considered local to that scoped expression, and attempting to use it as
the target of an augmented assignment operation in that scope, or any nested
scoped expression, will raise ``TargetNameError`` (this is a restriction that
could potentially be lifted later, but is being proposed for now to simplify
the initial set of compile time and runtime semantics that needs to be
covered in the language reference and handled by the compiler and interpreter)
For example, the following code would work as shown::
>>> global_target = 0
>>> incr_global_target = lambda: global_target += 1
>>> incr_global_target()
1
>>> incr_global_target()
2
>>> global_target
2
>>> def cumulative_sums(data, start=0)
... total = start
... yield from (total += value for value in data)
... return total
...
>>> print(list(cumulative_sums(range(5))))
[0, 1, 3, 6, 10]
While the following examples would all raise ``TargetNameError``::
class C:
cls_target = 0
incr_cls_target = lambda: cls_target += 1 # Error due to class scope
def missing_target():
incr_x = lambda: x += 1 # Error due to missing target "x"
def late_target():
incr_x = lambda: x += 1 # Error due to "x" being declared after use
x = 1
lambda arg: arg += 1 # Error due to attempt to target formal parameter
[x += 1 for x in data] # Error due to attempt to target iteration variable
As augmented assignments currently can't appear inside scoped expressions, the
above compile time name resolution exceptions would be included as part of the
initial implementation rather than needing to be phased in as a potentially
backwards incompatible change.
Design discussion
=================
Allowing complex assignment targets
-----------------------------------
The initial drafts of this PEP kept PEP 572's restriction to single name targets
when augmented assignments were used as expressions, allowing attribute and
subscript targets solely for the statement form.
However, enforcing that required varying the permitted targets based on whether
or not the augmented assignment was a top level expression or not, as well as
explaining why ``n += 1``, ``(n += 1)``, and ``self.n += 1`` were all legal,
but ``(self.n += 1)`` was prohibited, so the proposal was simplified to allow
all existing augmented assignment targets for the expression form as well.
Since this PEP defines ``TARGET := EXPR`` as a variant on augmented assignment,
that also gained support for assignment and subscript targets.
Augmented assignment or name binding only?
------------------------------------------
PEP 572 makes a reasonable case that the potential use cases for inline
augmented assignment are notably weaker than those for inline assignment in
general, so it's acceptable to require that they be spelled as ``x := x + 1``,
bypassing any in-place augmented assignment methods.
While this is at least arguably true for the builtin types (where potential
counterexamples would probably need to focus on set manipulation use cases
that the PEP author doesn't personally have), it would also rule out more
memory intensive use cases like manipulation of NumPy arrays, where the data
copying involved in out-of-place operations can make them impractical as
alternatives to their in-place counterparts.
That said, this PEP mainly exists because the PEP author found the inline
assignment proposal much easier to grasp as "It's like ``+=``, only skipping
the addition step", and also liked the way that that framing provides an
actual semantic difference between ``NAME = EXPR`` and ``NAME := EXPR`` at
function scope.
That difference in target scoping behaviour means that the ``NAME := EXPR``
syntax would be expected to have two primary use cases:
* as a way of allowing assignments to be embedded as an expression in an ``if``
or ``while`` statement, or as part of a scoped expression
* as a way of requesting a compile time check that the target name be previously
declared or bound in the current function scope
At module or class scope, ``NAME = EXPR`` and ``NAME := EXPR`` would be
semantically equivalent due to the compiler's lack of visibility into the set
of names that will be resolvable at runtime, but code linters and static
type checkers would be encouraged to enforce the same "declaration or assignment
required before use" behaviour for ``NAME := EXPR`` as the compiler would
enforce at function scope.
Postponing a decision on expression level target declarations
-------------------------------------------------------------
At least for Python 3.8, usage of inline assignments (whether augmented or not)
at function scope would always require a preceding name binding or scope
declaration to avoid getting ``TargetNameError``, even when used outside a
scoped expression.
The intent behind this requirement is to clearly separate the following two
language design questions:
1. Can an expression rebind a name in the current scope?
2. Can an expression declare a new name in the current scope?
For module global scopes, the answer to both of those questions is unequivocally
"Yes", because it's a language level guarantee that mutating the ``globals()``
dict will immediately impact the runtime module scope, and ``global NAME``
declarations inside a function can have the same effect (as can importing the
currently executing module and modifying its attributes).
For class scopes, the answer to both questions is also "Yes" in practice,
although less unequivocally so, since the semantics of ``locals()`` are
currently formally unspecified. However, if the current behaviour of ``locals()``
at class scope is taken as normative (as PEP 558 proposes), then this is
essentially the same scenario as manipulating the module globals, just using
``locals()`` instead.
For function scopes, however, the current answers to these two questions are
respectively "Yes" and "No". Expression level rebinding of function locals is
already possible thanks to lexically nested scopes and explicit ``nonlocal NAME``
expressions. While this PEP will likely make expression level rebinding more
common than it is today, it isn't a fundamentally new concept for the language.
By contrast, declaring a *new* function local variable is currently a statement
level action, involving one of:
* an assignment statement (``NAME = EXPR``, ``OTHER_TARGET = NAME = EXPR``, etc)
* a variable declaration (``NAME : EXPR``)
* a nested function definition
* a nested class definition
* a ``for`` loop
* a ``with`` statement
* an ``except`` clause (with limited scope of access)
The historical trend for the language has actually been to *remove* support for
expression level declarations of function local names, first with the
introduction of "fast locals" semantics (which made the introduction of names
via ``locals()`` unsupported for function scopes), and again with the hiding
of comprehension iteration variables in Python 3.0.
Now, it may be that in Python 3.9, we decide to revisit this question based on
our experience with expression level name binding in Python 3.8, and decide that
we really do want expression level function local variable declarations as well,
and that we want ``NAME := EXPR`` to be the way we spell that (rather than,
for example, spelling inline declarations more explicitly as
``NAME := EXPR given NAME``, which would permit them to carry type annotations,
and also permit them to declare new local variables in scoped expressions,
rather than having to pollute the namespace in their containing scope).
But the proposal in this PEP is that we explicitly give ourselves a full
release to decide how much we want that feature, and exactly where we find
its absence irritating. Python has survived happily without expression level
name bindings *or* declarations for decades, so we can afford to give ourselves
a couple of years to decide if we really want *both* of those, or if expression
level bindings are sufficient.
Ignoring scoped expressions when determining augmented assignment targets
-------------------------------------------------------------------------
When discussing possible binding semantics for PEP 572's assignment expressions,
Tim Peters made a plausible case [1_,2_,3_] for assignment expressions targeting
the containing block scope, essentially ignoring any intervening scoped
expressions.
This approach allows use cases like cumulative sums, or extracting the final
value from a generator expression to be written in a relatively straightforward
way::
total = 0
partial_sums = [total := total + value for value in data]
factor = 1
while any(n % (factor := p) == 0 for p in small_primes):
n //= factor
Guido also expressed his approval for this general approach [4_].
The proposal in this PEP differs from Tim's original proposal in three main
areas:
- it applies the proposal to all augmented assignment operators, not just a
single new name binding operator
- as far as is practical, it extends the augmented assignment requirement that
the name already be defined to the new name binding operator (raising
``TargetNameError`` rather than implicitly declaring new local variables at
function scope)
- it includes lambda expressions in the set of scopes that get ignored for
target name binding purposes, making this transparency to assignments common
to all of the scoped expressions rather than being specific to comprehensions
and generator expressions
With scoped expressions being ignored when calculating binding targets, it's
once again difficult to detect the scoping difference between the outermost
iterable expressions in generator expressions and comprehensions (you have to
mess about with either class scopes or attempting to rebind iteration Variables
to detect it), so there's also no need to tinker with that.
Treating inline assignment as an augmented assignment variant
-------------------------------------------------------------
One of the challenges with PEP 572 is the fact that ``NAME = EXPR`` and
``NAME := EXPR`` are entirely semantically equivalent at every scope. This
makes the two forms hard to teach, since there's no inherent nudge towards
choosing one over the other at the statement level, so you end up having to
resort to "``NAME = EXPR`` is preferred because it's been around longer"
(and PEP 572 proposes to enfore that historical idiosyncrasy at the compiler
level).
That semantic equivalence is difficult to avoid at module and class scope while
still having ``if NAME := EXPR:`` and ``while NAME := EXPR:`` work sensibly, but
at function scope the compiler's comprehensive view of all local names makes
it possible to require that the name be assigned or declared before use,
providing a reasonable incentive to continue to default to using the
``NAME = EXPR`` form when possible, while also enabling the use of the
``NAME := EXPR`` as a kind of simple compile time assertion (i.e. explicitly
indicating that the targeted name has already been bound or declared and hence
should already be known to the compiler).
If Guido were to declare that support for inline declarations was a hard
design requirement, then this PEP would be updated to propose that
``EXPR given NAME`` also be introduced as a way to support inline name declarations
after arbitrary expressions (this would allow the inline name declarations to be
deferred until the end of a complex expression rather than needing to be
embedded in the middle of it, and PEP 8 would gain a recommendation encouraging
that style).
Disallowing augmented assignments in class level scoped expressions
-------------------------------------------------------------------
While modern classes do define an implicit closure that's visible to method
implementations (in order to make ``__class__`` available for use in zero-arg
``super()`` calls), there's no way for user level code to explicitly add
additional names to that scope.
Meanwhile, attributes defined in a class body are ignored for the purpose of
defining a method's lexical closure, which means adding them there wouldn't
work at an implementation level.
Rather than trying to resolve that inherent ambiguity, this PEP simply
prohibits such usage, and requires that any affected logic be written somewhere
other than directly inline in the class body (e.g. in a separate helper
function).
Comparison operators vs assignment operators
--------------------------------------------
The ``OP=`` construct as an expression currently indicates a comparison
operation::
x == y # Equals
x >= y # Greater-than-or-equal-to
x <= y # Less-than-or-equal-to
Both this PEP and PEP 572 propose adding at least one operator that's somewhat
similar in appearance, but defines an assignment instead::
x := y # Becomes
This PEP then goes much further and allows all *13* augmented assignment symbols
to be uses as binary operators::
x += y # In-place add
x -= y # In-place minus
x *= y # In-place multiply
x @= y # In-place matrix multiply
x /= y # In-place division
x //= y # In-place int division
x %= y # In-place mod
x &= y # In-place bitwise and
x |= y # In-place bitwise or
x ^= y # In-place bitwise xor
x <<= y # In-place left shift
x >>= y # In-place right shift
x **= y # In-place power
Of those additional binary operators, the most questionable would be the
bitshift assignment operators, since they're each only one doubled character
away from one of the inclusive ordered comparison operators.
Examples
========
Simplifying retry loops
-----------------------
There are currently a few different options for writing retry loops, including::
# Post-decrementing a counter
remaining_attempts = MAX_ATTEMPTS
while remaining_attempts:
remaining_attempts -= 1
try:
result = attempt_operation()
except Exception as exc:
continue # Failed, so try again
log.debug(f"Succeeded after {attempts} attempts")
break # Success!
else:
raise OperationFailed(f"Failed after {MAX_ATTEMPTS} attempts") from exc
# Loop-and-a-half with a pre-incremented counter
attempt = 0
while True:
attempts += 1
if attempts > MAX_ATTEMPTS:
raise OperationFailed(f"Failed after {MAX_ATTEMPTS} attempts") from exc
try:
result = attempt_operation()
except Exception as exc:
continue # Failed, so try again
log.debug(f"Succeeded after {attempts} attempts")
break # Success!
Each of the available options hides some aspect of the intended loop structure
inside the loop body, whether that's the state modification, the exit condition,
or both.
The proposal in this PEP allows both the state modification and the exit
condition to be included directly in the loop header::
attempt = 0
while (attempt += 1) <= MAX_ATTEMPTS:
try:
result = attempt_operation()
except Exception as exc:
continue # Failed, so try again
log.debug(f"Succeeded after {attempts} attempts")
break # Success!
else:
raise OperationFailed(f"Failed after {MAX_ATTEMPTS} attempts") from exc
Simplifying if-elif chains
--------------------------
if-elif chains that need to rebind the checked condition currently need to
be written using nested if-else statements::
m = pattern.match(data)
if m:
...
else:
m = other_pattern.match(data)
if m:
...
else:
m = yet_another_pattern.match(data)
if m:
...
else:
...
As with PEP 572, this PEP allows the else/if portions of that chain to be
condensed, making their consistent and mutually exclusive structure more
readily apparent::
m = pattern.match(data)
if m:
...
elif m := other_pattern.match(data):
...
elif m := yet_another_pattern.match(data):
...
else:
...
Unlike PEP 572, this PEP requires that the assignment target be explicitly
indicated as local before the first use as a ``:=`` target, either by
binding it to a value (as shown above), or else by including an appropriate
explicit type declaration::
m: typing.re.Match
if m := pattern.match(data):
...
elif m := other_pattern.match(data):
...
elif m := yet_another_pattern.match(data):
...
else:
...
Capturing intermediate values from comprehensions
-------------------------------------------------
The proposal in this PEP makes it straightforward to capture and reuse
intermediate values in comprehensions and generator expressions by
exporting them to the containing block scope::
factor: int
while any(n % (factor := p) == 0 for p in small_primes):
n //= factor
total = 0
partial_sums = [total += value for value in data]
Allowing lambda expressions to act more like re-usable code thunks
------------------------------------------------------------------
This PEP allows the classic closure usage example::
def make_counter(start=0):
x = start
def counter(step=1):
nonlocal x
x += step
return x
return counter
To be abbreviated as::
def make_counter(start=0):
x = start
return lambda step=1: x += step
While the latter form is still a conceptually dense piece of code, it can be
reasonably argued that the lack of boilerplate (where the "def", "nonlocal",
and "return" keywords and two additional repetitions of the "x" variable name
have been replaced with the "lambda" keyword) may make it easier to read in
practice.
Relationship with PEP 572
=========================
The case for allowing inline assignments at all is made in PEP 572. This
competing PEP was initially going to propose an alternate surface syntax
(``EXPR given NAME = EXPR``), while retaining the expression semantics from
PEP 572, but that changed when discussing one of the initial motivating use
cases for allowing embedded assignments at all: making it possible to easily
calculate cumulative sums in comprehensions and generator expressions.
As a result of that, and unlike PEP 572, this PEP focuses primarily on use
cases for inline augmented assignment. It also has the effect of converting
cases that currently inevitably raise ``UnboundLocalError`` at function call
time to report a new compile time ``TargetNameError``.
New syntax for a name rebinding expression (``NAME := TARGET``) is then added
not only to handle the same use cases as are identified in PEP 572, but also
as a lower level primitive to help illustrate, implement and explain
the new augmented assignment semantics, rather than being the sole change being
proposed.
The author of this PEP believes that this approach makes the value of the new
flexibility in name rebinding clearer, while also mitigating many of the
potential concerns raised with PEP 572 around explaining when to use
``NAME = EXPR`` over ``NAME := EXPR`` (and vice-versa), without resorting to
prohibiting the bare statement form of ``NAME := EXPR`` outright (such
that ``NAME := EXPR`` is a compile error, but ``(NAME := EXPR)`` is permitted).
Acknowledgements
================
The PEP author wishes to thank Chris Angelico for his work on PEP 572, and his
efforts to create a coherent summary of the great many sprawling discussions
that spawned on both python-ideas and python-dev, as well as Tim Peters for
the in-depth discussion of parent local scoping that prompted the above
scoping proposal for augmented assignments inside scoped expressions.
Eric Snow's feedback on a pre-release version of this PEP helped make it
significantly more readable.
References
==========
.. [1] The beginning of Tim's genexp & comprehension scoping thread
(https://mail.python.org/pipermail/python-ideas/2018-May/050367.html)
.. [2] Reintroducing the original cumulative sums use case
(https://mail.python.org/pipermail/python-ideas/2018-May/050544.html)
.. [3] Tim's language reference level explanation of his proposed scoping semantics
(https://mail.python.org/pipermail/python-ideas/2018-May/050729.html)
.. [4] Guido's endorsement of Tim's proposed genexp & comprehension scoping
(https://mail.python.org/pipermail/python-ideas/2018-May/050411.html)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

492
pep-0578.rst Normal file
View File

@ -0,0 +1,492 @@
PEP: 578
Title: Python Runtime Audit Hooks
Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower <steve.dower@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-Jun-2018
Python-Version: 3.8
Post-History:
Abstract
========
This PEP describes additions to the Python API and specific behaviors
for the CPython implementation that make actions taken by the Python
runtime visible to auditing tools. Visibility into these actions
provides opportunities for test frameworks, logging frameworks, and
security tools to monitor and optionally limit actions taken by the
runtime.
This PEP proposes adding two APIs to provide insights into a running
Python application: one for arbitrary events, and another specific to
the module import system. The APIs are intended to be available in all
Python implementations, though the specific messages and values used
are unspecified here to allow implementations the freedom to determine
how best to provide information to their users. Some examples likely
to be used in CPython are provided for explanatory purposes.
See PEP-551 for discussion and recommendations on enhancing the
security of a Python runtime making use of these auditing APIs.
Background
==========
Python provides access to a wide range of low-level functionality on
many common operating systems in a consistent manner. While this is
incredibly useful for "write-once, run-anywhere" scripting, it also
makes monitoring of software written in Python difficult. Because
Python uses native system APIs directly, existing monitoring
tools either suffer from limited context or auditing bypass.
Limited context occurs when system monitoring can report that an
action occurred, but cannot explain the sequence of events leading to
it. For example, network monitoring at the OS level may be able to
report "listening started on port 5678", but may not be able to
provide the process ID, command line or parent process, or the local
state in the program at the point that triggered the action. Firewall
controls to prevent such an action are similarly limited, typically
to a process name or some global state such as the current user, and
in any case rarely provide a useful log file correlated with other
application messages.
Auditing bypass can occur when the typical system tool used for an
action would ordinarily report its use, but accessing the APIs via
Python do not trigger this. For example, invoking "curl" to make HTTP
requests may be specifically monitored in an audited system, but
Python's "urlretrieve" function is not.
Within a long-running Python application, particularly one that
processes user-provided information such as a web app, there is a risk
of unexpected behavior. This may be due to bugs in the code, or
deliberately induced by a malicious user. In both cases, normal
application logging may be bypassed resulting in no indication that
anything out of the ordinary has occurred.
Additionally, and somewhat unique to Python, it is very easy to affect
the code that is run in an application by manipulating either the
import system's search path or placing files earlier on the path than
intended. This is often seen when developers create a script with the
same name as the module they intend to use - for example, a
``random.py`` file that attempts to import the standard library
``random`` module.
Overview of Changes
===================
The aim of these changes is to enable both application developers and
system administrators to integrate Python into their existing
monitoring systems without dictating how those systems look or behave.
We propose two API changes to enable this: an Audit Hook and Verified
Open Hook. Both are available from Python and native code, allowing
applications and frameworks written in pure Python code to take
advantage of the extra messages, while also allowing embedders or
system administrators to deploy "always-on" builds of Python.
Only CPython is bound to provide the native APIs as described here.
Other implementations should provide the pure Python APIs, and
may provide native versions as appropriate for their underlying
runtimes.
Audit Hook
----------
In order to observe actions taken by the runtime (on behalf of the
caller), an API is required to raise messages from within certain
operations. These operations are typically deep within the Python
runtime or standard library, such as dynamic code compilation, module
imports, DNS resolution, or use of certain modules such as ``ctypes``.
The following new C APIs allow embedders and CPython implementors to
send and receive audit hook messages::
# Add an auditing hook
typedef int (*hook_func)(const char *event, PyObject *args,
void *userData);
int PySys_AddAuditHook(hook_func hook, void *userData);
# Raise an event with all auditing hooks
int PySys_Audit(const char *event, PyObject *args);
# Internal API used during Py_Finalize() - not publicly accessible
void _Py_ClearAuditHooks(void);
The new Python APIs for receiving and raising audit hooks are::
# Add an auditing hook
sys.addaudithook(hook: Callable[[str, tuple]])
# Raise an event with all auditing hooks
sys.audit(str, *args)
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling
``sys.addaudithook()`` from Python code. Hooks cannot be removed or
replaced.
When events of interest are occurring, code can either call
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
string argument is the name of the event, and the tuple contains
arguments. A given event name should have a fixed schema for arguments,
which should be considered a public API (for a given x.y version
release), and thus should only change between feature releases with
updated documentation.
For maximum compatibility, events using the same name as an event in
the reference interpreter CPython should make every attempt to use
compatible arguments. Including the name or an abbreviation of the
implementation in implementation-specific event names will also help
prevent collisions. For example, a ``pypy.jit_invoked`` event is clearly
distinguised from an ``ipy.jit_invoked`` event.
When an event is audited, each hook is called in the order it was added
with the event name and tuple. If any hook returns with an exception
set, later hooks are ignored and *in general* the Python runtime should
terminate. This is intentional to allow hook implementations to decide
how to respond to any particular event. The typical responses will be to
log the event, abort the operation with an exception, or to immediately
terminate the process with an operating system exit call.
When an event is audited but no hooks have been set, the ``audit()``
function should include minimal overhead. Ideally, each argument is a
reference to existing data rather than a value calculated just for the
auditing call.
As hooks may be Python objects, they need to be freed during
``Py_Finalize()``. To do this, we add an internal API
``_Py_ClearAuditHooks()`` that releases any Python hooks and any
memory held. This is an internal function with no public export, and
we recommend it should raise its own audit event for all current hooks
to ensure that unexpected calls are observed.
Below in `Suggested Audit Hook Locations`_, we recommend some important
operations that should raise audit events. In PEP 551, more audited
operations are recommended with a view to security transparency.
Python implementations should document which operations will raise
audit events, along with the event schema. It is intended that
``sys.addaudithook(print)`` be a trivial way to display all messages.
Verified Open Hook
------------------
Most operating systems have a mechanism to distinguish between files
that can be executed and those that can not. For example, this may be an
execute bit in the permissions field, or a verified hash of the file
contents to detect potential code tampering. These are an important
security mechanism for preventing execution of data or code that is not
approved for a given environment. Currently, Python has no way to
integrate with these when launching scripts or importing modules.
The new public C API for the verified open hook is::
# Set the handler
typedef PyObject *(*hook_func)(PyObject *path, void *userData)
int PyImport_SetOpenForImportHook(hook_func handler, void *userData)
# Open a file using the handler
PyObject *PyImport_OpenForImport(const char *path)
The new public Python API for the verified open hook is::
# Open a file using the handler
importlib.util.open_for_import(path : str) -> io.IOBase
The ``importlib.util.open_for_import()`` function is a drop-in
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
to open a file for raw, binary access. To change the behaviour a new
handler should be set. Handler functions only accept ``str`` arguments.
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
from C at any time, including before ``Py_Initialize()``. However, if a
hook has already been set then the call will fail. When
``open_for_import()`` is called with a hook set, the hook will be passed
the path and its return value will be returned directly. The returned
object should be an open file-like object that supports reading raw
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
the open handler has already had to read the file into memory in order
to perform whatever verification is necessary to determine whether the
content is permitted to be executed.
Note that these hooks can import and call the ``_io.open()`` function on
CPython without triggering themselves. They can also use ``_io.BytesIO``
to return a compatible result using an in-memory buffer.
If the hook determines that the file should not be loaded, it should
raise an exception of its choice, as well as performing any other
logging.
All import and execution functionality involving code from a file will
be changed to use ``open_for_import()`` unconditionally. It is important
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
through this function - an audit hook that includes the code from these
calls is the best opportunity to validate code that is read from the
file. Given the current decoupling between import and execution in
Python, most imported code will go through both ``open_for_import()``
and the log hook for ``compile``, and so care should be taken to avoid
repeating verification steps.
There is no Python API provided for changing the open hook. To modify
import behavior from Python code, use the existing functionality
provided by ``importlib``.
API Availability
----------------
While all the functions added here are considered public and stable API,
the behavior of the functions is implementation specific. Most
descriptions here refer to the CPython implementation, and while other
implementations should provide the functions, there is no requirement
that they behave the same.
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
may do nothing. This allows code to make calls to ``sys.audit()``
without having to test for existence, but it should not assume that its
call will have any effect. (Including existence tests in
security-critical code allows another vector to bypass auditing, so it
is preferable that the function always exist.)
``importlib.util.open_for_import(path)`` should at a minimum always
return ``_io.open(path, 'rb')``. Code using the function should make no
further assumptions about what may occur, and implementations other than
CPython are not required to let developers override the behavior of this
function with a hook.
Suggested Audit Hook Locations
==============================
The locations and parameters in calls to ``sys.audit()`` or
``PySys_Audit()`` are to be determined by individual Python
implementations. This is to allow maximum freedom for implementations
to expose the operations that are most relevant to their platform,
and to avoid or ignore potentially expensive or noisy events.
Table 1 acts as both suggestions of operations that should trigger
audit events on all implementations, and examples of event schemas.
Table 2 provides further examples that are not required, but are
likely to be available in CPython.
Refer to the documentation associated with your version of Python to
see which operations provide audit events.
.. csv-table:: Table 1: Suggested Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
audit hooks are being added.
"
``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", "
Detects any attempt to set the ``open_for_import`` hook.
"
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
Detect dynamic code compilation, where ``code`` could be a string or
AST. Note that this will be called for regular imports of source
code, including those that were opened with ``open_for_import``.
"
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
Detect dynamic execution of code objects. This only occurs for
explicit calls, and is not raised for normal function invocation.
"
``import``, ``import``, "``(module, filename, sys.path,
sys.meta_path, sys.path_hooks)``", "Detect when modules are
imported. This is raised before the module name is resolved to a
file. All arguments other than the module name may be ``None`` if
they are not used or available.
"
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.setprofile`` eventually
calls this function, so the event will be audited for each thread.
"
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.settrace`` eventually
calls this function, so the event will be audited for each thread.
"
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
``object.__setattr__``","``(object, attr, value)``","Detect monkey
patching of types and objects. This event
is raised for the ``__class__`` attribute and any attribute on
``type`` objects.
"
"``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
attr)``","Detect deletion of object attributes. This event is raised
for any attribute on ``type`` objects.
"
"``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
global_name)``","Detect imports and global name lookup when
unpickling.
"
.. csv-table:: Table 2: Potential CPython Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
hooks they are being cleaned up, mainly in case the event is
triggered unexpectedly. This event cannot be aborted.
"
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
Detect dynamic creation of code objects. This only occurs for
direct instantiation, and is not raised for normal compilation.
"
``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
dynamic creation of function objects. This only occurs for direct
instantiation, and is not raised for normal compilation.
"
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
``(module_or_path,)``", "Detect when native modules are used.
"
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
Collect information about specific symbols retrieved from native
modules.
"
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
when code is accessing arbitrary memory using ``ctypes``.
"
"``new_mmap_object``",``mmap.__new__``,"``(fileno, map_size, access,
offset)``", "Detects creation of mmap objects. On POSIX, access may
have been calculated from the ``prot`` and ``flags`` arguments.
"
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
when code is accessing frames directly.
"
``sys._current_frames``, ``sys._current_frames``, "", "Detect when
code is accessing frames directly.
"
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
access to network resources. The address is unmodified from the
original call.
"
"``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
",``object.__getattr__``,"``(object, attr)``","Detect access to
restricted attributes. This event is raised for any built-in
members that are marked as restricted, and members that may allow
bypassing imports.
"
"``urllib.urlopen``",``urllib.Request``,"``(url, data, headers,
method)``", "Detects URL requests.
"
Performance Impact
==================
The important performance impact is the case where events are being
raised but there are no hooks attached. This is the unavoidable case -
once a distributor begins adding audit hooks they have explicitly
chosen to trade performance for functionality. Performance import
with hooks added are not of interest here, since this is considered
opt-in functionality.
Analysis using the Python Performance Benchmark Suite [1]_ shows no
significant impact, with the vast majority of benchmarks showing
between 1.05x faster to 1.05x slower.
In our opinion, the performance impact of the set of auditing points
described in this PEP is negligible.
Rejected Ideas
==============
Separate module for audit hooks
-------------------------------
The proposal is to add a new module for audit hooks, hypothetically
``audit``. This would separate the API and implementation from the
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and
``PyAudit_Audit`` rather than the current variations.
Any such module would need to be a built-in module that is guaranteed to
always be present. The nature of these hooks is that they must be
callable without condition, as any conditional imports or calls provide
opportunities to intercept and suppress or modify events.
Given its nature as one of the most core modules, the ``sys`` module is
somewhat protected against module shadowing attacks. Replacing ``sys``
with a sufficiently functional module that the application can still run
is a much more complicated task than replacing a module with only one
function of interest. An attacker that has the ability to shadow the
``sys`` module is already capable of running arbitrary code from files,
whereas an ``audit`` module can be replaced with a single line in a
``.pth`` file anywhere on the search path::
import sys; sys.modules['audit'] = type('audit', (object,),
{'audit': lambda *a: None, 'addhook': lambda *a: None})
Multiple layers of protection already exist for monkey patching attacks
against either ``sys`` or ``audit``, but assignments or insertions to
``sys.modules`` are not audited.
This idea is rejected because it makes substituting ``audit`` calls
throughout all callers trivial.
Flag in sys.flags to indicate "audited" mode
--------------------------------------------
The proposal is to add a value in ``sys.flags`` to indicate when Python
is running in a "secure" or "audited" mode. This would allow
applications to detect when some features are enabled or when hooks
have been added and modify their behaviour appropriately.
Currently, we are not aware of any legitimate reasons for a program to
behave differently in the presence of audit hooks.
Both application-level APIs ``sys.audit`` and
``importlib.util.open_for_import`` are always present and functional,
regardless of whether the regular ``python`` entry point or some
alternative entry point is used. Callers cannot determine whether any
hooks have been added (except by performing side-channel analysis), nor
do they need to. The calls should be fast enough that callers do not
need to avoid them, and the program is responsible for ensuring that
any added hooks are fast enough to not affect application performance.
The argument that this is "security by obscurity" is valid, but
irrelevant. Security by obscurity is only an issue when there are no
other protective mechanisms; obscurity as the first step in avoiding
attack is strongly recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
discussion).
This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in
use.
Relationship to PEP 551
=======================
This API was originally presented as part of
`PEP 551 <https://www.python.org/dev/peps/pep-0551/>`_ Security
Transparency in the Python Runtime.
For simpler review purposes, and due to the broader applicability of
these APIs beyond security, the API design is now presented separately.
PEP 551 is an informational PEP discussing how to integrate Python into
a secure or audited environment.
References
==========
.. [1] Python Performance Benchmark Suite `<https://github.com/python/performance>`_
Copyright
=========
Copyright (c) 2018 by Microsoft Corporation. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0 or later (the latest version is presently
available at http://www.opencontent.org/openpub/).

416
pep-0579.rst Normal file
View File

@ -0,0 +1,416 @@
PEP: 579
Title: Refactoring C functions and methods
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 04-Jun-2018
Post-History: 20-Jun-2018
Abstract
========
This meta-PEP collects various issues with CPython's existing implementation
of built-in functions (functions implemented in C) and methods.
Fixing all these issues is too much for one PEP,
so that will be delegated to other standards track PEPs.
However, this PEP does give some brief ideas of possible fixes.
This is mainly meant to coordinate an overall strategy.
For example, a proposed solution may sound too complicated
for fixing any one single issue, but it may be the best overall
solution for multiple issues.
This PEP is purely informational:
it does not imply that all issues will eventually
be fixed, nor that they will be fixed using the solution proposed here.
It also serves as a check-list of possible requested features
to verify that a given fix does not make those
other features harder to implement.
The major proposed change is replacing ``PyMethodDef``
by a new structure ``PyCCallDef``
which collects everything needed for calling the function/method.
In the ``PyTypeObject`` structure, a new field ``tp_ccalloffset``
is added giving an offset to a ``PyCCallDef *`` in the object structure.
**NOTE**: This PEP deals only with CPython implementation details,
it does not affect the Python language or standard library.
Issues
======
This lists various issues with built-in functions and methods,
together with a plan for a solution and (if applicable)
pointers to standards track PEPs discussing the details.
1. Naming
---------
The word "built-in" is overused in Python.
From a quick skim of the Python documentation, it mostly refers
to things from the ``builtins`` module.
In other words: things which are available in the global namespace
without a need for importing them.
This conflicts with the use of the word "built-in" to mean "implemented in C".
**Solution**: since the C structure for built-in functions and methods is already
called ``PyCFunctionObject``,
let's use the name "cfunction" and "cmethod" instead of "built-in function"
and "built-in method".
2. Not extendable
-----------------
The various classes involved (such as ``builtin_function_or_method``)
cannot be subclassed::
>>> from types import BuiltinFunctionType
>>> class X(BuiltinFunctionType):
... pass
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'builtin_function_or_method' is not an acceptable base type
This is a problem because it makes it impossible to add features
such as introspection support to these classes.
If one wants to implement a function in C with additional functionality,
an entirely new class must be implemented from scratch.
The problem with this is that the existing classes like
``builtin_function_or_method`` are special-cased in the Python interpreter
to allow faster calling (for example, by using ``METH_FASTCALL``).
It is currently impossible to have a custom class with the same optimizations.
**Solution**: make the existing optimizations available to arbitrary classes.
This is done by adding a new ``PyTypeObject`` field ``tp_ccalloffset``
(or can we re-use ``tp_print`` for that?)
specifying the offset of a ``PyCCallDef`` pointer.
This is a new structure holding all information needed to call
a cfunction and it would be used instead of ``PyMethodDef``.
This implements the new "C call" protocol.
For constructing cfunctions and cmethods, ``PyMethodDef`` arrays
will still be used (for example, in ``tp_methods``) but that will
be the *only* remaining purpose of the ``PyMethodDef`` structure.
Additionally, we can also make some function classes subclassable.
However, this seems less important once we have ``tp_ccalloffset``.
**Reference**: PEP 580
3. cfunctions do not become methods
-----------------------------------
A cfunction like ``repr`` does not implement ``__get__`` to bind
as a method::
>>> class X:
... meth = repr
>>> x = X()
>>> x.meth()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: repr() takes exactly one argument (0 given)
In this example, one would have expected that ``x.meth()`` returns
``repr(x)`` by applying the normal rules of methods.
This is surprising and a needless difference
between cfunctions and Python functions.
For the standard built-in functions, this is not really a problem
since those are not meant to used as methods.
But it does become a problem when one wants to implement a
new cfunction with the goal of being usable as method.
Again, a solution could be to create a new class behaving just
like cfunctions but which bind as methods.
However, that would lose some existing optimizations for methods,
such as the ``LOAD_METHOD``/``CALL_METHOD`` opcodes.
**Solution**: the same as the previous issue.
It just shows that handling ``self`` and ``__get__``
should be part of the new C call protocol.
For backwards compatibility, we would keep the existing non-binding
behavior of cfunctions. We would just allow it in custom classes.
**Reference**: PEP 580
4. Semantics of inspect.isfunction
----------------------------------
Currently, ``inspect.isfunction`` returns ``True`` only for instances
of ``types.FunctionType``.
That is, true Python functions.
A common use case for ``inspect.isfunction`` is checking for introspection:
it guarantees for example that ``inspect.getfile()`` will work.
Ideally, it should be possible for other classes to be treated as
functions too.
**Solution**: introduce a new ``InspectFunction`` abstract base class
and use that to implement ``inspect.isfunction``.
Alternatively, use duck typing for ``inspect.isfunction``
(as proposed in [#bpo30071]_)::
def isfunction(obj):
return hasattr(type(obj), "__code__")
5. C functions should have access to the function object
--------------------------------------------------------
The underlying C function of a cfunction currently
takes a ``self`` argument (for bound methods)
and then possibly a number of arguments.
There is no way for the C function to actually access the Python
cfunction object (the ``self`` in ``__call__`` or ``tp_call``).
This would for example allow implementing the
C call protocol for Python functions (``types.FunctionType``):
the C function which implements calling Python functions
needs access to the ``__code__`` attribute of the function.
This is also needed for PEP 573
where all cfunctions require access to their "parent"
(the module for functions of a module or the defining class
for methods).
**Solution**: add a new ``PyMethodDef`` flag to specify
that the C function takes an additional argument (as first argument),
namely the function object.
**References**: PEP 580, PEP 573
6. METH_FASTCALL is private and undocumented
--------------------------------------------
The ``METH_FASTCALL`` mechanism allows calling cfunctions and cmethods
using a C array of Python objects instead of a ``tuple``.
This was introduced in Python 3.6 for positional arguments only
and extended in Python 3.7 with support for keyword arguments.
However, given that it is undocumented,
it is presumably only supposed to be used by CPython itself.
**Solution**: since this is an important optimization,
everybody should be encouraged to use it.
Now that the implementation of ``METH_FASTCALL`` is stable, document it!
As part of the C call protocol, we should also add a C API function ::
PyObject *PyCCall_FastCall(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *keywords)
**Reference**: PEP 580
7. Allowing native C arguments
------------------------------
A cfunction always takes its arguments as Python objects
(say, an array of ``PyObject`` pointers).
In cases where the cfunction is really wrapping a native C function
(for example, coming from ``ctypes`` or some compiler like Cython),
this is inefficient: calls from C code to C code are forced to use
Python objects to pass arguments.
Analogous to the buffer protocol which allows access to C data,
we should also allow access to the underlying C callable.
**Solution**: when wrapping a C function with native arguments
(for example, a C ``long``) inside a cfunction,
we should also store a function pointer to the underlying C function,
together with its C signature.
Argument Clinic could automatically do this by storing
a pointer to the "impl" function.
8. Complexity
-------------
There are a huge number of classes involved to implement
all variations of methods.
This is not a problem by itself, but a compounding issue.
For ordinary Python classes, the table below gives the classes
for various kinds of methods.
The columns refer to the class in the class ``__dict__``,
the class for unbound methods (bound to the class)
and the class for bound methods (bound to the instance):
============= ================ ============ ============
kind __dict__ unbound bound
============= ================ ============ ============
Normal method ``function`` ``function`` ``method``
Static method ``staticmethod`` ``function`` ``function``
Class method ``classmethod`` ``method`` ``method``
Slot method ``function`` ``function`` ``method``
============= ================ ============ ============
This is the analogous table for extension types (C classes):
============= ========================== ============================== ==============================
kind __dict__ unbound bound
============= ========================== ============================== ==============================
Normal method ``method_descriptor`` ``method_descriptor`` ``builtin_function_or_method``
Static method ``staticmethod`` ``builtin_function_or_method`` ``builtin_function_or_method``
Class method ``classmethod_descriptor`` ``builtin_function_or_method`` ``builtin_function_or_method``
Slot method ``wrapper_descriptor`` ``wrapper_descriptor`` ``method-wrapper``
============= ========================== ============================== ==============================
There are a lot of classes involved
and these two tables look very different.
There is no good reason why Python methods should be
treated fundamentally different from C methods.
Also the features are slightly different:
for example, ``method`` supports ``__func__``
but ``builtin_function_or_method`` does not.
Since CPython has optimizations for calls to most of these objects,
the code for dealing with them can also become complex.
A good example of this is the ``call_function`` function in ``Python/ceval.c``.
**Solution**: all these classes should implement the C call protocol.
Then the complexity in the code can mostly be fixed by
checking for the C call protocol (``tp_ccalloffset != 0``)
instead of doing type checks.
Furthermore, it should be investigated whether some of these classes can be merged
and whether ``method`` can be re-used also for bound methods of extension types
(see PEP 576 for the latter,
keeping in mind that this may have some minor backwards compatibility issues).
This is not a goal by itself but just something to keep in mind
when working on these classes.
9. PyMethodDef is too limited
-----------------------------
The typical way to create a cfunction or cmethod in an extension module
is by using a ``PyMethodDef`` to define it.
These are then stored in an array ``PyModuleDef.m_methods``
(for cfunctions) or ``PyTypeObject.tp_methods`` (for cmethods).
However, because of the stable ABI (PEP 384),
we cannot change the ``PyMethodDef`` structure.
So, this means that we cannot add new fields for creating cfunctions/cmethods
this way.
This is probably the reason for the hack that
``__doc__`` and ``__text_signature__`` are stored in the same C string
(with the ``__doc__`` and ``__text_signature__`` descriptors extracting
the relevant part).
**Solution**: stop assuming that a single ``PyMethodDef`` entry
is sufficient to describe a cfunction/cmethod.
Instead, we could add some flag which means that one of the ``PyMethodDef``
fields is instead a pointer to an additional structure.
Or, we could add a flag to use two or more consecutive ``PyMethodDef``
entries in the array to store more data.
Then the ``PyMethodDef`` array would be used only to construct
cfunctions/cmethods but it would no longer be used after that.
10. Slot wrappers have no custom documentation
----------------------------------------------
Right now, slot wrappers like ``__init__`` or ``__lt__`` only have very
generic documentation, not at all specific to the class::
>>> list.__init__.__doc__
'Initialize self. See help(type(self)) for accurate signature.'
>>> list.__lt__.__doc__
'Return self<value.'
The same happens for the signature::
>>> list.__init__.__text_signature__
'($self, /, *args, **kwargs)'
As you can see, slot wrappers do support ``__doc__``
and ``__text_signature__``.
The problem is that these are stored in ``struct wrapperbase``,
which is common for all wrappers of a specific slot
(for example, the same ``wrapperbase`` is used for ``str.__eq__`` and ``int.__eq__``).
**Solution**: rethink the slot wrapper class to allow docstrings
(and text signatures) for each instance separately.
This still leaves the question of how extension modules
should specify the documentation.
The ``PyTypeObject`` entries like ``tp_init`` are just function pointers,
we cannot do anything with those.
One solution would be to add entries to the ``tp_methods`` array
just for adding docstrings.
Such an entry could look like ::
{"__init__", NULL, METH_SLOTDOC, "pointer to __init__ doc goes here"}
11. Static methods and class methods should be callable
-------------------------------------------------------
Instances of ``staticmethod`` and ``classmethod`` should be callable.
Admittedly, there is no strong use case for this,
but it has occasionally been requested (see for example [#bpo20309]_).
Making static/class methods callable would increase consistency.
First of all, function decorators typically add functionality or modify
a function, but the result remains callable. This is not true for
``@staticmethod`` and ``@classmethod``.
Second, class methods of extension types are already callable::
>>> fromhex = float.__dict__["fromhex"]
>>> type(fromhex)
<class 'classmethod_descriptor'>
>>> fromhex(float, "0xff")
255.0
Third, one can see ``function``, ``staticmethod`` and ``classmethod``
as different kinds of unbound methods:
they all become ``method`` when bound, but the implementation of ``__get__``
is slightly different.
From this point of view, it looks strange that ``function`` is callable
but the others are not.
**Solution**:
when changing the implementation of ``staticmethod``, ``classmethod``,
we should consider making instances callable.
Even if this is not a goal by itself, it may happen naturally
because of the implementation.
References
==========
.. [#bpo20309] Not all method descriptors are callable
(https://bugs.python.org/issue20309)
.. [#bpo30071] Duck-typing inspect.isfunction()
(https://bugs.python.org/issue30071)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

611
pep-0580.rst Normal file
View File

@ -0,0 +1,611 @@
PEP: 580
Title: The C call protocol
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 14-Jun-2018
Python-Version: 3.8
Post-History: 20-Jun-2018, 22-Jun-2018, 16-Jul-2018
Abstract
========
A new "C call" protocol is proposed.
It is meant for classes representing functions or methods
which need to implement fast calling.
The goal is to generalize existing optimizations for built-in functions
to arbitrary extension types.
In the reference implementation,
this new protocol is used for the existing classes
``builtin_function_or_method`` and ``method_descriptor``.
However, in the future, more classes may implement it.
**NOTE**: This PEP deals only with CPython implementation details,
it does not affect the Python language or standard library.
Motivation
==========
Currently, the Python bytecode interpreter has various optimizations
for calling instances of ``builtin_function_or_method``,
``method_descriptor``, ``method`` and ``function``.
However, none of these classes is subclassable.
Therefore, these optimizations are not available to
user-defined extension types.
If this PEP is implemented, then the checks
for ``builtin_function_or_method`` and ``method_descriptor``
could be replaced by simply checking for and using the C call protocol.
This simplifies existing code.
We also design the C call protocol such that it can easily
be extended with new features in the future.
For more background and motivation, see PEP 579.
Basic idea
==========
Currently, CPython has multiple optimizations for fast calling
for a few specific function classes.
Calling instances of these classes using a plain ``tp_call`` is slower
than using the optimizations.
The basic idea of this PEP is to allow user-defined extension types
(not Python classes) to use these optimizations also,
both as caller and as callee.
The existing class ``builtin_function_or_method`` and a few others
use a ``PyMethodDef`` structure for describing the underlying C function and its signature.
The first concrete change is that this is replaced by a new structure ``PyCCallDef``.
This stores some of the same information as a ``PyMethodDef``,
but with one important addition:
the "parent" of the function (the class or module where it is defined).
Note that ``PyMethodDef`` arrays are still used to construct
functions/methods but no longer for calling them.
Second, we want that every class can use such a ``PyCCallDef`` for optimizing calls,
so the ``PyTypeObject`` structure gains a ``tp_ccalloffset`` field
giving an offset to a ``PyCCallDef *`` in the object structure
and a flag ``Py_TPFLAGS_HAVE_CCALL`` indicating that ``tp_ccalloffset`` is valid.
Third, since we want to deal efficiently with unbound and bound methods too
(as opposed to only plain functions), we need to handle ``__self__`` too:
after the ``PyCCallDef *`` in the object structure,
there is a ``PyObject *self`` field.
These two fields together are referred to as a ``PyCCallRoot`` structure.
The new protocol for efficiently calling objects using these new structures
is called the "C call protocol".
New data structures
===================
The ``PyTypeObject`` structure gains a new field ``Py_ssize_t tp_ccalloffset``
and a new flag ``Py_TPFLAGS_HAVE_CCALL``.
If this flag is set, then ``tp_ccalloffset`` is assumed to be a valid
offset inside the object structure (similar to ``tp_weaklistoffset``).
It must be a strictly positive integer.
At that offset, a ``PyCCallRoot`` structure appears::
typedef struct {
PyCCallDef *cr_ccall;
PyObject *cr_self; /* __self__ argument for methods */
} PyCCallRoot;
The ``PyCCallDef`` structure contains everything needed to describe how
the function can be called::
typedef struct {
uint32_t cc_flags;
PyCFunc cc_func; /* C function to call */
PyObject *cc_parent; /* class or module */
} PyCCallDef;
The reason for putting ``__self__`` outside of ``PyCCallDef``
is that ``PyCCallDef`` is not meant to be changed after creating the function.
A single ``PyCCallDef`` can be shared
by an unbound method and multiple bound methods.
This wouldn't work if we would put ``__self__`` inside that structure.
**NOTE**: unlike ``tp_dictoffset`` we do not allow negative numbers
for ``tp_ccalloffset`` to mean counting from the end.
There does not seem to be a use case for it and it would only complicate
the implementation.
Parent
------
The ``cc_parent`` field (accessed for example by a ``__parent__``
or ``__objclass__`` descriptor from Python code) can be any Python object.
For methods of extension types, this is set to the class.
For functions of modules, this is set to the module.
The parent serves multiple purposes: for methods of extension types,
it is used for type checks like the following::
>>> list.append({}, "x")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor 'append' requires a 'list' object but received a 'dict'
PEP 573 specifies that every function should have access to the
module in which it is defined.
For functions of a module, this is given by the parent.
For methods, this works indirectly through the class,
assuming that the class has a pointer to the module.
The parent would also typically be used to implement ``__qualname__``.
The new C API function ``PyCCall_GenericGetQualname()`` does exactly that.
Custom classes are free to set ``cc_parent`` to whatever they want.
It is only used by the C call protocol if the ``CCALL_OBJCLASS`` flag is set.
Using tp_print
--------------
We propose to replace the existing unused field ``tp_print``
by ``tp_ccalloffset``.
Since ``Py_TPFLAGS_HAVE_CCALL`` would *not* be added to
``Py_TPFLAGS_DEFAULT``, this ensures full backwards compatibility for
existing extension modules setting ``tp_print``.
It also means that we can require that ``tp_ccalloffset`` is a valid
offset when ``Py_TPFLAGS_HAVE_CCALL`` is specified:
we do not need to check ``tp_ccalloffset != 0``.
In future Python versions, we may decide that ``tp_print``
becomes ``tp_ccalloffset`` unconditionally,
drop the ``Py_TPFLAGS_HAVE_CCALL`` flag and instead check for
``tp_ccalloffset != 0``.
The C call protocol
===================
We say that a class implements the C call protocol
if it has the ``Py_TPFLAGS_HAVE_CCALL`` flag set
(as explained above, it must then set ``tp_ccalloffset > 0``).
Such a class must implement ``__call__`` as described in this section
(in practice, this just means setting ``tp_call`` to ``PyCCall_Call``).
The ``cc_func`` field is a C function pointer.
Its precise signature depends on flags.
Below are the possible values for ``cc_flags & CCALL_SIGNATURE``
together with the arguments that the C function takes.
The return value is always ``PyObject *``.
The following are completely analogous to the existing ``PyMethodDef``
signature flags:
- ``CCALL_VARARGS``: ``cc_func(PyObject *self, PyObject *args)``
- ``CCALL_VARARGS | CCALL_KEYWORDS``: ``cc_func(PyObject *self, PyObject *args, PyObject *kwds)``
- ``CCALL_FASTCALL``: ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs)``
- ``CCALL_FASTCALL | CCALL_KEYWORDS``: ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)``
- ``CCALL_NULLARG``: ``cc_func(PyObject *self, PyObject *null)``
(the function takes no arguments but a ``NULL`` is passed to the C function)
- ``CCALL_O``: ``cc_func(PyObject *self, PyObject *arg)``
The flag ``CCALL_FUNCARG`` may be combined with any of these.
If so, the C function takes an additional argument as first argument
which is the function object (the ``self`` in ``__call__``).
For example, we have the following signature:
- ``CCALL_FUNCARG | CCALL_VARARGS``: ``cc_func(PyObject *func, PyObject *self, PyObject *args)``
**NOTE**: in the case of bound methods, it is currently unspecified
whether the "function object" in the paragraph above refers
to the bound method or the original function (which is wrapped by the bound method).
In the reference implementation, the bound method is passed.
In the future, this may change to the wrapped function.
Despite this ambiguity, the implementation of bound methods
guarantees that ``PyCCall_CCALLDEF(func)``
points to the ``PyCCallDef`` of the original function.
**NOTE**: unlike the existing ``METH_...`` flags,
the ``CCALL_...`` constants do not necessarily represent single bits.
So checking ``(cc_flags & CCALL_VARARGS) == 0`` is not a valid way
for checking the signature.
There are also no guarantees of binary compatibility
between Python versions for these flags.
Checking __objclass__
---------------------
If the ``CCALL_OBJCLASS`` flag is set and if ``cr_self`` is NULL
(this is the case for unbound methods of extension types),
then a type check is done:
the function must be called with at least one positional argument
and the first (typically called ``self``) must be an instance of
``cc_parent`` (which must be a class).
If not, a ``TypeError`` is raised.
Self slicing
------------
If ``cr_self`` is not NULL or if the flag ``CCALL_SLICE_SELF``
is not set in ``cc_flags``, then the argument passed as ``self``
is simply ``cr_self``.
If ``cr_self`` is NULL and the flag ``CCALL_SLICE_SELF`` is set,
then the first positional argument is removed from
``args`` and instead passed as first argument to the C function.
Effectively, the first positional argument is treated as ``__self__``.
If there are no positional arguments, ``TypeError`` is raised.
This process is called self slicing and a function is said to have self
slicing if ``cr_self`` is NULL and ``CCALL_SLICE_SELF`` is set.
Note that a ``METH_NULLARG`` function with self slicing effectively has
one argument, namely ``self``.
Analogously, a ``METH_O`` function with self slicing has two arguments.
Descriptor behavior
-------------------
Classes supporting the C call protocol
must implement the descriptor protocol in a specific way.
This is required for an efficient implementation of bound methods:
it allows sharing the ``PyCCallDef`` structure between bound and unbound methods.
It is also needed for a correct implementation of ``_PyObject_GetMethod``
which is used by the ``LOAD_METHOD``/``CALL_METHOD`` optimization.
First of all, if ``func`` supports the C call protocol,
then ``func.__set__`` must not be implemented.
Second, ``func.__get__`` must behave as follows:
- If ``cr_self`` is not NULL, then ``__get__`` must be a no-op
in the sense that ``func.__get__(obj, cls)(*args, **kwds)``
behaves exactly the same as ``func(*args, **kwds)``.
It is also allowed for ``__get__`` to be not implemented at all.
- If ``cr_self`` is NULL, then ``func.__get__(obj, cls)(*args, **kwds)``
(with ``obj`` not None)
must be equivalent to ``func(obj, *args, **kwds)``.
In particular, ``__get__`` must be implemented in this case.
Note that this is unrelated to self slicing: ``obj`` may be passed
as ``self`` argument to the C function or it may be the first positional argument.
- If ``cr_self`` is NULL, then ``func.__get__(None, cls)(*args, **kwds)``
must be equivalent to ``func(*args, **kwds)``.
There are no restrictions on the object ``func.__get__(obj, cls)``.
The latter is not required to implement the C call protocol for example.
It only specifies what ``func.__get__(obj, cls).__call__`` does.
For classes that do not care about ``__self__`` and ``__get__`` at all,
the easiest solution is to assign ``cr_self = Py_None``
(or any other non-NULL value).
__name__ attribute
------------------
The C call protocol requires that the function has a ``__name__``
attribute which is of type ``str`` (not a subclass).
Furthermore, this must be idempotent in the sense
that getting the ``__name__`` attribute twice in a row must return
exactly the same Python object.
This implies that it cannot be a temporary object, it must be stored somewhere.
This is required because ``PyEval_GetFuncName`` and ``PyEval_GetFuncDesc``
use borrowed references to the ``__name__`` attribute.
Generic API functions
---------------------
This section lists the new public API functions dealing with the C call protocol.
- ``int PyCCall_Check(PyObject *op)``:
return true if ``op`` implements the C call protocol.
All the functions and macros below
apply to any instance supporting the C call protocol.
In other words, ``PyCCall_Check(func)`` must be true.
- ``PyObject * PyCCall_Call(PyObject *func, PyObject *args, PyObject *kwds)``:
call ``func`` with positional arguments ``args``
and keyword arguments ``kwds`` (``kwds`` may be NULL).
This function is meant to be put in the ``tp_call`` slot.
- ``PyObject * PyCCall_FASTCALL(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *kwds)``:
call ``func`` with ``nargs`` positional arguments given by ``args[0]``, …, ``args[nargs-1]``.
The parameter ``kwds`` can be NULL (no keyword arguments),
a dict with ``name:value`` items or a tuple with keyword names.
In the latter case, the keyword values are stored in the ``args``
array, starting at ``args[nargs]``.
Macros to access the ``PyCCallRoot`` and ``PyCCallDef`` structures:
- ``PyCCallRoot * PyCCall_CCALLROOT(PyObject *func)``:
pointer to the ``PyCCallRoot`` structure inside ``func``.
- ``PyCCallDef * PyCCall_CCALLDEF(PyObject *func)``:
shorthand for ``PyCCall_CCALLROOT(func)->cr_ccall``.
- ``PyCCallDef * PyCCall_FLAGS(PyObject *func)``:
shorthand for ``PyCCall_CCALLROOT(func)->cr_ccall->cc_flags``.
- ``PyObject * PyCCall_SELF(PyOject *func)``:
shorthand for ``PyCCall_CCALLROOT(func)->cr_self``.
Generic getters, meant to be put into the ``tp_getset`` array:
- ``PyObject * PyCCall_GenericGetParent(PyObject *func, void *closure)``:
return ``cc_parent``.
Raise ``AttributeError`` if ``cc_parent`` is NULL.
- ``PyObject * PyCCall_GenericGetQualname(PyObject *func, void *closure)``:
return a string suitable for using as ``__qualname__``.
This uses the ``__qualname__`` of ``cc_parent`` if possible.
It also uses the ``__name__`` attribute.
- ``PyObject * PyCCall_GenericGetSelf(PyObject *func, void *closure)``:
return ``cr_self``.
Raise ``AttributeError`` if ``cr_self`` is NULL.
Profiling
---------
The profiling events
``c_call``, ``c_return`` and ``c_exception`` are only generated
when calling actual instances of ``builtin_function_or_method`` or ``method_descriptor``.
This is done for simplicity and also for backwards compatibility
(such that the profile function does not receive objects that it does not recognize).
In a future PEP, we may extend C-level profiling to arbitrary classes
implementing the C call protocol.
Changes to built-in functions and methods
=========================================
The reference implementation of this PEP changes
the existing classes ``builtin_function_or_method`` and ``method_descriptor``
to use the C call protocol.
In fact, those two classes are almost merged:
the implementation becomes very similar, but they remain separate classes
(mostly for backwards compatibility).
The ``PyCCallDef`` structure is simply stored
as part of the object structure.
Both classes use ``PyCFunctionObject`` as object structure.
This is the new layout::
typedef struct {
PyObject_HEAD
PyCCallDef *m_ccall;
PyObject *m_self; /* Passed as 'self' arg to the C function */
PyCCallDef _ccalldef; /* Storage for m_ccall */
PyObject *m_name; /* __name__; str object (not NULL) */
PyObject *m_module; /* __module__; can be anything */
const char *m_doc; /* __text_signature__ and __doc__ */
PyObject *m_weakreflist; /* List of weak references */
} PyCFunctionObject;
For functions of a module and for unbound methods of extension types,
``m_ccall`` points to the ``_ccalldef`` field.
For bound methods, ``m_ccall`` points to the ``PyCCallDef``
of the unbound method.
**NOTE**: the new layout of ``method_descriptor`` changes it
such that it no longer starts with ``PyDescr_COMMON``.
This is purely an implementation detail and it should cause few (if any)
compatibility problems.
C API functions
---------------
The following function is added (also to the stable ABI [#pep384]_):
- ``PyObject * PyCFunction_ClsNew(PyTypeObject *cls, PyMethodDef *ml, PyObject *self, PyObject *module, PyObject *parent)``:
create a new object with object structure ``PyCFunctionObject`` and class ``cls``.
This is called in turn by ``PyCFunction_NewEx`` and ``PyDescr_NewMethod``.
The undocumented functions ``PyCFunction_GetFlags``
and ``PyCFunction_GET_FLAGS``
are removed because it would be non-trivial to support them
in a backwards-compatible way.
Inheritance
===========
Extension types inherit the type flag ``Py_TPFLAGS_HAVE_CCALL``
and the value ``tp_ccalloffset`` from the base class,
provided that they implement ``tp_call`` and ``tp_descr_get``
the same way as the base class.
Heap types never inherit the C call protocol because
that would not be safe (heap types can be changed dynamically).
Performance
===========
This PEP should not impact the performance of existing code
(in the positive or negative sense).
It is meant to allow efficient new code to be written,
not to make existing code faster.
Stable ABI
==========
None of the functions, structures or constants dealing with the C call protocol
are added to the stable ABI [#pep384]_.
There are two reasons for this:
first of all, the most useful feature of the C call protocol is probably the
``METH_FASTCALL`` calling convention.
Given that this is not even part of the public API (see also PEP 579, issue 6),
it would be strange to add anything else from the C call protocol
to the stable ABI.
Second, we want the C call protocol to be extensible in the future.
By not adding anything to the stable ABI,
we are free to do that without restrictions.
Backwards compatibility
=======================
There should be no difference at all for the Python interface,
and neither for the documented C API
(in the sense that all functions remain supported with the same functionality).
The removed function ``PyCFunction_GetFlags``,
is officially part of the stable ABI [#pep384]_.
However, this is probably an oversight:
first of all, it is not even documented.
Second, the flag ``METH_FASTCALL``
is not part of the stable ABI but it is very common
(because of Argument Clinic).
So, if one cannot support ``METH_FASTCALL``,
it is hard to imagine a use case for ``PyCFunction_GetFlags``.
The fact that ``PyCFunction_GET_FLAGS`` and ``PyCFunction_GetFlags``
are not used at all by CPython outside of ``Objects/call.c``
further shows that these functions are not particularly useful.
Concluding: the only potential breakage is with C code
which accesses the internals of ``PyCFunctionObject`` and ``PyMethodDescrObject``.
We expect very few problems because of this.
Rationale
=========
Why is this better than PEP 575?
--------------------------------
One of the major complaints of PEP 575 was that is was coupling
functionality (the calling and introspection protocol)
with the class hierarchy:
a class could only benefit from the new features
if it was a subclass of ``base_function``.
It may be difficult for existing classes to do that
because they may have other constraints on the layout of the C object structure,
coming from an existing base class or implementation details.
For example, ``functools.lru_cache`` cannot implement PEP 575 as-is.
It also complicated the implementation precisely because changes
were needed both in the implementation details and in the class hierarchy.
The current PEP does not have these problems.
Why store the function pointer in the instance?
-----------------------------------------------
The actual information needed for calling an object
is stored in the instance (in the ``PyCCallDef`` structure)
instead of the class.
This is different from the ``tp_call`` slot or earlier attempts
at implementing a ``tp_fastcall`` slot [#bpo29259]_.
The main use case is built-in functions and methods.
For those, the C function to be called does depend on the instance.
Note that the current protocol makes it easy to support the case
where the same C function is called for all instances:
just use a single static ``PyCCallDef`` structure for every instance.
Why CCALL_OBJCLASS?
-------------------
The flag ``CCALL_OBJCLASS`` is meant to support various cases
where the class of a ``self`` argument must be checked, such as::
>>> list.append({}, None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: append() requires a 'list' object but received a 'dict'
>>> list.__len__({})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__len__' requires a 'list' object but received a 'dict'
>>> float.__dict__["fromhex"](list, "0xff")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor 'fromhex' for type 'float' doesn't apply to type 'list'
In the reference implementation, only the first of these uses the new code.
The other examples show that these kind of checks appear
in multiple places, so it makes sense to add generic support for them.
Why CCALL_SLICE_SELF?
---------------------
The flag ``CCALL_SLICE_SELF`` and the concept of self slicing
are needed to support methods:
the C function should not care
whether it is called as unbound method or as bound method.
In both cases, there should be a ``self`` argument
and this is simply the first positional argument of an unbound method call.
For example, ``list.append`` is a ``METH_O`` method.
Both the calls ``list.append([], 42)`` and ``[].append(42)`` should
translate to the C call ``list_append([], 42)``.
Thanks to the proposed C call protocol, we can support this in such a way
that both the unbound and the bound method share a ``PyCCallDef``
structure (with the ``CCALL_SLICE_SELF`` flag set).
Concluding, ``CCALL_SLICE_SELF`` has two advantages:
there is no extra layer of indirection for calling
and constructing bound methods does not require setting up a ``PyCCallDef`` structure.
Replacing tp_print
------------------
We repurpose ``tp_print`` as ``tp_ccalloffset`` because this makes
it easier for external projects to backport the C call protocol
to earlier Python versions.
In particular, the Cython project has shown interest in doing that
(see https://mail.python.org/pipermail/python-dev/2018-June/153927.html).
Alternative suggestions
=======================
PEP 576 is an alternative approach to solving the same problem as this PEP.
See https://mail.python.org/pipermail/python-dev/2018-July/154238.html
for comments on the difference between PEP 576 and PEP 580.
Reference implementation
========================
The reference implementation can be found at
https://github.com/jdemeyer/cpython/tree/pep580
References
==========
.. [#pep384] Löwis, PEP 384 Defining a Stable ABI,
https://www.python.org/dev/peps/pep-0384/
.. [#bpo29259] Add tp_fastcall to PyTypeObject: support FASTCALL calling convention for all callable objects,
https://bugs.python.org/issue29259
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

30
pep-0801.rst Normal file
View File

@ -0,0 +1,30 @@
PEP: 801
Title: Reserved
Author: Barry Warsaw <barry@python.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 21-Jun-2018
Abstract
========
This PEP is reserved for future use. Contact the author or
`the PEP editors <peps@python.org>`_ for details.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

View File

@ -151,7 +151,7 @@ parentheses around the parameter list. However it was decided
[#lambda]_ not to make this change because: [#lambda]_ not to make this change because:
1. It would be an incompatible change. 1. It would be an incompatible change.
2. Lambda's are neutered anyway. 2. Lambdas are neutered anyway.
3. The lambda can always be changed to a function. 3. The lambda can always be changed to a function.
@ -159,11 +159,11 @@ Accessing Function Annotations
============================== ==============================
Once compiled, a function's annotations are available via the Once compiled, a function's annotations are available via the
function's ``func_annotations`` attribute. This attribute is function's ``__annotations__`` attribute. This attribute is
a mutable dictionary, mapping parameter names to an object a mutable dictionary, mapping parameter names to an object
representing the evaluated annotation expression representing the evaluated annotation expression
There is a special key in the ``func_annotations`` mapping, There is a special key in the ``__annotations__`` mapping,
``"return"``. This key is present only if an annotation was supplied ``"return"``. This key is present only if an annotation was supplied
for the function's return value. for the function's return value.
@ -172,7 +172,7 @@ For example, the following annotation::
def foo(a: 'x', b: 5 + 6, c: list) -> max(2, 9): def foo(a: 'x', b: 5 + 6, c: list) -> max(2, 9):
... ...
would result in a ``func_annotation`` mapping of :: would result in an ``__annotations__`` mapping of ::
{'a': 'x', {'a': 'x',
'b': 11, 'b': 11,
@ -183,7 +183,7 @@ The ``return`` key was chosen because it cannot conflict with the name
of a parameter; any attempt to use ``return`` as a parameter name of a parameter; any attempt to use ``return`` as a parameter name
would result in a ``SyntaxError``. would result in a ``SyntaxError``.
``func_annotations`` is an empty, mutable dictionary if there are no ``__annotations__`` is an empty, mutable dictionary if there are no
annotations on the function or if the functions was created from annotations on the function or if the functions was created from
a ``lambda`` expression. a ``lambda`` expression.

View File

@ -10,6 +10,7 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 12-Dec-2012 Created: 12-Dec-2012
Post-History: 21-Dec-2012 Post-History: 21-Dec-2012
Replaces: 3153
Resolution: https://mail.python.org/pipermail/python-dev/2013-November/130419.html Resolution: https://mail.python.org/pipermail/python-dev/2013-November/130419.html
Abstract Abstract

View File

@ -1,44 +1,43 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
import sys text_type = str
if sys.version_info[0] > 2:
text_type = str
else:
text_type = unicode
title_length = 55 title_length = 55
column_format = (u' %(type)1s%(status)1s %(number)4s %(title)-' + author_length = 40
text_type(title_length) + u's %(authors)-s') table_separator = "== ==== " + "="*title_length + " " + "="*author_length
column_format = (
'%(type)1s%(status)1s %(number)4s %(title)-{title_length}s %(authors)-s'
).format(title_length=title_length)
header = u"""PEP: 0 header = """\
PEP: 0
Title: Index of Python Enhancement Proposals (PEPs) Title: Index of Python Enhancement Proposals (PEPs)
Version: N/A Version: N/A
Last-Modified: %s Last-Modified: %s
Author: David Goodger <goodger@python.org>, Author: python-dev <python-dev@python.org>
Barry Warsaw <barry@python.org>
Status: Active Status: Active
Type: Informational Type: Informational
Content-Type: text/x-rst
Created: 13-Jul-2000 Created: 13-Jul-2000
""" """
intro = u""" intro = """\
This PEP contains the index of all Python Enhancement Proposals, This PEP contains the index of all Python Enhancement Proposals,
known as PEPs. PEP numbers are assigned by the PEP editors, and known as PEPs. PEP numbers are assigned by the PEP editors, and
once assigned are never changed[1]. The Mercurial history[2] of once assigned are never changed [1_]. The version control history [2_] of
the PEP texts represent their historical record. the PEP texts represent their historical record.
""" """
references = u""" references = """\
[1] PEP 1: PEP Purpose and Guidelines .. [1] PEP 1: PEP Purpose and Guidelines
[2] View PEP history online .. [2] View PEP history online: https://github.com/python/peps
https://hg.python.org/peps/
""" """
footer = u""" footer = """ \
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
sentence-end-double-space: t indent-tabs-mode: nil
fill-column: 70 sentence-end-double-space: t
coding: utf-8 fill-column: 70
End:""" coding: utf-8
End:\
"""

View File

@ -26,15 +26,13 @@ RESERVED = [
indent = u' ' indent = u' '
def write_column_headers(output): def emit_column_headers(output):
"""Output the column headers for the PEP indices.""" """Output the column headers for the PEP indices."""
column_headers = {'status': u'', 'type': u'', 'number': u'num', column_headers = {'status': '.', 'type': '.', 'number': 'PEP',
'title': u'title', 'authors': u'owner'} 'title': 'PEP Title', 'authors': 'PEP Author(s)'}
print(constants.table_separator, file=output)
print(constants.column_format % column_headers, file=output) print(constants.column_format % column_headers, file=output)
underline_headers = {} print(constants.table_separator, file=output)
for key, value in column_headers.items():
underline_headers[key] = constants.text_type(len(value) * '-')
print(constants.column_format % underline_headers, file=output)
def sort_peps(peps): def sort_peps(peps):
@ -42,6 +40,7 @@ def sort_peps(peps):
and essentially dead.""" and essentially dead."""
meta = [] meta = []
info = [] info = []
provisional = []
accepted = [] accepted = []
open_ = [] open_ = []
finished = [] finished = []
@ -74,6 +73,8 @@ def sort_peps(peps):
info.append(pep) info.append(pep)
else: else:
historical.append(pep) historical.append(pep)
elif pep.status == 'Provisional':
provisional.append(pep)
elif pep.status in ('Accepted', 'Active'): elif pep.status in ('Accepted', 'Active'):
accepted.append(pep) accepted.append(pep)
elif pep.status == 'Final': elif pep.status == 'Final':
@ -82,14 +83,15 @@ def sort_peps(peps):
raise PEPError("unsorted (%s/%s)" % raise PEPError("unsorted (%s/%s)" %
(pep.type_, pep.status), (pep.type_, pep.status),
pep.filename, pep.number) pep.filename, pep.number)
return meta, info, accepted, open_, finished, historical, deferred, dead return (meta, info, provisional, accepted, open_,
finished, historical, deferred, dead)
def verify_email_addresses(peps): def verify_email_addresses(peps):
authors_dict = {} authors_dict = {}
for pep in peps: for pep in peps:
for author in pep.authors: for author in pep.authors:
# If this is the first time we have come across an author, add him. # If this is the first time we have come across an author, add them.
if author not in authors_dict: if author not in authors_dict:
authors_dict[author] = [author.email] authors_dict[author] = [author.email]
else: else:
@ -129,112 +131,160 @@ def sort_authors(authors_dict):
def normalized_last_first(name): def normalized_last_first(name):
return len(unicodedata.normalize('NFC', name.last_first)) return len(unicodedata.normalize('NFC', name.last_first))
def emit_title(text, anchor, output, *, symbol="="):
print(".. _{anchor}:\n".format(anchor=anchor), file=output)
print(text, file=output)
print(symbol*len(text), file=output)
print(file=output)
def emit_subtitle(text, anchor, output):
emit_title(text, anchor, output, symbol="-")
def emit_pep_category(output, category, anchor, peps):
emit_subtitle(category, anchor, output)
emit_column_headers(output)
for pep in peps:
print(pep, file=output)
print(constants.table_separator, file=output)
print(file=output)
def write_pep0(peps, output=sys.stdout): def write_pep0(peps, output=sys.stdout):
# PEP metadata
today = datetime.date.today().strftime("%Y-%m-%d") today = datetime.date.today().strftime("%Y-%m-%d")
print(constants.header % today, file=output) print(constants.header % today, file=output)
print(file=output) print(file=output)
print(u"Introduction", file=output) # Introduction
emit_title("Introduction", "intro", output)
print(constants.intro, file=output) print(constants.intro, file=output)
print(file=output) print(file=output)
print(u"Index by Category", file=output) # PEPs by category
(meta, info, provisional, accepted, open_,
finished, historical, deferred, dead) = sort_peps(peps)
emit_title("Index by Category", "by-category", output)
emit_pep_category(
category="Meta-PEPs (PEPs about PEPs or Processes)",
anchor="by-category-meta",
peps=meta,
output=output,
)
emit_pep_category(
category="Other Informational PEPs",
anchor="by-category-other-info",
peps=info,
output=output,
)
emit_pep_category(
category="Provisional PEPs (provisionally accepted; interface may still change)",
anchor="by-category-provisional",
peps=provisional,
output=output,
)
emit_pep_category(
category="Accepted PEPs (accepted; may not be implemented yet)",
anchor="by-category-accepted",
peps=accepted,
output=output,
)
emit_pep_category(
category="Open PEPs (under consideration)",
anchor="by-category-open",
peps=open_,
output=output,
)
emit_pep_category(
category="Finished PEPs (done, with a stable interface)",
anchor="by-category-finished",
peps=finished,
output=output,
)
emit_pep_category(
category="Historical Meta-PEPs and Informational PEPs",
anchor="by-category-historical",
peps=historical,
output=output,
)
emit_pep_category(
category="Deferred PEPs (postponed pending further research or updates)",
anchor="by-category-deferred",
peps=deferred,
output=output,
)
emit_pep_category(
category="Abandoned, Withdrawn, and Rejected PEPs",
anchor="by-category-abandoned",
peps=dead,
output=output,
)
print(file=output) print(file=output)
write_column_headers(output) # PEPs by number
(meta, info, accepted, open_, finished, emit_title("Numerical Index", "by-pep-number", output)
historical, deferred, dead) = sort_peps(peps) emit_column_headers(output)
print(file=output)
print(u" Meta-PEPs (PEPs about PEPs or Processes)", file=output)
print(file=output)
for pep in meta:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Other Informational PEPs", file=output)
print(file=output)
for pep in info:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Accepted PEPs (accepted; may not be implemented yet)", file=output)
print(file=output)
for pep in accepted:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Open PEPs (under consideration)", file=output)
print(file=output)
for pep in open_:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Finished PEPs (done, implemented in code repository)", file=output)
print(file=output)
for pep in finished:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Historical Meta-PEPs and Informational PEPs", file=output)
print(file=output)
for pep in historical:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Deferred PEPs", file=output)
print(file=output)
for pep in deferred:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Abandoned, Withdrawn, and Rejected PEPs", file=output)
print(file=output)
for pep in dead:
print(constants.text_type(pep), file=output)
print(file=output)
print(file=output)
print(u"Numerical Index", file=output)
print(file=output)
write_column_headers(output)
prev_pep = 0 prev_pep = 0
for pep in peps: for pep in peps:
if pep.number - prev_pep > 1: if pep.number - prev_pep > 1:
print(file=output) print(file=output)
print(constants.text_type(pep), file=output) print(constants.text_type(pep), file=output)
prev_pep = pep.number prev_pep = pep.number
print(constants.table_separator, file=output)
print(file=output) print(file=output)
print(file=output) # Reserved PEP numbers
print(u'Reserved PEP Numbers', file=output) emit_title('Reserved PEP Numbers', "reserved", output)
print(file=output) emit_column_headers(output)
write_column_headers(output)
for number, claimants in sorted(RESERVED): for number, claimants in sorted(RESERVED):
print(constants.column_format % { print(constants.column_format % {
'type': '', 'type': '.',
'status': '', 'status': '.',
'number': number, 'number': number,
'title': 'RESERVED', 'title': 'RESERVED',
'authors': claimants, 'authors': claimants,
}, file=output) }, file=output)
print(constants.table_separator, file=output)
print(file=output) print(file=output)
print(file=output) # PEP types key
print(u"Key", file=output) emit_title("PEP Types Key", "type-key", output)
print(file=output) for type_ in sorted(PEP.type_values):
for type_ in PEP.type_values:
print(u" %s - %s PEP" % (type_[0], type_), file=output) print(u" %s - %s PEP" % (type_[0], type_), file=output)
print(file=output)
print(file=output) print(file=output)
for status in PEP.status_values: # PEP status key
print(u" %s - %s proposal" % (status[0], status), file=output) emit_title("PEP Status Key", "status-key", output)
for status in sorted(PEP.status_values):
# Draft PEPs have no status displayed, Active shares a key with Accepted
if status in ("Active", "Draft"):
continue
if status == "Accepted":
msg = " A - Accepted (Standards Track only) or Active proposal"
else:
msg = " {status[0]} - {status} proposal".format(status=status)
print(msg, file=output)
print(file=output)
print(file=output) print(file=output)
print(file=output) # PEP owners
print(u"Owners", file=output) emit_title("Authors/Owners", "authors", output)
print(file=output)
authors_dict = verify_email_addresses(peps) authors_dict = verify_email_addresses(peps)
max_name = max(authors_dict.keys(), key=normalized_last_first) max_name = max(authors_dict.keys(), key=normalized_last_first)
max_name_len = len(max_name.last_first) max_name_len = len(max_name.last_first)
print(u" %s %s" % ('name'.ljust(max_name_len), 'email address'), file=output) author_table_separator = "="*max_name_len + " " + "="*len("email address")
print(u" %s %s" % ((len('name')*'-').ljust(max_name_len), print(author_table_separator, file=output)
len('email address')*'-'), file=output) _author_header_fmt = "{name:{max_name_len}} Email Address"
print(_author_header_fmt.format(name="Name", max_name_len=max_name_len), file=output)
print(author_table_separator, file=output)
sorted_authors = sort_authors(authors_dict) sorted_authors = sort_authors(authors_dict)
_author_fmt = "{author.last_first:{max_name_len}} {author_email}"
for author in sorted_authors: for author in sorted_authors:
# Use the email from authors_dict instead of the one from 'author' as # Use the email from authors_dict instead of the one from 'author' as
# the author instance may have an empty email. # the author instance may have an empty email.
print((u" %s %s" % _entry = _author_fmt.format(
(author.last_first.ljust(max_name_len), authors_dict[author])), file=output) author=author,
author_email=authors_dict[author],
max_name_len=max_name_len,
)
print(_entry, file=output)
print(author_table_separator, file=output)
print(file=output) print(file=output)
print(file=output) print(file=output)
print(u"References", file=output) # References for introduction footnotes
print(file=output) emit_title("References", "references", output)
print(constants.references, file=output) print(constants.references, file=output)
print(constants.footer, file=output) print(constants.footer, file=output)

View File

@ -99,11 +99,11 @@ class Author(object):
name_parts = self.last.split() name_parts = self.last.split()
for index, part in enumerate(name_parts): for index, part in enumerate(name_parts):
if part[0].isupper(): if part[0].isupper():
base = u' '.join(name_parts[index:]).lower()
break break
else: else:
raise ValueError("last name missing a capital letter: %r" # If no capitals, use the whole string
% name_parts) base = self.last.lower()
base = u' '.join(name_parts[index:]).lower()
return unicodedata.normalize('NFKD', base).encode('ASCII', 'ignore') return unicodedata.normalize('NFKD', base).encode('ASCII', 'ignore')
def _last_name(self, full_name): def _last_name(self, full_name):
@ -169,7 +169,8 @@ class PEP(object):
type_values = (u"Standards Track", u"Informational", u"Process") type_values = (u"Standards Track", u"Informational", u"Process")
# Valid values for the Status header. # Valid values for the Status header.
# Active PEPs can only be for Informational or Process PEPs. # Active PEPs can only be for Informational or Process PEPs.
status_values = (u"Accepted", u"Rejected", u"Withdrawn", u"Deferred", status_values = (u"Accepted", u"Provisional",
u"Rejected", u"Withdrawn", u"Deferred",
u"Final", u"Active", u"Draft", u"Superseded") u"Final", u"Active", u"Draft", u"Superseded")
def __init__(self, pep_file): def __init__(self, pep_file):
@ -229,6 +230,11 @@ class PEP(object):
raise PEPError("Only Process and Informational PEPs may " raise PEPError("Only Process and Informational PEPs may "
"have an Active status", pep_file.name, "have an Active status", pep_file.name,
self.number) self.number)
# Special case for Provisional PEPs.
if (status == u"Provisional" and self.type_ != "Standards Track"):
raise PEPError("Only Standards Track PEPs may "
"have a Provisional status", pep_file.name,
self.number)
self.status = status self.status = status
# 'Author'. # 'Author'.
authors_and_emails = self._parse_author(metadata['Author']) authors_and_emails = self._parse_author(metadata['Author'])

View File

@ -235,7 +235,7 @@ def fixfile(inpath, input_lines, outfile):
else: else:
mailtos.append(part) mailtos.append(part)
v = COMMASPACE.join(mailtos) v = COMMASPACE.join(mailtos)
elif k.lower() in ('replaces', 'replaced-by', 'requires'): elif k.lower() in ('replaces', 'superseded-by', 'requires'):
otherpeps = '' otherpeps = ''
for otherpep in re.split(',?\s+', v): for otherpep in re.split(',?\s+', v):
otherpep = int(otherpep) otherpep = int(otherpep)
@ -296,7 +296,7 @@ def fixfile(inpath, input_lines, outfile):
print(re.sub( print(re.sub(
parts[1], parts[1],
'<a href="%s">%s</a>' % (url, parts[1]), '<a href="%s">%s</a>' % (url, parts[1]),
line, 1), end=' ', file=outfile) line, 1), end='', file=outfile)
continue continue
elif parts and '@' in parts[-1]: elif parts and '@' in parts[-1]:
# This is a pep email address line, so filter it. # This is a pep email address line, so filter it.
@ -305,7 +305,7 @@ def fixfile(inpath, input_lines, outfile):
print('<pre>', file=outfile) print('<pre>', file=outfile)
need_pre = 0 need_pre = 0
print(re.sub( print(re.sub(
parts[-1], url, line, 1), end=' ', file=outfile) parts[-1], url, line, 1), end='', file=outfile)
continue continue
line = fixpat.sub(lambda x, c=inpath: fixanchor(c, x), line) line = fixpat.sub(lambda x, c=inpath: fixanchor(c, x), line)
if need_pre: if need_pre:
@ -409,7 +409,7 @@ class PEPHeaders(Transform):
for node in para: for node in para:
if isinstance(node, nodes.reference): if isinstance(node, nodes.reference):
node.replace_self(peps.mask_email(node, pep)) node.replace_self(peps.mask_email(node, pep))
elif name in ('replaces', 'replaced-by', 'requires'): elif name in ('replaces', 'superseded-by', 'requires'):
newbody = [] newbody = []
space = nodes.Text(' ') space = nodes.Text(' ')
for refpep in re.split(r',?\s+', body.astext()): for refpep in re.split(r',?\s+', body.astext()):

View File

@ -224,7 +224,7 @@ def fixfile(inpath, input_lines, outfile):
else: else:
mailtos.append(part) mailtos.append(part)
v = COMMASPACE.join(mailtos) v = COMMASPACE.join(mailtos)
elif k.lower() in ('replaces', 'replaced-by', 'requires'): elif k.lower() in ('replaces', 'superseded-by', 'requires'):
otherpeps = '' otherpeps = ''
for otherpep in re.split(',?\s+', v): for otherpep in re.split(',?\s+', v):
otherpep = int(otherpep) otherpep = int(otherpep)

View File

@ -1,22 +1,23 @@
#!/usr/bin/env python #!/usr/bin/env python3
# usage: pep-hook.py $REPOS $REV # usage: pep-hook.py $REPOS $REV
# (standard post-commit args) # (standard post-commit args)
import os, glob, time, datetime, stat, re, sys import os, glob, time, datetime, stat, re, sys
import codecs
import PyRSS2Gen as rssgen import PyRSS2Gen as rssgen
RSS_PATH = os.path.join(sys.argv[1], 'peps.rss') RSS_PATH = os.path.join(sys.argv[1], 'peps.rss')
def firstline_startingwith(full_path, text): def firstline_startingwith(full_path, text):
for line in codecs.open(full_path, encoding="utf-8"): for line in open(full_path, encoding="utf-8"):
if line.startswith(text): if line.startswith(text):
return line[len(text):].strip() return line[len(text):].strip()
return None return None
# get list of peps with creation time (from "Created:" string in pep .txt) # get list of peps with creation time
# (from "Created:" string in pep .rst or .txt)
peps = glob.glob('pep-*.txt') peps = glob.glob('pep-*.txt')
peps.extend(glob.glob('pep-*.rst'))
def pep_creation_dt(full_path): def pep_creation_dt(full_path):
created_str = firstline_startingwith(full_path, 'Created:') created_str = firstline_startingwith(full_path, 'Created:')
# bleh, I was hoping to avoid re but some PEPs editorialize # bleh, I was hoping to avoid re but some PEPs editorialize
@ -69,5 +70,5 @@ rss = rssgen.RSS2(
lastBuildDate = datetime.datetime.now(), lastBuildDate = datetime.datetime.now(),
items = items) items = items)
with open(RSS_PATH, 'w') as fp: with open(RSS_PATH, 'w', encoding="utf-8") as fp:
fp.write(rss.to_xml()) fp.write(rss.to_xml(encoding="utf-8"))