Merge branch 'master' of github.com:python/peps

This commit is contained in:
Larry Hastings 2018-07-19 19:36:00 -07:00
commit 791464129f
84 changed files with 9249 additions and 2243 deletions

11
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,11 @@
<!--
Please include the PEP number in the pull request title, example:
PEP NNN: Summary of the changes made
In addition, please sign the CLA.
For more information, please read our Contributing Guidelines (CONTRIBUTING.rst)
-->

1
.gitignore vendored
View File

@ -1,4 +1,5 @@
pep-0000.txt
pep-0000.rst
pep-????.html
__pycache__
*.pyc

View File

@ -1,6 +1,6 @@
language: python
python:
- 3.5
- "3.7-dev"
sudo: false
cache: pip

13
CODE_OF_CONDUCT.rst Normal file
View File

@ -0,0 +1,13 @@
Code of Conduct
===============
Please note that all interactions on
`Python Software Foundation <https://www.python.org/psf-landing/>`__-supported
infrastructure is `covered
<https://www.python.org/psf/records/board/minutes/2014-01-06/#management-of-the-psfs-web-properties>`__
by the `PSF Code of Conduct <https://www.python.org/psf/codeofconduct/>`__,
which includes all infrastructure used in the development of Python itself
(e.g. mailing lists, issue trackers, GitHub, etc.).
In general this means everyone is expected to be open, considerate, and
respectful of others no matter what their position is within the project.

53
CONTRIBUTING.rst Normal file
View File

@ -0,0 +1,53 @@
Contributing Guidelines
=======================
To learn more about the purpose of PEPs and how to go about writing a PEP, please
start reading at PEP 1 (`pep-0001.txt <./pep-0001.txt>`_ in this repo). Note that
PEP 0, the index PEP, is now automatically generated, and not committed to the repo.
Before writing a new PEP
------------------------
Has this idea been proposed on `python-ideas <https://mail.python.org/mailman/listinfo/python-ideas>`_
and received general acceptance as being an idea worth pursuing? (if not then
please start a discussion there before submitting a pull request).
More details about it in `PEP 1 <https://www.python.org/dev/peps/pep-0001/#start-with-an-idea-for-python>`_.
Do you have an implementation of your idea? (this is important for when you
propose this PEP to `python-dev <https://mail.python.org/mailman/listinfo/python-dev>`_
as code maintenance is a critical aspect of all PEP proposals prior to a
final decision; in special circumstances an implementation can be deferred)
Commit messages
---------------
When committing to a PEP, please always include the PEP number in the subject
title. For example, ``PEP NNN: <summary of changes>``.
Sign the CLA
------------
Before you hit "Create pull request", please take a moment to ensure that this
project can legally accept your contribution by verifying you have signed the
PSF Contributor Agreement:
https://www.python.org/psf/contrib/contrib-form/
If you haven't signed the CLA before, please follow the steps outlined in the
CPython devguide to do so:
https://devguide.python.org/pullrequest/#licensing
Thanks again to your contribution and we look forward to looking at it!
Code of Conduct
---------------
All interactions for this project are covered by the
`PSF Code of Conduct <https://www.python.org/psf/codeofconduct/>`_. Everyone is
expected to be open, considerate, and respectful of others no matter their
position within the project.

View File

@ -17,11 +17,11 @@ PYTHON=python3
TARGETS= $(patsubst %.rst,%.html,$(wildcard pep-????.rst)) $(patsubst %.txt,%.html,$(wildcard pep-????.txt)) pep-0000.html
all: pep-0000.txt $(TARGETS)
all: pep-0000.rst $(TARGETS)
$(TARGETS): pep2html.py
pep-0000.txt: $(wildcard pep-????.txt) $(wildcard pep-????.rst) $(wildcard pep0/*.py)
pep-0000.rst: $(wildcard pep-????.txt) $(wildcard pep-????.rst) $(wildcard pep0/*.py) genpepindex.py
$(PYTHON) genpepindex.py .
rss:
@ -31,6 +31,7 @@ install:
echo "Installing is not necessary anymore. It will be done in post-commit."
clean:
-rm pep-0000.rst
-rm pep-0000.txt
-rm *.html

View File

@ -11,6 +11,12 @@ PEPs and how to go about writing a PEP, please start reading at PEP 1
now automatically generated, and not committed to the repo.
Contributing to PEPs
====================
See the `Contributing Guidelines <./CONTRIBUTING.rst>`_.
reStructuredText for PEPs
=========================
@ -26,12 +32,41 @@ package, which is available from `PyPI <http://pypi.python.org>`_.
If you have pip, ``pip install docutils`` should install it.
Generating HTML
===============
Generating the PEP Index
========================
PEP 0 is automatically generated based on the metadata headers in other
PEPs. The script handling this is ``genpepindex.py``, with supporting
libraries in the ``pep0`` directory.
Checking PEP formatting and rendering
=====================================
Do not commit changes with bad formatting. To check the formatting of
a PEP, use the Makefile. In particular, to generate HTML for PEP 999,
your source code should be in ``pep-0999.txt`` and the HTML will be
your source code should be in ``pep-0999.rst`` and the HTML will be
generated to ``pep-0999.html`` by the command ``make pep-0999.html``.
The default Make target generates HTML for all PEPs. If you don't have
Make, use the ``pep2html.py`` script.
The default Make target generates HTML for all PEPs.
If you don't have Make, use the ``pep2html.py`` script directly.
Generating HTML for python.org
==============================
python.org includes its own helper modules to render PEPs as HTML, with
suitable links back to the source pages in the version control repository.
These can be found at https://github.com/python/pythondotorg/tree/master/peps
When making changes to the PEP management process that may impact python.org's
rendering pipeline:
* Clone the python.org repository from https://github.com/python/pythondotorg/
* Get set up for local python.org development as per
https://pythondotorg.readthedocs.io/install.html#manual-setup
* Adjust ``PEP_REPO_PATH`` in ``pydotorg/settings/local.py`` to refer to your
local clone of the PEP repository
* Run ``./manage.py generate_pep_pages`` as described in
https://pythondotorg.readthedocs.io/pep_generation.html

View File

@ -36,7 +36,7 @@ def main(argv):
peps = []
if os.path.isdir(path):
for file_path in os.listdir(path):
if file_path == 'pep-0000.txt':
if file_path.startswith('pep-0000.'):
continue
abs_file_path = os.path.join(path, file_path)
if not os.path.isfile(abs_file_path):
@ -61,7 +61,7 @@ def main(argv):
else:
raise ValueError("argument must be a directory or file path")
with codecs.open('pep-0000.txt', 'w', encoding='UTF-8') as pep0_file:
with codecs.open('pep-0000.rst', 'w', encoding='UTF-8') as pep0_file:
write_pep0(peps, pep0_file)
if __name__ == "__main__":

Binary file not shown.

Before

Width:  |  Height:  |  Size: 20 KiB

BIN
pep-0001-process_flow.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

View File

@ -245,7 +245,22 @@ Once a PEP has been accepted, the reference implementation must be
completed. When the reference implementation is complete and incorporated
into the main source code repository, the status will be changed to "Final".
A PEP can also be assigned status "Deferred". The PEP author or an
To allow gathering of additional design and interface feedback before committing
to long term stability for a language feature or standard library API, a PEP
may also be marked as "Provisional". This is short for "Provisionally Accepted",
and indicates that the proposal has been accepted for inclusion in the reference
implementation, but additional user feedback is needed before the full design
can be considered "Final". Unlike regular accepted PEPs, provisionally accepted
PEPs may still be Rejected or Withdrawn *even after the related changes have
been included in a Python release*.
Wherever possible, it is considered preferable to reduce the scope of a proposal
to avoid the need to rely on the "Provisional" status (e.g. by deferring some
features to later PEPs), as this status can lead to version compatibility
challenges in the wider Python ecosystem. PEP 411 provides additional details
on potential use cases for the Provisional status.
A PEP can also be assigned the status "Deferred". The PEP author or an
editor can assign the PEP this status when no progress is being made
on the PEP. Once a PEP is deferred, a PEP editor can re-assign it
to draft status.
@ -267,7 +282,17 @@ an API can replace version 1.
The possible paths of the status of PEPs are as follows:
.. image:: pep-0001-1.png
.. image:: pep-0001-process_flow.png
:alt: PEP process flow diagram
While not shown in the diagram, "Accepted" PEPs may technically move to
"Rejected" or "Withdrawn" even after acceptance. This will only occur if
the implementation process reveals fundamental flaws in the design that were
not noticed prior to acceptance of the PEP. Unlike Provisional PEPs, these
transitions are only permitted if the accepted proposal has *not* been included
in a Python release - released changes must instead go through the regular
deprecation process (which may require a new PEP providing the rationale for
the deprecation).
Some Informational and Process PEPs may also have a status of "Active"
if they are never meant to be completed. E.g. PEP 1 (this PEP).
@ -281,6 +306,11 @@ reached the Final state. Once a PEP has been completed, the Language and
Standard Library References become the formal documentation of the expected
behavior.
If changes based on implementation experience and user feedback are made to
Standards track PEPs while in the Accepted or Provisional State, those changes
should be noted in the PEP, such that the PEP accurately describes the state of
the implementation at the point where it is marked Final.
Informational and Process PEPs may be updated over time to reflect changes
to development practices and other details. The precise process followed in
these cases will depend on the nature and purpose of the PEP being updated.
@ -345,6 +375,15 @@ Each PEP should have the following parts:
appropriate for either the Python language reference or the
standard library reference.
9. How to Teach This -- For a PEP that adds new functionality or changes
language behavior, it is helpful to include a section on how to
teach users, new and experienced, how to apply the PEP to their
work.
This section may include key points and recommended documentation
changes that would help users adopt a new feature or migrate their
code to use a language change.
PEP Formats and Templates
=========================
@ -354,10 +393,8 @@ ReStructuredText_ allows for rich markup that is still quite easy to
read, but also results in good-looking and functional HTML. PEP 12
contains instructions and a template [4]_ for reStructuredText PEPs.
A Python script automatically converts PEPs to HTML for viewing on
the web [5]_. The conversion of reStructuredText PEPs is handled by
the Docutils_ module; the same script also renders a legacy plain-text
format of PEP internally, to support pre-reST documents.
The PEP text files are automatically converted to HTML [5]_ for easier
`online reading <https://www.python.org/dev/peps/>`__.
PEP Header Preamble
@ -372,7 +409,7 @@ optional and are described below. All other headers are required. ::
Author: <list of authors' real names and optionally, email addrs>
* BDFL-Delegate: <PEP czar's real name>
* Discussions-To: <email address>
Status: <Draft | Active | Accepted | Deferred | Rejected |
Status: <Draft | Active | Accepted | Provisional | Deferred | Rejected |
Withdrawn | Final | Superseded>
Type: <Standards Track | Informational | Process>
* Content-Type: <text/x-rst | text/plain>
@ -441,8 +478,8 @@ Standards Track PEPs will typically have a Python-Version header which
indicates the version of Python that the feature will be released with.
Standards Track PEPs without a Python-Version header indicate
interoperability standards that will initially be supported through
external libraries and tools, and then supplemented by a later PEP to
add support to the standard library. Informational and Process PEPs do
external libraries and tools, and then potentially supplemented by a later PEP
to add support to the standard library. Informational and Process PEPs do
not need a Python-Version header.
PEPs may have a Requires header, indicating the PEP numbers that this
@ -458,11 +495,15 @@ obsolete.
Auxiliary Files
===============
PEPs may include auxiliary files such as diagrams. Such files must be
PEPs may include auxiliary files such as diagrams. Such files should be
named ``pep-XXXX-Y.ext``, where "XXXX" is the PEP number, "Y" is a
serial number (starting at 1), and "ext" is replaced by the actual
file extension (e.g. "png").
Alternatively, all support files may be placed in a subdirectory called
``pep-XXXX``, where "XXXX" is the PEP number. When using a subdirectory, there
are no constraints on the names used in files.
Reporting PEP Bugs, or Submitting PEP Updates
=============================================
@ -472,15 +513,15 @@ factors, such as the maturity of the PEP, the preferences of the PEP
author, and the nature of your comments. For the early draft stages
of the PEP, it's probably best to send your comments and changes
directly to the PEP author. For more mature, or finished PEPs you may
want to submit corrections to the Python `issue tracker`_ so that your
changes don't get lost. If the PEP author is a Python developer, assign the
bug/patch to them, otherwise assign it to a PEP editor.
want to submit corrections as a `GitHub issue`_ or `GitHub pull request`_ so that
your changes don't get lost.
When in doubt about where to send your changes, please check first
with the PEP author and/or a PEP editor.
PEP authors with git push privileges for the PEP repository can update the
PEPs themselves by using "git push" to submit their changes.
PEPs themselves by using "git push" or the GitHub PR interface to submit their
changes.
Transferring PEP Ownership
@ -600,11 +641,9 @@ References and Footnotes
.. [4] PEP 12, Sample reStructuredText PEP Template, Goodger, Warsaw
(http://www.python.org/dev/peps/pep-0012)
.. [5] The script referred to here is pep2pyramid.py, the successor to
pep2html.py, both of which live in the same directory in the hg
repo as the PEPs themselves. Try ``pep2html.py --help`` for
details. The URL for viewing PEPs on the web is
http://www.python.org/dev/peps/.
.. [5] More details on the PEP rendering and publication process can be found
in the PEPs repo README at
https://github.com/python/peps/blob/master/README.rst
.. _issue tracker:
http://bugs.python.org/
@ -619,6 +658,8 @@ References and Footnotes
.. _`GitHub pull request`: https://github.com/python/peps/pulls
.. _`GitHub issue`: https://github.com/python/peps/issues
Copyright
=========

580
pep-0001/process_flow.svg Normal file
View File

@ -0,0 +1,580 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
version="1.0"
width="518.000000pt"
height="230.000000pt"
viewBox="0 0 518.000000 230.000000"
preserveAspectRatio="xMidYMid meet"
id="svg3789"
sodipodi:docname="pep-0001-1.svg"
inkscape:version="0.92.2 (5c3e80d, 2017-08-06)">
<metadata
id="metadata3795">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
</cc:Work>
</rdf:RDF>
</metadata>
<defs
id="defs3793">
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker26466"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path26464"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker24282"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path24280" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker18063"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path18061" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker16749"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path16747"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker15177"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path15175" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker14679"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path14677"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker13779"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path13777" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker12309"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path12307"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker11613"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path11611" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker9945"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path9943"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:isstock="true"
style="overflow:visible;"
id="marker5313"
refX="0.0"
refY="0.0"
orient="auto"
inkscape:stockid="Arrow1Mend"
inkscape:collect="always">
<path
transform="scale(0.4) rotate(180) translate(10,0)"
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
id="path5311" />
</marker>
<marker
inkscape:stockid="Arrow1Mend"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow1Mend"
style="overflow:visible;"
inkscape:isstock="true"
inkscape:collect="always">
<path
id="path4732"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.4) rotate(180) translate(10,0)" />
</marker>
<marker
inkscape:stockid="Arrow1Lend"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow1Lend"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path4726"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.8) rotate(180) translate(12.5,0)" />
</marker>
<marker
inkscape:stockid="Arrow2Lend"
orient="auto"
refY="0.0"
refX="0.0"
id="marker5033"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path5031"
style="fill-rule:evenodd;stroke-width:0.625;stroke-linejoin:round;stroke:#000000;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 8.7185878,4.0337352 L -2.2072895,0.016013256 L 8.7185884,-4.0017078 C 6.9730900,-1.6296469 6.9831476,1.6157441 8.7185878,4.0337352 z "
transform="scale(1.1) rotate(180) translate(1,0)" />
</marker>
<marker
inkscape:stockid="Arrow1Lstart"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow1Lstart"
style="overflow:visible"
inkscape:isstock="true">
<path
id="path4723"
d="M 0.0,0.0 L 5.0,-5.0 L -12.5,0.0 L 5.0,5.0 L 0.0,0.0 z "
style="fill-rule:evenodd;stroke:#000000;stroke-width:1pt;stroke-opacity:1;fill:#000000;fill-opacity:1"
transform="scale(0.8) translate(12.5,0)" />
</marker>
<marker
inkscape:stockid="Arrow2Lend"
orient="auto"
refY="0.0"
refX="0.0"
id="Arrow2Lend"
style="overflow:visible;"
inkscape:isstock="true">
<path
id="path4744"
style="fill-rule:evenodd;stroke-width:0.625;stroke-linejoin:round;stroke:#000000;stroke-opacity:1;fill:#000000;fill-opacity:1"
d="M 8.7185878,4.0337352 L -2.2072895,0.016013256 L 8.7185884,-4.0017078 C 6.9730900,-1.6296469 6.9831476,1.6157441 8.7185878,4.0337352 z "
transform="scale(1.1) rotate(180) translate(1,0)" />
</marker>
</defs>
<sodipodi:namedview
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1"
objecttolerance="10"
gridtolerance="10"
guidetolerance="10"
inkscape:pageopacity="0"
inkscape:pageshadow="2"
inkscape:window-width="2560"
inkscape:window-height="1376"
id="namedview3791"
showgrid="false"
inkscape:zoom="1.7126796"
inkscape:cx="356.07349"
inkscape:cy="132.46132"
inkscape:window-x="3200"
inkscape:window-y="0"
inkscape:window-maximized="1"
inkscape:current-layer="svg3789"
showguides="true"
inkscape:guide-bbox="true">
<sodipodi:guide
position="10.509846,117.79787"
orientation="1,0"
id="guide4717"
inkscape:locked="false" />
<sodipodi:guide
position="88.019964,219.39306"
orientation="0,1"
id="guide4719"
inkscape:locked="false" />
<sodipodi:guide
position="416.89059,151.51696"
orientation="0,1"
id="guide21702"
inkscape:locked="false" />
<sodipodi:guide
position="219.83096,105.97429"
orientation="0,1"
id="guide21704"
inkscape:locked="false" />
<sodipodi:guide
position="254.42587,56.052518"
orientation="0,1"
id="guide21706"
inkscape:locked="false" />
</sodipodi:namedview>
<g
id="g4690"
transform="translate(-95.026522,-3.0384519)">
<rect
id="rect4612"
width="127"
height="37"
x="194.48441"
y="61.797592"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
<text
id="text4622"
y="88.499252"
x="198.61234"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="88.499252"
x="198.61234"
id="tspan4620"
sodipodi:role="line">Provisional</tspan></text>
</g>
<g
id="g4675"
transform="translate(-0.490154,-0.39305957)">
<rect
id="rect4606"
width="127"
height="37"
x="11"
y="11"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;image-rendering:auto" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="45.952637"
y="37.70166"
id="text4630"><tspan
id="tspan4632"
sodipodi:role="line"
x="45.952637"
y="37.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Draft</tspan></text>
</g>
<g
id="g4700"
transform="translate(26.712527,-17.52529)">
<rect
id="rect4616"
width="127"
height="37"
x="194.92232"
y="172.58888"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="199.76997"
y="199.29054"
id="text4638"><tspan
sodipodi:role="line"
id="tspan4636"
x="199.76997"
y="199.29054"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Withdrawn</tspan></text>
</g>
<g
id="g4695"
transform="translate(-20.143873,-6.5596308)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="113.471"
x="180.90918"
height="37"
width="127"
id="rect4614" />
<text
id="text4642"
y="138.04034"
x="196.46143"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="138.04034"
x="196.46143"
id="tspan4640"
sodipodi:role="line">Rejected</tspan></text>
</g>
<g
id="g4710"
transform="translate(2.9753917,-0.39303668)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="11"
x="371"
height="37"
width="127"
id="rect3797" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="409.07324"
y="37.70166"
id="text4646"><tspan
sodipodi:role="line"
id="tspan4644"
x="409.07324"
y="37.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Final</tspan></text>
</g>
<g
id="g4685"
transform="translate(-1.7850301,-1.7067669)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="12.31373"
x="192.29486"
height="37"
width="127"
id="rect4610" />
<text
id="text4650"
y="36.883068"
x="205.44623"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="36.883068"
x="205.44623"
id="tspan4648"
sodipodi:role="line">Accepted</tspan></text>
</g>
<g
id="g4680"
transform="translate(-0.490154,12.699399)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="173"
x="11"
height="37"
width="127"
id="rect4608" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="26.165527"
y="199.70166"
id="text4654"><tspan
sodipodi:role="line"
id="tspan4652"
x="26.165527"
y="199.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Deferred</tspan></text>
</g>
<flowRoot
xml:space="preserve"
id="flowRoot4656"
style="fill:black;stroke:none;stroke-opacity:1;stroke-width:1px;stroke-linejoin:miter;stroke-linecap:butt;fill-opacity:1;font-family:sans-serif;font-style:normal;font-weight:normal;font-size:40px;line-height:125%;letter-spacing:0px;word-spacing:0px"><flowRegion
id="flowRegion4658"><rect
id="rect4660"
width="8.7582054"
height="81.743256"
x="99.843544"
y="115.73779" /></flowRegion><flowPara
id="flowPara4662" /></flowRoot> <g
id="g4715"
transform="translate(2.9753917,12.699399)">
<rect
id="rect4604"
width="127"
height="37"
x="371"
y="173"
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1" />
<text
xml:space="preserve"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
x="400.96826"
y="199.70166"
id="text4666"><tspan
sodipodi:role="line"
id="tspan4664"
x="400.96826"
y="199.70166"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px">Active</tspan></text>
</g>
<g
id="g4705"
transform="translate(0,7.4535255)">
<rect
style="fill:none;stroke:#000000;stroke-width:1.125;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1"
y="90.699661"
x="374.46555"
height="37"
width="127"
id="rect4618" />
<text
id="text4670"
y="115.269"
x="387.37521"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;line-height:125%;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.75px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:22px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;stroke-width:0.75px"
y="115.269"
x="387.37521"
id="tspan4668"
sodipodi:role="line">Replaced</tspan></text>
</g>
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#Arrow1Mend)"
d="M 17.515858,48.274259 V 183.04703"
id="path4721"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
<path
sodipodi:nodetypes="cc"
inkscape:connector-curvature="0"
id="path5309"
d="M 26.545549,185.99111 V 49.504289"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker5313)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:3.00000001, 1.5;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#marker9945)"
d="M 439.22346,47.398438 V 96.65244"
id="path9941"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker12309)"
d="m 137.13596,29.727733 h 52.04728"
id="path12305"
inkscape:connector-curvature="0"
sodipodi:nodetypes="cc" />
<path
sodipodi:nodetypes="cc"
inkscape:connector-curvature="0"
id="path13775"
d="M 318.43082,27.538182 H 370.4781"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker13779)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker14679)"
d="M 69.843993,48.540525 V 77.774512 H 97.059211"
id="path14669"
inkscape:connector-curvature="0" />
<path
inkscape:connector-curvature="0"
id="path15173"
d="M 55.952177,47.433136 V 123.87614 H 158.23713"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker15177)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker16749)"
d="M 42.113037,47.312202 V 173.81084 H 220.13598"
id="path16745"
inkscape:connector-curvature="0" />
<path
inkscape:connector-curvature="0"
id="path18059"
d="M 226.35002,69.110911 H 403.09174 V 50.174148"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:none;stroke-opacity:1;marker-end:url(#marker18063)" />
<path
inkscape:connector-curvature="0"
id="path24278"
d="m 226.85286,85.449036 h 29.78208 v 20.112814"
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:3.00000001,1.5;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#marker24282)" />
<path
style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-dasharray:3.00000001,1.5;stroke-dashoffset:0;stroke-opacity:1;marker-end:url(#marker26466)"
d="M 226.44001,77.758634 H 316.6304 V 153.39305"
id="path26462"
inkscape:connector-curvature="0" />
</svg>

After

Width:  |  Height:  |  Size: 27 KiB

View File

@ -68,7 +68,7 @@ Some other good reasons to ignore a particular guideline:
Python that don't support the feature recommended by the style guide.
Code lay-out
Code Lay-out
============
Indentation
@ -179,7 +179,6 @@ starts the multiline construct, as in::
'd', 'e', 'f',
)
Tabs or Spaces?
---------------
@ -198,7 +197,6 @@ the ``-t`` option, it issues warnings about code that illegally mixes
tabs and spaces. When using ``-tt`` these warnings become errors.
These options are highly recommended!
Maximum Line Length
-------------------
@ -249,8 +247,7 @@ Another such case is with ``assert`` statements.
Make sure to indent the continued line appropriately.
Should a line break before or after a binary operator?
Should a Line Break Before or After a Binary Operator?
------------------------------------------------------
For decades the recommended style was to break after binary operators.
@ -287,7 +284,6 @@ In Python code, it is permissible to break before or after a binary
operator, as long as the convention is consistent locally. For new
code Knuth's style is suggested.
Blank Lines
-----------
@ -309,7 +305,6 @@ you may use them to separate pages of related sections of your file.
Note, some editors and web-based code viewers may not recognize
control-L as a form feed and will show another glyph in its place.
Source File Encoding
--------------------
@ -339,11 +334,10 @@ a transliteration of their names in this character set.
Open source projects with a global audience are encouraged to adopt a
similar policy.
Imports
-------
- Imports should usually be on separate lines, e.g.::
- Imports should usually be on separate lines::
Yes: import os
import sys
@ -359,9 +353,9 @@ Imports
Imports should be grouped in the following order:
1. standard library imports
2. related third party imports
3. local application/library specific imports
1. Standard library imports.
2. Related third party imports.
3. Local application/library specific imports.
You should put a blank line between each group of imports.
@ -393,7 +387,7 @@ Imports
from myclass import MyClass
from foo.bar.yourclass import YourClass
If this spelling causes local name clashes, then spell them ::
If this spelling causes local name clashes, then spell them explicitly::
import myclass
import foo.bar.yourclass
@ -412,8 +406,7 @@ Imports
When republishing names this way, the guidelines below regarding
public and internal interfaces still apply.
Module level dunder names
Module Level Dunder Names
-------------------------
Module level "dunders" (i.e. names with two leading and two trailing
@ -421,9 +414,7 @@ underscores) such as ``__all__``, ``__author__``, ``__version__``,
etc. should be placed after the module docstring but before any import
statements *except* ``from __future__`` imports. Python mandates that
future-imports must appear in the module before any other code except
docstrings.
For example::
docstrings::
"""This is the example module.
@ -524,7 +515,6 @@ Avoid extraneous whitespace in the following situations:
y = 2
long_variable = 3
Other Recommendations
---------------------
@ -642,7 +632,8 @@ Other Recommendations
if foo == 'blah': one(); two(); three()
When to use trailing commas
When to Use Trailing Commas
===========================
Trailing commas are usually optional, except they are mandatory when
@ -748,7 +739,7 @@ Conventions for writing good documentation strings
- PEP 257 describes good docstring conventions. Note that most
importantly, the ``"""`` that ends a multiline docstring should be
on a line by itself, e.g.::
on a line by itself::
"""Return a foobang
@ -882,13 +873,13 @@ Note that there is a separate convention for builtin names: most builtin
names are single words (or two words run together), with the CapWords
convention used only for exception names and builtin constants.
Type variable names
Type Variable Names
~~~~~~~~~~~~~~~~~~~
Names of type variables introduced in PEP 484 should normally use CapWords
preferring short names: ``T``, ``AnyStr``, ``Num``. It is recommended to add
suffixes ``_co`` or ``_contra`` to the variables used to declare covariant
or contravariant behavior correspondingly. Examples::
or contravariant behavior correspondingly::
from typing import TypeVar
@ -914,7 +905,7 @@ older convention of prefixing such globals with an underscore (which
you might want to do to indicate these globals are "module
non-public").
Function and variable names
Function and Variable Names
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Function names should be lowercase, with words separated by
@ -926,7 +917,7 @@ mixedCase is allowed only in contexts where that's already the
prevailing style (e.g. threading.py), to retain backwards
compatibility.
Function and method arguments
Function and Method Arguments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Always use ``self`` for the first argument to instance methods.
@ -966,7 +957,7 @@ Constants are usually defined on a module level and written in all
capital letters with underscores separating words. Examples include
``MAX_OVERFLOW`` and ``TOTAL``.
Designing for inheritance
Designing for Inheritance
~~~~~~~~~~~~~~~~~~~~~~~~~
Always decide whether a class's methods and instance variables
@ -975,7 +966,7 @@ doubt, choose non-public; it's easier to make it public later than to
make a public attribute non-public.
Public attributes are those that you expect unrelated clients of your
class to use, with your commitment to avoid backward incompatible
class to use, with your commitment to avoid backwards incompatible
changes. Non-public attributes are those that are not intended to be
used by third parties; you make no guarantees that non-public
attributes won't change or even be removed.
@ -1041,8 +1032,7 @@ With this in mind, here are the Pythonic guidelines:
need to avoid accidental name clashes with potential use by
advanced callers.
Public and internal interfaces
Public and Internal Interfaces
------------------------------
Any backwards compatibility guarantees apply only to public interfaces.
@ -1180,9 +1170,7 @@ Programming Recommendations
continuation characters thanks to the containing parentheses.
- When catching exceptions, mention specific exceptions whenever
possible instead of using a bare ``except:`` clause.
For example, use::
possible instead of using a bare ``except:`` clause::
try:
import platform_specific_module
@ -1250,7 +1238,6 @@ Programming Recommendations
- Context managers should be invoked through separate functions or methods
whenever they do something other than acquire and release resources.
For example:
Yes::
@ -1301,14 +1288,13 @@ Programming Recommendations
- Use string methods instead of the string module.
String methods are always much faster and share the same API with
unicode strings. Override this rule if backward compatibility with
unicode strings. Override this rule if backwards compatibility with
Pythons older than 2.0 is required.
- Use ``''.startswith()`` and ``''.endswith()`` instead of string
slicing to check for prefixes or suffixes.
startswith() and endswith() are cleaner and less error prone. For
example::
startswith() and endswith() are cleaner and less error prone::
Yes: if foo.startswith('bar'):
No: if foo[:3] == 'bar':
@ -1328,7 +1314,7 @@ Programming Recommendations
Note that in Python 3, ``unicode`` and ``basestring`` no longer exist
(there is only ``str``) and a bytes object is no longer a kind of
string (it is a sequence of integers instead)
string (it is a sequence of integers instead).
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false. ::
@ -1397,7 +1383,7 @@ annotations are changing.
can be added in the form of comments. See the relevant section of
PEP 484 [6]_.
Variable annotations
Variable Annotations
--------------------
PEP 526 introduced variable annotations. The style recommendations for them are
@ -1460,7 +1446,6 @@ References
https://www.python.org/dev/peps/pep-0484/#suggested-syntax-for-python-2-7-and-straddling-code
Copyright
=========

View File

@ -8,6 +8,7 @@ Type: Informational
Content-Type: text/x-rst
Created: 22-Aug-2001
Post-History:
Replaces: 102
Abstract
@ -158,7 +159,7 @@ to perform some manual editing steps.
- Check the stable buildbots.
Go to http://buildbot.python.org/all/waterfall
Go to http://buildbot.python.org/all/#/grid
Look at the buildbots for the release
you're making. Ignore any that are offline (or inform the community so

View File

@ -72,7 +72,7 @@ or::
More precisely, the first or second line must match the following
regular expression::
^[ \t\v]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)
^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)
The first group of this
expression is then interpreted as encoding name. If the encoding

View File

@ -9,6 +9,7 @@ Content-Type: text/x-rst
Created: 18-Jun-2002
Python-Version: 2.4
Post-History: 18-Jun-2002, 23-Mar-2004, 22-Aug-2004
Replaces: 215
Abstract

View File

@ -9,6 +9,21 @@ Content-Type: text/x-rst
Created: 22-Jan-2003
Post-History: 27-Jan-2003, 31-Jan-2003, 17-Jun-2005
Historical Note
===============
While this original PEP was withdrawn, a variant of this feature
was eventually implemented for Python 3.8 in https://bugs.python.org/issue33499
Several of the issues and concerns originally raised in this PEP were resolved
by other changes in the intervening years:
- the introduction of isolated mode to handle potential security concerns
- the switch to ``importlib``, a fully import-hook based import system implementation
- PEP 3147's change in the bytecode cache layout to use ``__pycache__``
subdirectories, including the ``source_to_cache(path)`` and
``cache_to_source(path)`` APIs that allow the interpreter to automatically
handle the redirection to a separate cache directory
Abstract
========

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: $Date$
Author: Richard Jones <richard@python.org>
Discussions-To: Distutils SIG
Status: Accepted
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Apr-2005

View File

@ -50,13 +50,12 @@ Maintenance releases
====================
Being the last of the 2.x series, 2.7 will have an extended period of
maintenance. The current plan is to support it for at least 10 years
from the initial 2.7 release. This means there will be bugfix releases
until 2020.
maintenance. Specifically, 2.7 will receive bugfix support until
January 1, 2020. All 2.7 development work will cease in 2020.
Planned future release dates:
- 2.7.15 2018
- 2.7.16 late 2018 - early 2019
Dates of previous maintenance releases:
@ -84,6 +83,8 @@ Dates of previous maintenance releases:
- 2.7.13 2016-12-17
- 2.7.14rc1 2017-08-26
- 2.7.14 2017-09-16
- 2.7.15rc1 2018-04-14
- 2.7.15 2018-05-01
2.7.0 Release Schedule
======================

View File

@ -3,7 +3,7 @@ Title: Database of Installed Python Distributions
Version: $Revision$
Last-Modified: $Date$
Author: Tarek Ziadé <tarek@ziade.org>
Status: Accepted
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 22-Feb-2009

View File

@ -144,7 +144,7 @@ specifiers like::
Java offers a `Decimal.Format Class`_ that uses picture patterns (one
for positive numbers and an optional one for negatives) such as:
``"#,##0.00;(#,##0.00)"``. It allows arbitrary groupings including
hundreds and ten-thousands and uneven groupings. The special patten
hundreds and ten-thousands and uneven groupings. The special pattern
characters are non-localized (using a DOT for a decimal separator and
a COMMA for a grouping separator). The user can supply an alternate
set of symbols using the formatter's *DecimalFormatSymbols* object.

View File

@ -52,7 +52,7 @@ be removed at any time in any way. These include:
- Function, class, module, attribute, method, and C-API names and types that
are prefixed by "_" (except special names). The contents of these
can also are not subject to the policy.
are also not subject to the policy.
- Inheritance patterns of internal classes.

View File

@ -4,12 +4,13 @@ Version: $Revision$
Last-Modified: $Date$
Author: Kerrick Staley <mail@kerrickstaley.com>,
Nick Coghlan <ncoghlan@gmail.com>,
Barry Warsaw <barry@python.org>
Barry Warsaw <barry@python.org>,
Petr Viktorin <encukou@gmail.com>
Status: Active
Type: Informational
Content-Type: text/x-rst
Created: 02-Mar-2011
Post-History: 04-Mar-2011, 20-Jul-2011, 16-Feb-2012, 30-Sep-2014
Post-History: 04-Mar-2011, 20-Jul-2011, 16-Feb-2012, 30-Sep-2014, 28-Apr-2018
Resolution: https://mail.python.org/pipermail/python-dev/2012-February/116594.html
@ -22,8 +23,9 @@ Python interpreter (i.e. the version invoked by the ``python`` command).
* ``python2`` will refer to some version of Python 2.x.
* ``python3`` will refer to some version of Python 3.x.
* for the time being, all distributions *should* ensure that ``python``
refers to the same target as ``python2``.
* for the time being, all distributions *should* ensure that ``python``,
if installed, refers to the same target as ``python2``, unless the user
deliberately overrides this or a virtual environment is active.
* however, end users should be aware that ``python`` refers to ``python3``
on at least Arch Linux (that change is what prompted the creation of this
PEP), so ``python`` should be used in the shebang line only for scripts
@ -43,8 +45,7 @@ Recommendation
* When invoked, ``python2`` should run some version of the Python 2
interpreter, and ``python3`` should run some version of the Python 3
interpreter.
* The more general ``python`` command should be installed whenever
any version of Python 2 is installed and should invoke the same version of
* If the ``python`` command is installed, it should invoke the same version of
Python as the ``python2`` command (however, note that some distributions
have already chosen to have ``python`` implement the ``python3``
command; see the `Rationale`_ and `Migration Notes`_ below).
@ -62,14 +63,30 @@ Recommendation
context.
* One exception to this is scripts that are deliberately written to be source
compatible with both Python 2.x and 3.x. Such scripts may continue to use
``python`` on their shebang line without affecting their portability.
``python`` on their shebang line.
* When packaging software that is source compatible with both versions,
distributions may change such ``python`` shebangs to ``python3``.
This ensures software is used with the latest version of
Python available, and it can remove a dependency on Python 2.
* When reinvoking the interpreter from a Python script, querying
``sys.executable`` to avoid hardcoded assumptions regarding the
interpreter location remains the preferred approach.
* In controlled environments aimed at expert users, where being explicit
is valued over user experience (for example, in test environments and
package build systems), distributions may choose to not provide the
``python`` command even if ``python2`` is available.
(All software in such a controlled environment must use ``python3`` or
``python2`` rather than ``python``, which means scripts that deliberately
use ``python`` need to be modified for such environments.)
* When a virtual environment (created by the PEP 405 ``venv`` package or a
similar tool) is active, the ``python`` command should refer to the
virtual environment's interpreter. In other words, activating a virtual
environment counts as deliberate user action to change the default
``python`` interpreter.
These recommendations are the outcome of the relevant python-dev discussions
in March and July 2011 ([1]_, [2]_), February 2012 ([4]_) and
September 2014 ([6]_).
in March and July 2011 ([1]_, [2]_), February 2012 ([4]_),
September 2014 ([6]_), and discussion on GitHub in April 2018 ([7]_).
Rationale
@ -91,11 +108,6 @@ on the part of distribution maintainers.
Future Changes to this Recommendation
=====================================
It is anticipated that there will eventually come a time where the third
party ecosystem surrounding Python 3 is sufficiently mature for this
recommendation to be updated to suggest that the ``python`` symlink
refer to ``python3`` rather than ``python2``.
This recommendation will be periodically reviewed over the next few years,
and updated when the core development team judges it appropriate. As a
point of reference, regular maintenance releases for the Python 2.7 series
@ -150,15 +162,13 @@ making such a change.
* When the ``pythonX.X`` binaries are provided by a distribution, the
``python2`` and ``python3`` commands should refer to one of those files
rather than being provided as a separate binary file.
* It is suggested that even distribution-specific packages follow the
``python2``/``python3`` convention, even in code that is not intended to
* It is strongly encouraged that distribution-specific packages use ``python2``
or ``python3`` rather than ``python``, even in code that is not intended to
operate on other distributions. This will reduce problems if the
distribution later decides to change the version of the Python interpreter
that the ``python`` command invokes, or if a sysadmin installs a custom
``python`` command with a different major version than the distribution
default. Distributions can test whether they are fully following this
convention by changing the ``python`` interpreter on a test box and checking
to see if anything breaks.
default.
* If the above point is adhered to and sysadmins are permitted to change the
``python`` command, then the ``python`` command should always be implemented
as a link to the interpreter binary (or a link to a link) and not vice
@ -267,6 +277,10 @@ References
.. [6] PEP 394 - Clarification of what "python" command should invoke
(https://mail.python.org/pipermail/python-dev/2014-September/136374.html)
.. [7] PEP 394: Allow the `python` command to not be installed, and other
minor edits
(https://github.com/python/peps/pull/630)
Copyright
===========
This document has been placed in the public domain.

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>,
Eli Bendersky <eliben@gmail.com>
Status: Accepted
Status: Active
Type: Informational
Content-Type: text/x-rst
Created: 2012-02-10

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: 07-Aug-2012
Author: Daniel Holth <dholth@gmail.com>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Status: Accepted
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Jul-2012

View File

@ -7,7 +7,7 @@ Author: Nick Coghlan <ncoghlan@gmail.com>,
Donald Stufft <donald@stufft.io>
BDFL-Delegate: Donald Stufft <donald@stufft.io>
Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Deferred
Status: Withdrawn
Type: Informational
Content-Type: text/x-rst
Requires: 440, 508, 518
@ -18,6 +18,20 @@ Post-History: 14 Nov 2012, 5 Feb 2013, 7 Feb 2013, 9 Feb 2013,
Replaces: 345
PEP Withdrawal
==============
The ground-up metadata redesign proposed in this PEP has been withdrawn in
favour of the more modest proposal in PEP 566, which retains the basic
Key:Value format of previous metadata versions, but also defines a standardised
mechanism for translating that format to nested JSON-compatible data structures.
Some of the ideas in this PEP (or the related PEP 459) may still be considered
as part of later proposals, but they will be handled in a more incremental
fashion, rather than as a single large proposed change with no feasible
migration plan.
Abstract
========
@ -25,16 +39,13 @@ This PEP describes a mechanism for publishing and exchanging metadata
related to Python distributions. It includes specifics of the field names,
and their semantics and usage.
This document specifies version 3.0 of the metadata format.
This document specifies the never released version 2.0 of the metadata format.
Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345.
Version 2.0 is specified in earlier drafts of this PEP and was never formally
approved for use.
Version 3.0 of the metadata format migrates from directly defining a
Version 2.0 of the metadata format proposed migrating from directly defining a
custom key-value file format to instead defining a JSON-compatible in-memory
representation that may be used to define metadata representation in other
contexts (such as API and archive format definitions).
@ -44,8 +55,8 @@ fields to be added for particular purposes without requiring updates to
the core metadata format.
Note on PEP Deferral
====================
Note on PEP History
===================
This PEP was initially deferred for an extended period, from December 2013
through to March 2017, as distutils-sig worked through a number of other
@ -74,7 +85,7 @@ of publishing and distributing software to be moved out to PEP 459, a separate
proposal for a number of standard metadata extensions that provide additional
optional information about a release.
As of September 2017, it has been deferred again, on the grounds that
As of September 2017, it was deferred again, on the grounds that
it doesn't actually help solve any particularly pressing problems:
- JSON representation would be better handled through defining a
@ -87,6 +98,9 @@ it doesn't actually help solve any particularly pressing problems:
.. _specifications: https://packaging.python.org/specifications/
.. _minor spec version update: https://mail.python.org/pipermail/distutils-sig/2017-September/031465.html
Finally, the PEP was withdrawn in February 2018 in favour of PEP 566 (which
pursues that more incremental strategy).
Purpose
=======
@ -391,7 +405,7 @@ binary archive from a source archive.
These locations are to be confirmed, since they depend on the definition
of sdist 2.0 and the revised installation database standard. There will
also be a wheel 1.1 format update after this PEP is approved that
mandates provision of 3.0+ metadata.
mandates provision of 2.0+ metadata.
Note that these metadata files MAY be processed even if the version of the
containing location is too low to indicate that they are valid. Specifically,
@ -414,7 +428,7 @@ used directly as a data input format. Generating the metadata as part of the
publication process also helps to deal with version specific fields (including
the source URL and the version field itself).
For backwards compatibility with older installation tools, metadata 3.0
For backwards compatibility with older installation tools, metadata 2.0
files MAY be distributed alongside legacy metadata.
Index servers MAY allow distributions to be uploaded and installation tools
@ -443,8 +457,8 @@ with RFC 3986.
The current version of the schema file covers the previous draft of the
PEP, and has not yet been updated for the split into the essential
dependency resolution metadata and multiple standard extensions, and nor
has it been updated for the various other differences between the 3.0
draft and the earlier 2.0 drafts.
has it been updated for the various other differences between the current
draft and the earlier drafts.
Core metadata
@ -467,7 +481,7 @@ installation to occur.
Metadata version
----------------
Version of the file format; ``"3.0"`` is the only legal value.
Version of the file format; ``"2.0"`` is the only legal value.
Automated tools consuming metadata SHOULD warn if ``metadata_version`` is
greater than the highest version they support, and MUST fail if
@ -481,7 +495,7 @@ all of the needed fields.
Example::
"metadata_version": "3.0"
"metadata_version": "2.0"
Generator
@ -1046,7 +1060,7 @@ Appendix A: Conversion notes for legacy metadata
================================================
The reference implementations for converting from legacy metadata to
metadata 3.0 are:
metadata 2.0 are:
* the `wheel project <https://bitbucket.org/dholth/wheel/overview>`__, which
adds the ``bdist_wheel`` command to ``setuptools``
@ -1114,7 +1128,7 @@ format.
Appendix C: Summary of differences from \PEP 345
=================================================
* Metadata-Version is now 3.0, with semantics specified for handling
* Metadata-Version is now 2.0, with semantics specified for handling
version changes
* The increasingly complex ad hoc "Key: Value" format has been replaced by
@ -1175,7 +1189,7 @@ provision of multiple versions of the metadata in parallel.
Existing tools won't abide by this guideline until they're updated to
support the new metadata standard, so the new semantics will first take
effect for a hypothetical 2.x -> 3.0 transition. For the 1.x -> 3.0
effect for a hypothetical 2.x -> 3.0 transition. For the 1.x -> 2.x
transition, we will use the approach where tools continue to produce the
existing supplementary files (such as ``entry_points.txt``) in addition
to any equivalents specified using the new features of the standard
@ -1283,7 +1297,7 @@ packages.
The ability to declare an extension as required is included primarily to
allow the definition of the metadata hooks extension to be deferred until
some time after the initial adoption of the metadata 3.0 specification. If
some time after the initial adoption of the metadata 2.0 specification. If
a release needs a ``postinstall`` hook to run in order to complete
the installation successfully, then earlier versions of tools should fall
back to installing from source rather than installing from a wheel file and
@ -1299,10 +1313,10 @@ order to better prioritise our efforts in migrating to the new metadata
standard. These all reflect information that may be nice to have in the
new metadata, but which can be readily added through metadata extensions or
in metadata 2.1 without breaking any use cases already supported by metadata
3.0.
2.0.
Once the ``pypi``, ``setuptools``, ``pip``, ``wheel`` and ``distlib``
projects support creation and consumption of metadata 3.0, then we may
projects support creation and consumption of metadata 2.0, then we may
revisit the creation of metadata 2.1 with some or all of these additional
features.
@ -1484,7 +1498,7 @@ the idea won't be reconsidered until metadata 2.1 at the earliest).
References
==========
This document specifies version 3.0 of the metadata format.
This document specifies version 2.0 of the metadata format.
Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345.

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Daniel Holth <dholth@gmail.com>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: <distutils-sig@python.org>
Status: Accepted
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 20-Sep-2012

View File

@ -61,13 +61,16 @@ The releases so far:
- 3.4.6 final: January 17, 2017
- 3.4.7 candidate 1: July 25, 2017
- 3.4.7 final: August 9, 2017
- 3.4.8 candidate 1: January 23, 2018
- 3.4.8 final: February 4, 2018
.. There are no currently planned releases of Python 3.4.
.. There are no specific plans for the next release of Python 3.4.
Planned future releases:
- 3.4.8 candidate 1: January 21, 2018
- 3.4.8 final: February 4, 2018
- 3.4.9 candidate 1: July 18, 2018
- 3.4.9 final: August 1, 2018
Features for 3.4

View File

@ -11,6 +11,7 @@ Content-Type: text/x-rst
Created: 2013-02-23
Python-Version: 3.4
Post-History: 2013-02-23, 2013-05-02
Replaces: 354
Resolution: https://mail.python.org/pipermail/python-dev/2013-May/126112.html

View File

@ -6,7 +6,7 @@ Author: Nick Coghlan <ncoghlan@gmail.com>,
Donald Stufft <donald@stufft.io>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Accepted
Status: Active
Type: Informational
Content-Type: text/x-rst
Created: 18 Mar 2013

View File

@ -8,6 +8,7 @@ Type: Standards Track
Content-Type: text/x-rst
Created: 5-August-2013
Python-Version: 3.4
Replaces: 433
Abstract

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Richard Jones <richard@python.org>
Discussions-To: distutils-sig@python.org
Status: Accepted
Status: Final
Type: Process
Content-Type: text/x-rst
Created: 04-Aug-2013

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Deferred
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Requires: 426
@ -13,6 +13,17 @@ Created: 11 Nov 2013
Post-History: 21 Dec 2013
PEP Withdrawal
==============
This PEP depends on PEP 426, which has itself been withdrawn. See the
PEP Withdrawal section in that PEP for details.
In the meantime, metadata extensions will continue to be handled as they
have been for past examples like ``entry_points.txt``: as additional files
installed into metadata directories alongside the main `METADATA` file.
Abstract
========
@ -22,22 +33,6 @@ Like all metadata extensions, each standard extension format is
independently versioned. Changing any of the formats requires an update
to this PEP, but does not require an update to the core packaging metadata.
PEP Deferral
============
This PEP depends on PEP 426, which has itself been deferred. See the
PEP Deferral section in that PEP for details.
.. note::
These extensions may eventually be separated out into their own PEPs,
but we're already suffering from PEP overload in the packaging
metadata space.
This PEP was initially created by slicing out large sections of earlier
drafts of PEP 426 and making them extensions, so some of the specifics
may still be rough in the new context.
Standard Extension Namespace
============================

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Richard Jones <richard@python.org>
Discussions-To: distutils-sig@python.org
Status: Accepted
Status: Final
Type: Process
Content-Type: text/x-rst
Created: 02-Mar-2014

View File

@ -2,13 +2,13 @@ PEP: 467
Title: Minor API improvements for binary sequences
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>
Author: Nick Coghlan <ncoghlan@gmail.com>, Ethan Furman <ethan@stoneleaf.us>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2014-03-30
Python-Version: 3.5
Post-History: 2014-03-30 2014-08-15 2014-08-16
Python-Version: 3.8
Post-History: 2014-03-30 2014-08-15 2014-08-16 2016-06-07 2016-09-01
Abstract
@ -20,22 +20,25 @@ that is now referred to as ``bytearray``. Other aspects of operating in
the binary domain in Python have also evolved over the course of the Python
3 series.
This PEP proposes four small adjustments to the APIs of the ``bytes``,
``bytearray`` and ``memoryview`` types to make it easier to operate entirely
in the binary domain:
This PEP proposes five small adjustments to the APIs of the ``bytes`` and
``bytearray`` types to make it easier to operate entirely in the binary domain:
* Deprecate passing single integer values to ``bytes`` and ``bytearray``
* Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
* Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
``memoryview.iterbytes`` alternative iterators
* Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative constructors
* Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors
* Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods
* Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative iterators
And one built-in::
* bchr
Proposals
=========
Deprecation of current "zero-initialised sequence" behaviour
------------------------------------------------------------
Deprecation of current "zero-initialised sequence" behaviour without removal
----------------------------------------------------------------------------
Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
argument and interpret it as meaning to create a zero-initialised sequence
@ -46,62 +49,75 @@ of the given size::
>>> bytearray(3)
bytearray(b'\x00\x00\x00')
This PEP proposes to deprecate that behaviour in Python 3.5, and remove it
entirely in Python 3.6.
This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
it in place for at least as long as Python 2.7 is supported, possibly
indefinitely.
No other changes are proposed to the existing constructors.
Addition of explicit "zero-initialised sequence" constructors
-------------------------------------------------------------
Addition of explicit "count and byte initialised sequence" constructors
-----------------------------------------------------------------------
To replace the deprecated behaviour, this PEP proposes the addition of an
explicit ``zeros`` alternative constructor as a class method on both
``bytes`` and ``bytearray``::
explicit ``fromsize`` alternative constructor as a class method on both
``bytes`` and ``bytearray`` whose first argument is the count, and whose
second argument is the fill byte to use (defaults to ``\x00``)::
>>> bytes.zeros(3)
>>> bytes.fromsize(3)
b'\x00\x00\x00'
>>> bytearray.zeros(3)
>>> bytearray.fromsize(3)
bytearray(b'\x00\x00\x00')
>>> bytes.fromsize(5, b'\x0a')
b'\x0a\x0a\x0a\x0a\x0a'
>>> bytearray.fromsize(5, b'\x0a')
bytearray(b'\x0a\x0a\x0a\x0a\x0a')
It will behave just as the current constructors behave when passed a single
integer.
The specific choice of ``zeros`` as the alternative constructor name is taken
from the corresponding initialisation function in NumPy (although, as these
are 1-dimensional sequence types rather than N-dimensional matrices, the
constructors take a length as input rather than a shape tuple)
``fromsize`` will behave just as the current constructors behave when passed a single
integer, while allowing for non-zero fill values when needed.
Addition of explicit "single byte" constructors
-----------------------------------------------
Addition of "bchr" function and explicit "single byte" constructors
-------------------------------------------------------------------
As binary counterparts to the text ``chr`` function, this PEP proposes the
addition of an explicit ``byte`` alternative constructor as a class method
on both ``bytes`` and ``bytearray``::
As binary counterparts to the text ``chr`` function, this PEP proposes
the addition of a ``bchr`` function and an explicit ``fromord`` alternative
constructor as a class method on both ``bytes`` and ``bytearray``::
>>> bytes.byte(3)
b'\x03'
>>> bytearray.byte(3)
bytearray(b'\x03')
>>> bchr(ord("A"))
b'A'
>>> bchr(ord(b"A"))
b'A'
>>> bytes.fromord(65)
b'A'
>>> bytearray.fromord(65)
bytearray(b'A')
These methods will only accept integers in the range 0 to 255 (inclusive)::
>>> bytes.byte(512)
>>> bytes.fromord(512)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: bytes must be in range(0, 256)
ValueError: integer must be in range(0, 256)
>>> bytes.byte(1.0)
>>> bytes.fromord(1.0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'float' object cannot be interpreted as an integer
The documentation of the ``ord`` builtin will be updated to explicitly note
that ``bytes.byte`` is the inverse operation for binary data, while ``chr``
is the inverse operation for text data.
While this does create some duplication, there are valid reasons for it:
Behaviourally, ``bytes.byte(x)`` will be equivalent to the current
* the ``bchr`` builtin is to recreate the ord/chr/unichr trio from Python
2 under a different naming scheme
* the class method is mainly for the ``bytearray.fromord`` case, with
``bytes.fromord`` added for consistency
The documentation of the ``ord`` builtin will be updated to explicitly note
that ``bchr`` is the primary inverse operation for binary data, while ``chr``
is the inverse operation for text data, and that ``bytes.fromord`` and
``bytearray.fromord`` also exist.
Behaviourally, ``bytes.fromord(x)`` will be equivalent to the current
``bytes([x])`` (and similarly for ``bytearray``). The new spelling is
expected to be easier to discover and easier to read (especially when used
in conjunction with indexing operations on binary sequence types).
@ -110,35 +126,37 @@ As a separate method, the new spelling will also work better with higher
order functions like ``map``.
Addition of "getbyte" method to retrieve a single byte
------------------------------------------------------
This PEP proposes that ``bytes`` and ``bytearray`` gain the method ``getbyte``
which will always return ``bytes``::
>>> b'abc'.getbyte(0)
b'a'
If an index is asked for that doesn't exist, ``IndexError`` is raised::
>>> b'abc'.getbyte(9)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index out of range
Addition of optimised iterator methods that produce ``bytes`` objects
---------------------------------------------------------------------
This PEP proposes that ``bytes``, ``bytearray`` and ``memoryview`` gain an
optimised ``iterbytes`` method that produces length 1 ``bytes`` objects
rather than integers::
This PEP proposes that ``bytes`` and ``bytearray``gain an optimised
``iterbytes`` method that produces length 1 ``bytes`` objects rather than
integers::
for x in data.iterbytes():
# x is a length 1 ``bytes`` object, rather than an integer
The method can be used with arbitrary buffer exporting objects by wrapping
them in a ``memoryview`` instance first::
For example::
for x in memoryview(data).iterbytes():
# x is a length 1 ``bytes`` object, rather than an integer
For ``memoryview``, the semantics of ``iterbytes()`` are defined such that::
memview.tobytes() == b''.join(memview.iterbytes())
This allows the raw bytes of the memory view to be iterated over without
needing to make a copy, regardless of the defined shape and format.
The main advantage this method offers over the ``map(bytes.byte, data)``
approach is that it is guaranteed *not* to fail midstream with a
``ValueError`` or ``TypeError``. By contrast, when using the ``map`` based
approach, the type and value of the individual items in the iterable are
only checked as they are retrieved and passed through the ``bytes.byte``
constructor.
>>> tuple(b"ABC".iterbytes())
(b'A', b'B', b'C')
Design discussion
@ -163,10 +181,18 @@ This PEP isn't revisiting that original design decision, just changing the
spelling as users sometimes find the current behaviour of the binary sequence
constructors surprising. In particular, there's a reasonable case to be made
that ``bytes(x)`` (where ``x`` is an integer) should behave like the
``bytes.byte(x)`` proposal in this PEP. Providing both behaviours as separate
``bytes.fromord(x)`` proposal in this PEP. Providing both behaviours as separate
class methods avoids that ambiguity.
Open Questions
==============
Do we add ``iterbytes`` to ``memoryview``, or modify
``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation? Or
do we ignore memory for now and add it later?
References
==========
@ -180,19 +206,11 @@ References
(http://bugs.python.org/issue21644)
.. [5] August 2014 discussion thread on python-dev
(https://mail.python.org/pipermail/python-ideas/2014-March/027295.html)
.. [6] June 2016 discussion thread on python-dev
(https://mail.python.org/pipermail/python-dev/2016-June/144875.html)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Paul Moore <p.f.moore@gmail.com>
Discussions-To: distutils-sig@python.org
Status: Accepted
Status: Final
Type: Process
Content-Type: text/x-rst
Created: 12-May-2014

View File

@ -57,13 +57,15 @@ The releases so far:
- 3.5.3 final: January 17, 2017
- 3.5.4 candidate 1: July 25, 2017
- 3.5.4 final: August 8, 2017
- 3.5.5 candidate 1: January 23, 2018
- 3.5.5 final: February 4, 2018
.. There are no currently planned releases for Python 3.5.
.. There are no specific plans for the next release of Python 3.5.
Planned future releases:
- 3.5.5 candidate 1: January 21, 2018
- 3.5.5 final: February 4, 2018
- 3.5.6 candidate 1: July 18, 2018
- 3.5.6 final: August 1, 2018

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Guido van Rossum <guido@python.org>, Jukka Lehtosalo <jukka.lehtosalo@iki.fi>, Łukasz Langa <lukasz@python.org>
BDFL-Delegate: Mark Shannon
Discussions-To: Python-Dev <python-dev@python.org>
Status: Accepted
Status: Provisional
Type: Standards Track
Content-Type: text/x-rst
Created: 29-Sep-2014
@ -342,7 +342,7 @@ Additionally, ``Any`` is a valid value for every type variable.
Consider the following::
def count_truthy(elements: List[Any]) -> int:
return sum(1 for elem in elements if element)
return sum(1 for elem in elements if elem)
This is equivalent to omitting the generic notation and just saying
``elements: List``.
@ -355,6 +355,7 @@ You can include a ``Generic`` base class to define a user-defined class
as generic. Example::
from typing import TypeVar, Generic
from logging import Logger
T = TypeVar('T')
@ -373,7 +374,7 @@ as generic. Example::
return self.value
def log(self, message: str) -> None:
self.logger.info('{}: {}'.format(self.name message))
self.logger.info('{}: {}'.format(self.name, message))
``Generic[T]`` as a base class defines that the class ``LoggedVar``
takes a single type parameter ``T``. This also makes ``T`` valid as
@ -582,9 +583,9 @@ argument(s) is substituted. Otherwise, ``Any`` is assumed. Example::
T = TypeVar('T')
class Node(Generic[T]):
x = None # type: T # Instance attribute (see below)
def __init__(self, label: T = None) -> None:
...
x = None # Type: T
x = Node('') # Inferred type is Node[str]
y = Node(0) # Inferred type is Node[int]
@ -983,15 +984,17 @@ for example, the above is equivalent to::
def handle_employee(e: Optional[Employee]) -> None: ...
An optional type is also automatically assumed when the default value is
``None``, for example::
A past version of this PEP allowed type checkers to assume an optional
type when the default value is ``None``, as in this code::
def handle_employee(e: Employee = None): ...
This is equivalent to::
This would have been treated as equivalent to::
def handle_employee(e: Optional[Employee] = None) -> None: ...
This is no longer the recommended behavior. Type checkers should move
towards requiring the optional type to be made explicit.
Support for singleton types in unions
-------------------------------------
@ -1367,11 +1370,12 @@ Positional-only arguments
Some functions are designed to take their arguments only positionally,
and expect their callers never to use the argument's name to provide
that argument by keyword. All arguments with names beginning with
``__`` are assumed to be positional-only::
``__`` are assumed to be positional-only, except if their names also
end with ``__``::
def quux(__x: int) -> None: ...
def quux(__x: int, __y__: int = 0) -> None: ...
quux(3) # This call is fine.
quux(3, __y__=1) # This call is fine.
quux(__x=3) # This call is an error.
@ -1409,7 +1413,7 @@ for example::
c = None # type: Coroutine[List[str], str, int]
...
x = c.send('hi') # type: List[str]
async def bar(): -> None:
async def bar() -> None:
x = await c # type: int
The module also provides generic ABCs ``Awaitable``,
@ -1464,10 +1468,7 @@ complex cases, a comment of the following format may be used::
x, y, z = [], [], [] # type: List[int], List[int], List[str]
x, y, z = [], [], [] # type: (List[int], List[int], List[str])
a, b, *c = range(5) # type: float, float, List[float]
x = [
1,
2,
] # type: List[int]
x = [1, 2] # type: List[int]
Type comments should be put on the last line of the statement that
contains the variable definition. They can also be placed on
@ -1857,6 +1858,14 @@ Stub file package authors might use the following snippet in ``setup.py``::
],
...
(*UPDATE:* As of June 2018 the recommended way to distribute type
hints for third-party packages has changed -- in addition to typeshed
(see the next section) there is now a standard for distributing type
hints, PEP 561. It supports separately installable packages containing
stubs, stub files included in the same distribution as the executable
code of a package, and inline type hints, the latter two options
enabled by including a file named ``py.typed`` in the package.)
The Typeshed Repo
-----------------

View File

@ -463,7 +463,7 @@ Expected Uses
The primary expected use case is various forms of testing -- "are the
results computed near what I expect as a result?" This sort of test
may or may not be part of a formal unit testing suite. Such testing
could be used one-off at the command line, in an iPython notebook,
could be used one-off at the command line, in an IPython notebook,
part of doctests, or simple asserts in an ``if __name__ == "__main__"``
block.

View File

@ -268,7 +268,7 @@ categories based on GNU autotools. This expanded scheme should help installers
to implement system policy, but installers may root each category at any
location.
A UNIX install scheme might map the categories to their installation patnhs
A UNIX install scheme might map the categories to their installation paths
like this::
{

View File

@ -34,9 +34,9 @@ Release Manager and Crew
3.6 Lifespan
============
3.6 will receive bugfix updates approximately every 3-6 months for
approximately 18 months. After the release of 3.7.0 final, a final
3.6 bugfix update will be released. After that, it is expected that
3.6 will receive bugfix updates approximately every 3 months for
approximately 24 months. After the release of 3.7.0 final, two more
3.6 bugfix updates will be released. After that, it is expected that
security updates (source only) will be released until 5 years after
the release of 3.6 final, so until approximately December 2021.
@ -93,32 +93,39 @@ Actual:
- 3.6.4 final: 2017-12-19
Expected:
3.6.5 schedule
--------------
- 3.6.5 candidate: 2018-03-12 (tenative)
- 3.6.5 candidate: 2018-03-13
- 3.6.5 final: 2018-03-26 (tentative)
- 3.6.5 final: 2018-03-28
3.6.6 schedule
--------------
- 3.6.6 candidate: 2018-06-04 (tenative)
- 3.6.6 candidate: 2018-06-12
- 3.6.6 final: 2018-06-15 (tentative)
- 3.6.6 final: 2018-06-27
Expected:
3.6.7 schedule
--------------
- 3.6.7 candidate: 2018-09-10 (tentative)
- 3.6.7 final: 2018-09-24 (tentative)
3.6.8 schedule
--------------
Final maintenance mode release, final binary releases.
- 3.6.6 candidate: 2018-09-10 (tenative)
- 3.6.8 candidate: 2018-12-03 (tentative)
- 3.6.6 final: 2018-09-24 (tentative)
- 3.6.8 final: 2018-12-16 (tentative)
3.6.8 and beyond schedule
3.6.9 and beyond schedule
-------------------------
Security fixes only, as needed, until 2021-12

748
pep-0505.rst Normal file
View File

@ -0,0 +1,748 @@
PEP: 505
Title: None-aware operators
Version: $Revision$
Last-Modified: $Date$
Author: Mark E. Haase <mehaase@gmail.com>, Steve Dower <steve.dower@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Sep-2015
Python-Version: 3.8
Abstract
========
Several modern programming languages have so-called "``null``-coalescing" or
"``null``- aware" operators, including C# [1]_, Dart [2]_, Perl, Swift, and PHP
(starting in version 7). These operators provide syntactic sugar for common
patterns involving null references.
* The "``null``-coalescing" operator is a binary operator that returns its left
operand if it is not ``null``. Otherwise it returns its right operand.
* The "``null``-aware member access" operator accesses an instance member only
if that instance is non-``null``. Otherwise it returns ``null``. (This is also
called a "safe navigation" operator.)
* The "``null``-aware index access" operator accesses an element of a collection
only if that collection is non-``null``. Otherwise it returns ``null``. (This
is another type of "safe navigation" operator.)
This PEP proposes three ``None``-aware operators for Python, based on the
definitions and other language's implementations of those above. Specifically:
* The "``None`` coalescing`` binary operator ``??`` returns the left hand side
if it evaluates to a value that is not ``None``, or else it evaluates and
returns the right hand side. A coalescing ``??=`` augmented assignment
operator is included.
* The "``None``-aware attribute access" operator ``?.`` evaluates the complete
expression if the left hand side evaluates to a value that is not ``None``
* The "``None``-aware indexing" operator ``?[]`` evaluates the complete
expression if the left hand site evaluates to a value that is not ``None``
Syntax and Semantics
====================
Specialness of ``None``
-----------------------
The ``None`` object denotes the lack of a value. For the purposes of these
operators, the lack of a value indicates that the remainder of the expression
also lacks a value and should not be evaluated.
A rejected proposal was to treat any value that evaluates to false in a
Boolean context as not having a value. However, the purpose of these operators
is to propagate the "lack of value" state, rather that the "false" state.
Some argue that this makes ``None`` special. We contend that ``None`` is
already special, and that using it as both the test and the result of these
operators does not change the existing semantics in any way.
See the `Rejected Ideas`_ section for discussion on the rejected approaches.
Grammar changes
---------------
The following rules of the Python grammar are updated to read::
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=' | '??=')
power: coalesce ['**' factor]
coalesce: atom_expr ['??' factor]
atom_expr: ['await'] atom trailer*
trailer: ('(' [arglist] ')' |
'[' subscriptlist ']' |
'?[' subscriptlist ']' |
'.' NAME |
'?.' NAME)
Inserting the ``coalesce`` rule in this location ensures that expressions
resulting in ``None`` are natuarlly coalesced before they are used in
operations that would typically raise ``TypeError``. Like ``and`` and ``or``
the right-hand expression is not evaluated until the left-hand side is
determined to be ``None``. For example::
a, b = None, None
def c(): return None
def ex(): raise Exception()
(a ?? 2 ** b ?? 3) == a ?? (2 ** (b ?? 3))
(a * b ?? c // d) == a * (b ?? c) // d
(a ?? True and b ?? False) == (a ?? True) and (b ?? False)
(c() ?? c() ?? True) == True
(True ?? ex()) == True
(c ?? ex)() == c()
Augmented coalescing assignment only rebinds the name if its current value is
``None``. If the target name already has a value, the right-hand side is not
evaluated. For example::
a = None
b = ''
c = 0
a ??= 'value'
b ??= undefined_name
c ??= shutil.rmtree('/') # don't try this at home, kids
assert a == 'value'
assert b == ''
assert c == '0' and any(os.scandir('/'))
Adding new trailers for the other ``None``-aware operators ensures that they
may be used in all valid locations for the existing equivalent operators,
including as part of an assignment target (more details below). As the existing
evaluation rules are not directly embedded in the grammar, we specify the
required changes here.
Assume that the ``atom`` is always successfully evaluated. Each ``trailer`` is
then evaluated from left to right, applying its own parameter (either its
arguments, subscripts or attribute name) to produce the value for the next
``trailer``. Finally, if present, ``await`` is applied.
For example, ``await a.b(c).d[e]`` is currently parsed as
``['await', 'a', '.b', '(c)', '.d', '[e]']`` and evaluated::
_v = a
_v = _v.b
_v = _v(c)
_v = _v.d
_v = _v[e]
await _v
When a ``None``-aware operator is present, the left-to-right evaluation may be
short-circuited. For example, ``await a?.b(c).d?[e]`` is evaluated::
_v = a
if _v is not None:
_v = _v.b
_v = _v(c)
_v = _v.d
if _v is not None:
_v = _v[e]
await _v
.. note::
``await`` will almost certainly fail in this context, as it would in
the case where code attempts ``await None``. We are not proposing to add a
``None``-aware ``await`` keyword here, and merely include it in this
example for completeness of the specification, since the ``atom_expr``
grammar rule includes the keyword. If it were in its own rule, we would have
never mentioned it.
Parenthesised expressions are handled by the ``atom`` rule (not shown above),
which will implicitly terminate the short-circuiting behaviour of the above
transformation. For example, ``(a?.b ?? c).d?.e`` is evaluated as::
# a?.b
_v = a
if _v is not None:
_v = _v.b
# ... ?? c
if _v is None:
_v = c
# (...).d?.e
_v = _v.d
if _v is not None:
_v = _v.e
When used as an assignment target, the ``None``-aware operations may only be
used in a "load" context. That is, ``a?.b = 1`` and ``a?[b] = 1`` will raise
``SyntaxError``. Use earlier in the expression (``a?.b.c = 1``) is permitted,
though unlikely to be useful unless combined with a coalescing operation::
(a?.b ?? d).c = 1
Examples
========
This section presents some examples of common ``None`` patterns and shows what
conversion to use ``None``-aware operators may look like.
Standard Library
----------------
Using the ``find-pep505.py`` script[3]_ an analysis of the Python 3.7 standard
library discovered up to 678 code snippets that could be replaced with use of
one of the ``None``-aware operators::
$ find /usr/lib/python3.7 -name '*.py' | xargs python3.7 find-pep505.py
<snip>
Total None-coalescing `if` blocks: 449
Total [possible] None-coalescing `or`: 120
Total None-coalescing ternaries: 27
Total Safe navigation `and`: 13
Total Safe navigation `if` blocks: 61
Total Safe navigation ternaries: 8
Some of these are shown below as examples before and after converting to use the
new operators.
From ``bisect.py``::
def insort_right(a, x, lo=0, hi=None):
# ...
if hi is None:
hi = len(a)
# ...
After updating to use the ``??=`` augmented assignment statement::
def insort_right(a, x, lo=0, hi=None):
# ...
hi ??= len(a)
# ...
From ``calendar.py``::
encoding = options.encoding
if encoding is None:
encoding = sys.getdefaultencoding()
optdict = dict(encoding=encoding, css=options.css)
After updating to use the ``??`` operator::
optdict = dict(encoding=encoding ?? sys.getdefaultencoding(),
css=options.css)
From ``dis.py``::
def _get_const_info(const_index, const_list):
argval = const_index
if const_list is not None:
argval = const_list[const_index]
return argval, repr(argval)
After updating to use the ``?[]`` and ``??`` operators::
def _get_const_info(const_index, const_list):
argval = const_list?[const_index] ?? const_index
return argval, repr(argval)
From ``inspect.py``::
for base in object.__bases__:
for name in getattr(base, "__abstractmethods__", ()):
value = getattr(object, name, None)
if getattr(value, "__isabstractmethod__", False):
return True
After updating to use the ``?.`` operator (and deliberately not converting to
use ``any()``)::
for base in object.__bases__:
for name in base?.__abstractmethods__ ?? ():
if object?.name?.__isabstractmethod__:
return True
From ``os.py``::
if entry.is_dir():
dirs.append(name)
if entries is not None:
entries.append(entry)
else:
nondirs.append(name)
After updating to use the ``?.`` operator::
if entry.is_dir():
dirs.append(name)
entries?.append(entry)
else:
nondirs.append(name)
jsonify
-------
This example is from a Python web crawler that uses the Flask framework as its
front-end. This function retrieves information about a web site from a SQL
database and formats it as JSON to send to an HTTP client::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
url=site.url.rstrip('/')
)
Both ``first_seen`` and ``last_seen`` are allowed to be ``null`` in the
database, and they are also allowed to be ``null`` in the JSON response. JSON
does not have a native way to represent a ``datetime``, so the server's contract
states that any non-``null`` date is represented as an ISO-8601 string.
Without knowing the exact semantics of the ``first_seen`` and ``last_seen``
attributes, it is impossible to know whether the attribute can be safely or
performantly accessed multiple times.
One way to fix this code is to replace each conditional expression with an
explicit value assignment and a full ``if``/``else`` block::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
first_seen_dt = site.first_seen
if first_seen_dt is None:
first_seen = None
else:
first_seen = first_seen_dt.isoformat()
last_seen_dt = site.last_seen
if last_seen_dt is None:
last_seen = None
else:
last_seen = last_seen_dt.isoformat()
return jsonify(
first_seen=first_seen,
id=site.id,
is_active=site.is_active,
last_seen=last_seen,
url=site.url.rstrip('/')
)
This adds ten lines of code and four new code paths to the function,
dramatically increasing the apparent complexity. Rewriting using the
``None``-aware attribute operator results in shorter code with more clear
intent::
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen?.isoformat(),
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen?.isoformat(),
url=site.url.rstrip('/')
)
Grab
----
The next example is from a Python scraping library called `Grab
<https://github.com/lorien/grab/blob/4c95b18dcb0fa88eeca81f5643c0ebfb114bf728/gr
ab/upload.py>`_::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
if ctype is None:
return 'application/octet-stream'
else:
return ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
if filename is None:
self.filename = self.get_random_filename()
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
if filename is None:
self.filename = os.path.split(path)[1]
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
This example contains several good examples of needing to provide default
values. Rewriting to use conditional expressions reduces the overall lines of
code, but does not necessarily improve readability::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return 'application/octet-stream' if ctype is None else ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = (self.get_random_filename() if filename
is None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = (os.path.split(path)[1] if filename is
None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
The first ternary expression is tidy, but it reverses the intuitive order of
the operands: it should return ``ctype`` if it has a value and use the string
literal as fallback. The other ternary expressions are unintuitive and so
long that they must be wrapped. The overall readability is worsened, not
improved.
Rewriting using the ``None`` coalescing operator::
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return ctype ?? 'application/octet-stream'
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = filename ?? self.get_random_filename()
self.content_type = content_type ?? self.find_content_type(self.filename)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = filename ?? os.path.split(path)[1]
self.content_type = content_type ?? self.find_content_type(self.filename)
This syntax has an intuitive ordering of the operands. In ``find_content_type``,
for example, the preferred value ``ctype`` appears before the fallback value.
The terseness of the syntax also makes for fewer lines of code and less code to
visually parse, and reading from left-to-right and top-to-bottom more accurately
follows the execution flow.
Rejected Ideas
==============
The first three ideas in this section are oft-proposed alternatives to treating
``None`` as special. For further background on why these are rejected, see their
treatment in `PEP 531 <https://www.python.org/dev/peps/pep-0531/>`_ and
`PEP 532 <https://www.python.org/dev/peps/pep-0532/>`_ and the associated
discussions.
No-Value Protocol
-----------------
The operators could be generalised to user-defined types by defining a protocol
to indicate when a value represents "no value". Such a protocol may be a dunder
method ``__has_value__(self)` that returns ``True`` if the value should be
treated as having a value, and ``False`` if the value should be treated as no
value.
With this generalization, ``object`` would implement a dunder method equivalent
to this::
def __has_value__(self):
return True
``NoneType`` would implement a dunder method equivalent to this::
def __has_value__(self):
return False
In the specification section, all uses of ``x is None`` would be replaced with
``not x.__has_value__()``.
This generalization would allow for domain-specific "no-value" objects to be
coalesced just like ``None``. For example the ``pyasn1`` package has a type
called ``Null`` that represents an ASN.1 ``null``::
>>> from pyasn1.type import univ
>>> univ.Null() ?? univ.Integer(123)
Integer(123)
Similarly, values such as ``math.nan`` and ``NotImplemented`` could be treated
as representing no value.
However, the "no-value" nature of these values is domain-specific, which means
they *should* be treated as a value by the language. For example,
``math.nan.imag`` is well defined (it's ``0.0``), and so short-circuiting
``math.nan?.imag`` to return ``math.nan`` would be incorrect.
As ``None`` is already defined by the language as being the value that
represents "no value", and the current specification would not preclude
switching to a protocol in the future (though changes to built-in objects would
not be compatible), this idea is rejected for now.
Boolean-aware operators
-----------------------
This suggestion is fundamentally the same as adding a no-value protocol, and so
the discussion above also applies.
Similar behavior to the ``??`` operator can be achieved with an ``or``
expression, however ``or`` checks whether its left operand is false-y and not
specifically ``None``. This approach is attractive, as it requires fewer changes
to the language, but ultimately does not solve the underlying problem correctly.
Assuming the check is for truthiness rather than ``None``, there is no longer a
need for the ``??`` operator. However, applying this check to the ``?.`` and
``?[]`` operators prevents perfectly valid operations applying
Consider the following example, where ``get_log_list()`` may return either a
list containing current log messages (potentially empty), or ``None`` if logging
is not enabled::
lst = get_log_list()
lst?.append('A log message')
If ``?.`` is checking for true values rather than specifically ``None`` and the
log has not been initialized with any items, no item will ever be appended. This
violates the obvious intent of the code, which is to append an item. The
``append`` method is available on an empty list, as are all other list methods,
and there is no reason to assume that these members should not be used because
the list is presently empty.
Further, there is no sensible result to use in place of the expression. A
normal ``lst.append`` returns ``None``, but under this idea ``lst?.append`` may
result in either ``[]`` or ``None``, depending on the value of ``lst``. As with
the examples in the previous section, this makes no sense.
As checking for truthiness rather than ``None`` results in apparently valid
expressions no longer executing as intended, this idea is rejected.
Exception-aware operators
-------------------------
Arguably, the reason to short-circuit an expression when ``None`` is encountered
is to avoid the ``AttributeError`` or ``TypeError`` that would be raised under
normal circumstances. As an alternative to testing for ``None``, the ``?.`` and
``?[]`` operators could instead handle ``AttributeError`` and ``TypeError``
raised by the operation and skip the remainder of the expression.
This produces a transformation for ``a?.b.c?.d.e`` similar to this::
_v = a
try:
_v = _v.b
except AttributeError:
pass
else:
_v = _v.c
try:
_v = _v.d
except AttributeError:
pass
else:
_v = _v.e
One open question is which value should be returned as the expression when an
exception is handled. The above example simply leaves the partial result, but
this is not helpful for replacing with a default value. An alternative would be
to force the result to ``None``, which then raises the question as to why
``None`` is special enough to be the result but not special enough to be the
test.
Secondly, this approach masks errors within code executed implicitly as part of
the expression. For ``?.``, any ``AttributeError`` within a property or
``__getattr__`` implementation would be hidden, and similarly for ``?[]`` and
``__getitem__`` implementations.
Similarly, simple typing errors such as ``{}?.ietms()`` could go unnoticed.
Existing conventions for handling these kinds of errors in the form of the
``getattr`` builtin and the ``.get(key, default)`` method pattern established by
``dict`` show that it is already possible to explicitly use this behaviour.
As this approach would hide errors in code, it is rejected.
``None``-aware Function Call
----------------------------
The ``None``-aware syntax applies to attribute and index access, so it seems
natural to ask if it should also apply to function invocation syntax. It might
be written as ``foo?()``, where ``foo`` is only called if it is not None.
This has been deferred on the basis of the proposed operators being intended
to aid traversal of partially populated hierarchical data structures, *not*
for traversal of arbitrary class hierarchies. This is reflected in the fact
that none of the other mainstream languages that already offer this syntax
have found it worthwhile to support a similar syntax for optional function
invocations.
A workaround similar to that used by C# would be to write
``maybe_none?.__call__(arguments)``. If the callable is ``None``, the
expression will not be evaluated. (The C# equivalent uses ``?.Invoke()`` on its
callable type.)
``?`` Unary Postfix Operator
----------------------------
To generalize the ``None``-aware behavior and limit the number of new operators
introduced, a unary, postfix operator spelled ``?`` was suggested. The idea is
that ``?`` might return a special object that could would override dunder
methods that return ``self``. For example, ``foo?`` would evaluate to ``foo`` if
it is not ``None``, otherwise it would evaluate to an instance of
``NoneQuestion``::
class NoneQuestion():
def __call__(self, *args, **kwargs):
return self
def __getattr__(self, name):
return self
def __getitem__(self, key):
return self
With this new operator and new type, an expression like ``foo?.bar[baz]``
evaluates to ``NoneQuestion`` if ``foo`` is None. This is a nifty
generalization, but it's difficult to use in practice since most existing code
won't know what ``NoneQuestion`` is.
Going back to one of the motivating examples above, consider the following::
>>> import json
>>> created = None
>>> json.dumps({'created': created?.isoformat()})``
The JSON serializer does not know how to serialize ``NoneQuestion``, nor will
any other API. This proposal actually requires *lots of specialized logic*
throughout the standard library and any third party library.
At the same time, the ``?`` operator may also be **too general**, in the sense
that it can be combined with any other operator. What should the following
expressions mean?::
>>> x? + 1
>>> x? -= 1
>>> x? == 1
>>> ~x?
This degree of generalization is not useful. The operators actually proposed
herein are intentionally limited to a few operators that are expected to make it
easier to write common code patterns.
Built-in ``maybe``
------------------
Haskell has a concept called `Maybe <https://wiki.haskell.org/Maybe>`_ that
encapsulates the idea of an optional value without relying on any special
keyword (e.g. ``null``) or any special instance (e.g. ``None``). In Haskell, the
purpose of ``Maybe`` is to avoid separate handling of "something" and nothing".
A Python package called `pymaybe <https://pypi.org/p/pymaybe/>`_ provides a
rough approximation. The documentation shows the following example::
>>> maybe('VALUE').lower()
'value'
>>> maybe(None).invalid().method().or_else('unknown')
'unknown'
The function ``maybe()`` returns either a ``Something`` instance or a
``Nothing`` instance. Similar to the unary postfix operator described in the
previous section, ``Nothing`` overrides dunder methods in order to allow
chaining on a missing value.
Note that ``or_else()`` is eventually required to retrieve the underlying value
from ``pymaybe``'s wrappers. Furthermore, ``pymaybe`` does not short circuit any
evaluation. Although ``pymaybe`` has some strengths and may be useful in its own
right, it also demonstrates why a pure Python implementation of coalescing is
not nearly as powerful as support built into the language.
The idea of adding a builtin ``maybe`` type to enable this scenario is rejected.
Just use a conditional expression
---------------------------------
Another common way to initialize default values is to use the ternary operator.
Here is an excerpt from the popular `Requests package
<https://github.com/kennethreitz/requests/blob/14a555ac716866678bf17e43e23230d81
a8149f5/requests/models.py#L212>`_::
data = [] if data is None else data
files = [] if files is None else files
headers = {} if headers is None else headers
params = {} if params is None else params
hooks = {} if hooks is None else hooks
This particular formulation has the undesirable effect of putting the operands
in an unintuitive order: the brain thinks, "use ``data`` if possible and use
``[]`` as a fallback," but the code puts the fallback *before* the preferred
value.
The author of this package could have written it like this instead::
data = data if data is not None else []
files = files if files is not None else []
headers = headers if headers is not None else {}
params = params if params is not None else {}
hooks = hooks if hooks is not None else {}
This ordering of the operands is more intuitive, but it requires 4 extra
characters (for "not "). It also highlights the repetition of identifiers:
``data if data``, ``files if files``, etc.
When written using the ``None`` coalescing operator, the sample reads::
data = data ?? []
files = files ?? []
headers = headers ?? {}
params = params ?? {}
hooks = hooks ?? {}
References
==========
.. [1] C# Reference: Operators
(https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
.. [2] A Tour of the Dart Language: Operators
(https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
.. [3] Associated scripts
(https://github.com/python/peps/tree/master/pep-0505/)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

File diff suppressed because it is too large Load Diff

View File

@ -3,7 +3,7 @@ Title: Adding A Secrets Module To The Standard Library
Version: $Revision$
Last-Modified: $Date$
Author: Steven D'Aprano <steve@pearwood.info>
Status: Accepted
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 19-Sep-2015

View File

@ -359,7 +359,7 @@ of ``PythonCore`` is omitted but shown in a later example::
(winreg.HKEY_LOCAL_MACHINE, r'Software\Python', winreg.KEY_WOW64_32KEY),
]:
with winreg.OpenKeyEx(hive, key, access=winreg.KEY_READ | flags) as root_key:
for comany in enum_keys(root_key):
for company in enum_keys(root_key):
if company == 'PyLauncher':
continue

View File

@ -6,7 +6,7 @@ Author: Nathaniel J. Smith <njs@pobox.com>,
Thomas Kluyver <thomas@kluyver.me.uk>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: <distutils-sig@python.org>
Status: Accepted
Status: Provisional
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Sep-2015

View File

@ -7,8 +7,8 @@ Author: Brett Cannon <brett@python.org>,
Donald Stufft <donald@stufft.io>
BDFL-Delegate: Nick Coghlan
Discussions-To: distutils-sig <distutils-sig at python.org>
Status: Accepted
Type: Informational
Status: Provisional
Type: Standards Track
Content-Type: text/x-rst
Created: 10-May-2016
Post-History: 10-May-2016,
@ -129,24 +129,63 @@ of requirements for the build system to simply begin execution.
Specification
=============
File Format
-----------
The build system dependencies will be stored in a file named
``pyproject.toml`` that is written in the TOML format [#toml]_. This
format was chosen as it is human-usable (unlike JSON [#json]_), it is
flexible enough (unlike configparser [#configparser]_), stems from a
standard (also unlike configparser [#configparser]_), and it is not
overly complex (unlike YAML [#yaml]_). The TOML format is already in
use by the Rust community as part of their
``pyproject.toml`` that is written in the TOML format [#toml]_.
This format was chosen as it is human-usable (unlike JSON [#json]_),
it is flexible enough (unlike configparser [#configparser]_), stems
from a standard (also unlike configparser [#configparser]_), and it
is not overly complex (unlike YAML [#yaml]_). The TOML format is
already in use by the Rust community as part of their
Cargo package manager [#cargo]_ and in private email stated they have
been quite happy with their choice of TOML. A more thorough
discussion as to why various alternatives were not chosen can be read
in the `Other file formats`_ section.
There will be a ``[build-system]`` table in the
configuration file to store build-related data. Initially only one key
of the table will be valid and mandatory: ``requires``. That key will
have a value of a list of strings representing the PEP 508
dependencies required to execute the build system (currently that
means what dependencies are required to execute a ``setup.py`` file).
Tables not specified in this PEP are reserved for future use by other
PEPs.
build-system table
------------------
The ``[build-system]`` table is used to store build-related data.
Initially only one key of the table will be valid and mandatory:
``requires``. This key must have a value of a list of strings
representing PEP 508 dependencies required to execute the build
system (currently that means what dependencies are required to
execute a ``setup.py`` file).
For the vast majority of Python projects that rely upon setuptools,
the ``pyproject.toml`` file will be::
[build-system]
# Minimum requirements for the build system to execute.
requires = ["setuptools", "wheel"] # PEP 508 specifications.
Because the use of setuptools and wheel are so expansive in the
community at the moment, build tools are expected to use the example
configuration file above as their default semantics when a
``pyproject.toml`` file is not present.
tool table
----------
The ``[tool]`` table is where tools can have users specify
configuration data as long as they use a sub-table within ``[tool]``,
e.g. the `flit <https://pypi.python.org/pypi/flit>`_ tool would store
its configuration in ``[tool.flit]``.
We need some mechanism to allocate names within the ``tool.*``
namespace, to make sure that different projects don't attempt to use
the same sub-table and collide. Our rule is that a project can use
the subtable ``tool.$NAME`` if, and only if, they own the entry for
``$NAME`` in the Cheeseshop/PyPI.
JSON Schema
-----------
To provide a type-specific representation of the resulting data from
the TOML file for illustrative purposes only, the following JSON
@ -180,31 +219,6 @@ Schema [#jsonschema]_ would match the data format::
}
}
For the vast majority of Python projects that rely upon setuptools,
the ``pyproject.toml`` file will be::
[build-system]
# Minimum requirements for the build system to execute.
requires = ["setuptools", "wheel"] # PEP 508 specifications.
Because the use of setuptools and wheel are so expansive in the
community at the moment, build tools are expected to use the example
configuration file above as their default semantics when a
``pyproject.toml`` file is not present.
All other top-level keys and tables are reserved for future use by
other PEPs except for the ``[tool]`` table. Within that table, tools
can have users specify configuration data as long as they use a
sub-table within ``[tool]``, e.g. the
`flit <https://pypi.python.org/pypi/flit>`_ tool would store its
configuration in ``[tool.flit]``.
We need some mechanism to allocate names within the ``tool.*``
namespace, to make sure that different projects don't attempt to use
the same sub-table and collide. Our rule is that a project can use
the subtable ``tool.$NAME`` if, and only if, they own the entry for
``$NAME`` in the Cheeseshop/PyPI.
Rejected Ideas
==============
@ -255,6 +269,44 @@ vendored easily by projects. This outright excluded certain formats
like XML which are not friendly towards human beings and were never
seriously discussed.
Overview of file formats considered
'''''''''''''''''''''''''''''''''''
The key reasons for rejecting the other alternatives considered are
summarised in the following sections, while the full review (including
positive arguments in favour of TOML) can be found at [#file_formats]_.
TOML was ultimately selected as it provided all the features we
were interested in, while avoiding the downsides introduced by
the alternatives.
======================= ==== ==== ==== =======
Feature TOML YAML JSON CFG/INI
======================= ==== ==== ==== =======
Well-defined yes yes yes
Real data types yes yes yes
Reliable Unicode yes yes yes
Reliable comments yes yes
Easy for humans to edit yes ?? ??
Easy for tools to edit yes ?? yes ??
In standard library yes yes
Easy for pip to vendor yes n/a n/a
======================= ==== ==== ==== =======
("??" in the table indicates items where most folks would be
inclined to answer "yes", but there turn out to be a lot of
quirks and edge cases that arise in practice due to either
the lack of a clear specification, or else the underlying
file format specification being surprisingly complicated)
The ``pytoml`` TOML parser is ~300 lines of pure Python code,
so being outside the standard library didn't count heavily
against it.
Python literals were also discussed as a potential format, but
weren't considered in the file format review (since they're not
a common pre-existing file format).
JSON
''''
@ -375,6 +427,17 @@ An example Python literal file for the proposed data would be::
}
Sticking with ``setup.cfg``
---------------------------
There are two issues with ``setup.cfg`` used by setuptools as a general
format. One is that they are ``.ini`` files which have issues as mentioned
in the configparser_ discussion above. The other is that the schema for
that file has never been rigorously defined and thus it's unknown which
format would be safe to use going forward without potentially confusing
setuptools installations.
Other file names
----------------
@ -474,6 +537,9 @@ References
.. [#jsonschema] JSON Schema
(http://json-schema.org/)
.. [#file_formats] Nathaniel J. Smith's file format review
(https://gist.github.com/njsmith/78f68204c5d969f8c8bc645ef77d4a8f)
Copyright
=========

View File

@ -365,7 +365,8 @@ and local variables should have a single space after corresponding colon.
There should be no space before the colon. If an assignment has right hand
side, then the equality sign should have exactly one space on both sides.
Examples:
* Yes::
- Yes::
code: int
@ -373,7 +374,7 @@ Examples:
coords: Tuple[int, int]
label: str = '<unknown>'
* No::
- No::
code:int # No space after colon
code : int # Space before colon

View File

@ -34,7 +34,7 @@ Release Manager and Crew
3.7 Lifespan
============
3.7 will receive bugfix updates approximately every 3-6 months for
3.7 will receive bugfix updates approximately every 1-3 months for
approximately 18 months. After the release of 3.8.0 final, a final
3.7 bugfix update will be released. After that, it is expected that
security updates (source only) will be released until 5 years after
@ -56,15 +56,21 @@ Actual:
- 3.7.0 alpha 4: 2018-01-09
- 3.7.0 beta 1: 2018-01-31
(No new features beyond this point.)
- 3.7.0 beta 2: 2018-02-27
- 3.7.0 beta 3: 2018-03-29
- 3.7.0 beta 4: 2018-05-02
- 3.7.0 beta 5: 2018-05-30
- 3.7.0 candidate 1: 2018-06-12
- 3.7.0 final: 2018-06-27
Expected:
- 3.7.0 beta 2: 2018-02-26
- 3.7.0 beta 3: 2018-03-26
- 3.7.0 beta 4: 2018-04-30
- 3.7.0 candidate 1: 2018-05-21
- 3.7.0 candidate 2: 2018-06-04 (if necessary)
- 3.7.0 final: 2018-06-15
Maintenance releases
--------------------
Expected:
- 3.7.1: 2018-07-xx
Features for 3.7

View File

@ -97,6 +97,9 @@ with a runtime ``PYTHONCOERCECLOCALE=warn`` environment variable setting
that allows developers and system integrators to opt-in to receiving locale
coercion and compatibility warnings, without emitting them by default.
The output examples in the PEP itself have also been updated to remove
the warnings and make them easier to read.
Background
==========
@ -352,10 +355,12 @@ proposed solution:
PEP process or Python release announcements. However, to minimize the chance
of introducing new problems for end users, we'll do this *without* using the
warnings system, so even running with ``-Werror`` won't turn it into a runtime
exception.
exception. (Note: these warnings ended up being silenced by default. See the
Implementation Note above for more details)
* for Python 3.7, any changed defaults will offer some form of explicit "off"
switch at build time, runtime, or both
Minimizing the negative impact on systems currently correctly configured to
use GB-18030 or another partially ASCII compatible universal encoding leads to
the following design principle:
@ -459,6 +464,9 @@ successfully configured::
Python detected LC_CTYPE=C: LC_CTYPE coerced to C.UTF-8 (set another
locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behaviour).
(Note: this warning ended up being silenced by default. See the
Implementation Note above for more details)
As long as the current platform provides at least one of the candidate UTF-8
based environments, this locale coercion will mean that the standard
Python binary *and* locale-aware extensions should once again "just work"
@ -508,6 +516,9 @@ configured locale is still the default ``C`` locale and
C.utf8, or UTF-8 (if available) as alternative Unicode-compatible
locales is recommended.
(Note: this warning ended up being silenced by default. See the
Implementation Note above for more details)
In this case, no actual change will be made to the locale settings.
Instead, the warning informs both system and application integrators that
@ -535,6 +546,10 @@ The locale warning behaviour would be controlled by the flag
``--with[out]-c-locale-warning``, which would set the ``PY_WARN_ON_C_LOCALE``
preprocessor definition.
(Note: this compile time warning option ended up being replaced by a runtime
``PYTHONCOERCECLOCALE=warn`` option. See the Implementation Note above for
more details)
On platforms which don't use the ``autotools`` based build system (i.e.
Windows) these preprocessor variables would always be undefined.
@ -925,8 +940,6 @@ cover, as it avoids causing any problems in cases like the following::
$ LANG=C LC_MONETARY=ja_JP.utf8 ./python -c \
"from locale import setlocale, LC_ALL, currency; setlocale(LC_ALL, ''); print(currency(1e6))"
Python detected LC_CTYPE=C: LC_CTYPE & LANG coerced to C.UTF-8 (set another
locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behavior).
¥1000000
@ -966,9 +979,6 @@ from a PEP 538 enabled CPython build, where each line after the first is
executed by doing "up-arrow, left-arrow x4, delete, enter"::
$ LANG=C ./python
Python detected LC_CTYPE=C: LC_CTYPE & LANG coerced to C.UTF-8 (set
another locale or PYTHONCOERCECLOCALE=0 to disable this locale
coercion behavior).
Python 3.7.0a0 (heads/pep538-coerce-c-locale:188e780, May 7 2017, 00:21:13)
[GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
@ -1064,7 +1074,7 @@ Accordingly, this PEP originally proposed to disable locale coercion and
warnings at build time for these platforms, on the assumption that it would
be entirely redundant.
However, that assumpion turned out to be incorrect assumption, as subsequent
However, that assumption turned out to be incorrect, as subsequent
investigations showed that if you explicitly configure ``LANG=C`` on
these platforms, extension modules like GNU readline will misbehave in much the
same way as they do on other \*nix systems. [21_]

View File

@ -4,10 +4,11 @@ Version: $Revision$
Last-Modified: $Date$
Author: Erik M. Bray, Masayuki Yamamoto
BDFL-Delegate: Nick Coghlan
Status: Accepted
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 20-Dec-2016
Python-Version: 3.7
Post-History: 16-Dec-2016, 31-Aug-2017, 08-Sep-2017
Resolution: https://mail.python.org/pipermail/python-dev/2017-September/149358.html

View File

@ -18,7 +18,7 @@ Abstract
Add a new "UTF-8 Mode" to enhance Python's use of UTF-8. When UTF-8 Mode
is active, Python will:
* use the ``utf-8`` encoding, irregardless of the locale currently set by
* use the ``utf-8`` encoding, regardless of the locale currently set by
the current platform, and
* change the ``stdin`` and ``stdout`` error handlers to
``surrogateescape``.
@ -163,7 +163,7 @@ The UTF-8 Mode has the same effect as locale coercion:
``surrogateescape``.
These changes only affect Python code. But the locale coercion has
addiditonal effects: the ``LC_CTYPE`` environment variable and the
additional effects: the ``LC_CTYPE`` environment variable and the
``LC_CTYPE`` locale are set to a UTF-8 locale like ``C.UTF-8``. One side
effect is that non-Python code is also impacted by the locale coercion.
The two PEPs are complementary.

View File

@ -3,12 +3,14 @@ Title: Package Index Name Retention
Version: $Revision$
Last-Modified: $Date$
Author: Łukasz Langa <lukasz@python.org>
BDFL-Delegate: Donald Stufft <donald@stufft.io>
BDFL-Delegate: Mark Mangoba <mmangoba@python.org>
Discussions-To: distutils-sig <distutils-sig@python.org>
Status: Draft
Status: Final
Type: Process
Content-Type: text/x-rst
Created: 12-January-2017
Post-History:
Resolution: https://mail.python.org/pipermail/distutils-sig/2018-March/032089.html
Abstract
@ -36,6 +38,22 @@ This document aims to provide general guidelines for solving the
most typical cases of such conflicts.
Approval Process
================
As the application of this policy has potential legal ramifications for the
Python Software Foundation, the approval process used is more formal than that
used for most PEPs.
Rather than accepting the PEP directly, the assigned BDFL-Delegate will instead
recommend its acceptance to the PSF's Packaging Working Group. After
consultation with the PSF's General Counsel, adoption of the policy will then
be subject to a formal vote within the working group.
This formal approval process will be used for both initial adoption of the
policy, and for adoption of any future amendments.
Specification
=============
@ -61,7 +79,9 @@ The use cases covered by this document are:
* resolving disputes over a name.
* Invalid projects.
* Invalid projects:
* projects subject to a claim of intellectual property infringement.
The proposed extension to the Terms of Use, as expressed in the
Implementation section, will be published as a separate document on the
@ -112,7 +132,7 @@ are met:
* the project has been determined *abandoned* by the rules described
above;
* the candidate is able to demonstrate own failed attempts to contact
* the candidate is able to demonstrate their own failed attempts to contact
the existing owner;
* the candidate is able to demonstrate improvements made on the
candidate's own fork of the project;
@ -137,7 +157,7 @@ of reusing the name when ALL of the following are met:
* the project has been determined *abandoned* by the rules described
above;
* the candidate is able to demonstrate own failed attempts to contact
* the candidate is able to demonstrate their own failed attempts to contact
the existing owner;
* the candidate is able to demonstrate that the project suggested to
reuse the name already exists and meets notability requirements;
@ -196,24 +216,53 @@ is considered invalid and will be removed from the Index:
The Package Index maintainers pre-emptively declare certain package
names as unavailable for security reasons.
If you find a project that you think might be considered invalid, create
a support request [7]_. Maintainers of the Package Index will review
the case.
Intellectual property policy
----------------------------
It is the policy of Python Software Foundation and the Package Index
maintainers to be appropriately responsive to claims of intellectual
property infringement by third parties. It is not the policy of
the Python Software Foundation nor the Package Index maintainers
to pre-screen uploaded packages for any type of intellectual property
infringement.
Possibly-infringing packages should be reported to legal@python.org
and counsel to the Python Software Foundation will determine an
appropriate response. A package can be removed or transferred to a
new owner at the sole discretion of the Python Software Foundation to
address a claim of infringement.
A project published on the Package Index meeting ANY of the following
may be considered infringing and subject to removal from the Index
or transferral to a new owner:
* project contains unlicensed copyrighted material from a third party,
and is subject to a properly made claim under the DMCA;
* project uses a third party's trademark in a way not covered by
nominal or fair use guidelines;
* project clearly implicates a patented system or process, and is
the subject of a complaint; or
* project is subject to an active lawsuit.
In the event of a complaint for intellectual property infringement,
a copy of the complaint will be sent to the package owner. In some
cases, action may be taken by the Package Index maintainers before
the owner responds.
The role of the Python Software Foundation
------------------------------------------
The Python Software Foundation [8]_ is the non-profit legal entity that
The Python Software Foundation [7]_ is the non-profit legal entity that
provides the Package Index as a community service.
The Package Index maintainers can escalate issues covered by this
document for resolution by the PSF Board if the matter is not clear
document for resolution by the Packaging Workgroup if the matter is not clear
enough. Some decisions *require* additional judgement by the Board,
especially in cases of Code of Conduct violations or legal claims.
Decisions made by the Board are published as Resolutions [9]_.
Recommendations made by the Board are sent to the Packaging Workgroup [8]_ for review.
The Board has the final say in any disputes covered by this document and
The Packaging Workgroup has the final say in any disputes covered by this document and
can decide to reassign or remove a project from the Package Index after
careful consideration even when not all requirements listed
here are met.
@ -266,7 +315,7 @@ References
(https://pypi.org/policy/terms-of-use/)
.. [2] The Python Package Index
(https://pypi.python.org/)
(https://pypi.org/)
.. [3] The Comprehensive Perl Archive Network
(http://www.cpan.org/)
@ -280,14 +329,11 @@ References
.. [6] Python Community Code of Conduct
(https://www.python.org/psf/codeofconduct/)
.. [7] PyPI Support Requests
(https://sourceforge.net/p/pypi/support-requests/)
.. [8] Python Software Foundation
.. [7] Python Software Foundation
(https://www.python.org/psf/)
.. [9] PSF Board Resolutions
(https://www.python.org/psf/records/board/resolutions/)
.. [8] Python Packaging Working Group
(https://wiki.python.org/psf/PackagingWG/)
Copyright

View File

@ -199,8 +199,8 @@ approaches related to structural subtyping in Python and other languages:
Such behavior seems to be a perfect fit for both runtime and static behavior
of protocols. As discussed in `rationale`_, we propose to add static support
for such behavior. In addition, to allow users to achieve such runtime
behavior for *user-defined* protocols a special ``@runtime`` decorator will
be provided, see detailed `discussion`_ below.
behavior for *user-defined* protocols a special ``@runtime_checkable`` decorator
will be provided, see detailed `discussion`_ below.
* TypeScript [typescript]_ provides support for user-defined classes and
interfaces. Explicit implementation declaration is not required and
@ -381,8 +381,7 @@ Explicitly declaring implementation
To explicitly declare that a certain class implements a given protocol,
it can be used as a regular base class. In this case a class could use
default implementations of protocol members. ``typing.Sequence`` is a good
example of a protocol with useful default methods. Static analysis tools are
default implementations of protocol members. Static analysis tools are
expected to automatically detect that a class implements a given protocol.
So while it's possible to subclass a protocol explicitly, it's *not necessary*
to do so for the sake of type-checking.
@ -587,6 +586,30 @@ Continuing the previous example::
walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable'
Self-types in protocols
-----------------------
The self-types in protocols follow the corresponding specification
[self-types]_ of PEP 484. For example::
C = TypeVar('C', bound='Copyable')
class Copyable(Protocol):
def copy(self: C) -> C:
class One:
def copy(self) -> 'One':
...
T = TypeVar('T', bound='Other')
class Other:
def copy(self: T) -> T:
...
c: Copyable
c = One() # OK
c = Other() # Also OK
Using Protocols
===============
@ -665,14 +688,14 @@ classes. For example::
One can use multiple inheritance to define an intersection of protocols.
Example::
from typing import Sequence, Hashable
from typing import Iterable, Hashable
class HashableFloats(Sequence[float], Hashable, Protocol):
class HashableFloats(Iterable[float], Hashable, Protocol):
pass
def cached_func(args: HashableFloats) -> float:
...
cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence
cached_func((1, 2, 3)) # OK, tuple is both hashable and iterable
If this will prove to be a widely used scenario, then a special
intersection type construct could be added in future as specified by PEP 483,
@ -740,8 +763,8 @@ aliases::
.. _discussion:
``@runtime`` decorator and narrowing types by ``isinstance()``
--------------------------------------------------------------
``@runtime_checkable`` decorator and narrowing types by ``isinstance()``
------------------------------------------------------------------------
The default semantics is that ``isinstance()`` and ``issubclass()`` fail
for protocol types. This is in the spirit of duck typing -- protocols
@ -752,38 +775,58 @@ However, it should be possible for protocol types to implement custom
instance and class checks when this makes sense, similar to how ``Iterable``
and other ABCs in ``collections.abc`` and ``typing`` already do it,
but this is limited to non-generic and unsubscripted generic protocols
(``Iterable`` is statically equivalent to ``Iterable[Any]`).
The ``typing`` module will define a special ``@runtime`` class decorator
(``Iterable`` is statically equivalent to ``Iterable[Any]``).
The ``typing`` module will define a special ``@runtime_checkable`` class decorator
that provides the same semantics for class and instance checks as for
``collections.abc`` classes, essentially making them "runtime protocols"::
from typing import runtime, Protocol
@runtime
class Closable(Protocol):
@runtime_checkable
class SupportsClose(Protocol):
def close(self):
...
assert isinstance(open('some/file'), Closable)
Static type checkers will understand ``isinstance(x, Proto)`` and
``issubclass(C, Proto)`` for protocols defined with this decorator (as they
already do for ``Iterable`` etc.). Static type checkers will narrow types
after such checks by the type erased ``Proto`` (i.e. with all variables
having type ``Any`` and all methods having type ``Callable[..., Any]``).
Note that ``isinstance(x, Proto[int])`` etc. will always fail in agreement
with PEP 484. Examples::
from typing import Iterable, Iterator, Sequence
def process(items: Iterable[int]) -> None:
if isinstance(items, Iterator):
# 'items' has type 'Iterator[int]' here
elif isinstance(items, Sequence[int]):
# Error! Can't use 'isinstance()' with subscripted protocols
assert isinstance(open('some/file'), SupportsClose)
Note that instance checks are not 100% reliable statically, this is why
this behavior is opt-in, see section on `rejected`_ ideas for examples.
The most type checkers can do is to treat ``isinstance(obj, Iterator)``
roughly as a simpler way to write
``hasattr(x, '__iter__') and hasattr(x, '__next__')``. To minimize
the risks for this feature, the following rules are applied.
**Definitions**:
* *Data, and non-data protocols*: A protocol is called non-data protocol
if it only contains methods as members (for example ``Sized``,
``Iterator``, etc). A protocol that contains at least one non-method member
(like ``x: int``) is called a data protocol.
* *Unsafe overlap*: A type ``X`` is called unsafely overlapping with
a protocol ``P``, if ``X`` is not a subtype of ``P``, but it is a subtype
of the type erased version of ``P`` where all members have type ``Any``.
In addition, if at least one element of a union unsafely overlaps with
a protocol ``P``, then the whole union is unsafely overlapping with ``P``.
**Specification**:
* A protocol can be used as a second argument in ``isinstance()`` and
``issubclass()`` only if it is explicitly opt-in by ``@runtime_checkable``
decorator. This requirement exists because protocol checks are not type safe
in case of dynamically set attributes, and because type checkers can only prove
that an ``isinstance()`` check is safe only for a given class, not for all its
subclasses.
* ``isinstance()`` can be used with both data and non-data protocols, while
``issubclass()`` can be used only with non-data protocols. This restriction
exists because some data attributes can be set on an instance in constructor
and this information is not always available on the class object.
* Type checkers should reject an ``isinstance()`` or ``issubclass()`` call, if
there is an unsafe overlap between the type of the first argument and
the protocol.
* Type checkers should be able to select a correct element from a union after
a safe ``isinstance()`` or ``issubclass()`` call. For narrowing from non-union
types, type checkers can use their best judgement (this is intentionally
unspecified, since a precise specification would require intersection types).
Using Protocols in Python 2.7 - 3.5
@ -825,14 +868,12 @@ effects on the core interpreter and standard library except in the
a protocol or not. Add a class attribute ``_is_protocol = True``
if that is the case. Verify that a protocol class only has protocol
base classes in the MRO (except for object).
* Implement ``@runtime`` that allows ``__subclasshook__()`` performing
structural instance and subclass checks as in ``collections.abc`` classes.
* Implement ``@runtime_checkable`` that allows ``__subclasshook__()``
performing structural instance and subclass checks as in ``collections.abc``
classes.
* All structural subtyping checks will be performed by static type checkers,
such as ``mypy`` [mypy]_. No additional support for protocol validation will
be provided at runtime.
* Classes ``Mapping``, ``MutableMapping``, ``Sequence``, and
``MutableSequence`` in ``collections.abc`` module will support structural
instance and subclass checks (like e.g. ``collections.abc.Iterable``).
Changes in the typing module
@ -849,8 +890,6 @@ The following classes in ``typing`` module will be protocols:
* ``Container``
* ``Collection``
* ``Reversible``
* ``Sequence``, ``MutableSequence``
* ``Mapping``, ``MutableMapping``
* ``ContextManager``, ``AsyncContextManager``
* ``SupportsAbs`` (and other ``Supports*`` classes)
@ -1026,11 +1065,10 @@ be considered "non-protocol". Therefore, it was decided to not introduce
"non-protocol" methods.
There is only one downside to this: it will require some boilerplate for
implicit subtypes of ``Mapping`` and few other "large" protocols. But, this
applies to few "built-in" protocols (like ``Mapping`` and ``Sequence``) and
people are already subclassing them. Also, such style is discouraged for
user-defined protocols. It is recommended to create compact protocols and
combine them.
implicit subtypes of "large" protocols. But, this doesn't apply to "built-in"
protocols that are all "small" (i.e. have only few abstract methods).
Also, such style is discouraged for user-defined protocols. It is recommended
to create compact protocols and combine them.
Make protocols interoperable with other approaches
@ -1103,7 +1141,7 @@ Another potentially problematic case is assignment of attributes
self.x = 0
c = C()
isinstance(c1, P) # False
isinstance(c, P) # False
c.initialize()
isinstance(c, P) # True
@ -1149,7 +1187,7 @@ This was rejected for the following reasons:
ABCs from ``typing`` module. If we prohibit explicit subclassing of these
ABCs, then quite a lot of code will break.
* Convenience: There are existing protocol-like ABCs (that will be turned
* Convenience: There are existing protocol-like ABCs (that may be turned
into protocols) that have many useful "mix-in" (non-abstract) methods.
For example in the case of ``Sequence`` one only needs to implement
``__getitem__`` and ``__len__`` in an explicit subclass, and one gets
@ -1301,33 +1339,16 @@ confusions.
Backwards Compatibility
=======================
This PEP is almost fully backwards compatible. Few collection classes such as
``Sequence`` and ``Mapping`` will be turned into runtime protocols, therefore
results of ``isinstance()`` checks are going to change in some edge cases.
For example, a class that implements the ``Sequence`` protocol but does not
explicitly inherit from ``Sequence`` currently returns ``False`` in
corresponding instance and class checks. With this PEP implemented, such
checks will return ``True``.
This PEP is fully backwards compatible.
Implementation
==============
A working implementation of this PEP for ``mypy`` type checker is found on
GitHub repo at https://github.com/ilevkivskyi/mypy/tree/protocols,
corresponding ``typeshed`` stubs for more flavor are found at
https://github.com/ilevkivskyi/typeshed/tree/protocols. Installation steps::
git clone --recurse-submodules https://github.com/ilevkivskyi/mypy/
cd mypy && git checkout protocols && cd typeshed
git remote add proto https://github.com/ilevkivskyi/typeshed
git fetch proto && git checkout proto/protocols
cd .. && git add typeshed && sudo python3 -m pip install -U .
The runtime implementation of protocols in ``typing`` module is
found at https://github.com/ilevkivskyi/typehinting/tree/protocols.
The version of ``collections.abc`` with structural behavior for mappings and
sequences is found at https://github.com/ilevkivskyi/cpython/tree/protocols.
The ``mypy`` type checker fully supports protocols (modulo a few
known bugs). This includes treating all the builtin protocols, such as
``Iterable`` structurally. The runtime implementation of protocols is
available in ``typing_extensions`` module on PyPI.
References
@ -1372,6 +1393,9 @@ References
.. [elsewhere]
https://github.com/python/peps/pull/224
.. [self-types]
https://www.python.org/dev/peps/pep-0484/#annotating-instance-and-class-methods
Copyright
=========

View File

@ -609,7 +609,7 @@ References
(https://github.com/AFPy/python_doc_fr/graphs/contributors?from=2016-01-01&to=2016-12-31&type=c)
.. [15] Python-doc on Transifex
(https://www.transifex.com/python-doc/)
(https://www.transifex.com/python-doc/public/)
.. [16] French translation
(https://www.afpy.org/doc/python/)

View File

@ -5,7 +5,7 @@ Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>,
Cory Benfield <cory@lukasa.co.uk>,
BDFL-Delegate: Benjamin Peterson <benjamin@python.org>
Status: Accepted
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 30-May-2017
@ -21,6 +21,15 @@ Backport the ssl.MemoryBIO and ssl.SSLObject classes from Python 3 to Python
2.7 to enhance the overall security of Python 2.7.
Rejection Notice
================
This PEP is rejected, see `Withdraw PEP 546? Backport ssl.MemoryBIO and
ssl.SSLObject to Python 2.7
<https://mail.python.org/pipermail/python-dev/2018-May/153760.html>`_
discussion for the rationale.
Rationale
=========

View File

@ -4,7 +4,7 @@ Version: $Revision$
Last-Modified: $Date$
Author: Marcel Plch <gmarcel.plch@gmail.com>,
Petr Viktorin <encukou@gmail.com>
Status: Draft
Status: Deferred
Type: Standards Track
Content-Type: text/x-rst
Created: 25-May-2017
@ -12,6 +12,17 @@ Python-Version: 3.7
Post-History:
Deferral Notice
===============
Cython -- the most important use case for this PEP and the only explicit
one -- is not ready for multi-phase initialization yet.
It keeps global state in C-level static variables.
See discussion at `Cython issue 1923`_.
The PEP is deferred until the situation changes.
Abstract
========
@ -186,6 +197,7 @@ References
.. _GitHub: https://github.com/python/cpython/pull/1761
.. _Cython issue 1715: https://github.com/cython/cython/issues/1715
.. _Possible Future Extensions section: https://www.python.org/dev/peps/pep-0489/#possible-future-extensions
.. _Cython issue 1923: https://github.com/cython/cython/pull/1923
Copyright

View File

@ -4,28 +4,37 @@ Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower <steve.dower@python.org>
Status: Draft
Type: Standards Track
Type: Informational
Content-Type: text/x-rst
Created: 23-Aug-2017
Python-Version: 3.7
Post-History: 24-Aug-2017 (security-sig), 28-Aug-2017 (python-dev)
Relationship to PEP 578
=======================
This PEP has been split into two since its original posting.
See `PEP 578 <https://www.python.org/dev/peps/pep-0578/>`_ for the
auditing APIs proposed for addition to the next version of Python.
This is now an informational PEP, providing guidance to those planning
to integrate Python into their secure or audited environments.
Abstract
========
This PEP describes additions to the Python API and specific behaviors
for the CPython implementation that make actions taken by the Python
runtime visible to security and auditing tools. The goals in order of
increasing importance are to prevent malicious use of Python, to detect
and report on malicious use, and most importantly to detect attempts to
bypass detection. Most of the responsibility for implementation is
required from users, who must customize and build Python for their own
environment.
This PEP describes the concept of security transparency and how it
applies to the Python runtime. Visibility into actions taken by the
runtime is invaluable in integrating Python into an otherwise secure
and/or monitored environment.
We propose two small sets of public APIs to enable users to reliably
build their copy of Python without having to modify the core runtime,
protecting future maintainability. We also discuss recommendations for
users to help them develop and configure their copy of Python.
The audit hooks described in PEP-578 are an essential component in
detecting, identifying and analyzing misuse of Python. While the hooks
themselves are neutral (in that not every reported event is inherently
misuse), they provide essential context to those who are responsible
for monitoring an overall system or network. With enough transparency,
attackers are no longer able to hide.
Background
==========
@ -126,14 +135,14 @@ tools, most network access and DNS resolution, and attempts to create
and hide files or configuration settings on the local machine.
To summarize, defenders have a need to audit specific uses of Python in
order to detect abnormal or malicious usage. Currently, the Python
runtime does not provide any ability to do this, which (anecdotally) has
led to organizations switching to other languages. The aim of this PEP
is to enable system administrators to deploy a security transparent copy
of Python that can integrate with their existing auditing and protection
systems.
order to detect abnormal or malicious usage. With PEP 578, the Python
runtime gains the ability to provide this. The aim of this PEP is to
assist system administrators with deploying a security transparent
version of Python that can integrate with their existing auditing and
protection systems.
On Windows, some specific features that may be enabled by this include:
On Windows, some specific features that may be integrated through the
hooks added by PEP 578 include:
* Script Block Logging [3]_
* DeviceGuard [4]_
@ -151,7 +160,7 @@ On Linux, some specific features that may be integrated are:
* SELinux labels [13]_
* check execute bit on imported modules
On macOS, some features that may be used with the expanded APIs are:
On macOS, some features that may be integrated are:
* OpenBSM [10]_
* syslog [11]_
@ -161,9 +170,6 @@ production machines is highly appealing to system administrators and
will make Python a more trustworthy dependency for application
developers.
Overview of Changes
===================
True security transparency is not fully achievable by Python in
isolation. The runtime can audit as many events as it likes, but unless
the logs are reviewed and analyzed there is no value. Python may impose
@ -173,340 +179,64 @@ implementations of certain security features, and organizations with the
resources to fully customize their runtime should be encouraged to do
so.
The aim of these changes is to enable system administrators to integrate
Python into their existing security systems, without dictating what
those systems look like or how they should behave. We propose two API
changes to enable this: an Audit Hook and Verified Open Hook. Both are
not set by default, and both require modifications to the entry point
binary to enable any functionality. For the purposes of validation and
example, we propose a new ``spython``/``spython.exe`` entry point
program that enables some basic functionality using these hooks.
**However, security-conscious organizations are expected to create their
own entry points to meet their own needs.**
Summary Recommendations
=======================
Audit Hook
----------
These are discussed in greater detail in later sections, but are
presented here to frame the overall discussion.
In order to achieve security transparency, an API is required to raise
messages from within certain operations. These operations are typically
deep within the Python runtime or standard library, such as dynamic code
compilation, module imports, DNS resolution, or use of certain modules
such as ``ctypes``.
Sysadmins should provide and use an alternate entry point (besides
``python.exe`` or ``pythonX.Y``) in order to reduce surface area and
securely enable audit hooks. A discussion of what could be restricted
is below in `Restricting the Entry Point`_.
The new C APIs required for audit hooks are::
Sysadmins should use all available measures provided by their operating
system to prevent modifications to their Python installation, such as
file permissions, access control lists and signature validation.
# Add an auditing hook
typedef int (*hook_func)(const char *event, PyObject *args,
void *userData);
int PySys_AddAuditHook(hook_func hook, void *userData);
Sysadmins should log everything and collect logs to a central location
as quickly as possible - avoid keeping logs on outer-ring machines.
# Raise an event with all auditing hooks
int PySys_Audit(const char *event, PyObject *args);
# Internal API used during Py_Finalize() - not publicly accessible
void _Py_ClearAuditHooks(void);
The new Python APIs for audit hooks are::
# Add an auditing hook
sys.addaudithook(hook: Callable[str, tuple]) -> None
# Raise an event with all auditing hooks
sys.audit(str, *args) -> None
Sysadmins should prioritize _detection_ of misuse over _prevention_ of
misuse.
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling
``sys.addaudithook()`` from Python code. Hooks are never removed or
replaced, and existing hooks have an opportunity to refuse to allow new
hooks to be added (adding an audit hook is audited, and so preexisting
hooks can raise an exception to block the new addition).
Restricting the Entry Point
===========================
When events of interest are occurring, code can either call
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
string argument is the name of the event, and the tuple contains
arguments. A given event name should have a fixed schema for arguments,
and both arguments are considered a public API (for a given x.y version
of Python), and thus should only change between feature releases with
updated documentation.
One of the primary vulnerabilities exposed by the presence of Python
on a machine is the ability to execute arbitrary code without
detection or verification by the system. This is made significantly
easier because the default entry point (``python.exe`` on Windows and
``pythonX.Y`` on other platforms) allows execution from the command
line, from standard input, and does not have any hooks enabled by
default.
When an event is audited, each hook is called in the order it was added
with the event name and tuple. If any hook returns with an exception
set, later hooks are ignored and *in general* the Python runtime should
terminate. This is intentional to allow hook implementations to decide
how to respond to any particular event. The typical responses will be to
log the event, abort the operation with an exception, or to immediately
terminate the process with an operating system exit call.
Our recommendation is that production machines should use a modified
entry point instead of the default. Once outside of the development
environment, there is rarely a need for the flexibility offered by the
default entry point.
When an event is audited but no hooks have been set, the ``audit()``
function should include minimal overhead. Ideally, each argument is a
reference to existing data rather than a value calculated just for the
auditing call.
In this section, we describe a hypothetical ``spython`` entry point
(``spython.exe`` on Windows; ``spythonX.Y`` on other platforms) that
provides a level of security transparency recommended for production
machines. An associated example implementation shows many of the
features described here, though with a number of concessions for the
sake of avoiding platform-specific code. A sufficient implementation
will inherently require some integration with platform-specific
security features.
As hooks may be Python objects, they need to be freed during
``Py_Finalize()``. To do this, we add an internal API
``_Py_ClearAuditHooks()`` that releases any ``PyObject*`` hooks that are
held, as well as any heap memory used. This is an internal function with
no public export, but it triggers an event for all audit hooks to ensure
that unexpected calls are logged.
Official distributions will not include any ``spython`` by default, but
third party distributions may include appropriately modified entry
points that use the same name.
See `Audit Hook Locations`_ for proposed audit hook points and schemas,
and the `Recommendations`_ section for discussion on
appropriate responses.
Verified Open Hook
------------------
Most operating systems have a mechanism to distinguish between files
that can be executed and those that can not. For example, this may be an
execute bit in the permissions field, or a verified hash of the file
contents to detect potential code tampering. These are an important
security mechanism for preventing execution of data or code that is not
approved for a given environment. Currently, Python has no way to
integrate with these when launching scripts or importing modules.
The new public C API for the verified open hook is::
# Set the handler
typedef PyObject *(*hook_func)(PyObject *path)
int PyImport_SetOpenForImportHook(void *handler)
# Open a file using the handler
PyObject *PyImport_OpenForImport(const char *path)
The new public Python API for the verified open hook is::
# Open a file using the handler
_imp.open_for_import(path)
The ``_imp.open_for_import()`` function is a drop-in replacement for
``open(str(pathlike), 'rb')``. Its default behaviour is to open a file
for raw, binary access - any more restrictive behaviour requires the
use of a custom handler. Only ``str`` arguments are accepted.
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
from C at any time, including before ``Py_Initialize()``. However, if a
hook has already been set then the call will fail. When
``open_for_import()`` is called with a hook set, the hook will be passed
the path and its return value will be returned directly. The returned
object should be an open file-like object that supports reading raw
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
the open handler has already had to read the file into memory in order
to perform whatever verification is necessary to determine whether the
content is permitted to be executed.
Note that these hooks can import and call the ``_io.open()`` function on
CPython without triggering themselves.
If the hook determines that the file is not suitable for execution, it
should raise an exception of its choice, as well as raising any other
auditing events or notifications.
All import and execution functionality involving code from a file will
be changed to use ``open_for_import()`` unconditionally. It is important
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
through this function - an audit hook that includes the code from these
calls will be added and is the best opportunity to validate code that is
read from the file. Given the current decoupling between import and
execution in Python, most imported code will go through both
``open_for_import()`` and the log hook for ``compile``, and so care
should be taken to avoid repeating verification steps.
.. note::
The use of ``open_for_import()`` by ``importlib`` is a valuable
first defence, but should not be relied upon to prevent misuse. In
particular, it is easy to monkeypatch ``importlib`` in order to
bypass the call. Auditing hooks are the primary way to achieve
security transparency, and are essential for detecting attempts to
bypass other functionality.
API Availability
----------------
While all the functions added here are considered public and stable API,
the behavior of the functions is implementation specific. The
descriptions here refer to the CPython implementation, and while other
implementations should provide the functions, there is no requirement
that they behave the same.
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
may do nothing. This allows code to make calls to ``sys.audit()``
without having to test for existence, but it should not assume that its
call will have any effect. (Including existence tests in
security-critical code allows another vector to bypass auditing, so it
is preferable that the function always exist.)
``_imp.open_for_import(path)`` should at a minimum always return
``_io.open(path, 'rb')``. Code using the function should make no further
assumptions about what may occur, and implementations other than CPython
are not required to let developers override the behavior of this
function with a hook.
Audit Hook Locations
====================
Calls to ``sys.audit()`` or ``PySys_Audit()`` will be added to the
following operations with the schema in Table 1. Unless otherwise
specified, the ability for audit hooks to abort any listed operation
should be considered part of the rationale for including the hook.
.. csv-table:: Table 1: Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
audit hooks are being added.
"
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
hooks they are being cleaned up, mainly in case the event is
triggered unexpectedly. This event cannot be aborted.
"
``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", "
Detects any attempt to set the ``open_for_import`` hook.
"
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
Detect dynamic code compilation, where ``code`` could be a string or
AST. Note that this will be called for regular imports of source
code, including those that were opened with ``open_for_import``.
"
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
Detect dynamic execution of code objects. This only occurs for
explicit calls, and is not raised for normal function invocation.
"
``import``, ``import``, "``(module, filename, sys.path,
sys.meta_path, sys.path_hooks)``", "Detect when modules are
imported. This is raised before the module name is resolved to a
file. All arguments other than the module name may be ``None`` if
they are not used or available.
"
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
Detect dynamic creation of code objects. This only occurs for
direct instantiation, and is not raised for normal compilation.
"
``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
dynamic creation of function objects. This only occurs for direct
instantiation, and is not raised for normal compilation.
"
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
``(module_or_path,)``", "Detect when native modules are used.
"
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
Collect information about specific symbols retrieved from native
modules.
"
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
when code is accessing arbitrary memory using ``ctypes``.
"
``id``, ``id``, "``(id_as_int,)``", "Detect when code is accessing
the id of objects, which in CPython reveals information about
memory layout.
"
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
when code is accessing frames directly.
"
``sys._current_frames``, ``sys._current_frames``, "", "Detect when
code is accessing frames directly.
"
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.setprofile`` eventually
calls this function, so the event will be audited for each thread.
"
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.settrace`` eventually
calls this function, so the event will be audited for each thread.
"
``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "
", "Detect changes to async generator hooks.
"
``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "
", "Detect changes to async generator hooks.
"
``_PyEval_SetCoroutineWrapper``, ``sys.set_coroutine_wrapper``, "
", "Detect changes to the coroutine wrapper.
"
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
access to network resources. The address is unmodified from the
original call.
"
``socket.__init__``, "socket()", "``(family, type, proto)``", "
Detect creation of sockets. The arguments will be int values.
"
``socket.gethostname``, ``socket.gethostname``, "", "Detect attempts
to retrieve the current host name.
"
``socket.sethostname``, ``socket.sethostname``, "``(name,)``", "
Detect attempts to change the current host name. The name argument
is passed as a bytes object.
"
"``socket.gethostbyname``, ``socket.gethostbyname_ex``",
"``socket.gethostbyname``", "``(name,)``", "Detect host name
resolution. The name argument is a str or bytes object.
"
``socket.gethostbyaddr``, ``socket.gethostbyaddr``, "
``(address,)``", "Detect host resolution. The address argument is a
str or bytes object.
"
``socket.getservbyname``, ``socket.getservbyname``, "``(name,
protocol)``", "Detect service resolution. The arguments are str
objects.
"
"``socket.getservbyport``", ``socket.getservbyport``, "``(port,
protocol)``", "Detect service resolution. The port argument is an
int and protocol is a str.
"
"``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
",``object.__getattr__``,"``(object, attr)``","Detect access to
restricted attributes. This event is raised for any built-in
members that are marked as restricted, and members that may allow
bypassing imports.
"
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
``object.__setattr__``","``(object, attr, value)``","Detect monkey
patching of types and objects. This event
is raised for the ``__class__`` attribute and any attribute on
``type`` objects.
"
"``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
attr)``","Detect deletion of object attributes. This event is raised
for any attribute on ``type`` objects.
"
"``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
global_name)``","Detect imports and global name lookup when
unpickling.
"
"``array_new``",``array.__new__``,"``(typecode, initial_value)``", "
Detects creation of array objects.
"
TODO - more hooks in ``_socket``, ``_ssl``, others?
SPython Entry Point
===================
A new entry point binary will be added, called ``spython.exe`` on
Windows and ``spythonX.Y`` on other platforms. This entry point is
intended primarily as an example, as we expect most users of this
functionality to implement their own entry point and hooks (see
`Recommendations`_). It will also be used for tests.
Source builds will build ``spython`` by default, but distributions
should not include it except as a test binary. The python.org managed
binary distributions will not include ``spython``.
**Do not accept most command-line arguments**
**Remove most command-line arguments**
The ``spython`` entry point requires a script file be passed as the
first argument, and does not allow any options. This prevents arbitrary
code execution from in-memory data or non-script files (such as pickles,
which can be executed using ``-m pickle <path>``.
first argument, and does not allow any options to precede it. This
prevents arbitrary code execution from in-memory data or non-script
files (such as pickles, which could be executed using
``-m pickle <path>``.
Options ``-B`` (do not write bytecode), ``-E`` (ignore environment
variables) and ``-s`` (no user site) are assumed.
@ -517,38 +247,57 @@ will be used to initialize ``sys.path`` following the rules currently
described `for Windows
<https://docs.python.org/3/using/windows.html#finding-modules>`_.
When built with ``Py_DEBUG``, the ``spython`` entry point will allow a
``-i`` option with no other arguments to enter into interactive mode,
with audit messages being written to standard error rather than a file.
This is intended for testing and debugging only.
For the sake of demonstration, the example implementation of
``spython`` also allows the ``-i`` option to start in interactive mode.
This is not recommended for restricted entry points.
**Log security events to a file**
**Log audited events**
Before initialization, ``spython`` will set an audit hook that writes
events to a local file. By default, this file is the full path of the
process with a ``.log`` suffix, but may be overridden with the
``SPYTHONLOG`` environment variable (despite such overrides being
explicitly discouraged in `Recommendations`_).
Before initialization, ``spython`` sets an audit hook that writes all
audited events to an OS-managed log file. On Windows, this is the Event
Tracing functionality,[7]_ and on other platforms they go to
syslog.[11]_ Logs are copied from the machine as frequently as possible
to prevent loss of information should an attacker attempt to clear
local logs or prevent legitimate access to the machine.
The audit hook will also abort all ``sys.addaudithook`` events,
preventing any other hooks from being added.
The logging hook is written in native code and configured before the
interpreter is initialized. This is the only opportunity to ensure that
no Python code executes without auditing, and that Python code cannot
prevent registration of the hook.
Our primary aim is to record all actions taken by all Python processes,
so that detection may be performed offline against logged events.
Having all events recorded also allows for deeper analysis and the use
of machine learning algorithms. These are useful for detecting
persistent attacks, where the attacker is intending to remain within
the protected machines for some period of time, as well as for later
analysis to determine the impact and exposure caused by a successful
attack.
The example implementation of ``spython`` writes to a log file on the
local machine, for the sake of demonstration. When started with ``-i``,
the example implementation writes all audit events to standard error
instead of the log file. The ``SPYTHONLOG`` environment variable can be
used to specify the log file location.
**Restrict importable modules**
Also before initialization, ``spython`` will set an open-for-import
hook that validates all files opened with ``os.open_for_import``. This
implementation will require all files to have a ``.py`` suffix (thereby
blocking the use of cached bytecode), and will raise a custom audit
event ``spython.open_for_import`` containing ``(filename,
True_if_allowed)``.
Also before initialization, ``spython`` sets an open-for-import hook
that validates all files opened with ``os.open_for_import``. This
implementation requires all files to have a ``.py`` suffix (preventing
the use of cached bytecode), and will raise a custom audit event
``spython.open_for_import`` containing ``(filename, True_if_allowed)``.
On Windows, the hook will also open the file with flags that prevent any
other process from opening it with write access, which allows the hook
to perform additional validation on the contents with confidence that it
will not be modified between the check and use. Compilation will later
trigger a ``compile`` event, so there is no need to read the contents
now for AMSI, but other validation mechanisms such as DeviceGuard [4]_
should be performed here.
After opening the file, the entire contents is read into memory in a
single buffer and the file is closed.
Compilation will later trigger a ``compile`` event, so there is no need
to validate the contents now using mechanisms that also apply to
dynamically generated code. However, if a whitelist of source files or
file hashes is available, then this is the point
**Restrict globals in pickles**
@ -556,35 +305,37 @@ The ``spython`` entry point will abort all ``pickle.find_class`` events
that use the default implementation. Overrides will not raise audit
events unless explicitly added, and so they will continue to be allowed.
Performance Impact
==================
**Prevent os.system**
The important performance impact is the case where events are being
raised but there are no hooks attached. This is the unavoidable case -
once a distributor or sysadmin begins adding audit hooks they have
explicitly chosen to trade performance for functionality. Performance
impact using ``spython`` or with hooks added are not of interest here,
since this is considered opt-in functionality.
The ``spython`` entry point aborts all ``os.system`` calls.
Analysis using the ``performance`` tool shows no significant impact,
with the vast majority of benchmarks showing between 1.05x faster to
1.05x slower.
It should be noted here that ``subprocess.Popen(shell=True)`` is
allowed (though logged via the platform-specific process creation
events). This tradeoff is made because it is much simpler to induce a
running application to call ``os.system`` with a single string argument
than a function with multiple arguments, and so it is more likely to be
used as part of an exploit. There is also little justification for
using ``os.system`` in production code, while ``subprocess.Popen`` has
a large number of legitimate uses. Though logs indicating the use of
the ``shell=True`` argument should be more carefully scrutinised.
In our opinion, the performance impact of the set of auditing points
described in this PEP is negligible.
Sysadmins are encouraged to make these kinds of tradeoffs between
restriction and detection, and generally should prefer detection.
Recommendations
===============
General Recommendations
=======================
Specific recommendations are difficult to make, as the ideal
configuration for any environment will depend on the user's ability to
manage, monitor, and respond to activity on their own network. However,
many of the proposals here do not appear to be of value without deeper
illustration. This section provides recommendations using the terms
**should** (or **should not**), indicating that we consider it dangerous
to ignore the advice, and **may**, indicating that for the advice ought
to be considered for high value systems. The term **sysadmins** refers
to whoever is responsible for deploying Python throughout your network;
Recommendations beyond those suggested in the previous section are
difficult, as the ideal configuration for any environment depends on
the sysadmin's ability to manage, monitor, and respond to activity on
their own network. Nonetheless, here we attempt to provide some context
and guidance for integrating Python into a complete system.
This section provides recommendations using the terms **should** (or
**should not**), indicating that we consider it risky to ignore the
advice, and **may**, indicating that for the advice ought to be
considered for high value systems. The term **sysadmin** refers to
whoever is responsible for deploying Python throughout the network;
different organizations may have an alternative title for the
responsible people.
@ -666,73 +417,30 @@ Since ``importlib``'s use of ``open_for_import`` may be easily bypassed
with monkeypatching, an audit hook **should** be used to detect
attribute changes on type objects.
[TODO: more good advice; less bad advice]
Rejected Advice
===============
Rejected Ideas
==============
This section discusses common or "obviously good" recommendations that
we are specifically *not* making. These range from useless or incorrect
through to ideas that are simply not feasible in any real world
environment.
Separate module for audit hooks
-------------------------------
**Do not** attempt to implement a sandbox within the Python runtime.
There is a long history of attempts to allow arbitrary code limited use
of Python features (such as [14]_), but no general success. The best
options are to run unrestricted Python within a sandboxed environment
with at least hypervisor-level isolation, or to prevent unauthorised
code from starting at all.
The proposal is to add a new module for audit hooks, hypothetically
``audit``. This would separate the API and implementation from the
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and
``PyAudit_Audit`` rather than the current variations.
**Do not** rely on static analysis to verify untrusted code before use.
The best options are to pre-authorise trusted code, such as with code
signing, and if not possible to identify known-bad code, such as with
an anti-malware scanner.
Any such module would need to be a built-in module that is guaranteed to
always be present. The nature of these hooks is that they must be
callable without condition, as any conditional imports or calls provide
more opportunities to intercept and suppress or modify events.
**Do not** use audit hooks to abort operations without logging the
event first. You will regret not knowing why your process disappeared.
Given its nature as one of the most core modules, the ``sys`` module is
somewhat protected against module shadowing attacks. Replacing ``sys``
with a sufficiently functional module that the application can still run
is a much more complicated task than replacing a module with only one
function of interest. An attacker that has the ability to shadow the
``sys`` module is already capable of running arbitrary code from files,
whereas an ``audit`` module can be replaced with a single statement::
import sys; sys.modules['audit'] = type('audit', (object,),
{'audit': lambda *a: None, 'addhook': lambda *a: None})
Multiple layers of protection already exist for monkey patching attacks
against either ``sys`` or ``audit``, but assignments or insertions to
``sys.modules`` are not audited.
This idea is rejected because it makes substituting ``audit`` calls
throughout all callers near trivial.
Flag in sys.flags to indicate "secure" mode
-------------------------------------------
The proposal is to add a value in ``sys.flags`` to indicate when Python
is running in a "secure" mode. This would allow applications to detect
when some features are enabled and modify their behaviour appropriately.
Currently there are no guarantees made about security by this PEP - this
section is the first time the word "secure" has been used. Security
**transparency** does not result in any changed behaviour, so there is
no appropriate reason for applications to modify their behaviour.
Both application-level APIs ``sys.audit`` and ``_imp.open_for_import``
are always present and functional, regardless of whether the regular
``python`` entry point or some alternative entry point is used. Callers
cannot determine whether any hooks have been added (except by performing
side-channel analysis), nor do they need to. The calls should be fast
enough that callers do not need to avoid them, and the sysadmin is
responsible for ensuring their added hooks are fast enough to not affect
application performance.
The argument that this is "security by obscurity" is valid, but
irrelevant. Security by obscurity is only an issue when there are no
other protective mechanisms; obscurity as the first step in avoiding
attack is strongly recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
discussion).
This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in
use.
[TODO - more bad advice]
Further Reading
===============
@ -789,7 +497,7 @@ References
.. [4] `<https://aka.ms/deviceguard>`_
.. [5] AMSI, `<https://msdn.microsoft.com/en-us/library/windows/desktop/dn889587(v=vs.85).aspx>`_
.. [5] Antimalware Scan Interface, `<https://msdn.microsoft.com/en-us/library/windows/desktop/dn889587(v=vs.85).aspx>`_
.. [6] Persistent Zone Identifiers, `<https://msdn.microsoft.com/en-us/library/ms537021(v=vs.85).aspx>`_
@ -807,6 +515,8 @@ References
.. [13] SELinux access decisions `<http://man7.org/linux/man-pages/man3/avc_entry_ref_init.3.html>`_
.. [14] The failure of pysandbox `<https://lwn.net/Articles/574215/>`_
Acknowledgments
===============
@ -820,7 +530,7 @@ discussions.
Copyright
=========
Copyright (c) 2017 by Microsoft Corporation. This material may be
Copyright (c) 2017-2018 by Microsoft Corporation. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0 or later (the latest version is presently
available at http://www.opencontent.org/openpub/).

View File

@ -6,15 +6,16 @@ Type: Standards Track
Content-Type: text/x-rst
Created: 2017-09-05
Python-Version: 3.8
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017,
09-May-2018
Abstract
========
CPython has supported multiple interpreters in the same process (AKA
"subinterpreters") since version 1.5. The feature has been available
via the C-API. [c-api]_ Subinterpreters operate in
"subinterpreters") since version 1.5 (1997). The feature has been
available via the C-API. [c-api]_ Subinterpreters operate in
`relative isolation from one another <Interpreter Isolation_>`_, which
provides the basis for an
`alternative concurrency model <Concurrency_>`_.
@ -30,7 +31,7 @@ Proposal
The ``interpreters`` module will be added to the stdlib. It will
provide a high-level interface to subinterpreters and wrap a new
low-level ``_interpreters`` (in the same was as the ``threading``
low-level ``_interpreters`` (in the same way as the ``threading``
module). See the `Examples`_ section for concrete usage and use cases.
Along with exposing the existing (in CPython) subinterpreter support,
@ -47,6 +48,8 @@ At first only the following types will be supported for sharing:
* None
* bytes
* str
* int
* PEP 3118 buffer objects (via ``send_buffer()``)
Support for other basic types (e.g. int, Ellipsis) will be added later.
@ -87,6 +90,14 @@ For creating and using interpreters:
| channels=None) | | (This blocks the current thread until done.) |
+-----------------------+-----------------------------------------------------+
|
+----------------+--------------+------------------------------------------------------+
| exception | base | description |
+================+==============+======================================================+
| RunFailedError | RuntimeError | Interpreter.run() resulted in an uncaught exception. |
+----------------+--------------+------------------------------------------------------+
For sharing data between interpreters:
+--------------------------------+--------------------------------------------+
@ -120,9 +131,11 @@ For sharing data between interpreters:
| .recv_nowait(default=None) -> | | Like recv(), but return the default |
| object | | instead of waiting. |
+-------------------------------+-----------------------------------------------+
| .close() | | No longer associate the current interpreter |
| .release() | | No longer associate the current interpreter |
| | | with the channel (on the receiving end). |
+-------------------------------+-----------------------------------------------+
| .close(force=False) | | Close the channel in all interpreters. |
+-------------------------------+-----------------------------------------------+
|
@ -147,9 +160,31 @@ For sharing data between interpreters:
+---------------------------+-------------------------------------------------+
| .send_buffer_nowait(obj) | | Like send_buffer(), but fail if not received. |
+---------------------------+-------------------------------------------------+
| .close() | | No longer associate the current interpreter |
| .release() | | No longer associate the current interpreter |
| | | with the channel (on the sending end). |
+---------------------------+-------------------------------------------------+
| .close(force=False) | | Close the channel in all interpreters. |
+---------------------------+-------------------------------------------------+
|
+----------------------+--------------------+------------------------------------------------+
| exception | base | description |
+======================+====================+================================================+
| ChannelError | Exception | The base class for channel-related exceptions. |
+----------------------+--------------------+------------------------------------------------+
| ChannelNotFoundError | ChannelError | The identified channel was not found. |
+----------------------+--------------------+------------------------------------------------+
| ChannelEmptyError | ChannelError | The channel was unexpectedly empty. |
+----------------------+--------------------+------------------------------------------------+
| ChannelNotEmptyError | ChannelError | The channel was unexpectedly not empty. |
+----------------------+--------------------+------------------------------------------------+
| NotReceivedError | ChannelError | Nothing was waiting to receive a sent object. |
+----------------------+--------------------+------------------------------------------------+
| ChannelClosedError | ChannelError | The channel is closed. |
+----------------------+--------------------+------------------------------------------------+
| ChannelReleasedError | ChannelClosedError | The channel is released (but not yet closed). |
+----------------------+--------------------+------------------------------------------------+
Examples
@ -218,7 +253,7 @@ Synchronize using a channel
interp.run(tw.dedent("""
reader.recv()
print("during")
reader.close()
reader.release()
"""),
shared=dict(
reader=r,
@ -229,7 +264,7 @@ Synchronize using a channel
t.start()
print('after')
s.send(b'')
s.close()
s.release()
Sharing a file descriptor
-------------------------
@ -280,7 +315,7 @@ Passing objects via marshal
obj = marshal.loads(data)
do_something(obj)
data = reader.recv()
reader.close()
reader.release()
"""))
t = threading.Thread(target=run)
t.start()
@ -310,7 +345,7 @@ Passing objects via pickle
obj = pickle.loads(data)
do_something(obj)
data = reader.recv()
reader.close()
reader.release()
"""))
t = threading.Thread(target=run)
t.start()
@ -514,6 +549,8 @@ channels to the following:
* None
* bytes
* str
* int
* PEP 3118 buffer objects (via ``send_buffer()``)
Limiting the initial shareable types is a practical matter, reducing
@ -686,16 +723,24 @@ The module also provides the following class:
"run()" call into one long script. This is the same as how the
REPL operates.
Regarding uncaught exceptions, we noted that they are
"effectively" propagated into the code where ``run()`` was called.
To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its
traceback (see ``traceback.TracebackException``), wrap it in a
RuntimeError, and raise that.
Supported code: source text.
Uncaught Exceptions
-------------------
Regarding uncaught exceptions in ``Interpreter.run()``, we noted that
they are "effectively" propagated into the code where ``run()`` was
called. To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its traceback
(see ``traceback.TracebackException``), set it to ``__cause__`` on a
new ``RunFailedError``, and raise that.
Raising (a proxy of) the exception is problematic since it's harder to
distinguish between an error in the ``run()`` call and an uncaught
exception from the subinterpreter.
API for sharing data
--------------------
@ -703,8 +748,8 @@ Subinterpreters are less useful without a mechanism for sharing data
between them. Sharing actual Python objects between interpreters,
however, has enough potential problems that we are avoiding support
for that here. Instead, only mimimum set of types will be supported.
Initially this will include ``bytes`` and channels. Further types may
be supported later.
Initially this will include ``None``, ``bytes``, ``str``, ``int``,
and channels. Further types may be supported later.
The ``interpreters`` module provides a way for users to determine
whether an object is shareable or not:
@ -737,11 +782,12 @@ many-to-many, channels have no buffer.
Create a new channel and return (recv, send), the RecvChannel and
SendChannel corresponding to the ends of the channel. The channel
is not closed and destroyed (i.e. garbage-collected) until the number
of associated interpreters returns to 0.
of associated interpreters returns to 0 (including when the channel
is explicitly closed).
An interpreter gets associated with a channel by calling its "send()"
or "recv()" method. That association gets dropped by calling
"close()" on the channel.
"release()" on the channel.
Both ends of the channel are supported "shared" objects (i.e. may be
safely shared by different interpreters. Thus they may be passed as
@ -765,7 +811,8 @@ many-to-many, channels have no buffer.
interpreters:
The list of associated interpreters: those that have called
the "recv()" or "__next__()" methods and haven't called "close()".
the "recv()" or "__next__()" methods and haven't called
"release()" (and the channel hasn't been explicitly closed).
recv():
@ -773,10 +820,11 @@ many-to-many, channels have no buffer.
the channel. If none have been sent then wait until the next
send. This associates the current interpreter with the channel.
If the channel is already closed (see the close() method)
then raise EOFError. If the channel isn't closed, but the current
interpreter already called the "close()" method (which drops its
association with the channel) then raise ValueError.
If the channel is already closed then raise ChannelClosedError.
If the channel isn't closed but the current interpreter already
called the "release()" method (which drops its association with
the channel) then raise ChannelReleasedError (which is a subclass
of ChannelClosedError).
recv_nowait(default=None):
@ -784,26 +832,35 @@ many-to-many, channels have no buffer.
then return the default. Otherwise, this is the same as the
"recv()" method.
close():
release():
No longer associate the current interpreter with the channel (on
the receiving end) and block future association (via the "recv()"
method. If the interpreter was never associated with the channel
method). If the interpreter was never associated with the channel
then still block future association. Once an interpreter is no
longer associated with the channel, subsequent (or current) send()
and recv() calls from that interpreter will raise ValueError
(or EOFError if the channel is actually marked as closed).
and recv() calls from that interpreter will raise
ChannelReleasedError (or ChannelClosedError if the channel
is actually marked as closed).
Once the number of associated interpreters on both ends drops
to 0, the channel is actually marked as closed. The Python
runtime will garbage collect all closed channels, though it may
not be immediately. Note that "close()" is automatically called
not be immediately. Note that "release()" is automatically called
in behalf of the current interpreter when the channel is no longer
used (i.e. has no references) in that interpreter.
This operation is idempotent. Return True if "close()" has not
This operation is idempotent. Return True if "release()" has not
been called before by the current interpreter.
close(force=False):
Close both ends of the channel (in all interpreters). This means
that any further use of the channel raises ChannelClosedError. If
the channel is not empty then raise ChannelNotEmptyError (if
"force" is False) or discard the remaining objects (if "force"
is True) and close it.
``SendChannel(id)``::
@ -827,16 +884,16 @@ many-to-many, channels have no buffer.
object is not shareable then ValueError is raised. Currently
only bytes are supported.
If the channel is already closed (see the close() method)
then raise EOFError. If the channel isn't closed, but the current
interpreter already called the "close()" method (which drops its
association with the channel) then raise ValueError.
If the channel is already closed then raise ChannelClosedError.
If the channel isn't closed but the current interpreter already
called the "release()" method (which drops its association with
the channel) then raise ChannelReleasedError.
send_nowait(obj):
Send the object to the receiving end of the channel. If the other
end is not currently receiving then raise RuntimeError. Otherwise
this is the same as "send()".
end is not currently receiving then raise NotReceivedError.
Otherwise this is the same as "send()".
send_buffer(obj):
@ -847,14 +904,23 @@ many-to-many, channels have no buffer.
send_buffer_nowait(obj):
Send a MemoryView of the object rather than the object. If the
other end is not currently receiving then raise RuntimeError.
other end is not currently receiving then raise NotReceivedError.
Otherwise this is the same as "send_buffer()".
close():
release():
This is the same as "RecvChannel.close(), but applied to the
This is the same as "RecvChannel.release(), but applied to the
sending end of the channel.
close(force=False):
Close both ends of the channel (in all interpreters). No matter
what the "send" end of the channel is immediately closed. If the
channel is empty then close the "recv" end immediately too.
Otherwise wait until the channel is empty before closing it (if
"force" is False) or discard the remaining items and close
immediately (if "force" is True).
Note that ``send_buffer()`` is similar to how
``multiprocessing.Connection`` works. [mp-conn]_
@ -862,7 +928,9 @@ Note that ``send_buffer()`` is similar to how
Open Questions
==============
None
* "force" argument to ``ch.release()``?
* add a "tp_share" type slot instead of using a global registry
for shareable types?
Open Implementation Questions
@ -1020,9 +1088,8 @@ exception, effectively ending execution in the interpreter that tried
to use the poisoned channel.
This could be accomplished by adding a ``poison()`` method to both ends
of the channel. The ``close()`` method could work if it had a ``force``
option to force the channel closed. Regardless, these semantics are
relatively specialized and can wait.
of the channel. The ``close()`` method can be used in this way
(mostly), but these semantics are relatively specialized and can wait.
Sending channels over channels
------------------------------
@ -1070,14 +1137,6 @@ generic module reset mechanism may prove unnecessary.
This isn't a critical feature initially. It can wait until later
if desirable.
Support passing ints in channels
--------------------------------
Passing ints around should be fine and ultimately is probably
desirable. However, we can get by with serializing them as bytes
for now. The goal is a minimal API for the sake of basic
functionality at first.
File descriptors and sockets in channels
----------------------------------------
@ -1119,7 +1178,8 @@ Channel context managers
Context manager support on ``RecvChannel`` and ``SendChannel`` may be
helpful. The implementation would be simple, wrapping a call to
``close()`` like files do. As with iteration, this can wait.
``close()`` (or maybe ``release()``) like files do. As with iteration,
this can wait.
Pipes and Queues
----------------
@ -1136,19 +1196,11 @@ reasonable. The could be trivially implemented as wrappers around
channels. Alternatively they could be implemented for efficiency at the
same low level as channels.
interpreters.RunFailedError
---------------------------
Buffering
---------
As currently proposed, ``Interpreter.run()`` offers you no way to
distinguish an error coming from the subinterpreter from any other
error in the current interpreter. Your only option would be to
explicitly wrap your ``run()`` call in a
``try: ... except RuntimeError:`` (since we wrap a proxy of the original
exception in a RuntimeError and raise that).
If this is a problem in practice then would could add something like
``interpreters.RunFailedError`` (subclassing RuntimeError) and raise that
in ``run()``.
The proposed channels are unbuffered. This simplifies the API and
implementation. If buffering is desireable we can add it later.
Return a lock from send()
-------------------------
@ -1162,6 +1214,26 @@ This matters for buffered channels (i.e. queues). For unbuffered
channels it is a non-issue. So this can be dealt with once channels
support buffering.
Add a "reraise" method to RunFailedError
----------------------------------------
While having ``__cause__`` set on ``RunFailedError`` helps produce a
more useful traceback, it's less helpful when handling the original
error. To help facilitate this, we could add
``RunFailedError.reraise()``. This method would enable the following
pattern::
try:
interp.run(script)
except RunFailedError as exc:
try:
exc.reraise()
except MyException:
...
This would be made even simpler if there existed a ``__reraise__``
protocol.
Rejected Ideas
==============
@ -1170,7 +1242,7 @@ Explicit channel association
----------------------------
Interpreters are implicitly associated with channels upon ``recv()`` and
``send()`` calls. They are de-associated with ``close()`` calls. The
``send()`` calls. They are de-associated with ``release()`` calls. The
alternative would be explicit methods. It would be either
``add_channel()`` and ``remove_channel()`` methods on ``Interpreter``
objects or something similar on channel objects.
@ -1216,15 +1288,16 @@ While that might not be a problem currently, it would be a problem once
interpreters get better isolation relative to memory management (which
is necessary to stop sharing the GIL between interpreters). We've
resolved the semantics of how the exceptions propagate by raising a
RuntimeError instead, which wraps a safe proxy for the original
exception and traceback.
``RunFailedError`` instead, for which ``__cause__`` wraps a safe proxy
for the original exception and traceback.
Rejected possible solutions:
* set the RuntimeError's __cause__ to the proxy of the original
exception
* reproduce the exception and traceback in the original interpreter
and raise that.
* raise a subclass of RunFailedError that proxies the original
exception and traceback.
* raise RuntimeError instead of RunFailedError
* convert at the boundary (a la ``subprocess.CalledProcessError``)
(requires a cross-interpreter representation)
* support customization via ``Interpreter.excepthook``
@ -1282,7 +1355,7 @@ References
https://bugs.python.org/issue6531
.. [mp-conn]
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Connection
https://docs.python.org/3/library/multiprocessing.html#connection-objects
.. [bug-rate]
https://mail.python.org/pipermail/python-ideas/2017-September/047094.html

View File

@ -6,7 +6,7 @@ Type: Standards Track
Content-Type: text/x-rst
Created: 02-Jun-2017
Python-Version: 3.7
Post-History: 08-Sep-2017, 25-Nov-2017, 30-Nov-2017, 01-Dec-2017, 02-Dec-2017, 06-Jan-2018
Post-History: 08-Sep-2017, 25-Nov-2017, 30-Nov-2017, 01-Dec-2017, 02-Dec-2017, 06-Jan-2018, 04-Mar-2018
Resolution: https://mail.python.org/pipermail/python-dev/2017-December/151034.html
Notice for Reviewers
@ -93,7 +93,7 @@ There have been numerous attempts to define classes which exist
primarily to store values which are accessible by attribute lookup.
Some examples include:
- collection.namedtuple in the standard library.
- collections.namedtuple in the standard library.
- typing.NamedTuple in the standard library.
@ -170,7 +170,7 @@ The ``dataclass`` decorator is typically used with no parameters and
no parentheses. However, it also supports the following logical
signature::
def dataclass(*, init=True, repr=True, eq=True, order=False, hash=None, frozen=False)
def dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
If ``dataclass`` is used just as a simple decorator with no
parameters, it acts as if it has the default values documented in this
@ -184,7 +184,7 @@ signature. That is, these three uses of ``@dataclass`` are equivalent::
class C:
...
@dataclass(init=True, repr=True, eq=True, order=False, hash=None, frozen=False)
@dataclass(init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
class C:
...
@ -200,10 +200,15 @@ The parameters to ``dataclass`` are:
are not included. For example:
``InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=10)``.
- ``eq``: If true (the default), ``__eq__`` and ``__ne__`` methods
will be generated. These compare the class as if it were a tuple of
its fields, in order. Both instances in the comparison must be of
the identical type.
If the class already defines ``__repr__``, this parameter is
ignored.
- ``eq``: If true (the default), an ``__eq__`` method will be
generated. This method compares the class as if it were a tuple of its
fields, in order. Both instances in the comparison must be of the
identical type.
If the class already defines ``__eq__``, this parameter is ignored.
- ``order``: If true (the default is False), ``__lt__``, ``__le__``,
``__gt__``, and ``__ge__`` methods will be generated. These compare
@ -211,9 +216,11 @@ The parameters to ``dataclass`` are:
instances in the comparison must be of the identical type. If
``order`` is true and ``eq`` is false, a ``ValueError`` is raised.
- ``hash``: Either a bool or ``None``. If ``None`` (the default), the
``__hash__`` method is generated according to how ``eq`` and
``frozen`` are set.
If the class already defines any of ``__lt__``, ``__le__``,
``__gt__``, or ``__ge__``, then ``ValueError`` is raised.
- ``unsafe_hash``: If ``False`` (the default), the ``__hash__`` method
is generated according to how ``eq`` and ``frozen`` are set.
If ``eq`` and ``frozen`` are both true, Data Classes will generate a
``__hash__`` method for you. If ``eq`` is true and ``frozen`` is
@ -224,15 +231,36 @@ The parameters to ``dataclass`` are:
to id-based hashing).
Although not recommended, you can force Data Classes to create a
``__hash__`` method with ``hash=True``. This might be the case if your
class is logically immutable but can nonetheless be mutated. This
is a specialized use case and should be considered carefully.
``__hash__`` method with ``unsafe_hash=True``. This might be the
case if your class is logically immutable but can nonetheless be
mutated. This is a specialized use case and should be considered
carefully.
If a class already has an explicitely defined ``__hash__`` the
behavior when adding ``__hash__`` is modified. An expicitely
defined ``__hash__`` is defined when:
- ``__eq__`` is defined in the class and ``__hash__`` is defined
with any value other than ``None``.
- ``__eq__`` is defined in the class and any non-``None``
``__hash__`` is defined.
- ``__eq__`` is not defined on the class, and any ``__hash__`` is
defined.
If ``unsafe_hash`` is true and an explicitely defined ``__hash__``
is present, then ``ValueError`` is raised.
If ``unsafe_hash`` is false and an explicitely defined ``__hash__``
is present, then no ``__hash__`` is added.
See the Python documentation [#]_ for more information.
- ``frozen``: If true (the default is False), assigning to fields will
generate an exception. This emulates read-only frozen instances.
See the discussion below.
If either ``__getattr__`` or ``__setattr__`` is defined in the
class, then ``ValueError`` is raised. See the discussion below.
``field``\s may optionally specify a default value, using normal
Python syntax::
@ -533,7 +561,7 @@ Module level helper functions
- ``fields(class_or_instance)``: Returns a tuple of ``Field`` objects
that define the fields for this Data Class. Accepts either a Data
Class, or an instance of a Data Class. Raises `ValueError` if not
Class, or an instance of a Data Class. Raises ``ValueError`` if not
passed a Data Class or instance of one. Does not return
pseudo-fields which are ``ClassVar`` or ``InitVar``.

View File

@ -206,6 +206,59 @@ the backwards compatibility::
return meta(name, resolved_bases, ns, **kwds)
Using ``__class_getitem__`` in C extensions
-------------------------------------------
As mentioned above, ``__class_getitem__`` is automatically a class method
if defined in Python code. To define this method in a C extension, one
should use flags ``METH_O|METH_CLASS``. For example, a simple way to make
an extension class generic is to use a method that simply returns the
original class objects, thus fully erasing the type information at runtime,
and deferring all check to static type checkers only::
typedef struct {
PyObject_HEAD
/* ... your code ... */
} SimpleGeneric;
static PyObject *
simple_class_getitem(PyObject *type, PyObject *item)
{
Py_INCREF(type);
return type;
}
static PyMethodDef simple_generic_methods[] = {
{"__class_getitem__", simple_class_getitem, METH_O|METH_CLASS, NULL},
/* ... other methods ... */
};
PyTypeObject SimpleGeneric_Type = {
PyVarObject_HEAD_INIT(NULL, 0)
"SimpleGeneric",
sizeof(SimpleGeneric),
0,
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
.tp_methods = simple_generic_methods,
};
Such class can be used as a normal generic in Python type annotations
(a corresponding stub file should be provided for static type checkers,
see PEP 484 for details)::
from simple_extension import SimpleGeneric
from typing import TypeVar
T = TypeVar('T')
Alias = SimpleGeneric[str, T]
class SubClass(SimpleGeneric[T, int]):
...
data: Alias[int] # Works at runtime
more_data: SubClass[str] # Also works at runtime
Backwards compatibility and impact on users who don't use ``typing``
====================================================================

View File

@ -1,12 +1,12 @@
PEP: 561
Title: Distributing and Packaging Type Information
Author: Ethan Smith <ethan@ethanhs.me>
Status: Draft
Status: Accepted
Type: Standards Track
Content-Type: text/x-rst
Created: 09-Sep-2017
Python-Version: 3.7
Post-History: 10-Sep-2017, 12-Sep-2017, 06-Oct-2017, 26-Oct-2017
Post-History: 10-Sep-2017, 12-Sep-2017, 06-Oct-2017, 26-Oct-2017, 12-Apr-2018
Abstract
@ -49,10 +49,11 @@ Definition of Terms
The definition of "MAY", "MUST", and "SHOULD", and "SHOULD NOT" are
to be interpreted as described in RFC 2119.
"inline" - the types are part of the runtime code using PEP 526 and 3107
syntax.
"inline" - the types are part of the runtime code using PEP 526 and
PEP 3107 syntax (the filename ends in ``.py``).
"stubs" - files containing only type information, empty of runtime code.
"stubs" - files containing only type information, empty of runtime code
(the filename ends in ``.pyi``).
"Distributions" are the packaged files which are used to publish and distribute
a release. [3]_
@ -60,13 +61,16 @@ a release. [3]_
"Module" a file containing Python runtime code or stubbed type information.
"Package" a directory or directories that namespace Python modules.
(Note the distinction between packages and distributions. While most
distributions are named after the one package they install, some
distributions install multiple packages.)
Specification
=============
There are several motivations and methods of supporting typing in a package.
This PEP recognizes three (3) types of packages that users of typing wish to
This PEP recognizes three types of packages that users of typing wish to
create:
1. The package maintainer would like to add type information inline.
@ -77,7 +81,7 @@ create:
a package, but the maintainer does not want to include them in the source
of the package.
This PEP aims to support these scenarios and make them simple to add to
This PEP aims to support all three scenarios and make them simple to add to
packaging and deployment.
The two major parts of this specification are the packaging specifications
@ -115,15 +119,15 @@ Distutils option example::
...,
)
For namespace packages, the ``py.typed`` file should be in the submodules of
the namespace, to avoid conflicts and for clarity.
For namespace packages (see PEP 420), the ``py.typed`` file should be in the
submodules of the namespace, to avoid conflicts and for clarity.
This PEP does not support distributing typing information as part of
module-only distributions. The code should be refactored into a package-based
distribution and indicate that the package supports typing as described
above.
Stub Only Packages
Stub-only Packages
''''''''''''''''''
For package maintainers wishing to ship stub files containing all of their
@ -131,21 +135,26 @@ type information, it is preferred that the ``*.pyi`` stubs are alongside the
corresponding ``*.py`` files. However, the stubs can also be put in a separate
package and distributed separately. Third parties can also find this method
useful if they wish to distribute stub files. The name of the stub package
MUST follow the scheme ``foopkg_stubs`` for type stubs for the package named
``foopkg``. The normal resolution order of checking ``*.pyi`` before ``*.py``
will be maintained.
MUST follow the scheme ``foopkg-stubs`` for type stubs for the package named
``foopkg``. Note that for stub-only packages adding a ``py.typed`` marker is not
needed since the name ``*-stubs`` is enough to indicate it is a source of typing
information.
Third parties seeking to distribute stub files are encouraged to contact the
maintainer of the package about distribution alongside the package. If the
maintainer does not wish to maintain or package stub files or type information
inline, then a third party stub only package can be created.
inline, then a third party stub-only package can be created.
In addition, stub-only distributions SHOULD indicate which version(s)
of the runtime package are supported by indicating the runtime distribution's
version(s) through normal dependency data. For example, the
stub package ``flyingcircus_stubs`` can indicate the versions of the
stub package ``flyingcircus-stubs`` can indicate the versions of the
runtime ``flyingcircus`` distribution it supports through ``install_requires``
in distutils-based tools, or the equivalent in other packaging tools.
in distutils-based tools, or the equivalent in other packaging tools. Note that
in pip 9.0, if you update ``flyingcircus-stubs``, it will update
``flyingcircus``. In pip 9.0, you can use the
``--upgrade-strategy=only-if-needed`` flag. In pip 10.0 this is the default
behavior.
Type Checker Module Resolution Order
@ -158,13 +167,14 @@ resolve modules containing type information:
2. Stubs or Python source manually put in the beginning of the path. Type
checkers SHOULD provide this to allow the user complete control of which
stubs to use, and patch broken stubs/inline types from packages.
stubs to use, and to patch broken stubs/inline types from packages.
In mypy the ``$MYPYPATH`` environment variable can be used for this.
3. Stub packages - these packages can supersede the installed packages.
They can be found at ``foopkg_stubs`` for package ``foopkg``.
3. Stub packages - these packages SHOULD supersede any installed inline
package. They can be found at ``foopkg-stubs`` for package ``foopkg``.
4. Inline packages - if there is nothing overriding the installed
package, and it opts into type checking, inline types SHOULD be used.
package, *and* it opts into type checking, inline types SHOULD be used.
5. Typeshed (if used) - Provides the stdlib types and several third party
libraries.
@ -177,27 +187,77 @@ that the type checker allow for the user to point to a particular Python
binary, in case it is not in the path.
Partial Stub Packages
---------------------
Many stub packages will only have part of the type interface for libraries
completed, especially initially. For the benefit of type checking and code
editors, packages can be "partial". This means modules not found in the stub
package SHOULD be searched for in parts four and five of the module resolution
order above, namely inline packages and typeshed.
Type checkers should merge the stub package and runtime package or typeshed
directories. This can be thought of as the functional equivalent of copying the
stub package into the same directory as the corresponding runtime package or
typeshed folder and type checking the combined directory structure. Thus type
checkers MUST maintain the normal resolution order of checking ``*.pyi`` before
``*.py`` files.
Stub packages can opt into declaring themselves as partial by including
``partial\n`` in the package's ``py.typed`` file.
Implementation
==============
The proposed scheme of indicating support for typing is completely backwards
compatible, and requires no modification to tooling. A sample package with
inline types is available [typed_pkg]_, as well as a sample package checker
[pkg_checker]_ which reads the metadata of installed packages and reports on
their status as either not typed, inline typed, or a stub package.
compatible, and requires no modification to package tooling. A sample package
with inline types is available [typed_package]_, as well as a [stub_package]_. A
sample package checker [pkg_checker]_ which reads the metadata of installed
packages and reports on their status as either not typed, inline typed, or a
stub package.
The mypy type checker has an implementation of PEP 561 searching which can be
read about in the mypy docs [4]_.
[numpy-stubs]_ is an example of a real stub-only package for the numpy
distribution.
Acknowledgements
================
This PEP would not have been possible without the ideas, feedback, and support
of Ivan Levkivskyi, Jelle Zijlstra, Nick Coghlan, Daniel F Moisset, Nathaniel
Smith, and Guido van Rossum.
of Ivan Levkivskyi, Jelle Zijlstra, Nick Coghlan, Daniel F Moisset, Andrey
Vlasovskikh, Nathaniel Smith, and Guido van Rossum.
Version History
===============
* 2018-07-09
* Add links to sample stub-only packages
* 2018-06-19
* Partial stub packages can look at typeshed as well as runtime packages
* 2018-05-15
* Add partial stub package spec.
* 2018-04-09
* Add reference to mypy implementation
* Clarify stub package priority.
* 2018-02-02
* Change stub-only package suffix to be -stubs not _stubs.
* Note that py.typed is not needed for stub-only packages.
* Add note about pip and upgrading stub packages.
* 2017-11-12
* Rewritten to use existing tooling only
@ -208,7 +268,7 @@ Version History
* Specification re-written to use package metadata instead of distribution
metadata.
* Removed stub only packages and merged into third party packages spec.
* Removed stub-only packages and merged into third party packages spec.
* Removed suggestion for typecheckers to consider checking runtime versions
* Implementations updated to reflect PEP changes.
@ -238,9 +298,18 @@ References
.. [3] PEP 426 definitions
(https://www.python.org/dev/peps/pep-0426/)
.. [typed_pkg] Sample typed package
.. [4] Example implementation in a type checker
(https://mypy.readthedocs.io/en/latest/installed_packages.html)
.. [stub_package] A stub-only package
(https://github.com/ethanhs/stub-package)
.. [typed_package] Sample typed package
(https://github.com/ethanhs/sample-typed-package)
.. [numpy-stubs] Stubs for numpy
(https://github.com/numpy/numpy-stubs)
.. [pkg_checker] Sample package checker
(https://github.com/ethanhs/check_typedpkg)

View File

@ -72,7 +72,7 @@ imports. Consider a simple example::
# main.py
import lib
lib.submodule.HeavyClass # prints "Submodule loaded"
lib.submod.HeavyClass # prints "Submodule loaded"
There is a related proposal PEP 549 that proposes to support instance
properties for a similar functionality. The difference is this PEP proposes

View File

@ -5,14 +5,14 @@ Last-Modified: $Date$
Author: Dustin Ingram <di@di.codes>
BDFL-Delegate: Daniel Holth
Discussions-To: distutils-sig <distutils-sig at python.org>
Status: Draft
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 1-Dec-2017
Python-Version: 3.x
Post-History:
Replaces: 345
Resolution: https://mail.python.org/pipermail/distutils-sig/2018-February/032014.html
Abstract
========
@ -81,6 +81,14 @@ Name
The specification for the format of this field is now identical to the
distribution name specification defined in PEP 508.
Description
:::::::::::
In addition to the ``Description`` header field, the distribution's
description may instead be provided in the message body (i.e., after a
completely blank line following the headers, with no indentation or other
special formatting necessary).
Version Specifiers
==================
@ -124,6 +132,8 @@ as follows:
single list containing all the original values for the given key;
#. The ``Keywords`` field should be converted to a list by splitting the
original value on whitespace characters;
#. The message body, if present, should be set to the value of the
``description`` key.
#. The result should be stored as a string-keyed dictionary.
Summary of Differences From PEP 345

View File

@ -712,6 +712,15 @@ This proposal was deferred to Python 3.8+ because of the following:
ctx.run(func)
3. If ``Context`` was mutable it would mean that context variables
could be mutated separately (or concurrently) from the code that
runs within the context. That would be similar to obtaining a
reference to a running Python frame object and modifying its
``f_locals`` from another OS thread. Having one single way to
assign values to context variables makes contexts conceptually
simpler and more predictable, while keeping the door open for
future performance optimizations.
Having initial values for ContextVars
-------------------------------------

379
pep-0571.rst Normal file
View File

@ -0,0 +1,379 @@
PEP: 571
Title: The manylinux2010 Platform Tag
Version: $Revision$
Last-Modified: $Date$
Author: Mark Williams <mrw@enotuniq.org>,
Geoffrey Thomas <geofft@ldpreload.com>,
Thomas Kluyver <thomas@kluyver.me.uk>
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: Distutils SIG <distutils-sig@python.org>
Status: Active
Type: Informational
Content-Type: text/x-rst
Created:
Post-History:
Resolution: https://mail.python.org/pipermail/distutils-sig/2018-April/032156.html
Abstract
========
This PEP proposes the creation of a ``manylinux2010`` platform tag to
succeed the ``manylinux1`` tag introduced by PEP 513 [1]_. It also
proposes that PyPI and ``pip`` both be updated to support uploading,
downloading, and installing ``manylinux2010`` distributions on compatible
platforms.
Rationale
=========
True to its name, the ``manylinux1`` platform tag has made the
installation of binary extension modules a reality on many Linux
systems. Libraries like ``cryptography`` [2]_ and ``numpy`` [3]_ are
more accessible to Python developers now that their installation on
common architectures does not depend on fragile development
environments and build toolchains.
``manylinux1`` wheels achieve their portability by allowing the
extension modules they contain to link against only a small set of
system-level shared libraries that export versioned symbols old enough
to benefit from backwards-compatibility policies. Extension modules
in a ``manylinux1`` wheel that rely on ``glibc``, for example, must be
built against version 2.5 or earlier; they may then be run systems
that provide more recent ``glibc`` version that still export the
required symbols at version 2.5.
PEP 513 drew its whitelisted shared libraries and their symbol
versions from CentOS 5.11, which was the oldest supported CentOS
release at the time of its writing. Unfortunately, CentOS 5.11
reached its end-of-life on March 31st, 2017 with a clear warning
against its continued use. [4]_ No further updates, such as security
patches, will be made available. This means that its packages will
remain at obsolete versions that hamper the efforts of Python software
packagers who use the ``manylinux1`` Docker image.
CentOS 6 is now the oldest supported CentOS release, and will receive
maintenance updates through November 30th, 2020. [5]_ We propose that
a new PEP 425-style [6]_ platform tag called ``manylinux2010`` be derived
from CentOS 6 and that the ``manylinux`` toolchain, PyPI, and ``pip``
be updated to support it.
This was originally proposed as ``manylinux2``, but the versioning has
been changed to use calendar years (also known as CalVer [23]_). This
makes it easier to define future *manylinux* tags out of order: for
example, a hypothetical ``manylinux2017`` standard may be defined via
a new PEP before ``manylinux2014``, or a ``manylinux2007`` standard
might be defined that targets systems older than this PEP but newer
than ``manylinux1``.
Calendar versioning also gives a rough idea of which Linux
distribution versions support which tag: ``manylinux2010`` will work
on most distribution versions released since 2010. This is only an
approximation, however: the actual compatibility rules are defined
below, and some newer distributions may not meet them.
The ``manylinux2010`` policy
============================
The following criteria determine a ``linux`` wheel's eligibility for
the ``manylinux2010`` tag:
1. The wheel may only contain binary executables and shared objects
compiled for one of the two architectures supported by CentOS 6:
x86_64 or i686. [5]_
2. The wheel's binary executables or shared objects may not link
against externally-provided libraries except those in the following
whitelist: ::
libgcc_s.so.1
libstdc++.so.6
libm.so.6
libdl.so.2
librt.so.1
libcrypt.so.1
libc.so.6
libnsl.so.1
libutil.so.1
libpthread.so.0
libresolv.so.2
libX11.so.6
libXext.so.6
libXrender.so.1
libICE.so.6
libSM.so.6
libGL.so.1
libgobject-2.0.so.0
libgthread-2.0.so.0
libglib-2.0.so.0
This list is identical to the externally-provided libraries
whitelisted for ``manylinux1``, minus ``libncursesw.so.5`` and
``libpanelw.so.5``. [7]_ ``libpythonX.Y`` remains ineligible for
inclusion for the same reasons outlined in PEP 513.
On Debian-based systems, these libraries are provided by the packages:
============ =======================================================
Package Libraries
============ =======================================================
libc6 libdl.so.2, libresolv.so.2, librt.so.1, libc.so.6,
libpthread.so.0, libm.so.6, libutil.so.1, libcrypt.so.1,
libnsl.so.1
libgcc1 libgcc_s.so.1
libgl1 libGL.so.1
libglib2.0-0 libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0
libice6 libICE.so.6
libsm6 libSM.so.6
libstdc++6 libstdc++.so.6
libx11-6 libX11.so.6
libxext6 libXext.so.6
libxrender1 libXrender.so.1
============ =======================================================
On RPM-based systems, they are provided by these packages:
============ =======================================================
Package Libraries
============ =======================================================
glib2 libglib-2.0.so.0, libgthread-2.0.so.0, libgobject-2.0.so.0
glibc libresolv.so.2, libutil.so.1, libnsl.so.1, librt.so.1,
libcrypt.so.1, libpthread.so.0, libdl.so.2, libm.so.6,
libc.so.6
libICE libICE.so.6
libX11 libX11.so.6
libXext: libXext.so.6
libXrender libXrender.so.1
libgcc: libgcc_s.so.1
libstdc++ libstdc++.so.6
mesa libGL.so.1
============ =======================================================
3. If the wheel contains binary executables or shared objects linked
against any whitelisted libraries that also export versioned
symbols, they may only depend on the following maximum versions::
GLIBC_2.12
CXXABI_1.3.3
GLIBCXX_3.4.13
GCC_4.3.0
As an example, ``manylinux2010`` wheels may include binary artifacts
that require ``glibc`` symbols at version ``GLIBC_2.4``, because
this an earlier version than the maximum of ``GLIBC_2.12``.
4. If a wheel is built for any version of CPython 2 or CPython
versions 3.0 up to and including 3.2, it *must* include a CPython
ABI tag indicating its Unicode ABI. A ``manylinux2010`` wheel built
against Python 2, then, must include either the ``cpy27mu`` tag
indicating it was built against an interpreter with the UCS-4 ABI
or the ``cpy27m`` tag indicating an interpeter with the UCS-2
ABI. [8]_ [9]_
5. A wheel *must not* require the ``PyFPE_jbuf`` symbol. This is
achieved by building it against a Python compiled *without* the
``--with-fpectl`` ``configure`` flag.
Compilation of Compliant Wheels
===============================
Like ``manylinux1``, the ``auditwheel`` tool adds ``manylinux2010``
platform tags to ``linux`` wheels built by ``pip wheel`` or
``bdist_wheel`` in a ``manylinux2010`` Docker container.
Docker Images
-------------
``manylinux2010`` Docker images based on CentOS 6 x86_64 and i686 are
provided for building binary ``linux`` wheels that can reliably be
converted to ``manylinux2010`` wheels. [10]_ These images come with a
full compiler suite installed (``gcc``, ``g++``, and ``gfortran``
4.8.2) as well as the latest releases of Python and ``pip``.
Compatibility with kernels that lack ``vsyscall``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A Docker container assumes that its userland is compatible with its
host's kernel. Unfortunately, an increasingly common kernel
configuration breaks breaks this assumption for x86_64 CentOS 6 Docker
images.
Versions 2.14 and earlier of ``glibc`` require the kernel provide an
archaic system call optimization known as ``vsyscall`` on x86_64. [11]_
To effect the optimization, the kernel maps a read-only page of
frequently-called system calls -- most notably ``time(2)`` -- into
each process at a fixed memory location. ``glibc`` then invokes these
system calls by dereferencing a function pointer to the appropriate
offset into the ``vsyscall`` page and calling it. This avoids the
overhead associated with invoking the kernel that affects normal
system call invocation. ``vsyscall`` has long been deprecated in
favor of an equivalent mechanism known as vDSO, or "virtual dynamic
shared object", in which the kernel instead maps a relocatable virtual
shared object containing the optimized system calls into each
process. [12]_
The ``vsyscall`` page has serious security implications because it
does not participate in address space layout randomization (ASLR).
Its predictable location and contents make it a useful source of
gadgets used in return-oriented programming attacks. [13]_ At the same
time, its elimination breaks the x86_64 ABI, because ``glibc``
versions that depend on ``vsyscall`` suffer from segmentation faults
when attempting to dereference a system call pointer into a
non-existent page. As a compromise, Linux 3.1 implemented an
"emulated" ``vsyscall`` that reduced the executable code, and thus the
material for ROP gadgets, mapped into the process. [14]_
``vsyscall=emulated`` has been the default configuration in most
distribution's kernels for many years.
Unfortunately, ``vsyscall`` emulation still exposes predicatable code
at a reliable memory location, and continues to be useful for
return-oriented programming. [15]_ Because most distributions have now
upgraded to ``glibc`` versions that do not depend on ``vsyscall``,
they are beginning to ship kernels that do not support ``vsyscall`` at
all. [16]_
CentOS 5.11 and 6 both include versions of ``glibc`` that depend on
the ``vsyscall`` page (2.5 and 2.12.2 respectively), so containers
based on either cannot run under kernels provided with many
distribution's upcoming releases. [17]_ If Travis CI, for example,
begins running jobs under
a kernel that does not provide the ``vsyscall`` interface, Python
packagers will not be able to use our Docker images there to build
``manylinux`` wheels. [19]_
We have derived a patch from the ``glibc`` git repository that
backports the removal of all dependencies on ``vsyscall`` to the
version of ``glibc`` included with our ``manylinux2010`` image. [20]_
Rebuilding ``glibc``, and thus building ``manylinux2010`` image itself,
still requires a host kernel that provides the ``vsyscall`` mechanism,
but the resulting image can be both run on hosts that provide it and
those that do not. Because the ``vsyscall`` interface is an
optimization that is only applied to running processes, the
``manylinux2010`` wheels built with this modified image should be
identical to those built on an unmodified CentOS 6 system. Also, the
``vsyscall`` problem applies only to x86_64; it is not part of the
i686 ABI.
Auditwheel
----------
The ``auditwheel`` tool has also been updated to produce
``manylinux2010`` wheels. [21]_ Its behavior and purpose are otherwise
unchanged from PEP 513.
Platform Detection for Installers
=================================
Platforms may define a ``manylinux2010_compatible`` boolean attribute on
the ``_manylinux`` module described in PEP 513. A platform is
considered incompatible with ``manylinux2010`` if the attribute is
``False``.
If the ``_manylinux`` module is not found, or it does not have the attribute
``manylinux2010_compatible``, tools may fall back to checking for glibc. If the
platform has glibc 2.12 or newer, it is assumed to be compatible unless the
``_manylinux`` module says otherwise.
Specifically, the algorithm we propose is::
def is_manylinux2010_compatible():
# Only Linux, and only x86-64 / i686
from distutils.util import get_platform
if get_platform() not in ["linux-x86_64", "linux-i686"]:
return False
# Check for presence of _manylinux module
try:
import _manylinux
return bool(_manylinux.manylinux2010_compatible)
except (ImportError, AttributeError):
# Fall through to heuristic check below
pass
# Check glibc version. CentOS 6 uses glibc 2.12.
# PEP 513 contains an implementation of this function.
return have_compatible_glibc(2, 12)
Backwards compatibility with ``manylinux1`` wheels
==================================================
As explained in PEP 513, the specified symbol versions for
``manylinux1`` whitelisted libraries constitute an *upper bound*. The
same is true for the symbol versions defined for ``manylinux2010`` in
this PEP. As a result, ``manylinux1`` wheels are considered
``manylinux2010`` wheels. A ``pip`` that recognizes the ``manylinux2010``
platform tag will thus install ``manylinux1`` wheels for
``manylinux2010`` platforms -- even when explicitly set -- when no
``manylinux2010`` wheels are available. [22]_
PyPI Support
============
PyPI should permit wheels containing the ``manylinux2010`` platform tag
to be uploaded in the same way that it permits ``manylinux1``. It
should not attempt to verify the compatibility of ``manylinux2010``
wheels.
References
==========
.. [1] PEP 513 -- A Platform Tag for Portable Linux Built Distributions
(https://www.python.org/dev/peps/pep-0513/)
.. [2] pyca/cryptography
(https://cryptography.io/)
.. [3] numpy
(https://numpy.org)
.. [4] CentOS 5.11 EOL announcement
(https://lists.centos.org/pipermail/centos-announce/2017-April/022350.html)
.. [5] CentOS Product Specifications
(https://web.archive.org/web/20180108090257/https://wiki.centos.org/About/Product)
.. [6] PEP 425 -- Compatibility Tags for Built Distributions
(https://www.python.org/dev/peps/pep-0425/)
.. [7] ncurses 5 -> 6 transition means we probably need to drop some
libraries from the manylinux whitelist
(https://github.com/pypa/manylinux/issues/94)
.. [8] PEP 3149
https://www.python.org/dev/peps/pep-3149/
.. [9] SOABI support for Python 2.X and PyPy
https://github.com/pypa/pip/pull/3075
.. [10] manylinux2 Docker images
(https://hub.docker.com/r/markrwilliams/manylinux2/)
.. [11] On vsyscalls and the vDSO
(https://lwn.net/Articles/446528/)
.. [12] vdso(7)
(http://man7.org/linux/man-pages/man7/vdso.7.html)
.. [13] Framing Signals -- A Return to Portable Shellcode
(http://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf)
.. [14] ChangeLog-3.1
(https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.1)
.. [15] Project Zero: Three bypasses and a fix for one of Flash's Vector.<*> mitigations
(https://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html)
.. [16] linux: activate CONFIG_LEGACY_VSYSCALL_NONE ?
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852620)
.. [17] [Wheel-builders] Heads-up re: new kernel configurations breaking the manylinux docker image
(https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html)
.. [18] No longer used
.. [19] Travis CI
(https://travis-ci.org/)
.. [20] remove-vsyscall.patch
https://github.com/markrwilliams/manylinux/commit/e9493d55471d153089df3aafca8cfbcb50fa8093#diff-3eda4130bdba562657f3ec7c1b3f5720
.. [21] auditwheel manylinux2 branch
(https://github.com/markrwilliams/auditwheel/tree/manylinux2)
.. [22] pip manylinux2 branch
https://github.com/markrwilliams/pip/commits/manylinux2
.. [23] Calendar Versioning
http://calver.org/
Copyright
=========
This document has been placed into the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

1262
pep-0572.rst Normal file

File diff suppressed because it is too large Load Diff

568
pep-0573.rst Normal file
View File

@ -0,0 +1,568 @@
PEP: 573
Title: Module State Access from C Extension Methods
Version: $Revision$
Last-Modified: $Date$
Author: Petr Viktorin <encukou@gmail.com>,
Nick Coghlan <ncoghlan@gmail.com>,
Eric Snow <ericsnowcurrently@gmail.com>
Marcel Plch <gmarcel.plch@gmail.com>
Discussions-To: import-sig@python.org
Status: Active
Type: Process
Content-Type: text/x-rst
Created: 02-Jun-2016
Python-Version: 3.8
Post-History:
Abstract
========
This PEP proposes to add a way for CPython extension methods to access context such as
the state of the modules they are defined in.
This will allow extension methods to use direct pointer dereferences
rather than PyState_FindModule for looking up module state, reducing or eliminating the
performance cost of using module-scoped state over process global state.
This fixes one of the remaining roadblocks for adoption of PEP 3121 (Extension
module initialization and finalization) and PEP 489
(Multi-phase extension module initialization).
Additionaly, support for easier creation of immutable exception classes is added.
This removes the need for keeping per-module state if it would only be used
for exception classes.
While this PEP takes an additional step towards fully solving the problems that PEP 3121 and PEP 489 started
tackling, it does not attempt to resolve *all* remaining concerns. In particular, accessing the module state from slot methods (``nb_add``, etc) remains slower than accessing that state from other extension methods.
Terminology
===========
Process-Global State
--------------------
C-level static variables. Since this is very low-level
memory storage, it must be managed carefully.
Per-module State
----------------
State local to a module object, allocated dynamically as part of a
module object's initialization. This isolates the state from other
instances of the module (including those in other subinterpreters).
Accessed by ``PyModule_GetState()``.
Static Type
-----------
A type object defined as a C-level static variable, i.e. a compiled-in type object.
A static type needs to be shared between module instances and has no
information of what module it belongs to.
Static types do not have ``__dict__`` (although their instances might).
Heap Type
---------
A type object created at run time.
Rationale
=========
PEP 489 introduced a new way to initialize extension modules, which brings
several advantages to extensions that implement it:
* The extension modules behave more like their Python counterparts.
* The extension modules can easily support loading into pre-existing
module objects, which paves the way for extension module support for
``runpy`` or for systems that enable extension module reloading.
* Loading multiple modules from the same extension is possible, which
makes testing module isolation (a key feature for proper sub-interpreter
support) possible from a single interpreter.
The biggest hurdle for adoption of PEP 489 is allowing access to module state
from methods of extension types.
Currently, the way to access this state from extension methods is by looking up the module via
``PyState_FindModule`` (in contrast to module level functions in extension modules, which
receive a module reference as an argument).
However, ``PyState_FindModule`` queries the thread-local state, making it relatively
costly compared to C level process global access and consequently deterring module authors from using it.
Also, ``PyState_FindModule`` relies on the assumption that in each
subinterpreter, there is at most one module corresponding to
a given ``PyModuleDef``. This does not align well with Python's import
machinery. Since PEP 489 aimed to fix that, the assumption does
not hold for modules that use multi-phase initialization, so
``PyState_FindModule`` is unavailable for these modules.
A faster, safer way of accessing module-level state from extension methods
is needed.
Immutable Exception Types
-------------------------
For isolated modules to work, any class whose methods touch module state
must be a heap type, so that each instance of a module can have its own
type object. With the changes proposed in this PEP, heap type instances will
have access to module state without global registration. But, to create
instances of heap types, one will need the module state in order to
get the type object corresponding to the appropriate module.
In short, heap types are "viral" anything that “touches” them must itself be
a heap type.
Curently, most exception types, apart from the ones in ``builtins``, are
heap types. This is likely simply because there is a convenient way
to create them: ``PyErr_NewException``.
Heap types generally have a mutable ``__dict__``.
In most cases, this mutability is harmful. For example, exception types
from the ``sqlite`` module are mutable and shared across subinterpreters.
This allows "smuggling" values to other subinterpreters via attributes of
``sqlite3.Error``.
Moreover, since raising exceptions is a common operation, and heap types
will be "viral", ``PyErr_NewException`` will tend to "infect" the module
with "heap type-ness" at least if the module decides play well with
subinterpreters/isolation.
Many modules could go without module state
entirely if the exception classes were immutable.
To solve this problem, a new function for creating immutable exception types
is proposed.
Background
===========
The implementation of a Python method may need access to one or more of
the following pieces of information:
* The instance it is called on (``self``)
* The underlying function
* The class the method was defined in
* The corresponding module
* The module state
In Python code, the Python-level equivalents may be retrieved as::
import sys
def meth(self):
instance = self
module_globals = globals()
module_object = sys.modules[__name__] # (1)
underlying_function = Foo.meth # (1)
defining_class = Foo # (1)
defining_class = __class__ # (2)
.. note::
The defining class is not ``type(self)``, since ``type(self)`` might
be a subclass of ``Foo``.
The statements marked (1) implicitly rely on name-based lookup via the function's ``__globals__``:
either the ``Foo`` attribute to access the defining class and Python function object, or ``__name__`` to find the module object in ``sys.modules``.
In Python code, this is feasible, as ``__globals__`` is set appropriately when the function definition is executed, and
even if the namespace has been manipulated to return a different object, at worst an exception will be raised.
The ``__class__`` closure, (2), is a safer way to get the defining class, but it still relies on ``__closure__`` being set appropriately.
By contrast, extension methods are typically implemented as normal C functions.
This means that they only have access to their arguments and C level thread-local
and process-global states. Traditionally, many extension modules have stored
their shared state in C-level process globals, causing problems when:
* running multiple initialize/finalize cycles in the same process
* reloading modules (e.g. to test conditional imports)
* loading extension modules in subinterpreters
PEP 3121 attempted to resolve this by offering the ``PyState_FindModule`` API, but this still has significant problems when it comes to extension methods (rather than module level functions):
* it is markedly slower than directly accessing C-level process-global state
* there is still some inherent reliance on process global state that means it still doesn't reliably handle module reloading
It's also the case that when looking up a C-level struct such as module state, supplying
an unexpected object layout can crash the interpreter, so it's significantly more important to ensure that extension
methods receive the kind of object they expect.
Proposal
========
Currently, a bound extension method (``PyCFunction`` or ``PyCFunctionWithKeywords``) receives only
``self``, and (if applicable) the supplied positional and keyword arguments.
While module-level extension functions already receive access to the defining module object via their
``self`` argument, methods of extension types don't have that luxury: they receive the bound instance
via ``self``, and hence have no direct access to the defining class or the module level state.
The additional module level context described above can be made available with two changes.
Both additions are optional; extension authors need to opt in to start
using them:
* Add a pointer to the module to heap type objects.
* Pass the defining class to the underlying C function.
The defining class is readily available at the time built-in
method object (``PyCFunctionObject``) is created, so it can be stored
in a new struct that extends ``PyCFunctionObject``.
The module state can then be retrieved from the module object via
``PyModule_GetState``.
Note that this proposal implies that any type whose method needs to access
per-module state must be a heap type, rather than a static type.
This is necessary to support loading multiple module objects from a single
extension: a static type, as a C-level global, has no information about
which module it belongs to.
Slot methods
------------
The above changes don't cover slot methods, such as ``tp_iter`` or ``nb_add``.
The problem with slot methods is that their C API is fixed, so we can't
simply add a new argument to pass in the defining class.
Two possible solutions have been proposed to this problem:
* Look up the class through walking the MRO.
This is potentially expensive, but will be useful if performance is not
a problem (such as when raising a module-level exception).
* Storing a pointer to the defining class of each slot in a separate table,
``__typeslots__`` [#typeslots-mail]_. This is technically feasible and fast,
but quite invasive.
Due to the invasiveness of the latter approach, this PEP proposes adding an MRO walking
helper for use in slot method implementations, deferring the more complex alternative
as a potential future optimisation. Modules affected by this concern also have the
option of using thread-local state or PEP 567 context variables, or else defining their
own reload-friendly lookup caching scheme.
Immutable Exception Types
-------------------------
To facilitate creating static exception classes, a new function is proposed:
``PyErr_PrepareImmutableException``. It will work similarly to ``PyErr_NewExceptionWithDoc``
but will take a ``PyTypeObject **`` pointer, which points to a ``PyTypeObject *`` that is
either ``NULL`` or an initialized ``PyTypeObject``.
This pointer may be declared in process-global state. The function will then
allocate the object and will keep in mind that already existing exception
should not be overwritten.
The extra indirection makes it possible to make ``PyErr_PrepareImmutableException``
part of the stable ABI by having the Python interpreter, rather than extension code,
allocate the ``PyTypeObject``.
Specification
=============
Adding module references to heap types
--------------------------------------
The ``PyHeapTypeObject`` struct will get a new member, ``PyObject *ht_module``,
that can store a pointer to the module object for which the type was defined.
It will be ``NULL`` by default, and should not be modified after the type
object is created.
A new factory method will be added for creating modules::
PyObject* PyType_FromModuleAndSpec(PyObject *module,
PyType_Spec *spec,
PyObject *bases)
This acts the same as ``PyType_FromSpecWithBases``, and additionally sets
``ht_module`` to the provided module object.
Additionally, an accessor, ``PyObject * PyType_GetModule(PyTypeObject *)``
will be provided.
It will return the ``ht_module`` if a heap type with module pointer set
is passed in, otherwise it will set a SystemError and return NULL.
Usually, creating a class with ``ht_module`` set will create a reference
cycle involving the class and the module.
This is not a problem, as tearing down modules is not a performance-sensitive
operation (and module-level functions typically also create reference cycles).
The existing "set all module globals to None" code that breaks function cycles
through ``f_globals`` will also break the new cycles through ``ht_module``.
Passing the defining class to extension methods
-----------------------------------------------
A new style of C-level functions will be added to the current selection of
``PyCFunction`` and ``PyCFunctionWithKeywords``::
PyObject *PyCMethod(PyObject *self,
PyTypeObject *defining_class,
PyObject *args, PyObject *kwargs)
A new method object flag, ``METH_METHOD``, will be added to signal that
the underlying C function is ``PyCMethod``.
To hold the extra information, a new structure extending ``PyCFunctionObject``
will be added::
typedef struct {
PyCFunctionObject func;
PyTypeObject *mm_class; /* Passed as 'defining_class' arg to the C func */
} PyCMethodObject;
To allow passing the defining class to the underlying C function, a change
to private API is required, now ``_PyMethodDef_RawFastCallDict`` and
``_PyMethodDef_RawFastCallKeywords`` will receive ``PyTypeObject *cls``
as one of their arguments.
A new macro ``PyCFunction_GET_CLASS(cls)`` will be added for easier access to mm_class.
Method construction and calling code and will be updated to honor
``METH_METHOD``.
Argument Clinic
---------------
To support passing the defining class to methods using Argument Clinic,
a new converter will be added to clinic.py: ``defining_class``.
Each method may only have one argument using this converter, and it must
appear after ``self``, or, if ``self`` is not used, as the first argument.
The argument will be of type ``PyTypeObject *``.
When used, Argument Clinic will select ``METH_METHOD`` as the calling
convention.
The argument will not appear in ``__text_signature__``.
This will be compatible with ``__init__`` and ``__new__`` methods, where an
MRO walker will be used to pass the defining class from clinic generated
code to the user's function.
Slot methods
------------
To allow access to per-module state from slot methods, an MRO walker
will be implemented::
PyTypeObject *PyType_DefiningTypeFromSlotFunc(PyTypeObject *type,
int slot, void *func)
The walker will go through bases of heap-allocated ``type``
and search for class that defines ``func`` at its ``slot``.
The ``func`` needs not to be inherited by ``type``, only requirement
for the walker to find the defining class is that the defining class
must be heap-allocated.
On failure, exception is set and NULL is returned.
Static exceptions
-----------------
A new function will be added::
int PyErr_PrepareImmutableException(PyTypeObject **exc,
const char *name,
const char *doc,
PyObject *base)
Creates an immutable exception type which can be shared
across multiple module objects.
If the type already exists (determined by a process-global pointer,
``*exc``), skip the initialization and only ``INCREF`` it.
If ``*exc`` is NULL, the function will
allocate a new exception type and initialize it using given parameters
the same way ``PyType_FromSpecAndBases`` would.
The ``doc`` and ``base`` arguments may be ``NULL``, defaulting to a
missing docstring and ``PyExc_Exception`` base class, respectively.
The exception type's ``tp_flags`` will be set to values common to
built-in exceptions and the ``Py_TPFLAGS_HEAP_IMMUTABLE`` flag (see below)
will be set.
On failure, ``PyErr_PrepareImmutableException`` will set an exception
and return -1.
If called with an initialized exception type (``*exc``
is non-NULL), the function will do nothing but incref ``*exc``.
A new flag, ``Py_TPFLAGS_HEAP_IMMUTABLE``, will be added to prevent
mutation of the type object. This makes it possible to
share the object safely between multiple interpreters.
This flag is checked in ``type_setattro`` and blocks
setting of attributes when set, similar to built-in types.
A new pointer, ``ht_moduleptr``, will be added to heap types to store ``exc``.
On deinitialization of the exception type, ``*exc`` will be set to ``NULL``.
This makes it safe for ``PyErr_PrepareImmutableException`` to check if
the exception was already initialized.
PyType_offsets
--------------
Some extension types are using instances with ``__dict__`` or ``__weakref__``
allocated. Currently, there is no way of passing offsets of these through
``PyType_Spec``. To allow this, a new structure and a spec slot are proposed.
A new structure, ``PyType_offsets``, will have two members containing the
offsets of ``__dict__`` and ``__weakref__``::
typedef struct {
Py_ssize_t dict;
Py_ssize_t weaklist;
} PyType_offsets;
The new slot, ``Py_offsets``, will be used to pass a ``PyType_offsets *``
structure containing the mentioned data.
Helpers
-------
Getting to per-module state from a heap type is a very common task. To make this
easier, a helper will be added::
void *PyType_GetModuleState(PyObject *type)
This function takes a heap type and on success, it returns pointer to state of the
module that the heap type belongs to.
On failure, two scenarios may occure. When a type without a module is passed in,
``SystemError`` is set and ``NULL`` returned. If the module is found, pointer
to the state, which may be ``NULL``, is returned without setting any exception.
Modules Converted in the Initial Implementation
-----------------------------------------------
To validate the approach, several modules will be modified during
the initial implementation:
The ``zipimport``, ``_io``, ``_elementtree``, and ``_csv`` modules
will be ported to PEP 489 multiphase initialization.
Summary of API Changes and Additions
====================================
New functions:
* PyType_GetModule
* PyType_DefiningTypeFromSlotFunc
* PyType_GetModuleState
* PyErr_PrepareImmutableException
New macros:
* PyCFunction_GET_CLASS
New types:
* PyCMethodObject
New structures:
* PyType_offsets
Modified functions:
* _PyMethodDef_RawFastCallDict now receives ``PyTypeObject *cls``.
* _PyMethodDef_RawFastCallKeywords now receives ``PyTypeObject *cls``.
Modified structures:
* _heaptypeobject - added ht_module and ht_moduleptr
Other changes:
* METH_METHOD call flag
* defining_class converter in clinic
* Py_TPFLAGS_HEAP_IMMUTABLE flag
* Py_offsets type spec slot
Backwards Compatibility
=======================
Two new pointers are added to all heap types.
All other changes are adding new functions, structures and a type flag.
The new ``PyErr_PrepareImmutableException`` function changes encourages
modules to switch from using heap type Exception classes to immutable ones,
and a number of modules will be switched in the initial implementation.
This change will prevent adding class attributes to such types.
For example, the following will raise AttributeError::
sqlite.OperationalError.foo = None
Instances and subclasses of such exceptions will not be affected.
Implementation
==============
An initial implementation is available in a Github repository [#gh-repo]_;
a patchset is at [#gh-patch]_.
Possible Future Extensions
==========================
Easy creation of types with module references
---------------------------------------------
It would be possible to add a PEP 489 execution slot type to make
creating heap types significantly easier than calling
``PyType_FromModuleAndSpec``.
This is left to a future PEP.
Optimization
------------
CPython optimizes calls to methods that have restricted signatures,
such as not allowing keyword arguments.
As proposed here, methods defined with the ``METH_METHOD`` flag do not support
these optimizations.
Optimized calls still have the option of accessing per-module state
the same way slot methods do.
References
==========
.. [#typeslots-mail] [Import-SIG] On singleton modules, heap types, and subinterpreters
(https://mail.python.org/pipermail/import-sig/2015-July/001035.html)
.. [#gh-repo]
https://github.com/Traceur759/cpython/commits/pep-c
.. [#gh-patch]
https://github.com/Traceur759/cpython/compare/master...Traceur759:pep-c.patch
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

512
pep-0574.rst Normal file
View File

@ -0,0 +1,512 @@
PEP: 574
Title: Pickle protocol 5 with out-of-band data
Version: $Revision$
Last-Modified: $Date$
Author: Antoine Pitrou <solipsis@pitrou.net>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Mar-2018
Post-History: 28-Mar-2018
Resolution:
Abstract
========
This PEP proposes to standardize a new pickle protocol version, and
accompanying APIs to take full advantage of it:
1. A new pickle protocol version (5) to cover the extra metadata needed
for out-of-band data buffers.
2. A new ``PickleBuffer`` type for ``__reduce_ex__`` implementations
to return out-of-band data buffers.
3. A new ``buffer_callback`` parameter when pickling, to handle out-of-band
data buffers.
4. A new ``buffers`` parameter when unpickling to provide out-of-band data
buffers.
The PEP guarantees unchanged behaviour for anyone not using the new APIs.
Rationale
=========
The pickle protocol was originally designed in 1995 for on-disk persistency
of arbitrary Python objects. The performance of a 1995-era storage medium
probably made it irrelevant to focus on performance metrics such as
use of RAM bandwidth when copying temporary data before writing it to disk.
Nowadays the pickle protocol sees a growing use in applications where most
of the data isn't ever persisted to disk (or, when it is, it uses a portable
format instead of Python-specific). Instead, pickle is being used to transmit
data and commands from one process to another, either on the same machine
or on multiple machines. Those applications will sometimes deal with very
large data (such as Numpy arrays or Pandas dataframes) that need to be
transferred around. For those applications, pickle is currently
wasteful as it imposes spurious memory copies of the data being serialized.
As a matter of fact, the standard ``multiprocessing`` module uses pickle
for serialization, and therefore also suffers from this problem when
sending large data to another process.
Third-party Python libraries, such as Dask [#dask]_, PyArrow [#pyarrow]_
and IPyParallel [#ipyparallel]_, have started implementing alternative
serialization schemes with the explicit goal of avoiding copies on large
data. Implementing a new serialization scheme is difficult and often
leads to reduced generality (since many Python objects support pickle
but not the new serialization scheme). Falling back on pickle for
unsupported types is an option, but then you get back the spurious
memory copies you wanted to avoid in the first place. For example,
``dask`` is able to avoid memory copies for Numpy arrays and
built-in containers thereof (such as lists or dicts containing Numpy
arrays), but if a large Numpy array is an attribute of a user-defined
object, ``dask`` will serialize the user-defined object as a pickle
stream, leading to memory copies.
The common theme of these third-party serialization efforts is to generate
a stream of object metadata (which contains pickle-like information about
the objects being serialized) and a separate stream of zero-copy buffer
objects for the payloads of large objects. Note that, in this scheme,
small objects such as ints, etc. can be dumped together with the metadata
stream. Refinements can include opportunistic compression of large data
depending on its type and layout, like ``dask`` does.
This PEP aims to make ``pickle`` usable in a way where large data is handled
as a separate stream of zero-copy buffers, letting the application handle
those buffers optimally.
Example
=======
To keep the example simple and avoid requiring knowledge of third-party
libraries, we will focus here on a bytearray object (but the issue is
conceptually the same with more sophisticated objects such as Numpy arrays).
Like most objects, the bytearray object isn't immediately understood by
the pickle module and must therefore specify its decomposition scheme.
Here is how a bytearray object currently decomposes for pickling::
>>> b.__reduce_ex__(4)
(<class 'bytearray'>, (b'abc',), None)
This is because the ``bytearray.__reduce_ex__`` implementation reads
morally as follows::
class bytearray:
def __reduce_ex__(self, protocol):
if protocol == 4:
return type(self), bytes(self), None
# Legacy code for earlier protocols omitted
In turn it produces the following pickle code::
>>> pickletools.dis(pickletools.optimize(pickle.dumps(b, protocol=4)))
0: \x80 PROTO 4
2: \x95 FRAME 30
11: \x8c SHORT_BINUNICODE 'builtins'
21: \x8c SHORT_BINUNICODE 'bytearray'
32: \x93 STACK_GLOBAL
33: C SHORT_BINBYTES b'abc'
38: \x85 TUPLE1
39: R REDUCE
40: . STOP
(the call to ``pickletools.optimize`` above is only meant to make the
pickle stream more readable by removing the MEMOIZE opcodes)
We can notice several things about the bytearray's payload (the sequence
of bytes ``b'abc'``):
* ``bytearray.__reduce_ex__`` produces a first copy by instantiating a
new bytes object from the bytearray's data.
* ``pickle.dumps`` produces a second copy when inserting the contents of
that bytes object into the pickle stream, after the SHORT_BINBYTES opcode.
* Furthermore, when deserializing the pickle stream, a temporary bytes
object is created when the SHORT_BINBYTES opcode is encountered (inducing
a data copy).
What we really want is something like the following:
* ``bytearray.__reduce_ex__`` produces a *view* of the bytearray's data.
* ``pickle.dumps`` doesn't try to copy that data into the pickle stream
but instead passes the buffer view to its caller (which can decide on the
most efficient handling of that buffer).
* When deserializing, ``pickle.loads`` takes the pickle stream and the
buffer view separately, and passes the buffer view directly to the
bytearray constructor.
We see that several conditions are required for the above to work:
* ``__reduce__`` or ``__reduce_ex__`` must be able to return *something*
that indicates a serializable no-copy buffer view.
* The pickle protocol must be able to represent references to such buffer
views, instructing the unpickler that it may have to get the actual buffer
out of band.
* The ``pickle.Pickler`` API must provide its caller with a way
to receive such buffer views while serializing.
* The ``pickle.Unpickler`` API must similarly allow its caller to provide
the buffer views required for deserialization.
* For compatibility, the pickle protocol must also be able to contain direct
serializations of such buffer views, such that current uses of the ``pickle``
API don't have to be modified if they are not concerned with memory copies.
Producer API
============
We are introducing a new type ``pickle.PickleBuffer`` which can be
instantiated from any buffer-supporting object, and is specifically meant
to be returned from ``__reduce__`` implementations::
class bytearray:
def __reduce_ex__(self, protocol):
if protocol >= 5:
return type(self), (PickleBuffer(self),), None
# Legacy code for earlier protocols omitted
``PickleBuffer`` is a simple wrapper that doesn't have all the memoryview
semantics and functionality, but is specifically recognized by the ``pickle``
module if protocol 5 or higher is enabled. It is an error to try to
serialize a ``PickleBuffer`` with pickle protocol version 4 or earlier.
Only the raw *data* of the ``PickleBuffer`` will be considered by the
``pickle`` module. Any type-specific *metadata* (such as shapes or
datatype) must be returned separately by the type's ``__reduce__``
implementation, as is already the case.
PickleBuffer objects
--------------------
The ``PickleBuffer`` class supports a very simple Python API. Its constructor
takes a single PEP 3118-compatible object [#pep-3118]_. ``PickleBuffer``
objects themselves support the buffer protocol, so consumers can
call ``memoryview(...)`` on them to get additional information
about the underlying buffer (such as the original type, shape, etc.).
In addition, ``PickleBuffer`` objects can be explicitly released using
their ``release()`` method.
On the C side, a simple API will be provided to create and inspect
PickleBuffer objects:
``PyObject *PyPickleBuffer_FromObject(PyObject *obj)``
Create a ``PickleBuffer`` object holding a view over the PEP 3118-compatible
*obj*.
``PyPickleBuffer_Check(PyObject *obj)``
Return whether *obj* is a ``PickleBuffer`` instance.
``const Py_buffer *PyPickleBuffer_GetBuffer(PyObject *picklebuf)``
Return a pointer to the internal ``Py_buffer`` owned by the ``PickleBuffer``
instance. An exception is raised if the buffer is released.
``int PyPickleBuffer_Release(PyObject *picklebuf)``
Release the ``PickleBuffer`` instance's underlying buffer.
``PickleBuffer`` can wrap any kind of buffer, including non-contiguous
buffers. It's up to consumers to decide how best to handle different kinds
of buffers (for example, some consumers may find it acceptable to make a
contiguous copy of non-contiguous buffers).
Consumer API
============
``pickle.Pickler.__init__`` and ``pickle.dumps`` are augmented with an additional
``buffer_callback`` parameter::
class Pickler:
def __init__(self, file, protocol=None, ..., buffer_callback=None):
"""
If *buffer_callback* is not None, then it is called with a list
of out-of-band buffer views when deemed necessary (this could be
once every buffer, or only after a certain size is reached,
or once at the end, depending on implementation details). The
callback should arrange to store or transmit those buffers without
changing their order.
If *buffer_callback* is None (the default), buffer views are
serialized into *file* as part of the pickle stream.
It is an error if *buffer_callback* is not None and *protocol* is
None or smaller than 5.
"""
def pickle.dumps(obj, protocol=None, *, ..., buffer_callback=None):
"""
See above for *buffer_callback*.
"""
``pickle.Unpickler.__init__`` and ``pickle.loads`` are augmented with an
additional ``buffers`` parameter::
class Unpickler:
def __init__(file, *, ..., buffers=None):
"""
If *buffers* is not None, it should be an iterable of buffer-enabled
objects that is consumed each time the pickle stream references
an out-of-band buffer view. Such buffers have been given in order
to the *buffer_callback* of a Pickler object.
If *buffers* is None (the default), then the buffers are taken
from the pickle stream, assuming they are serialized there.
It is an error for *buffers* to be None if the pickle stream
was produced with a non-None *buffer_callback*.
"""
def pickle.loads(data, *, ..., buffers=None):
"""
See above for *buffers*.
"""
Protocol changes
================
Three new opcodes are introduced:
* ``BYTEARRAY8`` creates a bytearray from the data following it in the pickle
stream and pushes it on the stack (just like ``BINBYTES8`` does for bytes
objects);
* ``NEXT_BUFFER`` fetches a buffer from the ``buffers`` iterable and pushes
it on the stack.
* ``READONLY_BUFFER`` makes a readonly view of the top of the stack.
When pickling encounters a ``PickleBuffer``, there can be four cases:
* If a ``buffer_callback`` is given and the ``PickleBuffer`` is writable,
the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode
is appended to the pickle stream.
* If a ``buffer_callback`` is given and the ``PickleBuffer`` is readonly,
the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode
is appended to the pickle stream, followed by a ``READONLY_BUFFER`` opcode.
* If no ``buffer_callback`` is given and the ``PickleBuffer`` is writable,
it is serialized into the pickle stream as if it were a ``bytearray`` object.
* If no ``buffer_callback`` is given and the ``PickleBuffer`` is readonly,
it is serialized into the pickle stream as if it were a ``bytes`` object.
The distinction between readonly and writable buffers is explained below
(see "Mutability").
Side effects
============
Improved in-band performance
----------------------------
Even in-band pickling can be improved by returning a ``PickleBuffer``
instance from ``__reduce_ex__``, as one copy is avoided on the serialization
path [#ogrisel-numpy]_.
Caveats
=======
Mutability
----------
PEP 3118 buffers [#pep-3118]_ can be readonly or writable. Some objects,
such as Numpy arrays, need to be backed by a mutable buffer for full
operation. Pickle consumers that use the ``buffer_callback`` and ``buffers``
arguments will have to be careful to recreate mutable buffers. When doing
I/O, this implies using buffer-passing API variants such as ``readinto``
(which are also often preferrable for performance).
Data sharing
------------
If you pickle and then unpickle an object in the same process, passing
out-of-band buffer views, then the unpickled object may be backed by the
same buffer as the original pickled object.
For example, it might be reasonable to implement reduction of a Numpy array
as follows (crucial metadata such as shapes is omitted for simplicity)::
class ndarray:
def __reduce_ex__(self, protocol):
if protocol == 5:
return numpy.frombuffer, (PickleBuffer(self), self.dtype)
# Legacy code for earlier protocols omitted
Then simply passing the PickleBuffer around from ``dumps`` to ``loads``
will produce a new Numpy array sharing the same underlying memory as the
original Numpy object (and, incidentally, keeping it alive)::
>>> import numpy as np
>>> a = np.zeros(10)
>>> a[0]
0.0
>>> buffers = []
>>> data = pickle.dumps(a, protocol=5, buffer_callback=buffers.extend)
>>> b = pickle.loads(data, buffers=buffers)
>>> b[0] = 42
>>> a[0]
42.0
This won't happen with the traditional ``pickle`` API (i.e. without passing
``buffers`` and ``buffer_callback`` parameters), because then the buffer view
is serialized inside the pickle stream with a copy.
Rejected alternatives
=====================
Using the existing persistent load interface
--------------------------------------------
The ``pickle`` persistence interface is a way of storing references to
designated objects in the pickle stream while handling their actual
serialization out of band. For example, one might consider the following
for zero-copy serialization of bytearrays::
class MyPickle(pickle.Pickler):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.buffers = []
def persistent_id(self, obj):
if type(obj) is not bytearray:
return None
else:
index = len(self.buffers)
self.buffers.append(obj)
return ('bytearray', index)
class MyUnpickle(pickle.Unpickler):
def __init__(self, *args, buffers, **kwargs):
super().__init__(*args, **kwargs)
self.buffers = buffers
def persistent_load(self, pid):
type_tag, index = pid
if type_tag == 'bytearray':
return self.buffers[index]
else:
assert 0 # unexpected type
This mechanism has two drawbacks:
* Each ``pickle`` consumer must reimplement ``Pickler`` and ``Unpickler``
subclasses, with custom code for each type of interest. Essentially,
N pickle consumers end up each implementing custom code for M producers.
This is difficult (especially for sophisticated types such as Numpy
arrays) and poorly scalable.
* Each object encountered by the pickle module (even simple built-in objects
such as ints and strings) triggers a call to the user's ``persistent_id()``
method, leading to a possible performance drop compared to nominal.
Open questions
==============
Should ``buffer_callback`` take a single buffers or a sequence of buffers?
* Taking a single buffer would allow returning a boolean indicating whether
the given buffer should be serialized in-band or out-of-band.
* Taking a sequence of buffers is potentially more efficient by reducing
function call overhead.
Should it be allowed to serialize a ``PickleBuffer`` in protocol 4 and earlier?
It would simply be serialized as a ``bytes`` object (if read-only) or
``bytearray`` (if writable).
* It can make implementing ``__reduce__`` simpler.
* Serializing a ``bytearray`` in protocol 4 makes a supplementary memory
copy when ``bytearray.__reduce_ex__`` returns a ``bytes`` object. This
is a performance regression that may be overlooked by ``__reduce__``
implementors.
Implementation
==============
A first implementation is available in the author's GitHub fork [#pickle5-git]_.
An experimental backport for Python 3.6 and 3.7 is downloadable from PyPI
[#pickle5-pypi]_.
Related work
============
Dask.distributed implements a custom zero-copy serialization with fallback
to pickle [#dask-serialization]_.
PyArrow implements zero-copy component-based serialization for a few
selected types [#pyarrow-serialization]_.
PEP 554 proposes hosting multiple interpreters in a single process, with
provisions for transferring buffers between interpreters as a communication
scheme [#pep-554]_.
Acknowledgements
================
Thanks to the following people for early feedback: Nick Coghlan, Olivier
Grisel, Stefan Krah, MinRK, Matt Rocklin, Eric Snow.
References
==========
.. [#dask] Dask.distributed -- A lightweight library for distributed computing
in Python
https://distributed.readthedocs.io/
.. [#dask-serialization] Dask.distributed custom serialization
https://distributed.readthedocs.io/en/latest/serialization.html
.. [#ipyparallel] IPyParallel -- Using IPython for parallel computing
https://ipyparallel.readthedocs.io/
.. [#pyarrow] PyArrow -- A cross-language development platform for in-memory data
https://arrow.apache.org/docs/python/
.. [#pyarrow-serialization] PyArrow IPC and component-based serialization
https://arrow.apache.org/docs/python/ipc.html#component-based-serialization
.. [#pep-3118] PEP 3118 -- Revising the buffer protocol
https://www.python.org/dev/peps/pep-3118/
.. [#pep-554] PEP 554 -- Multiple Interpreters in the Stdlib
https://www.python.org/dev/peps/pep-0554/
.. [#pickle5-git] ``pickle5`` branch on GitHub
https://github.com/pitrou/cpython/tree/pickle5
.. [#pickle5-pypi] ``pickle5`` project on PyPI
https://pypi.org/project/pickle5/
.. [#ogrisel-numpy] Draft use of pickle protocol 5 for Numpy array pickling
https://gist.github.com/ogrisel/a2b0e5ae4987a398caa7f9277cb3b90a
Copyright
=========
This document has been placed into the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

1122
pep-0575.rst Normal file

File diff suppressed because it is too large Load Diff

161
pep-0576.rst Normal file
View File

@ -0,0 +1,161 @@
PEP: 576
Title: Rationalize Built-in function classes
Author: Mark Shannon <mark@hotpy.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-May-2018
Python-Version: 3.8
Post-History: 17-May-2018
23-June-2018
08-July-2018
Abstract
========
Expose the "FastcallKeywords" convention used internally by CPython to third-party code, and make the ``inspect`` module use duck-typing.
In combination this will allow third-party C extensions and tools like Cython to create objects that use the same calling conventions as built-in and Python functions, thus gaining performance parity with built-in functions like ``len`` or ``print``.
A small improvement in the performance of existing code is expected.
Motivation
==========
Currently third-party module authors face a dilemna when implementing
functions in C. Either they can use one of the pre-existing built-in function
or method classes or implement their own custom class in C.
The first choice causes them to lose the ability to access the internals of the callable object.
The second choice is an additional maintenance burden and, more importantly,
has a significant negative impact on performance.
This PEP aims to allow authors of third-party C modules, and tools like to Cython, to utilize the faster calling convention used internally by CPython for built-in functions and methods, and to do so without a loss of capabilities relative to a function implemented in Python.
Introspection
-------------
The inspect module will fully support duck-typing when introspecting callables.
The ``inspect.Signature.from_callable()`` function computes the signature of a callable. If an object has a ``__signature__``
property, then ``inspect.Signature.from_callable()`` simply returns that. To further support duck-typing, if a callable has a ``__text_signature__``
then the ``__signature__`` will be created from that.
This means that 3rd party builtin-functions can implement ``__text_signature__`` if sufficient,
and the more expensive ``__signature__`` if necessary.
Efficient calls to third-party callables
----------------------------------------
Currently the majority of calls are dispatched to ``function``\s and ``method_descriptor``\s in custom code, using the "FastcallKeywords" internal calling convention. This PEP proposes that this calling convention is implemented via a C function pointer. Third-party callables which implement this binary interface will have the potential to be called as fast as a built-in function.
Continued prohibition of callable classes as base classes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Currently any attempt to use ``function``, ``method`` or ``method_descriptor`` as a base class for a new class will fail with a ``TypeError``. This behaviour is desirable as it prevents errors when a subclass overrides the ``__call__`` method. If callables could be sub-classed then any call to a ``function`` or a ``method_descriptor`` would need an additional check that the ``__call__`` method had not been overridden. By exposing an additional call mechanism, the potential for errors becomes greater. As a consequence, any third-partyy class implementing the addition call interface will not be usable as a base class.
New classes and changes to existing classes
===========================================
Python visible changes
----------------------
#. A new built-in class, ``builtin_function``, will be added.
#. ``types.BuiltinFunctionType`` will refer to ``builtin_function`` not ``builtin_function_or_method``.
#. Instances of the ``builtin_function`` class will retain the ``__module__`` property of ``builtin_function_or_method`` and gain the ``func_module`` and ``func_globals`` properties. The ``func_module`` allows access to the module to which the function belongs. Note that this is different from the ``__module__`` property which merely returns the name of the module. The ``func_globals`` property is equivalent to ``func_module.__dict__`` and is provided to mimic the Python function property of the same name.
#. When binding a ``method_descriptor`` instance to an instance of its owning class, a ``bound_method`` will be created instead of a ``builtin_function_or_method``. This means that the ``method_descriptors`` now mimic the behaviour of Python functions more closely. In other words, ``[].append`` becomes a ``bound_method`` instead of a ``builtin_function_or_method``.
C API changes
-------------
#. A new function ``PyBuiltinFunction_New(PyMethodDef *ml, PyObject *module)`` is added to create built-in functions.
#. ``PyCFunction_NewEx()`` and ``PyCFunction_New()`` are deprecated and will return a ``PyBuiltinFunction`` if able, otherwise a ``builtin_function_or_method``.
Retaining backwards compatibility in the C API and ABI
======================================================
The proposed changes are fully backwards and forwards compatible at both the API and ABI level.
Internal C changes
------------------
Two new flags will be allowed for the ``typeobject.tp_flags`` field.
These are ``Py_TPFLAGS_EXTENDED_CALL`` and ``Py_TPFLAGS_FUNCTION_DESCRIPTOR``
Py_TPFLAGS_EXTENDED_CALL
~~~~~~~~~~~~~~~~~~~~~~~~
For any built-in class that sets ``Py_TPFLAGS_EXTENDED_CALL``
The C struct corresponding to this built-in class must begin with the struct ``PyExtendedCallable`` which is defined as follows::
typedef PyObject *(*extended_call_ptr)(PyObject *callable, PyObject** args,
int positional_argcount, PyTupleObject* kwnames);
typedef struct {
PyObject_HEAD
extended_call_ptr ext_call;
} PyExtendedCallable;
Any class that sets the ``Py_TPFLAGS_EXTENDED_CALL`` cannot be used as a base class and a TypeError will be raised if any Python code tries to use it a base class.
Py_TPFLAGS_FUNCTION_DESCRIPTOR
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If this flag is set for a built-in class ``F``, then instances of that class are expected to behave the same as a Python function when used as a class attribute.
Specifically, this mean that the value of ``c.m`` where ``C.m`` is an instanceof the built-in class ``F`` (and ``c`` is an instance of ``C``) must be a bound-method binding ``C.m`` and ``c``.
Without this flag, it would be impossible for custom callables to behave like Python functions *and* be efficient as Python or built-in functions.
Changes to existing C structs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``function``, ``method_descriptor`` and ``method`` classes will have their corresponding structs changed to
start with the ``PyExtendedCallable`` struct.
Third-party built-in classes using the new extended call interface
------------------------------------------------------------------
To enable call performance on a par with Python functions and built-in functions, third-party callables should set the ``Py_TPFLAGS_EXTENDED_CALL`` bit of ``tp_flags`` and ensure that the corresponding C struct starts with the ``PyExtendedCallable``.
Any built-in class that has the ``Py_TPFLAGS_EXTENDED_CALL`` bit set must also implement the ``tp_call`` function and make sure its behaviour is consistent with the ``ext_call`` function.
Performance implications of these changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Adding a function pointer to each callable, rather than each class of callable, enables the choice of dispatching function (the code to shuffle arguments about and do error checking) to be made when the callable object is created rather than when it is called. This should reduce the number of instructions executed between the call-site in the interpreter and the execution of the callee.
Alternative Suggestions
=======================
PEP 580 <https://www.python.org/dev/peps/pep-0580/> is an alternative approach to solving the same problem as this PEP.
Reference implementation
========================
A draft implementation can be found at https://github.com/markshannon/cpython/tree/pep-576-minimal
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

826
pep-0577.rst Normal file
View File

@ -0,0 +1,826 @@
PEP: 577
Title: Augmented Assignment Expressions
Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Created: 14-May-2018
Python-Version: 3.8
Post-History: 22-May-2018
PEP Withdrawal
==============
While working on this PEP, I realised that it didn't really address what was
actually bothering me about PEP 572's proposed scoping rules for previously
unreferenced assignment targets, and also had some significant undesirable
consequences (most notably, allowing ``>>=` and ``<<=`` as inline augmented
assignment operators that meant something entirely different from the ``>=``
and ``<=`` comparison operators).
I also realised that even without dedicated syntax of their own, PEP 572 allows
inline augmented assignments to be written using the ``operator`` module::
from operator import iadd
if (target := iadd(target, value)) < limit:
...
(The restriction to simple names as inline assignment targets means that the
target expession can always be repeated without side effects)
Accordingly, I'm withdrawing this PEP without submitting it for pronouncement,
and will instead be writing a replacement PEP that focuses specifically on the
handling of assignment targets which haven't already been declared as local
variables in the current scope (for both regular block scopes, and for scoped
expressions).
Abstract
========
This is a proposal to allow augmented assignments such as ``x += 1`` to be
used as expressions, not just statements.
As part of this, ``NAME := EXPR`` is proposed as an inline assignment expression
that uses the new augmented assignment scoping rules, rather than implicitly
defining a new local variable name the way that existing name binding
statements do. The question of allowing expression level local variable
declarations at function scope is deliberately separated from the question of
allowing expression level name bindings, and deferred to a later PEP.
This PEP is a direct competitor to PEP 572 (although it borrows heavily from that
PEP's motivation, and even shares the proposed syntax for inline assignments).
See `Relationship with PEP 572`_ for more details on the connections between
the two PEPs.
To improve the usability of the new expressions, a semantic split is proposed
between the handling of augmented assignments in regular block scopes (modules,
classes, and functions), and the handling of augmented assignments in scoped
expressions (lambda expressions, generator expressions, and comprehensions),
such that all inline assignments default to targeting the nearest containing
block scope.
A new compile time ``TargetNameError`` is added as a subclass of ``SyntaxError``
to handle cases where it is deemed to be currently unclear which target is
expected to be rebound by an inline assignment, or else the target scope
for the inline assignment is considered invalid for another reason.
Syntax and semantics
====================
Augmented assignment expressions
--------------------------------
The language grammar would be adjusted to allow augmented assignments to
appear as expressions, where the result of the augmented assignment
expression is the same post-calculation reference as is being bound to the
given target.
For example::
>>> n = 0
>>> n += 5
5
>>> n -= 2
3
>>> n *= 3
9
>>> n
9
For mutable targets, this means the result is always just the original object::
>>> seq = []
>>> seq_id = id(seq)
>>> seq += range(3)
[0, 1, 2]
>>> seq_id == id(seq)
True
Augmented assignments to attributes and container subscripts will be permitted,
with the result being the post-calculation reference being bound to the target,
just as it is for simple name targets::
def increment(self, step=1):
return self._value += step
In these cases, ``__getitem__`` and ``__getattribute__`` will *not* be called
after the assignment has already taken place (they will only be called as
needed to evaluate the in-place operation).
Adding an inline assignment operator
------------------------------------
Given only the addition of augmented assignment expressions, it would be
possible to abuse a symbol like ``|=`` as a general purpose assignment
operator by defining a ``Target`` wrapper type that worked as follows::
>>> class Target:
... def __init__(self, value):
... self.value = value
... def __or__(self, other):
... return Target(other)
...
>>> x = Target(10)
>>> x.value
10
>>> x |= 42
<__main__.Target object at 0x7f608caa8048>
>>> x.value
42
This is similar to the way that storing a single reference in a list was long
used as a workaround for the lack of a ``nonlocal`` keyword, and can still be
used today (in combination with ``operator.itemsetter``) to work around the
lack of expression level assignments.
Rather than requiring such workarounds, this PEP instead proposes that
PEP 572's "NAME := EXPR" syntax be adopted as a new inline assignment
expression that uses the augmented assignment scoping rules described below.
This cleanly handles cases where only the new value is of interest, and the
previously bound value (if any) can just be discarded completely.
Note that for both simple names and complex assignment targets, the inline
assignment operator does *not* read the previous reference before assigning
the new one. However, when used at function scope (either directly or inside
a scoped expression), it does *not* implicitly define a new local variable,
and will instead raise ``TargetNameError`` (as described for augmented
assignments below).
Assignment operator precedence
------------------------------
To preserve the existing semantics of augmented assignment statements,
inline assignment operators will be defined as being of lower precedence
than all other operators, include the comma pseudo-operator. This ensures
that when used as a top level expression the entire right hand side of the
expression is still interpreted as the value to be processed (even when that
value is a tuple without parentheses).
The difference this introduces relative to PEP 572 is that where
``(n := first, second)`` sets ``n = first`` in PEP 572, in this PEP it would set
``n = (first, second)`, and getting the first meaning would require an extra
set of parentheses (``((n := first), second)``).
PEP 572 quite reasonably notes that this results in ambiguity when assignment
expressions are used as function call arguments. This PEP resolves that concern
a different way by requiring that assignment expressions be parenthesised
when used as arguments to a function call (unless they're the sole argument).
This is a more relaxed version of the restriction placed on generator
expressions (which always require parentheses, except when they're the sole
argument to a function call).
Augmented assignment to names in block scopes
---------------------------------------------
No target name binding changes are proposed for augmented assignments at module
or class scope (this also includes code executed using "exec" or "eval"). These
will continue to implicitly declare a new local variable as the binding target
as they do today, and (if necessary) will be able to resolve the name from an
outer scope before binding it locally.
At function scope, augmented assignments will be changed to require that there
be either a preceding name binding or variable declaration to explicitly
establish the target name as being local to the function, or else an explicit
``global`` or ``nonlocal`` declaration. ``TargetNameError``, a new
``SyntaxError`` subclass, will be raised at compile time if no such binding or
declaration is present.
For example, the following code would compile and run as it does today::
x = 0
x += 1 # Sets global "x" to 1
class C:
x += 1 # Sets local "x" to 2, leaves global "x" alone
def local_target():
x = 0
x += 1 # Sets local "x" to 1, leaves global "x" alone
def global_target():
global x
x += 1 # Increments global "x" each time this runs
def nonlocal_target():
x = 0
def g():
nonlocal x
x += 1 # Increments "x" in outer scope each time this runs
return x
return g
The follow examples would all still compile and then raise an error at runtime
as they do today::
n += 1 # Raises NameError at runtime
class C:
n += 1 # Raises NameError at runtime
def missing_global():
global n
n += 1 # Raises NameError at runtime
def delayed_nonlocal_initialisation():
def f():
nonlocal n
n += 1
f() # Raises NameError at runtime
n = 0
def skipped_conditional_initialisation():
if False:
n = 0
n += 1 # Raises UnboundLocalError at runtime
def local_declaration_without_initial_assignment():
n: typing.Any
n += 1 # Raises UnboundLocalError at runtime
Whereas the following would raise a compile time ``DeprecationWarning``
initially, and eventually change to report a compile time ``TargetNameError``::
def missing_target():
x += 1 # Compile time TargetNameError due to ambiguous target scope
# Is there a missing initialisation of "x" here? Or a missing
# global or nonlocal declaration?
As a conservative implementation approach, the compile time function name
resolution change would be introduced as a ``DeprecationWarning`` in Python
3.8, and then converted to ``TargetNameError`` in Python 3.9. This avoids
potential problems in cases where an unused function would currently raise
``UnboundLocalError`` if it was ever actually called, but the code is actually
unused - converting that latent runtime defect to a compile time error qualifies
as a backwards incompatible change that requires a deprecation period.
When augmented assignments are used as expressions in function scope (rather
than as standalone statements), there aren't any backwards compatibility
concerns, so the compile time name binding checks would be enforced immediately
in Python 3.8.
Similarly, the new inline assignment expressions would always require explicit
predeclaration of their target scope when used as part of a function, at least
for Python 3.8. (See the design discussion section for notes on potentially
revisiting that restriction in the future).
Augmented assignment to names in scoped expressions
---------------------------------------------------
Scoped expressions is a new collective term being proposed for expressions that
introduce a new nested scope of execution, either as an intrinsic part of their
operation (lambda expressions, generator expressions), or else as a way of
hiding name binding operations from the containing scope (container
comprehensions).
Unlike regular functions, these scoped expressions can't include explicit
``global`` or ``nonlocal`` declarations to rebind names directly in an outer
scope.
Instead, their name binding semantics for augmented assignment expressions would
be defined as follows:
* augmented assignment targets used in scoped expressions are expected to either
be already bound in the containing block scope, or else have their scope
explicitly declared in the containing block scope. If no suitable name
binding or declaration can be found in that scope, then ``TargetNameError``
will be raised at compile time (rather than creating a new binding within
the scoped expression).
* if the containing block scope is a function scope, and the target name is
explicitly declared as ``global`` or ``nonlocal``, then it will be use the
same scope declaration in the body of the scoped expression
* if the containing block scope is a function scope, and the target name is
a local variable in that function, then it will be implicitly declared as
``nonlocal`` in the body of the scoped expression
* if the containing block scope is a class scope, than ``TargetNameError`` will
always be raised, with a dedicated message indicating that combining class
scopes with augmented assignments in scoped expressions is not currently
permitted.
* if a name is declared as a formal parameter (lambda expressions), or as an
iteration variable (generator expressions, comprehensions), then that name
is considered local to that scoped expression, and attempting to use it as
the target of an augmented assignment operation in that scope, or any nested
scoped expression, will raise ``TargetNameError`` (this is a restriction that
could potentially be lifted later, but is being proposed for now to simplify
the initial set of compile time and runtime semantics that needs to be
covered in the language reference and handled by the compiler and interpreter)
For example, the following code would work as shown::
>>> global_target = 0
>>> incr_global_target = lambda: global_target += 1
>>> incr_global_target()
1
>>> incr_global_target()
2
>>> global_target
2
>>> def cumulative_sums(data, start=0)
... total = start
... yield from (total += value for value in data)
... return total
...
>>> print(list(cumulative_sums(range(5))))
[0, 1, 3, 6, 10]
While the following examples would all raise ``TargetNameError``::
class C:
cls_target = 0
incr_cls_target = lambda: cls_target += 1 # Error due to class scope
def missing_target():
incr_x = lambda: x += 1 # Error due to missing target "x"
def late_target():
incr_x = lambda: x += 1 # Error due to "x" being declared after use
x = 1
lambda arg: arg += 1 # Error due to attempt to target formal parameter
[x += 1 for x in data] # Error due to attempt to target iteration variable
As augmented assignments currently can't appear inside scoped expressions, the
above compile time name resolution exceptions would be included as part of the
initial implementation rather than needing to be phased in as a potentially
backwards incompatible change.
Design discussion
=================
Allowing complex assignment targets
-----------------------------------
The initial drafts of this PEP kept PEP 572's restriction to single name targets
when augmented assignments were used as expressions, allowing attribute and
subscript targets solely for the statement form.
However, enforcing that required varying the permitted targets based on whether
or not the augmented assignment was a top level expression or not, as well as
explaining why ``n += 1``, ``(n += 1)``, and ``self.n += 1`` were all legal,
but ``(self.n += 1)`` was prohibited, so the proposal was simplified to allow
all existing augmented assignment targets for the expression form as well.
Since this PEP defines ``TARGET := EXPR`` as a variant on augmented assignment,
that also gained support for assignment and subscript targets.
Augmented assignment or name binding only?
------------------------------------------
PEP 572 makes a reasonable case that the potential use cases for inline
augmented assignment are notably weaker than those for inline assignment in
general, so it's acceptable to require that they be spelled as ``x := x + 1``,
bypassing any in-place augmented assignment methods.
While this is at least arguably true for the builtin types (where potential
counterexamples would probably need to focus on set manipulation use cases
that the PEP author doesn't personally have), it would also rule out more
memory intensive use cases like manipulation of NumPy arrays, where the data
copying involved in out-of-place operations can make them impractical as
alternatives to their in-place counterparts.
That said, this PEP mainly exists because the PEP author found the inline
assignment proposal much easier to grasp as "It's like ``+=``, only skipping
the addition step", and also liked the way that that framing provides an
actual semantic difference between ``NAME = EXPR`` and ``NAME := EXPR`` at
function scope.
That difference in target scoping behaviour means that the ``NAME := EXPR``
syntax would be expected to have two primary use cases:
* as a way of allowing assignments to be embedded as an expression in an ``if``
or ``while`` statement, or as part of a scoped expression
* as a way of requesting a compile time check that the target name be previously
declared or bound in the current function scope
At module or class scope, ``NAME = EXPR`` and ``NAME := EXPR`` would be
semantically equivalent due to the compiler's lack of visibility into the set
of names that will be resolvable at runtime, but code linters and static
type checkers would be encouraged to enforce the same "declaration or assignment
required before use" behaviour for ``NAME := EXPR`` as the compiler would
enforce at function scope.
Postponing a decision on expression level target declarations
-------------------------------------------------------------
At least for Python 3.8, usage of inline assignments (whether augmented or not)
at function scope would always require a preceding name binding or scope
declaration to avoid getting ``TargetNameError``, even when used outside a
scoped expression.
The intent behind this requirement is to clearly separate the following two
language design questions:
1. Can an expression rebind a name in the current scope?
2. Can an expression declare a new name in the current scope?
For module global scopes, the answer to both of those questions is unequivocally
"Yes", because it's a language level guarantee that mutating the ``globals()``
dict will immediately impact the runtime module scope, and ``global NAME``
declarations inside a function can have the same effect (as can importing the
currently executing module and modifying its attributes).
For class scopes, the answer to both questions is also "Yes" in practice,
although less unequivocally so, since the semantics of ``locals()`` are
currently formally unspecified. However, if the current behaviour of ``locals()``
at class scope is taken as normative (as PEP 558 proposes), then this is
essentially the same scenario as manipulating the module globals, just using
``locals()`` instead.
For function scopes, however, the current answers to these two questions are
respectively "Yes" and "No". Expression level rebinding of function locals is
already possible thanks to lexically nested scopes and explicit ``nonlocal NAME``
expressions. While this PEP will likely make expression level rebinding more
common than it is today, it isn't a fundamentally new concept for the language.
By contrast, declaring a *new* function local variable is currently a statement
level action, involving one of:
* an assignment statement (``NAME = EXPR``, ``OTHER_TARGET = NAME = EXPR``, etc)
* a variable declaration (``NAME : EXPR``)
* a nested function definition
* a nested class definition
* a ``for`` loop
* a ``with`` statement
* an ``except`` clause (with limited scope of access)
The historical trend for the language has actually been to *remove* support for
expression level declarations of function local names, first with the
introduction of "fast locals" semantics (which made the introduction of names
via ``locals()`` unsupported for function scopes), and again with the hiding
of comprehension iteration variables in Python 3.0.
Now, it may be that in Python 3.9, we decide to revisit this question based on
our experience with expression level name binding in Python 3.8, and decide that
we really do want expression level function local variable declarations as well,
and that we want ``NAME := EXPR`` to be the way we spell that (rather than,
for example, spelling inline declarations more explicitly as
``NAME := EXPR given NAME``, which would permit them to carry type annotations,
and also permit them to declare new local variables in scoped expressions,
rather than having to pollute the namespace in their containing scope).
But the proposal in this PEP is that we explicitly give ourselves a full
release to decide how much we want that feature, and exactly where we find
its absence irritating. Python has survived happily without expression level
name bindings *or* declarations for decades, so we can afford to give ourselves
a couple of years to decide if we really want *both* of those, or if expression
level bindings are sufficient.
Ignoring scoped expressions when determining augmented assignment targets
-------------------------------------------------------------------------
When discussing possible binding semantics for PEP 572's assignment expressions,
Tim Peters made a plausible case [1_,2_,3_] for assignment expressions targeting
the containing block scope, essentially ignoring any intervening scoped
expressions.
This approach allows use cases like cumulative sums, or extracting the final
value from a generator expression to be written in a relatively straightforward
way::
total = 0
partial_sums = [total := total + value for value in data]
factor = 1
while any(n % (factor := p) == 0 for p in small_primes):
n //= factor
Guido also expressed his approval for this general approach [4_].
The proposal in this PEP differs from Tim's original proposal in three main
areas:
- it applies the proposal to all augmented assignment operators, not just a
single new name binding operator
- as far as is practical, it extends the augmented assignment requirement that
the name already be defined to the new name binding operator (raising
``TargetNameError`` rather than implicitly declaring new local variables at
function scope)
- it includes lambda expressions in the set of scopes that get ignored for
target name binding purposes, making this transparency to assignments common
to all of the scoped expressions rather than being specific to comprehensions
and generator expressions
With scoped expressions being ignored when calculating binding targets, it's
once again difficult to detect the scoping difference between the outermost
iterable expressions in generator expressions and comprehensions (you have to
mess about with either class scopes or attempting to rebind iteration Variables
to detect it), so there's also no need to tinker with that.
Treating inline assignment as an augmented assignment variant
-------------------------------------------------------------
One of the challenges with PEP 572 is the fact that ``NAME = EXPR`` and
``NAME := EXPR`` are entirely semantically equivalent at every scope. This
makes the two forms hard to teach, since there's no inherent nudge towards
choosing one over the other at the statement level, so you end up having to
resort to "``NAME = EXPR`` is preferred because it's been around longer"
(and PEP 572 proposes to enfore that historical idiosyncrasy at the compiler
level).
That semantic equivalence is difficult to avoid at module and class scope while
still having ``if NAME := EXPR:`` and ``while NAME := EXPR:`` work sensibly, but
at function scope the compiler's comprehensive view of all local names makes
it possible to require that the name be assigned or declared before use,
providing a reasonable incentive to continue to default to using the
``NAME = EXPR`` form when possible, while also enabling the use of the
``NAME := EXPR`` as a kind of simple compile time assertion (i.e. explicitly
indicating that the targeted name has already been bound or declared and hence
should already be known to the compiler).
If Guido were to declare that support for inline declarations was a hard
design requirement, then this PEP would be updated to propose that
``EXPR given NAME`` also be introduced as a way to support inline name declarations
after arbitrary expressions (this would allow the inline name declarations to be
deferred until the end of a complex expression rather than needing to be
embedded in the middle of it, and PEP 8 would gain a recommendation encouraging
that style).
Disallowing augmented assignments in class level scoped expressions
-------------------------------------------------------------------
While modern classes do define an implicit closure that's visible to method
implementations (in order to make ``__class__`` available for use in zero-arg
``super()`` calls), there's no way for user level code to explicitly add
additional names to that scope.
Meanwhile, attributes defined in a class body are ignored for the purpose of
defining a method's lexical closure, which means adding them there wouldn't
work at an implementation level.
Rather than trying to resolve that inherent ambiguity, this PEP simply
prohibits such usage, and requires that any affected logic be written somewhere
other than directly inline in the class body (e.g. in a separate helper
function).
Comparison operators vs assignment operators
--------------------------------------------
The ``OP=`` construct as an expression currently indicates a comparison
operation::
x == y # Equals
x >= y # Greater-than-or-equal-to
x <= y # Less-than-or-equal-to
Both this PEP and PEP 572 propose adding at least one operator that's somewhat
similar in appearance, but defines an assignment instead::
x := y # Becomes
This PEP then goes much further and allows all *13* augmented assignment symbols
to be uses as binary operators::
x += y # In-place add
x -= y # In-place minus
x *= y # In-place multiply
x @= y # In-place matrix multiply
x /= y # In-place division
x //= y # In-place int division
x %= y # In-place mod
x &= y # In-place bitwise and
x |= y # In-place bitwise or
x ^= y # In-place bitwise xor
x <<= y # In-place left shift
x >>= y # In-place right shift
x **= y # In-place power
Of those additional binary operators, the most questionable would be the
bitshift assignment operators, since they're each only one doubled character
away from one of the inclusive ordered comparison operators.
Examples
========
Simplifying retry loops
-----------------------
There are currently a few different options for writing retry loops, including::
# Post-decrementing a counter
remaining_attempts = MAX_ATTEMPTS
while remaining_attempts:
remaining_attempts -= 1
try:
result = attempt_operation()
except Exception as exc:
continue # Failed, so try again
log.debug(f"Succeeded after {attempts} attempts")
break # Success!
else:
raise OperationFailed(f"Failed after {MAX_ATTEMPTS} attempts") from exc
# Loop-and-a-half with a pre-incremented counter
attempt = 0
while True:
attempts += 1
if attempts > MAX_ATTEMPTS:
raise OperationFailed(f"Failed after {MAX_ATTEMPTS} attempts") from exc
try:
result = attempt_operation()
except Exception as exc:
continue # Failed, so try again
log.debug(f"Succeeded after {attempts} attempts")
break # Success!
Each of the available options hides some aspect of the intended loop structure
inside the loop body, whether that's the state modification, the exit condition,
or both.
The proposal in this PEP allows both the state modification and the exit
condition to be included directly in the loop header::
attempt = 0
while (attempt += 1) <= MAX_ATTEMPTS:
try:
result = attempt_operation()
except Exception as exc:
continue # Failed, so try again
log.debug(f"Succeeded after {attempts} attempts")
break # Success!
else:
raise OperationFailed(f"Failed after {MAX_ATTEMPTS} attempts") from exc
Simplifying if-elif chains
--------------------------
if-elif chains that need to rebind the checked condition currently need to
be written using nested if-else statements::
m = pattern.match(data)
if m:
...
else:
m = other_pattern.match(data)
if m:
...
else:
m = yet_another_pattern.match(data)
if m:
...
else:
...
As with PEP 572, this PEP allows the else/if portions of that chain to be
condensed, making their consistent and mutually exclusive structure more
readily apparent::
m = pattern.match(data)
if m:
...
elif m := other_pattern.match(data):
...
elif m := yet_another_pattern.match(data):
...
else:
...
Unlike PEP 572, this PEP requires that the assignment target be explicitly
indicated as local before the first use as a ``:=`` target, either by
binding it to a value (as shown above), or else by including an appropriate
explicit type declaration::
m: typing.re.Match
if m := pattern.match(data):
...
elif m := other_pattern.match(data):
...
elif m := yet_another_pattern.match(data):
...
else:
...
Capturing intermediate values from comprehensions
-------------------------------------------------
The proposal in this PEP makes it straightforward to capture and reuse
intermediate values in comprehensions and generator expressions by
exporting them to the containing block scope::
factor: int
while any(n % (factor := p) == 0 for p in small_primes):
n //= factor
total = 0
partial_sums = [total += value for value in data]
Allowing lambda expressions to act more like re-usable code thunks
------------------------------------------------------------------
This PEP allows the classic closure usage example::
def make_counter(start=0):
x = start
def counter(step=1):
nonlocal x
x += step
return x
return counter
To be abbreviated as::
def make_counter(start=0):
x = start
return lambda step=1: x += step
While the latter form is still a conceptually dense piece of code, it can be
reasonably argued that the lack of boilerplate (where the "def", "nonlocal",
and "return" keywords and two additional repetitions of the "x" variable name
have been replaced with the "lambda" keyword) may make it easier to read in
practice.
Relationship with PEP 572
=========================
The case for allowing inline assignments at all is made in PEP 572. This
competing PEP was initially going to propose an alternate surface syntax
(``EXPR given NAME = EXPR``), while retaining the expression semantics from
PEP 572, but that changed when discussing one of the initial motivating use
cases for allowing embedded assignments at all: making it possible to easily
calculate cumulative sums in comprehensions and generator expressions.
As a result of that, and unlike PEP 572, this PEP focuses primarily on use
cases for inline augmented assignment. It also has the effect of converting
cases that currently inevitably raise ``UnboundLocalError`` at function call
time to report a new compile time ``TargetNameError``.
New syntax for a name rebinding expression (``NAME := TARGET``) is then added
not only to handle the same use cases as are identified in PEP 572, but also
as a lower level primitive to help illustrate, implement and explain
the new augmented assignment semantics, rather than being the sole change being
proposed.
The author of this PEP believes that this approach makes the value of the new
flexibility in name rebinding clearer, while also mitigating many of the
potential concerns raised with PEP 572 around explaining when to use
``NAME = EXPR`` over ``NAME := EXPR`` (and vice-versa), without resorting to
prohibiting the bare statement form of ``NAME := EXPR`` outright (such
that ``NAME := EXPR`` is a compile error, but ``(NAME := EXPR)`` is permitted).
Acknowledgements
================
The PEP author wishes to thank Chris Angelico for his work on PEP 572, and his
efforts to create a coherent summary of the great many sprawling discussions
that spawned on both python-ideas and python-dev, as well as Tim Peters for
the in-depth discussion of parent local scoping that prompted the above
scoping proposal for augmented assignments inside scoped expressions.
Eric Snow's feedback on a pre-release version of this PEP helped make it
significantly more readable.
References
==========
.. [1] The beginning of Tim's genexp & comprehension scoping thread
(https://mail.python.org/pipermail/python-ideas/2018-May/050367.html)
.. [2] Reintroducing the original cumulative sums use case
(https://mail.python.org/pipermail/python-ideas/2018-May/050544.html)
.. [3] Tim's language reference level explanation of his proposed scoping semantics
(https://mail.python.org/pipermail/python-ideas/2018-May/050729.html)
.. [4] Guido's endorsement of Tim's proposed genexp & comprehension scoping
(https://mail.python.org/pipermail/python-ideas/2018-May/050411.html)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

492
pep-0578.rst Normal file
View File

@ -0,0 +1,492 @@
PEP: 578
Title: Python Runtime Audit Hooks
Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower <steve.dower@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-Jun-2018
Python-Version: 3.8
Post-History:
Abstract
========
This PEP describes additions to the Python API and specific behaviors
for the CPython implementation that make actions taken by the Python
runtime visible to auditing tools. Visibility into these actions
provides opportunities for test frameworks, logging frameworks, and
security tools to monitor and optionally limit actions taken by the
runtime.
This PEP proposes adding two APIs to provide insights into a running
Python application: one for arbitrary events, and another specific to
the module import system. The APIs are intended to be available in all
Python implementations, though the specific messages and values used
are unspecified here to allow implementations the freedom to determine
how best to provide information to their users. Some examples likely
to be used in CPython are provided for explanatory purposes.
See PEP-551 for discussion and recommendations on enhancing the
security of a Python runtime making use of these auditing APIs.
Background
==========
Python provides access to a wide range of low-level functionality on
many common operating systems in a consistent manner. While this is
incredibly useful for "write-once, run-anywhere" scripting, it also
makes monitoring of software written in Python difficult. Because
Python uses native system APIs directly, existing monitoring
tools either suffer from limited context or auditing bypass.
Limited context occurs when system monitoring can report that an
action occurred, but cannot explain the sequence of events leading to
it. For example, network monitoring at the OS level may be able to
report "listening started on port 5678", but may not be able to
provide the process ID, command line or parent process, or the local
state in the program at the point that triggered the action. Firewall
controls to prevent such an action are similarly limited, typically
to a process name or some global state such as the current user, and
in any case rarely provide a useful log file correlated with other
application messages.
Auditing bypass can occur when the typical system tool used for an
action would ordinarily report its use, but accessing the APIs via
Python do not trigger this. For example, invoking "curl" to make HTTP
requests may be specifically monitored in an audited system, but
Python's "urlretrieve" function is not.
Within a long-running Python application, particularly one that
processes user-provided information such as a web app, there is a risk
of unexpected behavior. This may be due to bugs in the code, or
deliberately induced by a malicious user. In both cases, normal
application logging may be bypassed resulting in no indication that
anything out of the ordinary has occurred.
Additionally, and somewhat unique to Python, it is very easy to affect
the code that is run in an application by manipulating either the
import system's search path or placing files earlier on the path than
intended. This is often seen when developers create a script with the
same name as the module they intend to use - for example, a
``random.py`` file that attempts to import the standard library
``random`` module.
Overview of Changes
===================
The aim of these changes is to enable both application developers and
system administrators to integrate Python into their existing
monitoring systems without dictating how those systems look or behave.
We propose two API changes to enable this: an Audit Hook and Verified
Open Hook. Both are available from Python and native code, allowing
applications and frameworks written in pure Python code to take
advantage of the extra messages, while also allowing embedders or
system administrators to deploy "always-on" builds of Python.
Only CPython is bound to provide the native APIs as described here.
Other implementations should provide the pure Python APIs, and
may provide native versions as appropriate for their underlying
runtimes.
Audit Hook
----------
In order to observe actions taken by the runtime (on behalf of the
caller), an API is required to raise messages from within certain
operations. These operations are typically deep within the Python
runtime or standard library, such as dynamic code compilation, module
imports, DNS resolution, or use of certain modules such as ``ctypes``.
The following new C APIs allow embedders and CPython implementors to
send and receive audit hook messages::
# Add an auditing hook
typedef int (*hook_func)(const char *event, PyObject *args,
void *userData);
int PySys_AddAuditHook(hook_func hook, void *userData);
# Raise an event with all auditing hooks
int PySys_Audit(const char *event, PyObject *args);
# Internal API used during Py_Finalize() - not publicly accessible
void _Py_ClearAuditHooks(void);
The new Python APIs for receiving and raising audit hooks are::
# Add an auditing hook
sys.addaudithook(hook: Callable[[str, tuple]])
# Raise an event with all auditing hooks
sys.audit(str, *args)
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling
``sys.addaudithook()`` from Python code. Hooks cannot be removed or
replaced.
When events of interest are occurring, code can either call
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
string argument is the name of the event, and the tuple contains
arguments. A given event name should have a fixed schema for arguments,
which should be considered a public API (for a given x.y version
release), and thus should only change between feature releases with
updated documentation.
For maximum compatibility, events using the same name as an event in
the reference interpreter CPython should make every attempt to use
compatible arguments. Including the name or an abbreviation of the
implementation in implementation-specific event names will also help
prevent collisions. For example, a ``pypy.jit_invoked`` event is clearly
distinguised from an ``ipy.jit_invoked`` event.
When an event is audited, each hook is called in the order it was added
with the event name and tuple. If any hook returns with an exception
set, later hooks are ignored and *in general* the Python runtime should
terminate. This is intentional to allow hook implementations to decide
how to respond to any particular event. The typical responses will be to
log the event, abort the operation with an exception, or to immediately
terminate the process with an operating system exit call.
When an event is audited but no hooks have been set, the ``audit()``
function should include minimal overhead. Ideally, each argument is a
reference to existing data rather than a value calculated just for the
auditing call.
As hooks may be Python objects, they need to be freed during
``Py_Finalize()``. To do this, we add an internal API
``_Py_ClearAuditHooks()`` that releases any Python hooks and any
memory held. This is an internal function with no public export, and
we recommend it should raise its own audit event for all current hooks
to ensure that unexpected calls are observed.
Below in `Suggested Audit Hook Locations`_, we recommend some important
operations that should raise audit events. In PEP 551, more audited
operations are recommended with a view to security transparency.
Python implementations should document which operations will raise
audit events, along with the event schema. It is intended that
``sys.addaudithook(print)`` be a trivial way to display all messages.
Verified Open Hook
------------------
Most operating systems have a mechanism to distinguish between files
that can be executed and those that can not. For example, this may be an
execute bit in the permissions field, or a verified hash of the file
contents to detect potential code tampering. These are an important
security mechanism for preventing execution of data or code that is not
approved for a given environment. Currently, Python has no way to
integrate with these when launching scripts or importing modules.
The new public C API for the verified open hook is::
# Set the handler
typedef PyObject *(*hook_func)(PyObject *path, void *userData)
int PyImport_SetOpenForImportHook(hook_func handler, void *userData)
# Open a file using the handler
PyObject *PyImport_OpenForImport(const char *path)
The new public Python API for the verified open hook is::
# Open a file using the handler
importlib.util.open_for_import(path : str) -> io.IOBase
The ``importlib.util.open_for_import()`` function is a drop-in
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
to open a file for raw, binary access. To change the behaviour a new
handler should be set. Handler functions only accept ``str`` arguments.
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
from C at any time, including before ``Py_Initialize()``. However, if a
hook has already been set then the call will fail. When
``open_for_import()`` is called with a hook set, the hook will be passed
the path and its return value will be returned directly. The returned
object should be an open file-like object that supports reading raw
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
the open handler has already had to read the file into memory in order
to perform whatever verification is necessary to determine whether the
content is permitted to be executed.
Note that these hooks can import and call the ``_io.open()`` function on
CPython without triggering themselves. They can also use ``_io.BytesIO``
to return a compatible result using an in-memory buffer.
If the hook determines that the file should not be loaded, it should
raise an exception of its choice, as well as performing any other
logging.
All import and execution functionality involving code from a file will
be changed to use ``open_for_import()`` unconditionally. It is important
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
through this function - an audit hook that includes the code from these
calls is the best opportunity to validate code that is read from the
file. Given the current decoupling between import and execution in
Python, most imported code will go through both ``open_for_import()``
and the log hook for ``compile``, and so care should be taken to avoid
repeating verification steps.
There is no Python API provided for changing the open hook. To modify
import behavior from Python code, use the existing functionality
provided by ``importlib``.
API Availability
----------------
While all the functions added here are considered public and stable API,
the behavior of the functions is implementation specific. Most
descriptions here refer to the CPython implementation, and while other
implementations should provide the functions, there is no requirement
that they behave the same.
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
may do nothing. This allows code to make calls to ``sys.audit()``
without having to test for existence, but it should not assume that its
call will have any effect. (Including existence tests in
security-critical code allows another vector to bypass auditing, so it
is preferable that the function always exist.)
``importlib.util.open_for_import(path)`` should at a minimum always
return ``_io.open(path, 'rb')``. Code using the function should make no
further assumptions about what may occur, and implementations other than
CPython are not required to let developers override the behavior of this
function with a hook.
Suggested Audit Hook Locations
==============================
The locations and parameters in calls to ``sys.audit()`` or
``PySys_Audit()`` are to be determined by individual Python
implementations. This is to allow maximum freedom for implementations
to expose the operations that are most relevant to their platform,
and to avoid or ignore potentially expensive or noisy events.
Table 1 acts as both suggestions of operations that should trigger
audit events on all implementations, and examples of event schemas.
Table 2 provides further examples that are not required, but are
likely to be available in CPython.
Refer to the documentation associated with your version of Python to
see which operations provide audit events.
.. csv-table:: Table 1: Suggested Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
audit hooks are being added.
"
``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", "
Detects any attempt to set the ``open_for_import`` hook.
"
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
Detect dynamic code compilation, where ``code`` could be a string or
AST. Note that this will be called for regular imports of source
code, including those that were opened with ``open_for_import``.
"
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
Detect dynamic execution of code objects. This only occurs for
explicit calls, and is not raised for normal function invocation.
"
``import``, ``import``, "``(module, filename, sys.path,
sys.meta_path, sys.path_hooks)``", "Detect when modules are
imported. This is raised before the module name is resolved to a
file. All arguments other than the module name may be ``None`` if
they are not used or available.
"
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.setprofile`` eventually
calls this function, so the event will be audited for each thread.
"
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
injecting trace functions. Because of the implementation, exceptions
raised from the hook will abort the operation, but will not be
raised in Python code. Note that ``threading.settrace`` eventually
calls this function, so the event will be audited for each thread.
"
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
``object.__setattr__``","``(object, attr, value)``","Detect monkey
patching of types and objects. This event
is raised for the ``__class__`` attribute and any attribute on
``type`` objects.
"
"``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
attr)``","Detect deletion of object attributes. This event is raised
for any attribute on ``type`` objects.
"
"``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
global_name)``","Detect imports and global name lookup when
unpickling.
"
.. csv-table:: Table 2: Potential CPython Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
hooks they are being cleaned up, mainly in case the event is
triggered unexpectedly. This event cannot be aborted.
"
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
Detect dynamic creation of code objects. This only occurs for
direct instantiation, and is not raised for normal compilation.
"
``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
dynamic creation of function objects. This only occurs for direct
instantiation, and is not raised for normal compilation.
"
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
``(module_or_path,)``", "Detect when native modules are used.
"
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
Collect information about specific symbols retrieved from native
modules.
"
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
when code is accessing arbitrary memory using ``ctypes``.
"
"``new_mmap_object``",``mmap.__new__``,"``(fileno, map_size, access,
offset)``", "Detects creation of mmap objects. On POSIX, access may
have been calculated from the ``prot`` and ``flags`` arguments.
"
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
when code is accessing frames directly.
"
``sys._current_frames``, ``sys._current_frames``, "", "Detect when
code is accessing frames directly.
"
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
access to network resources. The address is unmodified from the
original call.
"
"``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
",``object.__getattr__``,"``(object, attr)``","Detect access to
restricted attributes. This event is raised for any built-in
members that are marked as restricted, and members that may allow
bypassing imports.
"
"``urllib.urlopen``",``urllib.Request``,"``(url, data, headers,
method)``", "Detects URL requests.
"
Performance Impact
==================
The important performance impact is the case where events are being
raised but there are no hooks attached. This is the unavoidable case -
once a distributor begins adding audit hooks they have explicitly
chosen to trade performance for functionality. Performance import
with hooks added are not of interest here, since this is considered
opt-in functionality.
Analysis using the Python Performance Benchmark Suite [1]_ shows no
significant impact, with the vast majority of benchmarks showing
between 1.05x faster to 1.05x slower.
In our opinion, the performance impact of the set of auditing points
described in this PEP is negligible.
Rejected Ideas
==============
Separate module for audit hooks
-------------------------------
The proposal is to add a new module for audit hooks, hypothetically
``audit``. This would separate the API and implementation from the
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and
``PyAudit_Audit`` rather than the current variations.
Any such module would need to be a built-in module that is guaranteed to
always be present. The nature of these hooks is that they must be
callable without condition, as any conditional imports or calls provide
opportunities to intercept and suppress or modify events.
Given its nature as one of the most core modules, the ``sys`` module is
somewhat protected against module shadowing attacks. Replacing ``sys``
with a sufficiently functional module that the application can still run
is a much more complicated task than replacing a module with only one
function of interest. An attacker that has the ability to shadow the
``sys`` module is already capable of running arbitrary code from files,
whereas an ``audit`` module can be replaced with a single line in a
``.pth`` file anywhere on the search path::
import sys; sys.modules['audit'] = type('audit', (object,),
{'audit': lambda *a: None, 'addhook': lambda *a: None})
Multiple layers of protection already exist for monkey patching attacks
against either ``sys`` or ``audit``, but assignments or insertions to
``sys.modules`` are not audited.
This idea is rejected because it makes substituting ``audit`` calls
throughout all callers trivial.
Flag in sys.flags to indicate "audited" mode
--------------------------------------------
The proposal is to add a value in ``sys.flags`` to indicate when Python
is running in a "secure" or "audited" mode. This would allow
applications to detect when some features are enabled or when hooks
have been added and modify their behaviour appropriately.
Currently, we are not aware of any legitimate reasons for a program to
behave differently in the presence of audit hooks.
Both application-level APIs ``sys.audit`` and
``importlib.util.open_for_import`` are always present and functional,
regardless of whether the regular ``python`` entry point or some
alternative entry point is used. Callers cannot determine whether any
hooks have been added (except by performing side-channel analysis), nor
do they need to. The calls should be fast enough that callers do not
need to avoid them, and the program is responsible for ensuring that
any added hooks are fast enough to not affect application performance.
The argument that this is "security by obscurity" is valid, but
irrelevant. Security by obscurity is only an issue when there are no
other protective mechanisms; obscurity as the first step in avoiding
attack is strongly recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
discussion).
This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in
use.
Relationship to PEP 551
=======================
This API was originally presented as part of
`PEP 551 <https://www.python.org/dev/peps/pep-0551/>`_ Security
Transparency in the Python Runtime.
For simpler review purposes, and due to the broader applicability of
these APIs beyond security, the API design is now presented separately.
PEP 551 is an informational PEP discussing how to integrate Python into
a secure or audited environment.
References
==========
.. [1] Python Performance Benchmark Suite `<https://github.com/python/performance>`_
Copyright
=========
Copyright (c) 2018 by Microsoft Corporation. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0 or later (the latest version is presently
available at http://www.opencontent.org/openpub/).

416
pep-0579.rst Normal file
View File

@ -0,0 +1,416 @@
PEP: 579
Title: Refactoring C functions and methods
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 04-Jun-2018
Post-History: 20-Jun-2018
Abstract
========
This meta-PEP collects various issues with CPython's existing implementation
of built-in functions (functions implemented in C) and methods.
Fixing all these issues is too much for one PEP,
so that will be delegated to other standards track PEPs.
However, this PEP does give some brief ideas of possible fixes.
This is mainly meant to coordinate an overall strategy.
For example, a proposed solution may sound too complicated
for fixing any one single issue, but it may be the best overall
solution for multiple issues.
This PEP is purely informational:
it does not imply that all issues will eventually
be fixed, nor that they will be fixed using the solution proposed here.
It also serves as a check-list of possible requested features
to verify that a given fix does not make those
other features harder to implement.
The major proposed change is replacing ``PyMethodDef``
by a new structure ``PyCCallDef``
which collects everything needed for calling the function/method.
In the ``PyTypeObject`` structure, a new field ``tp_ccalloffset``
is added giving an offset to a ``PyCCallDef *`` in the object structure.
**NOTE**: This PEP deals only with CPython implementation details,
it does not affect the Python language or standard library.
Issues
======
This lists various issues with built-in functions and methods,
together with a plan for a solution and (if applicable)
pointers to standards track PEPs discussing the details.
1. Naming
---------
The word "built-in" is overused in Python.
From a quick skim of the Python documentation, it mostly refers
to things from the ``builtins`` module.
In other words: things which are available in the global namespace
without a need for importing them.
This conflicts with the use of the word "built-in" to mean "implemented in C".
**Solution**: since the C structure for built-in functions and methods is already
called ``PyCFunctionObject``,
let's use the name "cfunction" and "cmethod" instead of "built-in function"
and "built-in method".
2. Not extendable
-----------------
The various classes involved (such as ``builtin_function_or_method``)
cannot be subclassed::
>>> from types import BuiltinFunctionType
>>> class X(BuiltinFunctionType):
... pass
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'builtin_function_or_method' is not an acceptable base type
This is a problem because it makes it impossible to add features
such as introspection support to these classes.
If one wants to implement a function in C with additional functionality,
an entirely new class must be implemented from scratch.
The problem with this is that the existing classes like
``builtin_function_or_method`` are special-cased in the Python interpreter
to allow faster calling (for example, by using ``METH_FASTCALL``).
It is currently impossible to have a custom class with the same optimizations.
**Solution**: make the existing optimizations available to arbitrary classes.
This is done by adding a new ``PyTypeObject`` field ``tp_ccalloffset``
(or can we re-use ``tp_print`` for that?)
specifying the offset of a ``PyCCallDef`` pointer.
This is a new structure holding all information needed to call
a cfunction and it would be used instead of ``PyMethodDef``.
This implements the new "C call" protocol.
For constructing cfunctions and cmethods, ``PyMethodDef`` arrays
will still be used (for example, in ``tp_methods``) but that will
be the *only* remaining purpose of the ``PyMethodDef`` structure.
Additionally, we can also make some function classes subclassable.
However, this seems less important once we have ``tp_ccalloffset``.
**Reference**: PEP 580
3. cfunctions do not become methods
-----------------------------------
A cfunction like ``repr`` does not implement ``__get__`` to bind
as a method::
>>> class X:
... meth = repr
>>> x = X()
>>> x.meth()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: repr() takes exactly one argument (0 given)
In this example, one would have expected that ``x.meth()`` returns
``repr(x)`` by applying the normal rules of methods.
This is surprising and a needless difference
between cfunctions and Python functions.
For the standard built-in functions, this is not really a problem
since those are not meant to used as methods.
But it does become a problem when one wants to implement a
new cfunction with the goal of being usable as method.
Again, a solution could be to create a new class behaving just
like cfunctions but which bind as methods.
However, that would lose some existing optimizations for methods,
such as the ``LOAD_METHOD``/``CALL_METHOD`` opcodes.
**Solution**: the same as the previous issue.
It just shows that handling ``self`` and ``__get__``
should be part of the new C call protocol.
For backwards compatibility, we would keep the existing non-binding
behavior of cfunctions. We would just allow it in custom classes.
**Reference**: PEP 580
4. Semantics of inspect.isfunction
----------------------------------
Currently, ``inspect.isfunction`` returns ``True`` only for instances
of ``types.FunctionType``.
That is, true Python functions.
A common use case for ``inspect.isfunction`` is checking for introspection:
it guarantees for example that ``inspect.getfile()`` will work.
Ideally, it should be possible for other classes to be treated as
functions too.
**Solution**: introduce a new ``InspectFunction`` abstract base class
and use that to implement ``inspect.isfunction``.
Alternatively, use duck typing for ``inspect.isfunction``
(as proposed in [#bpo30071]_)::
def isfunction(obj):
return hasattr(type(obj), "__code__")
5. C functions should have access to the function object
--------------------------------------------------------
The underlying C function of a cfunction currently
takes a ``self`` argument (for bound methods)
and then possibly a number of arguments.
There is no way for the C function to actually access the Python
cfunction object (the ``self`` in ``__call__`` or ``tp_call``).
This would for example allow implementing the
C call protocol for Python functions (``types.FunctionType``):
the C function which implements calling Python functions
needs access to the ``__code__`` attribute of the function.
This is also needed for PEP 573
where all cfunctions require access to their "parent"
(the module for functions of a module or the defining class
for methods).
**Solution**: add a new ``PyMethodDef`` flag to specify
that the C function takes an additional argument (as first argument),
namely the function object.
**References**: PEP 580, PEP 573
6. METH_FASTCALL is private and undocumented
--------------------------------------------
The ``METH_FASTCALL`` mechanism allows calling cfunctions and cmethods
using a C array of Python objects instead of a ``tuple``.
This was introduced in Python 3.6 for positional arguments only
and extended in Python 3.7 with support for keyword arguments.
However, given that it is undocumented,
it is presumably only supposed to be used by CPython itself.
**Solution**: since this is an important optimization,
everybody should be encouraged to use it.
Now that the implementation of ``METH_FASTCALL`` is stable, document it!
As part of the C call protocol, we should also add a C API function ::
PyObject *PyCCall_FastCall(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *keywords)
**Reference**: PEP 580
7. Allowing native C arguments
------------------------------
A cfunction always takes its arguments as Python objects
(say, an array of ``PyObject`` pointers).
In cases where the cfunction is really wrapping a native C function
(for example, coming from ``ctypes`` or some compiler like Cython),
this is inefficient: calls from C code to C code are forced to use
Python objects to pass arguments.
Analogous to the buffer protocol which allows access to C data,
we should also allow access to the underlying C callable.
**Solution**: when wrapping a C function with native arguments
(for example, a C ``long``) inside a cfunction,
we should also store a function pointer to the underlying C function,
together with its C signature.
Argument Clinic could automatically do this by storing
a pointer to the "impl" function.
8. Complexity
-------------
There are a huge number of classes involved to implement
all variations of methods.
This is not a problem by itself, but a compounding issue.
For ordinary Python classes, the table below gives the classes
for various kinds of methods.
The columns refer to the class in the class ``__dict__``,
the class for unbound methods (bound to the class)
and the class for bound methods (bound to the instance):
============= ================ ============ ============
kind __dict__ unbound bound
============= ================ ============ ============
Normal method ``function`` ``function`` ``method``
Static method ``staticmethod`` ``function`` ``function``
Class method ``classmethod`` ``method`` ``method``
Slot method ``function`` ``function`` ``method``
============= ================ ============ ============
This is the analogous table for extension types (C classes):
============= ========================== ============================== ==============================
kind __dict__ unbound bound
============= ========================== ============================== ==============================
Normal method ``method_descriptor`` ``method_descriptor`` ``builtin_function_or_method``
Static method ``staticmethod`` ``builtin_function_or_method`` ``builtin_function_or_method``
Class method ``classmethod_descriptor`` ``builtin_function_or_method`` ``builtin_function_or_method``
Slot method ``wrapper_descriptor`` ``wrapper_descriptor`` ``method-wrapper``
============= ========================== ============================== ==============================
There are a lot of classes involved
and these two tables look very different.
There is no good reason why Python methods should be
treated fundamentally different from C methods.
Also the features are slightly different:
for example, ``method`` supports ``__func__``
but ``builtin_function_or_method`` does not.
Since CPython has optimizations for calls to most of these objects,
the code for dealing with them can also become complex.
A good example of this is the ``call_function`` function in ``Python/ceval.c``.
**Solution**: all these classes should implement the C call protocol.
Then the complexity in the code can mostly be fixed by
checking for the C call protocol (``tp_ccalloffset != 0``)
instead of doing type checks.
Furthermore, it should be investigated whether some of these classes can be merged
and whether ``method`` can be re-used also for bound methods of extension types
(see PEP 576 for the latter,
keeping in mind that this may have some minor backwards compatibility issues).
This is not a goal by itself but just something to keep in mind
when working on these classes.
9. PyMethodDef is too limited
-----------------------------
The typical way to create a cfunction or cmethod in an extension module
is by using a ``PyMethodDef`` to define it.
These are then stored in an array ``PyModuleDef.m_methods``
(for cfunctions) or ``PyTypeObject.tp_methods`` (for cmethods).
However, because of the stable ABI (PEP 384),
we cannot change the ``PyMethodDef`` structure.
So, this means that we cannot add new fields for creating cfunctions/cmethods
this way.
This is probably the reason for the hack that
``__doc__`` and ``__text_signature__`` are stored in the same C string
(with the ``__doc__`` and ``__text_signature__`` descriptors extracting
the relevant part).
**Solution**: stop assuming that a single ``PyMethodDef`` entry
is sufficient to describe a cfunction/cmethod.
Instead, we could add some flag which means that one of the ``PyMethodDef``
fields is instead a pointer to an additional structure.
Or, we could add a flag to use two or more consecutive ``PyMethodDef``
entries in the array to store more data.
Then the ``PyMethodDef`` array would be used only to construct
cfunctions/cmethods but it would no longer be used after that.
10. Slot wrappers have no custom documentation
----------------------------------------------
Right now, slot wrappers like ``__init__`` or ``__lt__`` only have very
generic documentation, not at all specific to the class::
>>> list.__init__.__doc__
'Initialize self. See help(type(self)) for accurate signature.'
>>> list.__lt__.__doc__
'Return self<value.'
The same happens for the signature::
>>> list.__init__.__text_signature__
'($self, /, *args, **kwargs)'
As you can see, slot wrappers do support ``__doc__``
and ``__text_signature__``.
The problem is that these are stored in ``struct wrapperbase``,
which is common for all wrappers of a specific slot
(for example, the same ``wrapperbase`` is used for ``str.__eq__`` and ``int.__eq__``).
**Solution**: rethink the slot wrapper class to allow docstrings
(and text signatures) for each instance separately.
This still leaves the question of how extension modules
should specify the documentation.
The ``PyTypeObject`` entries like ``tp_init`` are just function pointers,
we cannot do anything with those.
One solution would be to add entries to the ``tp_methods`` array
just for adding docstrings.
Such an entry could look like ::
{"__init__", NULL, METH_SLOTDOC, "pointer to __init__ doc goes here"}
11. Static methods and class methods should be callable
-------------------------------------------------------
Instances of ``staticmethod`` and ``classmethod`` should be callable.
Admittedly, there is no strong use case for this,
but it has occasionally been requested (see for example [#bpo20309]_).
Making static/class methods callable would increase consistency.
First of all, function decorators typically add functionality or modify
a function, but the result remains callable. This is not true for
``@staticmethod`` and ``@classmethod``.
Second, class methods of extension types are already callable::
>>> fromhex = float.__dict__["fromhex"]
>>> type(fromhex)
<class 'classmethod_descriptor'>
>>> fromhex(float, "0xff")
255.0
Third, one can see ``function``, ``staticmethod`` and ``classmethod``
as different kinds of unbound methods:
they all become ``method`` when bound, but the implementation of ``__get__``
is slightly different.
From this point of view, it looks strange that ``function`` is callable
but the others are not.
**Solution**:
when changing the implementation of ``staticmethod``, ``classmethod``,
we should consider making instances callable.
Even if this is not a goal by itself, it may happen naturally
because of the implementation.
References
==========
.. [#bpo20309] Not all method descriptors are callable
(https://bugs.python.org/issue20309)
.. [#bpo30071] Duck-typing inspect.isfunction()
(https://bugs.python.org/issue30071)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

611
pep-0580.rst Normal file
View File

@ -0,0 +1,611 @@
PEP: 580
Title: The C call protocol
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 14-Jun-2018
Python-Version: 3.8
Post-History: 20-Jun-2018, 22-Jun-2018, 16-Jul-2018
Abstract
========
A new "C call" protocol is proposed.
It is meant for classes representing functions or methods
which need to implement fast calling.
The goal is to generalize existing optimizations for built-in functions
to arbitrary extension types.
In the reference implementation,
this new protocol is used for the existing classes
``builtin_function_or_method`` and ``method_descriptor``.
However, in the future, more classes may implement it.
**NOTE**: This PEP deals only with CPython implementation details,
it does not affect the Python language or standard library.
Motivation
==========
Currently, the Python bytecode interpreter has various optimizations
for calling instances of ``builtin_function_or_method``,
``method_descriptor``, ``method`` and ``function``.
However, none of these classes is subclassable.
Therefore, these optimizations are not available to
user-defined extension types.
If this PEP is implemented, then the checks
for ``builtin_function_or_method`` and ``method_descriptor``
could be replaced by simply checking for and using the C call protocol.
This simplifies existing code.
We also design the C call protocol such that it can easily
be extended with new features in the future.
For more background and motivation, see PEP 579.
Basic idea
==========
Currently, CPython has multiple optimizations for fast calling
for a few specific function classes.
Calling instances of these classes using a plain ``tp_call`` is slower
than using the optimizations.
The basic idea of this PEP is to allow user-defined extension types
(not Python classes) to use these optimizations also,
both as caller and as callee.
The existing class ``builtin_function_or_method`` and a few others
use a ``PyMethodDef`` structure for describing the underlying C function and its signature.
The first concrete change is that this is replaced by a new structure ``PyCCallDef``.
This stores some of the same information as a ``PyMethodDef``,
but with one important addition:
the "parent" of the function (the class or module where it is defined).
Note that ``PyMethodDef`` arrays are still used to construct
functions/methods but no longer for calling them.
Second, we want that every class can use such a ``PyCCallDef`` for optimizing calls,
so the ``PyTypeObject`` structure gains a ``tp_ccalloffset`` field
giving an offset to a ``PyCCallDef *`` in the object structure
and a flag ``Py_TPFLAGS_HAVE_CCALL`` indicating that ``tp_ccalloffset`` is valid.
Third, since we want to deal efficiently with unbound and bound methods too
(as opposed to only plain functions), we need to handle ``__self__`` too:
after the ``PyCCallDef *`` in the object structure,
there is a ``PyObject *self`` field.
These two fields together are referred to as a ``PyCCallRoot`` structure.
The new protocol for efficiently calling objects using these new structures
is called the "C call protocol".
New data structures
===================
The ``PyTypeObject`` structure gains a new field ``Py_ssize_t tp_ccalloffset``
and a new flag ``Py_TPFLAGS_HAVE_CCALL``.
If this flag is set, then ``tp_ccalloffset`` is assumed to be a valid
offset inside the object structure (similar to ``tp_weaklistoffset``).
It must be a strictly positive integer.
At that offset, a ``PyCCallRoot`` structure appears::
typedef struct {
PyCCallDef *cr_ccall;
PyObject *cr_self; /* __self__ argument for methods */
} PyCCallRoot;
The ``PyCCallDef`` structure contains everything needed to describe how
the function can be called::
typedef struct {
uint32_t cc_flags;
PyCFunc cc_func; /* C function to call */
PyObject *cc_parent; /* class or module */
} PyCCallDef;
The reason for putting ``__self__`` outside of ``PyCCallDef``
is that ``PyCCallDef`` is not meant to be changed after creating the function.
A single ``PyCCallDef`` can be shared
by an unbound method and multiple bound methods.
This wouldn't work if we would put ``__self__`` inside that structure.
**NOTE**: unlike ``tp_dictoffset`` we do not allow negative numbers
for ``tp_ccalloffset`` to mean counting from the end.
There does not seem to be a use case for it and it would only complicate
the implementation.
Parent
------
The ``cc_parent`` field (accessed for example by a ``__parent__``
or ``__objclass__`` descriptor from Python code) can be any Python object.
For methods of extension types, this is set to the class.
For functions of modules, this is set to the module.
The parent serves multiple purposes: for methods of extension types,
it is used for type checks like the following::
>>> list.append({}, "x")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor 'append' requires a 'list' object but received a 'dict'
PEP 573 specifies that every function should have access to the
module in which it is defined.
For functions of a module, this is given by the parent.
For methods, this works indirectly through the class,
assuming that the class has a pointer to the module.
The parent would also typically be used to implement ``__qualname__``.
The new C API function ``PyCCall_GenericGetQualname()`` does exactly that.
Custom classes are free to set ``cc_parent`` to whatever they want.
It is only used by the C call protocol if the ``CCALL_OBJCLASS`` flag is set.
Using tp_print
--------------
We propose to replace the existing unused field ``tp_print``
by ``tp_ccalloffset``.
Since ``Py_TPFLAGS_HAVE_CCALL`` would *not* be added to
``Py_TPFLAGS_DEFAULT``, this ensures full backwards compatibility for
existing extension modules setting ``tp_print``.
It also means that we can require that ``tp_ccalloffset`` is a valid
offset when ``Py_TPFLAGS_HAVE_CCALL`` is specified:
we do not need to check ``tp_ccalloffset != 0``.
In future Python versions, we may decide that ``tp_print``
becomes ``tp_ccalloffset`` unconditionally,
drop the ``Py_TPFLAGS_HAVE_CCALL`` flag and instead check for
``tp_ccalloffset != 0``.
The C call protocol
===================
We say that a class implements the C call protocol
if it has the ``Py_TPFLAGS_HAVE_CCALL`` flag set
(as explained above, it must then set ``tp_ccalloffset > 0``).
Such a class must implement ``__call__`` as described in this section
(in practice, this just means setting ``tp_call`` to ``PyCCall_Call``).
The ``cc_func`` field is a C function pointer.
Its precise signature depends on flags.
Below are the possible values for ``cc_flags & CCALL_SIGNATURE``
together with the arguments that the C function takes.
The return value is always ``PyObject *``.
The following are completely analogous to the existing ``PyMethodDef``
signature flags:
- ``CCALL_VARARGS``: ``cc_func(PyObject *self, PyObject *args)``
- ``CCALL_VARARGS | CCALL_KEYWORDS``: ``cc_func(PyObject *self, PyObject *args, PyObject *kwds)``
- ``CCALL_FASTCALL``: ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs)``
- ``CCALL_FASTCALL | CCALL_KEYWORDS``: ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)``
- ``CCALL_NULLARG``: ``cc_func(PyObject *self, PyObject *null)``
(the function takes no arguments but a ``NULL`` is passed to the C function)
- ``CCALL_O``: ``cc_func(PyObject *self, PyObject *arg)``
The flag ``CCALL_FUNCARG`` may be combined with any of these.
If so, the C function takes an additional argument as first argument
which is the function object (the ``self`` in ``__call__``).
For example, we have the following signature:
- ``CCALL_FUNCARG | CCALL_VARARGS``: ``cc_func(PyObject *func, PyObject *self, PyObject *args)``
**NOTE**: in the case of bound methods, it is currently unspecified
whether the "function object" in the paragraph above refers
to the bound method or the original function (which is wrapped by the bound method).
In the reference implementation, the bound method is passed.
In the future, this may change to the wrapped function.
Despite this ambiguity, the implementation of bound methods
guarantees that ``PyCCall_CCALLDEF(func)``
points to the ``PyCCallDef`` of the original function.
**NOTE**: unlike the existing ``METH_...`` flags,
the ``CCALL_...`` constants do not necessarily represent single bits.
So checking ``(cc_flags & CCALL_VARARGS) == 0`` is not a valid way
for checking the signature.
There are also no guarantees of binary compatibility
between Python versions for these flags.
Checking __objclass__
---------------------
If the ``CCALL_OBJCLASS`` flag is set and if ``cr_self`` is NULL
(this is the case for unbound methods of extension types),
then a type check is done:
the function must be called with at least one positional argument
and the first (typically called ``self``) must be an instance of
``cc_parent`` (which must be a class).
If not, a ``TypeError`` is raised.
Self slicing
------------
If ``cr_self`` is not NULL or if the flag ``CCALL_SLICE_SELF``
is not set in ``cc_flags``, then the argument passed as ``self``
is simply ``cr_self``.
If ``cr_self`` is NULL and the flag ``CCALL_SLICE_SELF`` is set,
then the first positional argument is removed from
``args`` and instead passed as first argument to the C function.
Effectively, the first positional argument is treated as ``__self__``.
If there are no positional arguments, ``TypeError`` is raised.
This process is called self slicing and a function is said to have self
slicing if ``cr_self`` is NULL and ``CCALL_SLICE_SELF`` is set.
Note that a ``METH_NULLARG`` function with self slicing effectively has
one argument, namely ``self``.
Analogously, a ``METH_O`` function with self slicing has two arguments.
Descriptor behavior
-------------------
Classes supporting the C call protocol
must implement the descriptor protocol in a specific way.
This is required for an efficient implementation of bound methods:
it allows sharing the ``PyCCallDef`` structure between bound and unbound methods.
It is also needed for a correct implementation of ``_PyObject_GetMethod``
which is used by the ``LOAD_METHOD``/``CALL_METHOD`` optimization.
First of all, if ``func`` supports the C call protocol,
then ``func.__set__`` must not be implemented.
Second, ``func.__get__`` must behave as follows:
- If ``cr_self`` is not NULL, then ``__get__`` must be a no-op
in the sense that ``func.__get__(obj, cls)(*args, **kwds)``
behaves exactly the same as ``func(*args, **kwds)``.
It is also allowed for ``__get__`` to be not implemented at all.
- If ``cr_self`` is NULL, then ``func.__get__(obj, cls)(*args, **kwds)``
(with ``obj`` not None)
must be equivalent to ``func(obj, *args, **kwds)``.
In particular, ``__get__`` must be implemented in this case.
Note that this is unrelated to self slicing: ``obj`` may be passed
as ``self`` argument to the C function or it may be the first positional argument.
- If ``cr_self`` is NULL, then ``func.__get__(None, cls)(*args, **kwds)``
must be equivalent to ``func(*args, **kwds)``.
There are no restrictions on the object ``func.__get__(obj, cls)``.
The latter is not required to implement the C call protocol for example.
It only specifies what ``func.__get__(obj, cls).__call__`` does.
For classes that do not care about ``__self__`` and ``__get__`` at all,
the easiest solution is to assign ``cr_self = Py_None``
(or any other non-NULL value).
__name__ attribute
------------------
The C call protocol requires that the function has a ``__name__``
attribute which is of type ``str`` (not a subclass).
Furthermore, this must be idempotent in the sense
that getting the ``__name__`` attribute twice in a row must return
exactly the same Python object.
This implies that it cannot be a temporary object, it must be stored somewhere.
This is required because ``PyEval_GetFuncName`` and ``PyEval_GetFuncDesc``
use borrowed references to the ``__name__`` attribute.
Generic API functions
---------------------
This section lists the new public API functions dealing with the C call protocol.
- ``int PyCCall_Check(PyObject *op)``:
return true if ``op`` implements the C call protocol.
All the functions and macros below
apply to any instance supporting the C call protocol.
In other words, ``PyCCall_Check(func)`` must be true.
- ``PyObject * PyCCall_Call(PyObject *func, PyObject *args, PyObject *kwds)``:
call ``func`` with positional arguments ``args``
and keyword arguments ``kwds`` (``kwds`` may be NULL).
This function is meant to be put in the ``tp_call`` slot.
- ``PyObject * PyCCall_FASTCALL(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *kwds)``:
call ``func`` with ``nargs`` positional arguments given by ``args[0]``, …, ``args[nargs-1]``.
The parameter ``kwds`` can be NULL (no keyword arguments),
a dict with ``name:value`` items or a tuple with keyword names.
In the latter case, the keyword values are stored in the ``args``
array, starting at ``args[nargs]``.
Macros to access the ``PyCCallRoot`` and ``PyCCallDef`` structures:
- ``PyCCallRoot * PyCCall_CCALLROOT(PyObject *func)``:
pointer to the ``PyCCallRoot`` structure inside ``func``.
- ``PyCCallDef * PyCCall_CCALLDEF(PyObject *func)``:
shorthand for ``PyCCall_CCALLROOT(func)->cr_ccall``.
- ``PyCCallDef * PyCCall_FLAGS(PyObject *func)``:
shorthand for ``PyCCall_CCALLROOT(func)->cr_ccall->cc_flags``.
- ``PyObject * PyCCall_SELF(PyOject *func)``:
shorthand for ``PyCCall_CCALLROOT(func)->cr_self``.
Generic getters, meant to be put into the ``tp_getset`` array:
- ``PyObject * PyCCall_GenericGetParent(PyObject *func, void *closure)``:
return ``cc_parent``.
Raise ``AttributeError`` if ``cc_parent`` is NULL.
- ``PyObject * PyCCall_GenericGetQualname(PyObject *func, void *closure)``:
return a string suitable for using as ``__qualname__``.
This uses the ``__qualname__`` of ``cc_parent`` if possible.
It also uses the ``__name__`` attribute.
- ``PyObject * PyCCall_GenericGetSelf(PyObject *func, void *closure)``:
return ``cr_self``.
Raise ``AttributeError`` if ``cr_self`` is NULL.
Profiling
---------
The profiling events
``c_call``, ``c_return`` and ``c_exception`` are only generated
when calling actual instances of ``builtin_function_or_method`` or ``method_descriptor``.
This is done for simplicity and also for backwards compatibility
(such that the profile function does not receive objects that it does not recognize).
In a future PEP, we may extend C-level profiling to arbitrary classes
implementing the C call protocol.
Changes to built-in functions and methods
=========================================
The reference implementation of this PEP changes
the existing classes ``builtin_function_or_method`` and ``method_descriptor``
to use the C call protocol.
In fact, those two classes are almost merged:
the implementation becomes very similar, but they remain separate classes
(mostly for backwards compatibility).
The ``PyCCallDef`` structure is simply stored
as part of the object structure.
Both classes use ``PyCFunctionObject`` as object structure.
This is the new layout::
typedef struct {
PyObject_HEAD
PyCCallDef *m_ccall;
PyObject *m_self; /* Passed as 'self' arg to the C function */
PyCCallDef _ccalldef; /* Storage for m_ccall */
PyObject *m_name; /* __name__; str object (not NULL) */
PyObject *m_module; /* __module__; can be anything */
const char *m_doc; /* __text_signature__ and __doc__ */
PyObject *m_weakreflist; /* List of weak references */
} PyCFunctionObject;
For functions of a module and for unbound methods of extension types,
``m_ccall`` points to the ``_ccalldef`` field.
For bound methods, ``m_ccall`` points to the ``PyCCallDef``
of the unbound method.
**NOTE**: the new layout of ``method_descriptor`` changes it
such that it no longer starts with ``PyDescr_COMMON``.
This is purely an implementation detail and it should cause few (if any)
compatibility problems.
C API functions
---------------
The following function is added (also to the stable ABI [#pep384]_):
- ``PyObject * PyCFunction_ClsNew(PyTypeObject *cls, PyMethodDef *ml, PyObject *self, PyObject *module, PyObject *parent)``:
create a new object with object structure ``PyCFunctionObject`` and class ``cls``.
This is called in turn by ``PyCFunction_NewEx`` and ``PyDescr_NewMethod``.
The undocumented functions ``PyCFunction_GetFlags``
and ``PyCFunction_GET_FLAGS``
are removed because it would be non-trivial to support them
in a backwards-compatible way.
Inheritance
===========
Extension types inherit the type flag ``Py_TPFLAGS_HAVE_CCALL``
and the value ``tp_ccalloffset`` from the base class,
provided that they implement ``tp_call`` and ``tp_descr_get``
the same way as the base class.
Heap types never inherit the C call protocol because
that would not be safe (heap types can be changed dynamically).
Performance
===========
This PEP should not impact the performance of existing code
(in the positive or negative sense).
It is meant to allow efficient new code to be written,
not to make existing code faster.
Stable ABI
==========
None of the functions, structures or constants dealing with the C call protocol
are added to the stable ABI [#pep384]_.
There are two reasons for this:
first of all, the most useful feature of the C call protocol is probably the
``METH_FASTCALL`` calling convention.
Given that this is not even part of the public API (see also PEP 579, issue 6),
it would be strange to add anything else from the C call protocol
to the stable ABI.
Second, we want the C call protocol to be extensible in the future.
By not adding anything to the stable ABI,
we are free to do that without restrictions.
Backwards compatibility
=======================
There should be no difference at all for the Python interface,
and neither for the documented C API
(in the sense that all functions remain supported with the same functionality).
The removed function ``PyCFunction_GetFlags``,
is officially part of the stable ABI [#pep384]_.
However, this is probably an oversight:
first of all, it is not even documented.
Second, the flag ``METH_FASTCALL``
is not part of the stable ABI but it is very common
(because of Argument Clinic).
So, if one cannot support ``METH_FASTCALL``,
it is hard to imagine a use case for ``PyCFunction_GetFlags``.
The fact that ``PyCFunction_GET_FLAGS`` and ``PyCFunction_GetFlags``
are not used at all by CPython outside of ``Objects/call.c``
further shows that these functions are not particularly useful.
Concluding: the only potential breakage is with C code
which accesses the internals of ``PyCFunctionObject`` and ``PyMethodDescrObject``.
We expect very few problems because of this.
Rationale
=========
Why is this better than PEP 575?
--------------------------------
One of the major complaints of PEP 575 was that is was coupling
functionality (the calling and introspection protocol)
with the class hierarchy:
a class could only benefit from the new features
if it was a subclass of ``base_function``.
It may be difficult for existing classes to do that
because they may have other constraints on the layout of the C object structure,
coming from an existing base class or implementation details.
For example, ``functools.lru_cache`` cannot implement PEP 575 as-is.
It also complicated the implementation precisely because changes
were needed both in the implementation details and in the class hierarchy.
The current PEP does not have these problems.
Why store the function pointer in the instance?
-----------------------------------------------
The actual information needed for calling an object
is stored in the instance (in the ``PyCCallDef`` structure)
instead of the class.
This is different from the ``tp_call`` slot or earlier attempts
at implementing a ``tp_fastcall`` slot [#bpo29259]_.
The main use case is built-in functions and methods.
For those, the C function to be called does depend on the instance.
Note that the current protocol makes it easy to support the case
where the same C function is called for all instances:
just use a single static ``PyCCallDef`` structure for every instance.
Why CCALL_OBJCLASS?
-------------------
The flag ``CCALL_OBJCLASS`` is meant to support various cases
where the class of a ``self`` argument must be checked, such as::
>>> list.append({}, None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: append() requires a 'list' object but received a 'dict'
>>> list.__len__({})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__len__' requires a 'list' object but received a 'dict'
>>> float.__dict__["fromhex"](list, "0xff")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor 'fromhex' for type 'float' doesn't apply to type 'list'
In the reference implementation, only the first of these uses the new code.
The other examples show that these kind of checks appear
in multiple places, so it makes sense to add generic support for them.
Why CCALL_SLICE_SELF?
---------------------
The flag ``CCALL_SLICE_SELF`` and the concept of self slicing
are needed to support methods:
the C function should not care
whether it is called as unbound method or as bound method.
In both cases, there should be a ``self`` argument
and this is simply the first positional argument of an unbound method call.
For example, ``list.append`` is a ``METH_O`` method.
Both the calls ``list.append([], 42)`` and ``[].append(42)`` should
translate to the C call ``list_append([], 42)``.
Thanks to the proposed C call protocol, we can support this in such a way
that both the unbound and the bound method share a ``PyCCallDef``
structure (with the ``CCALL_SLICE_SELF`` flag set).
Concluding, ``CCALL_SLICE_SELF`` has two advantages:
there is no extra layer of indirection for calling
and constructing bound methods does not require setting up a ``PyCCallDef`` structure.
Replacing tp_print
------------------
We repurpose ``tp_print`` as ``tp_ccalloffset`` because this makes
it easier for external projects to backport the C call protocol
to earlier Python versions.
In particular, the Cython project has shown interest in doing that
(see https://mail.python.org/pipermail/python-dev/2018-June/153927.html).
Alternative suggestions
=======================
PEP 576 is an alternative approach to solving the same problem as this PEP.
See https://mail.python.org/pipermail/python-dev/2018-July/154238.html
for comments on the difference between PEP 576 and PEP 580.
Reference implementation
========================
The reference implementation can be found at
https://github.com/jdemeyer/cpython/tree/pep580
References
==========
.. [#pep384] Löwis, PEP 384 Defining a Stable ABI,
https://www.python.org/dev/peps/pep-0384/
.. [#bpo29259] Add tp_fastcall to PyTypeObject: support FASTCALL calling convention for all callable objects,
https://bugs.python.org/issue29259
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

30
pep-0801.rst Normal file
View File

@ -0,0 +1,30 @@
PEP: 801
Title: Reserved
Author: Barry Warsaw <barry@python.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 21-Jun-2018
Abstract
========
This PEP is reserved for future use. Contact the author or
`the PEP editors <peps@python.org>`_ for details.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

View File

@ -151,7 +151,7 @@ parentheses around the parameter list. However it was decided
[#lambda]_ not to make this change because:
1. It would be an incompatible change.
2. Lambda's are neutered anyway.
2. Lambdas are neutered anyway.
3. The lambda can always be changed to a function.
@ -159,11 +159,11 @@ Accessing Function Annotations
==============================
Once compiled, a function's annotations are available via the
function's ``func_annotations`` attribute. This attribute is
function's ``__annotations__`` attribute. This attribute is
a mutable dictionary, mapping parameter names to an object
representing the evaluated annotation expression
There is a special key in the ``func_annotations`` mapping,
There is a special key in the ``__annotations__`` mapping,
``"return"``. This key is present only if an annotation was supplied
for the function's return value.
@ -172,7 +172,7 @@ For example, the following annotation::
def foo(a: 'x', b: 5 + 6, c: list) -> max(2, 9):
...
would result in a ``func_annotation`` mapping of ::
would result in an ``__annotations__`` mapping of ::
{'a': 'x',
'b': 11,
@ -183,7 +183,7 @@ The ``return`` key was chosen because it cannot conflict with the name
of a parameter; any attempt to use ``return`` as a parameter name
would result in a ``SyntaxError``.
``func_annotations`` is an empty, mutable dictionary if there are no
``__annotations__`` is an empty, mutable dictionary if there are no
annotations on the function or if the functions was created from
a ``lambda`` expression.

View File

@ -10,6 +10,7 @@ Type: Standards Track
Content-Type: text/x-rst
Created: 12-Dec-2012
Post-History: 21-Dec-2012
Replaces: 3153
Resolution: https://mail.python.org/pipermail/python-dev/2013-November/130419.html
Abstract

View File

@ -1,44 +1,43 @@
# -*- coding: utf-8 -*-
import sys
if sys.version_info[0] > 2:
text_type = str
else:
text_type = unicode
title_length = 55
column_format = (u' %(type)1s%(status)1s %(number)4s %(title)-' +
text_type(title_length) + u's %(authors)-s')
author_length = 40
table_separator = "== ==== " + "="*title_length + " " + "="*author_length
column_format = (
'%(type)1s%(status)1s %(number)4s %(title)-{title_length}s %(authors)-s'
).format(title_length=title_length)
header = u"""PEP: 0
header = """\
PEP: 0
Title: Index of Python Enhancement Proposals (PEPs)
Version: N/A
Last-Modified: %s
Author: David Goodger <goodger@python.org>,
Barry Warsaw <barry@python.org>
Author: python-dev <python-dev@python.org>
Status: Active
Type: Informational
Content-Type: text/x-rst
Created: 13-Jul-2000
"""
intro = u"""
intro = """\
This PEP contains the index of all Python Enhancement Proposals,
known as PEPs. PEP numbers are assigned by the PEP editors, and
once assigned are never changed[1]. The Mercurial history[2] of
once assigned are never changed [1_]. The version control history [2_] of
the PEP texts represent their historical record.
"""
references = u"""
[1] PEP 1: PEP Purpose and Guidelines
[2] View PEP history online
https://hg.python.org/peps/
references = """\
.. [1] PEP 1: PEP Purpose and Guidelines
.. [2] View PEP history online: https://github.com/python/peps
"""
footer = u"""
footer = """ \
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:"""
End:\
"""

View File

@ -26,15 +26,13 @@ RESERVED = [
indent = u' '
def write_column_headers(output):
def emit_column_headers(output):
"""Output the column headers for the PEP indices."""
column_headers = {'status': u'', 'type': u'', 'number': u'num',
'title': u'title', 'authors': u'owner'}
column_headers = {'status': '.', 'type': '.', 'number': 'PEP',
'title': 'PEP Title', 'authors': 'PEP Author(s)'}
print(constants.table_separator, file=output)
print(constants.column_format % column_headers, file=output)
underline_headers = {}
for key, value in column_headers.items():
underline_headers[key] = constants.text_type(len(value) * '-')
print(constants.column_format % underline_headers, file=output)
print(constants.table_separator, file=output)
def sort_peps(peps):
@ -42,6 +40,7 @@ def sort_peps(peps):
and essentially dead."""
meta = []
info = []
provisional = []
accepted = []
open_ = []
finished = []
@ -74,6 +73,8 @@ def sort_peps(peps):
info.append(pep)
else:
historical.append(pep)
elif pep.status == 'Provisional':
provisional.append(pep)
elif pep.status in ('Accepted', 'Active'):
accepted.append(pep)
elif pep.status == 'Final':
@ -82,14 +83,15 @@ def sort_peps(peps):
raise PEPError("unsorted (%s/%s)" %
(pep.type_, pep.status),
pep.filename, pep.number)
return meta, info, accepted, open_, finished, historical, deferred, dead
return (meta, info, provisional, accepted, open_,
finished, historical, deferred, dead)
def verify_email_addresses(peps):
authors_dict = {}
for pep in peps:
for author in pep.authors:
# If this is the first time we have come across an author, add him.
# If this is the first time we have come across an author, add them.
if author not in authors_dict:
authors_dict[author] = [author.email]
else:
@ -129,112 +131,160 @@ def sort_authors(authors_dict):
def normalized_last_first(name):
return len(unicodedata.normalize('NFC', name.last_first))
def emit_title(text, anchor, output, *, symbol="="):
print(".. _{anchor}:\n".format(anchor=anchor), file=output)
print(text, file=output)
print(symbol*len(text), file=output)
print(file=output)
def emit_subtitle(text, anchor, output):
emit_title(text, anchor, output, symbol="-")
def emit_pep_category(output, category, anchor, peps):
emit_subtitle(category, anchor, output)
emit_column_headers(output)
for pep in peps:
print(pep, file=output)
print(constants.table_separator, file=output)
print(file=output)
def write_pep0(peps, output=sys.stdout):
# PEP metadata
today = datetime.date.today().strftime("%Y-%m-%d")
print(constants.header % today, file=output)
print(file=output)
print(u"Introduction", file=output)
# Introduction
emit_title("Introduction", "intro", output)
print(constants.intro, file=output)
print(file=output)
print(u"Index by Category", file=output)
# PEPs by category
(meta, info, provisional, accepted, open_,
finished, historical, deferred, dead) = sort_peps(peps)
emit_title("Index by Category", "by-category", output)
emit_pep_category(
category="Meta-PEPs (PEPs about PEPs or Processes)",
anchor="by-category-meta",
peps=meta,
output=output,
)
emit_pep_category(
category="Other Informational PEPs",
anchor="by-category-other-info",
peps=info,
output=output,
)
emit_pep_category(
category="Provisional PEPs (provisionally accepted; interface may still change)",
anchor="by-category-provisional",
peps=provisional,
output=output,
)
emit_pep_category(
category="Accepted PEPs (accepted; may not be implemented yet)",
anchor="by-category-accepted",
peps=accepted,
output=output,
)
emit_pep_category(
category="Open PEPs (under consideration)",
anchor="by-category-open",
peps=open_,
output=output,
)
emit_pep_category(
category="Finished PEPs (done, with a stable interface)",
anchor="by-category-finished",
peps=finished,
output=output,
)
emit_pep_category(
category="Historical Meta-PEPs and Informational PEPs",
anchor="by-category-historical",
peps=historical,
output=output,
)
emit_pep_category(
category="Deferred PEPs (postponed pending further research or updates)",
anchor="by-category-deferred",
peps=deferred,
output=output,
)
emit_pep_category(
category="Abandoned, Withdrawn, and Rejected PEPs",
anchor="by-category-abandoned",
peps=dead,
output=output,
)
print(file=output)
write_column_headers(output)
(meta, info, accepted, open_, finished,
historical, deferred, dead) = sort_peps(peps)
print(file=output)
print(u" Meta-PEPs (PEPs about PEPs or Processes)", file=output)
print(file=output)
for pep in meta:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Other Informational PEPs", file=output)
print(file=output)
for pep in info:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Accepted PEPs (accepted; may not be implemented yet)", file=output)
print(file=output)
for pep in accepted:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Open PEPs (under consideration)", file=output)
print(file=output)
for pep in open_:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Finished PEPs (done, implemented in code repository)", file=output)
print(file=output)
for pep in finished:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Historical Meta-PEPs and Informational PEPs", file=output)
print(file=output)
for pep in historical:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Deferred PEPs", file=output)
print(file=output)
for pep in deferred:
print(constants.text_type(pep), file=output)
print(file=output)
print(u" Abandoned, Withdrawn, and Rejected PEPs", file=output)
print(file=output)
for pep in dead:
print(constants.text_type(pep), file=output)
print(file=output)
print(file=output)
print(u"Numerical Index", file=output)
print(file=output)
write_column_headers(output)
# PEPs by number
emit_title("Numerical Index", "by-pep-number", output)
emit_column_headers(output)
prev_pep = 0
for pep in peps:
if pep.number - prev_pep > 1:
print(file=output)
print(constants.text_type(pep), file=output)
prev_pep = pep.number
print(constants.table_separator, file=output)
print(file=output)
print(file=output)
print(u'Reserved PEP Numbers', file=output)
print(file=output)
write_column_headers(output)
# Reserved PEP numbers
emit_title('Reserved PEP Numbers', "reserved", output)
emit_column_headers(output)
for number, claimants in sorted(RESERVED):
print(constants.column_format % {
'type': '',
'status': '',
'type': '.',
'status': '.',
'number': number,
'title': 'RESERVED',
'authors': claimants,
}, file=output)
print(constants.table_separator, file=output)
print(file=output)
print(file=output)
print(u"Key", file=output)
print(file=output)
for type_ in PEP.type_values:
# PEP types key
emit_title("PEP Types Key", "type-key", output)
for type_ in sorted(PEP.type_values):
print(u" %s - %s PEP" % (type_[0], type_), file=output)
print(file=output)
for status in PEP.status_values:
print(u" %s - %s proposal" % (status[0], status), file=output)
print(file=output)
# PEP status key
emit_title("PEP Status Key", "status-key", output)
for status in sorted(PEP.status_values):
# Draft PEPs have no status displayed, Active shares a key with Accepted
if status in ("Active", "Draft"):
continue
if status == "Accepted":
msg = " A - Accepted (Standards Track only) or Active proposal"
else:
msg = " {status[0]} - {status} proposal".format(status=status)
print(msg, file=output)
print(file=output)
print(file=output)
print(file=output)
print(u"Owners", file=output)
print(file=output)
# PEP owners
emit_title("Authors/Owners", "authors", output)
authors_dict = verify_email_addresses(peps)
max_name = max(authors_dict.keys(), key=normalized_last_first)
max_name_len = len(max_name.last_first)
print(u" %s %s" % ('name'.ljust(max_name_len), 'email address'), file=output)
print(u" %s %s" % ((len('name')*'-').ljust(max_name_len),
len('email address')*'-'), file=output)
author_table_separator = "="*max_name_len + " " + "="*len("email address")
print(author_table_separator, file=output)
_author_header_fmt = "{name:{max_name_len}} Email Address"
print(_author_header_fmt.format(name="Name", max_name_len=max_name_len), file=output)
print(author_table_separator, file=output)
sorted_authors = sort_authors(authors_dict)
_author_fmt = "{author.last_first:{max_name_len}} {author_email}"
for author in sorted_authors:
# Use the email from authors_dict instead of the one from 'author' as
# the author instance may have an empty email.
print((u" %s %s" %
(author.last_first.ljust(max_name_len), authors_dict[author])), file=output)
_entry = _author_fmt.format(
author=author,
author_email=authors_dict[author],
max_name_len=max_name_len,
)
print(_entry, file=output)
print(author_table_separator, file=output)
print(file=output)
print(file=output)
print(u"References", file=output)
print(file=output)
# References for introduction footnotes
emit_title("References", "references", output)
print(constants.references, file=output)
print(constants.footer, file=output)

View File

@ -99,11 +99,11 @@ class Author(object):
name_parts = self.last.split()
for index, part in enumerate(name_parts):
if part[0].isupper():
base = u' '.join(name_parts[index:]).lower()
break
else:
raise ValueError("last name missing a capital letter: %r"
% name_parts)
base = u' '.join(name_parts[index:]).lower()
# If no capitals, use the whole string
base = self.last.lower()
return unicodedata.normalize('NFKD', base).encode('ASCII', 'ignore')
def _last_name(self, full_name):
@ -169,7 +169,8 @@ class PEP(object):
type_values = (u"Standards Track", u"Informational", u"Process")
# Valid values for the Status header.
# Active PEPs can only be for Informational or Process PEPs.
status_values = (u"Accepted", u"Rejected", u"Withdrawn", u"Deferred",
status_values = (u"Accepted", u"Provisional",
u"Rejected", u"Withdrawn", u"Deferred",
u"Final", u"Active", u"Draft", u"Superseded")
def __init__(self, pep_file):
@ -229,6 +230,11 @@ class PEP(object):
raise PEPError("Only Process and Informational PEPs may "
"have an Active status", pep_file.name,
self.number)
# Special case for Provisional PEPs.
if (status == u"Provisional" and self.type_ != "Standards Track"):
raise PEPError("Only Standards Track PEPs may "
"have a Provisional status", pep_file.name,
self.number)
self.status = status
# 'Author'.
authors_and_emails = self._parse_author(metadata['Author'])

View File

@ -235,7 +235,7 @@ def fixfile(inpath, input_lines, outfile):
else:
mailtos.append(part)
v = COMMASPACE.join(mailtos)
elif k.lower() in ('replaces', 'replaced-by', 'requires'):
elif k.lower() in ('replaces', 'superseded-by', 'requires'):
otherpeps = ''
for otherpep in re.split(',?\s+', v):
otherpep = int(otherpep)
@ -409,7 +409,7 @@ class PEPHeaders(Transform):
for node in para:
if isinstance(node, nodes.reference):
node.replace_self(peps.mask_email(node, pep))
elif name in ('replaces', 'replaced-by', 'requires'):
elif name in ('replaces', 'superseded-by', 'requires'):
newbody = []
space = nodes.Text(' ')
for refpep in re.split(r',?\s+', body.astext()):

View File

@ -224,7 +224,7 @@ def fixfile(inpath, input_lines, outfile):
else:
mailtos.append(part)
v = COMMASPACE.join(mailtos)
elif k.lower() in ('replaces', 'replaced-by', 'requires'):
elif k.lower() in ('replaces', 'superseded-by', 'requires'):
otherpeps = ''
for otherpep in re.split(',?\s+', v):
otherpep = int(otherpep)

View File

@ -1,22 +1,23 @@
#!/usr/bin/env python
#!/usr/bin/env python3
# usage: pep-hook.py $REPOS $REV
# (standard post-commit args)
import os, glob, time, datetime, stat, re, sys
import codecs
import PyRSS2Gen as rssgen
RSS_PATH = os.path.join(sys.argv[1], 'peps.rss')
def firstline_startingwith(full_path, text):
for line in codecs.open(full_path, encoding="utf-8"):
for line in open(full_path, encoding="utf-8"):
if line.startswith(text):
return line[len(text):].strip()
return None
# get list of peps with creation time (from "Created:" string in pep .txt)
# get list of peps with creation time
# (from "Created:" string in pep .rst or .txt)
peps = glob.glob('pep-*.txt')
peps.extend(glob.glob('pep-*.rst'))
def pep_creation_dt(full_path):
created_str = firstline_startingwith(full_path, 'Created:')
# bleh, I was hoping to avoid re but some PEPs editorialize
@ -69,5 +70,5 @@ rss = rssgen.RSS2(
lastBuildDate = datetime.datetime.now(),
items = items)
with open(RSS_PATH, 'w') as fp:
fp.write(rss.to_xml())
with open(RSS_PATH, 'w', encoding="utf-8") as fp:
fp.write(rss.to_xml(encoding="utf-8"))