PEP 618: Second Draft (#1399)
* Update history. * Credit Ram. * Acknowledge "equal" alternative. * Reword link. * Add note about strict=__debug__. * Add another argument against "equal". * Don't lean so heavily on infinite iterators. * Clarify example. * Clean up example. * Clean up response. * Outline precedent for map. * Clean up wording. * Reword map section. * Move method argument. * Flesh out argument against methods. * Further refinements to method argument. * Add callback argument. * Simplify AST example. * Clean up method/constructor outcomes. * Clean up callback bit. * Add StackOverflow link. * Address "mode" parameter. * "argument" -> "parameter" * Address "constant" arguments in Rationale. * Clean up Rationale phrasing. * Flesh out itertools bit. * Clean up itertools wording.
This commit is contained in:
parent
4732446cae
commit
17a0d39810
152
pep-0618.rst
152
pep-0618.rst
|
@ -7,9 +7,9 @@ Sponsor: Antoine Pitrou <antoine@python.org>
|
|||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 30-Apr-2020
|
||||
Created: 01-May-2020
|
||||
Python-Version: 3.9
|
||||
Post-History:
|
||||
Post-History: 01-May-2020
|
||||
Resolution:
|
||||
|
||||
|
||||
|
@ -17,7 +17,7 @@ Abstract
|
|||
========
|
||||
|
||||
This PEP proposes adding an optional ``strict`` boolean keyword
|
||||
argument to the built-in ``zip``. When enabled, a ``ValueError`` is
|
||||
parameter to the built-in ``zip``. When enabled, a ``ValueError`` is
|
||||
raised if one of the arguments is exhausted before the others.
|
||||
|
||||
|
||||
|
@ -59,9 +59,9 @@ Another is "chunking" data into equal-sized groups::
|
|||
>>> xn = list(zip(*[iter(x)] * n))
|
||||
|
||||
In the first case, non-rectangular data is usually a logic error. In
|
||||
the second case, data that is not a multiple of ``n`` is often an
|
||||
error as well. However, both of these idioms will silently omit the
|
||||
tail-end items of malformed input.
|
||||
the second case, data with a length that is not a multiple of ``n`` is
|
||||
often an error as well. However, both of these idioms will silently
|
||||
omit the tail-end items of malformed input.
|
||||
|
||||
Perhaps most convincingly, the current use of ``zip`` in the
|
||||
standard-library ``ast`` module has created multiple bugs that
|
||||
|
@ -69,8 +69,8 @@ standard-library ``ast`` module has created multiple bugs that
|
|||
<https://bugs.python.org/issue40355>`_::
|
||||
|
||||
>>> from ast import Constant, Dict, literal_eval
|
||||
>>> nasty_dict = Dict(keys=[Constant("XXX")], values=[])
|
||||
>>> literal_eval(nasty_dict) # Like eval('{"XXX": }')
|
||||
>>> nasty_dict = Dict(keys=[Constant(None)], values=[])
|
||||
>>> literal_eval(nasty_dict) # Like eval("{None: }")
|
||||
{}
|
||||
|
||||
In fact, the author has counted dozens of other call sites in Python's
|
||||
|
@ -81,10 +81,10 @@ this new feature immediately.
|
|||
Rationale
|
||||
=========
|
||||
|
||||
Some critics assert that boolean switches are a "code-smell", or go
|
||||
against Python's design philosophy. However, Python currently
|
||||
contains several examples of built-in functions with boolean keyword
|
||||
arguments:
|
||||
Some critics assert that constant boolean switches are a "code-smell",
|
||||
or go against Python's design philosophy. However, Python currently
|
||||
contains several examples of boolean keyword parameters on built-in
|
||||
functions which are typically called with compile-time constants:
|
||||
|
||||
- ``compile(..., dont_inherit=True)``
|
||||
- ``open(..., closefd=False)``
|
||||
|
@ -95,13 +95,18 @@ Many more exist in the standard library.
|
|||
|
||||
A good rule of thumb is that "mode-switches" which change return types
|
||||
or significantly alter functionality are indeed an anti-pattern, while
|
||||
ones which enable or disable complementary checks or functionality are
|
||||
not.
|
||||
ones which enable or disable complementary checks or behavior are not.
|
||||
|
||||
The name for this new parameter was taken from the `original message
|
||||
The idea and name for this new parameter were `originally proposed
|
||||
<https://mail.python.org/archives/list/python-ideas@python.org/message/6GFUADSQ5JTF7W7OGWF7XF2NH2XUTUQM>`_
|
||||
suggesting the feature. It received over 100 replies, with nobody
|
||||
challenging the use of the word "strict".
|
||||
by Ram Rachum. The thread received over 100 replies, with the
|
||||
alternative "equal" receiving a similar amount of support.
|
||||
|
||||
The author does not have a strong preference between the two choices,
|
||||
though "equal equals" *is* a bit awkward in prose. It may also
|
||||
(wrongly) imply some notion of "equality" between the zipped items::
|
||||
|
||||
>>> z = zip([2.0, 4.0, 6.0], [2, 4, 8], equal=True)
|
||||
|
||||
|
||||
Specification
|
||||
|
@ -125,7 +130,7 @@ This change is fully backward-compatible.
|
|||
Reference Implementation
|
||||
========================
|
||||
|
||||
The author has `drafted a C implementation
|
||||
The author has drafted a `C implementation
|
||||
<https://github.com/python/cpython/compare/master...brandtbucher:zip-strict>`_.
|
||||
|
||||
An approximate pure-Python translation is::
|
||||
|
@ -158,14 +163,26 @@ Rejected Ideas
|
|||
Add Additional Flavors Of ``zip`` To ``itertools``
|
||||
''''''''''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
Importing a drop-in replacement for a built-in feels too heavy,
|
||||
Adding ``zip_strict`` to itertools is a larger change with greater
|
||||
maintenance burden than the simple modification being proposed.
|
||||
|
||||
It seems that a great deal of the motivation driving this alternative
|
||||
is that ``zip_longest`` already exists in ``itertools``. However,
|
||||
``zip_longest`` is really another beast entirely: it takes on the
|
||||
responsibility of filling in missing values, a problem neither of
|
||||
the other variants even have. It also arguably has the most
|
||||
specialized behavior of the three (to the point of exposing a new
|
||||
``fillvalue`` parameter), so it makes sense that it would live in
|
||||
``itertools`` while ``zip`` grows in-place.
|
||||
|
||||
Importing a drop-in replacement for a built-in also feels too heavy,
|
||||
especially just to check a tricky condition that should "always" be
|
||||
true. The goal here is not just to provide a way to catch bugs, but
|
||||
to also make it easy (even tempting) for a user to enable the check
|
||||
whenever using ``zip`` at a call site with this property.
|
||||
|
||||
Some have also argued that a new function buried in the standard
|
||||
library is somehow more "discoverable" than a keyword argument on the
|
||||
library is somehow more "discoverable" than a keyword parameter on the
|
||||
built-in itself. The author does not believe this to be true.
|
||||
|
||||
Another proposed idiom, per-module shadowing of the built-in ``zip``
|
||||
|
@ -173,23 +190,72 @@ with some subtly different variant from ``itertools``, is an
|
|||
anti-pattern that shouldn't be encouraged.
|
||||
|
||||
|
||||
Change The Default Behavior Of ``zip``
|
||||
''''''''''''''''''''''''''''''''''''''
|
||||
Add Several "Modes" To Switch Between
|
||||
'''''''''''''''''''''''''''''''''''''
|
||||
|
||||
Support for infinite iterators is generally useful, and there is
|
||||
nothing "wrong" with the default behavior of ``zip``. Likely, this
|
||||
backward-incompatible change would break more code than it "fixes".
|
||||
This option only makes more sense than a binary flag if we anticipate
|
||||
having three or more modes. The "obvious" three choices for these
|
||||
enumerated or constant modes would be "shortest" (the current ``zip``
|
||||
behavior), "strict" (the proposed behavior), and "longest"
|
||||
(the ``itertools.zip_longest`` behavior).
|
||||
|
||||
``itertools.zip_longest`` already exists to service those cases where
|
||||
the "extra" tail-end data is still needed.
|
||||
However, it doesn't seem like adding behaviors other than the current
|
||||
default and the proposed "strict" mode is worth the additional
|
||||
complexity. The clearest candidate, "longest", would require a new
|
||||
``fillvalue`` parameter (which is meaningless for both other modes).
|
||||
This mode is also already handled perfectly by
|
||||
``itertools.zip_longest``, and adding it would create two ways of
|
||||
doing the same thing. It's not clear which would be the "obvious"
|
||||
choice: the ``mode`` parameter on the built-in ``zip``, or the
|
||||
long-lived namesake utility in ``itertools``.
|
||||
|
||||
|
||||
Add A Method Or Alternate Constructor To The ``zip`` Type
|
||||
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
The actual ``zip`` type is an undocumented implementation detail.
|
||||
Adding additional methods or constructors is really a much larger
|
||||
change that is not necessary to achieve the stated goal.
|
||||
Consider the following two options, which have both been proposed::
|
||||
|
||||
>>> zm = zip(*iters).strict()
|
||||
>>> zd = zip.strict(*iters)
|
||||
|
||||
It's not obvious which one will succeed, or how the other will fail.
|
||||
If ``zip.strict`` is implemented as a method, ``zm`` will succeed, but
|
||||
``zd`` will fail in one of several confusing ways:
|
||||
|
||||
- Yield results that aren't wrapped in a tuple (if ``iters`` contains
|
||||
just one item, a ``zip`` iterator).
|
||||
- Raise a ``TypeError`` for an incorrect argument type (if ``iters``
|
||||
contains just one item, not a ``zip`` iterator).
|
||||
- Raise a ``TypeError`` for an incorrect number of arguments
|
||||
(otherwise).
|
||||
|
||||
If ``zip.strict`` is implemented as a ``classmethod`` or
|
||||
``staticmethod``, ``zd`` will succeed, and ``zm`` will silently yield
|
||||
nothing (which is the problem we are trying to avoid in the first
|
||||
place).
|
||||
|
||||
This proposal is further complicated by the fact that CPython's actual
|
||||
``zip`` type is an undocumented implementation detail.
|
||||
|
||||
|
||||
Change The Default Behavior Of ``zip``
|
||||
''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
There is nothing "wrong" with the default behavior of ``zip``, since
|
||||
there are many cases where it is indeed the correct way to handle
|
||||
unequally-sized inputs. It's extremely useful, for example, when
|
||||
dealing with infinite iterators.
|
||||
|
||||
``itertools.zip_longest`` already exists to service those cases where
|
||||
the "extra" tail-end data is still needed.
|
||||
|
||||
|
||||
Accept A Callback To Handle Remaining Items
|
||||
'''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
While able to do basically anything a user could need, this solution
|
||||
makes handling the more common cases (like rejecting mismatched
|
||||
lengths) unnecessarily complicated and non-obvious.
|
||||
|
||||
|
||||
Raise An ``AssertionError`` Instead Of A ``ValueError``
|
||||
|
@ -205,6 +271,23 @@ simply reads (in its entirety):
|
|||
|
||||
Since this feature has nothing to do with Python's ``assert``
|
||||
statement, raising an ``AssertionError`` here would be inappropriate.
|
||||
Users desiring a check that is disabled in optimized mode (like an
|
||||
``assert`` statement) can use ``strict=__debug__`` instead.
|
||||
|
||||
|
||||
Add A Similar Feature to ``map``
|
||||
''''''''''''''''''''''''''''''''
|
||||
|
||||
This PEP does not propose any changes to ``map``, since the use of
|
||||
``map`` with multiple iterable arguments is quite rare. However, this
|
||||
PEP's ruling shall serve as precedent such a future discussion (should
|
||||
it occur).
|
||||
|
||||
If rejected, the feature is realistically not worth pursuing. If
|
||||
accepted, such a change to ``map`` should not require its own PEP
|
||||
(though, like all enhancements, its usefulness should be carefully
|
||||
considered). For consistency, it should follow same API and semantics
|
||||
debated here for ``zip``.
|
||||
|
||||
|
||||
Do Nothing
|
||||
|
@ -213,10 +296,11 @@ Do Nothing
|
|||
This option is perhaps the least attractive.
|
||||
|
||||
Silently truncated data is a particularly nasty class of bug, and
|
||||
hand-writing a robust solution that gets this right isn't trivial. The
|
||||
real-world motivating examples from Python's own standard library are
|
||||
evidence that it's *very* easy to fall into the sort of trap that this
|
||||
feature aims to avoid.
|
||||
hand-writing a robust solution that gets this right `isn't trivial
|
||||
<https://stackoverflow.com/questions/32954486/zip-iterators-asserting-for-equal-length-in-python>`_.
|
||||
The real-world motivating examples from Python's own standard library
|
||||
are evidence that it's *very* easy to fall into the sort of trap that
|
||||
this feature aims to avoid.
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue