200 lines
5.5 KiB
Plaintext
200 lines
5.5 KiB
Plaintext
PEP: 3132
|
||
Title: Extended Iterable Unpacking
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Georg Brandl <georg@python.org>
|
||
Status: Final
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 30-Apr-2007
|
||
Python-Version: 3.0
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes a change to iterable unpacking syntax, allowing to
|
||
specify a "catch-all" name which will be assigned a list of all items
|
||
not assigned to a "regular" name.
|
||
|
||
An example says more than a thousand words::
|
||
|
||
>>> a, *b, c = range(5)
|
||
>>> a
|
||
0
|
||
>>> c
|
||
4
|
||
>>> b
|
||
[1, 2, 3]
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Many algorithms require splitting a sequence in a "first, rest" pair.
|
||
With the new syntax, ::
|
||
|
||
first, rest = seq[0], seq[1:]
|
||
|
||
is replaced by the cleaner and probably more efficient::
|
||
|
||
first, *rest = seq
|
||
|
||
For more complex unpacking patterns, the new syntax looks even
|
||
cleaner, and the clumsy index handling is not necessary anymore.
|
||
|
||
Also, if the right-hand value is not a list, but an iterable, it
|
||
has to be converted to a list before being able to do slicing; to
|
||
avoid creating this temporary list, one has to resort to ::
|
||
|
||
it = iter(seq)
|
||
first = it.next()
|
||
rest = list(it)
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
A tuple (or list) on the left side of a simple assignment (unpacking
|
||
is not defined for augmented assignment) may contain at most one
|
||
expression prepended with a single asterisk (which is henceforth
|
||
called a "starred" expression, while the other expressions in the
|
||
list are called "mandatory"). This designates a subexpression that
|
||
will be assigned a list of all items from the iterable being unpacked
|
||
that are not assigned to any of the mandatory expressions, or an
|
||
empty list if there are no such items.
|
||
|
||
For example, if ``seq`` is a sliceable sequence, all the following
|
||
assignments are equivalent if ``seq`` has at least two elements::
|
||
|
||
a, b, c = seq[0], list(seq[1:-1]), seq[-1]
|
||
a, *b, c = seq
|
||
[a, *b, c] = seq
|
||
|
||
It is an error (as it is currently) if the iterable doesn't contain
|
||
enough items to assign to all the mandatory expressions.
|
||
|
||
It is also an error to use the starred expression as a lone
|
||
assignment target, as in ::
|
||
|
||
*a = range(5)
|
||
|
||
This, however, is valid syntax::
|
||
|
||
*a, = range(5)
|
||
|
||
Note that this proposal also applies to tuples in implicit assignment
|
||
context, such as in a ``for`` statement::
|
||
|
||
for a, *b in [(1, 2, 3), (4, 5, 6, 7)]:
|
||
print(b)
|
||
|
||
would print out ::
|
||
|
||
[2, 3]
|
||
[5, 6, 7]
|
||
|
||
Starred expressions are only allowed as assignment targets, using them
|
||
anywhere else (except for star-args in function calls, of course) is an
|
||
error.
|
||
|
||
|
||
Implementation
|
||
==============
|
||
|
||
Grammar change
|
||
--------------
|
||
|
||
This feature requires a new grammar rule::
|
||
|
||
star_expr: ['*'] expr
|
||
|
||
In these two rules, ``expr`` is changed to ``star_expr``::
|
||
|
||
comparison: star_expr (comp_op star_expr)*
|
||
exprlist: star_expr (',' star_expr)* [',']
|
||
|
||
Changes to the Compiler
|
||
-----------------------
|
||
|
||
A new ASDL expression type ``Starred`` is added which represents a
|
||
starred expression. Note that the starred expression element
|
||
introduced here is universal and could later be used for other
|
||
purposes in non-assignment context, such as the ``yield *iterable``
|
||
proposal.
|
||
|
||
The compiler is changed to recognize all cases where a starred
|
||
expression is invalid and flag them with syntax errors.
|
||
|
||
A new bytecode instruction, ``UNPACK_EX``, is added, whose argument
|
||
has the number of mandatory targets before the starred target in the
|
||
lower 8 bits and the number of mandatory targets after the starred
|
||
target in the upper 8 bits. For unpacking sequences without starred
|
||
expressions, the old ``UNPACK_ITERABLE`` opcode is kept.
|
||
|
||
Changes to the Bytecode Interpreter
|
||
-----------------------------------
|
||
|
||
The function ``unpack_iterable()`` in ceval.c is changed to handle
|
||
the extended unpacking, via an ``argcntafter`` parameter. In the
|
||
``UNPACK_EX`` case, the function will do the following:
|
||
|
||
* collect all items for mandatory targets before the starred one
|
||
* collect all remaining items from the iterable in a list
|
||
* pop items for mandatory targets after the starred one from the list
|
||
* push the single items and the resized list on the stack
|
||
|
||
Shortcuts for unpacking iterables of known types, such as lists or
|
||
tuples, can be added.
|
||
|
||
|
||
The current implementation can be found at the SourceForge Patch
|
||
tracker [SFPATCH]_. It now includes a minimal test case.
|
||
|
||
|
||
Acceptance
|
||
==========
|
||
|
||
After a short discussion on the python-3000 list [1]_, the PEP was
|
||
accepted by Guido in its current form. Possible changes discussed
|
||
were:
|
||
|
||
* Only allow a starred expression as the last item in the exprlist.
|
||
This would simplify the unpacking code a bit and allow for the
|
||
starred expression to be assigned an iterator. This behavior was
|
||
rejected because it would be too surprising.
|
||
|
||
* Try to give the starred target the same type as the source
|
||
iterable, for example, ``b`` in ``a, *b = 'hello'`` would be
|
||
assigned the string ``'ello'``. This may seem nice, but is
|
||
impossible to get right consistently with all iterables.
|
||
|
||
* Make the starred target a tuple instead of a list. This would be
|
||
consistent with a function's ``*args``, but make further processing
|
||
of the result harder.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [SFPATCH] https://bugs.python.org/issue1711529
|
||
.. [1] https://mail.python.org/pipermail/python-3000/2007-May/007198.html
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|