Complete rewrite of the PEP, discarding the previous long explanation about
augmented assignment. Kept as short as possible, but might still be considered wordy :-)
This commit is contained in:
parent
93d20f2bf1
commit
18701cf307
299
pep-0203.txt
299
pep-0203.txt
|
@ -3,10 +3,9 @@ Title: Augmented Assignments
|
||||||
Version: $Revision$
|
Version: $Revision$
|
||||||
Owner: thomas@xs4all.net (Thomas Wouters)
|
Owner: thomas@xs4all.net (Thomas Wouters)
|
||||||
Python-Version: 2.0
|
Python-Version: 2.0
|
||||||
Status: Incomplete
|
Status: Draft
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Introduction
|
Introduction
|
||||||
|
|
||||||
This PEP describes the `augmented assignment' proposal for Python
|
This PEP describes the `augmented assignment' proposal for Python
|
||||||
|
@ -19,133 +18,197 @@ Introduction
|
||||||
definitive historical record.
|
definitive historical record.
|
||||||
|
|
||||||
|
|
||||||
|
Proposed semantics
|
||||||
The Origin of Augmented Assignment
|
|
||||||
|
|
||||||
Augmented assignment refers to binary operators that combine two
|
The proposed patch that adds augmented assignment to Python
|
||||||
existing operators: the assignment operator, and one of the binary
|
introduces the following new operators:
|
||||||
operators. Its origins lie in other programming languages, most
|
|
||||||
notably `C', where it was defined for performance reasons. They
|
|
||||||
are meant to replace the repetetive syntax of, for instance,
|
|
||||||
adding the number '1' to a variable:
|
|
||||||
|
|
||||||
x = x + 1;
|
+= -= *= /= %= **= <<= >>= &= ^= |=
|
||||||
|
|
||||||
with an expression that is shorter, less error-prone and easier to
|
|
||||||
optimize (by the compiler):
|
|
||||||
|
|
||||||
x += 1;
|
They implement the same operator as their normal binary form, with
|
||||||
|
the exception that the operation is done `in-place' whenever
|
||||||
The same goes for all other binary operands, resulting in the
|
possible.
|
||||||
following augmented assignment operator list, based on Python's
|
|
||||||
current binary operator list:
|
|
||||||
|
|
||||||
+=, -=, /=, *=, %=, **=, >>=, <<=, &=, |=, ^=
|
|
||||||
|
|
||||||
See the documentation of each operator on what they do.
|
They truly behave as augmented assignment, in that they perform
|
||||||
|
all of the normal load and store operations, in addition to the
|
||||||
|
binary operation they are intended to do. So, given the expression:
|
||||||
|
|
||||||
Augmented Assignment in Python
|
|
||||||
|
|
||||||
The traditional reasons for augmented assignment, readability and
|
|
||||||
optimization, are not as obvious in Python, for several reasons.
|
|
||||||
|
|
||||||
- Numbers are immutable, they cannot be changed. In other
|
x += y
|
||||||
programming languages, a variable holds a value, and altering
|
|
||||||
the variable changes the value it holds. In Python, variables
|
|
||||||
hold `references' to values, and altering an immutable value
|
|
||||||
means changing the variable, not what it points to.
|
|
||||||
|
|
||||||
- Assignment is a different operation in Python. In most
|
|
||||||
languages, variables are containers, and assignment copies a
|
|
||||||
value into that container. In Python, assignment binds a value
|
|
||||||
to a name, it does not copy the value into a new storage space.
|
|
||||||
|
|
||||||
- The augmented assignment operators map fairly directly into the
|
|
||||||
underlying hardware. Python does not deal directly with the
|
|
||||||
hardware it runs on, so this `natural inclusion' does not make
|
|
||||||
sense.
|
|
||||||
|
|
||||||
- The augmented assigment syntax is subtly different in more
|
|
||||||
complex expressions. What to do, for instance, in a case such
|
|
||||||
as this:
|
|
||||||
|
|
||||||
seq[i:calc(seq, i)] *= r
|
|
||||||
|
|
||||||
It is unclear whether 'seq' gets indexed once or twice, and
|
|
||||||
whether 'calc' gets called once or twice.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Normal operators
|
|
||||||
|
|
||||||
There are, however, good reasons to include augented assignment.
|
|
||||||
One of these has to do with Python's way of handling operators. In
|
|
||||||
Python, a user defined class can implement one or more of the
|
|
||||||
binary operators by supplying a 'magic' method name. For instance,
|
|
||||||
for a class to support '<instance> + <object>', the '__add__'
|
|
||||||
method should be defined. This method should return a new object,
|
|
||||||
which is the result of the expression.
|
|
||||||
|
|
||||||
For the case of '<object> + <instance>', where 'object' does not
|
The object `x' is loaded, then added with 1, and the resulting
|
||||||
have an '__add__' method, the class can define a '__radd__'
|
object is stored back in the original place. The precise action
|
||||||
method, which then should behave exactly as '__add__'. Indeed,
|
performed on the two arguments depends on the type of `x', and
|
||||||
'__radd__' is often a different name for the same method.
|
possibly of `y'.
|
||||||
|
|
||||||
For C extention types, a similar technique is available, through
|
|
||||||
the PyNumberMethods and PySequenceMethods members of the PyType
|
|
||||||
structure.
|
|
||||||
|
|
||||||
However, the problem with this approach is that the '__add__'
|
The idea behind augmented assignment in Python is that it isn't
|
||||||
method cannot know in what context it is called. It cannot tell
|
just an easier way to write the common practice of storing the
|
||||||
whether it should create a new object, or whether it is allowed to
|
result of a binary operation in its left-hand operand, but also a
|
||||||
modify itself. (As would be the case in 'x = x + 1') As a result,
|
way for the left-hand operand in question to know that it should
|
||||||
the '__add__' method, and all other such 'magic' methods, should
|
operate 'on itself', rather than creating a modified copy of
|
||||||
always return a new object. For large objects, this can be very
|
itself.
|
||||||
inefficient.
|
|
||||||
|
|
||||||
This inefficiency is often solved by adding a method that does the
|
|
||||||
appropriate modification 'in-place'. List objects, for instance,
|
|
||||||
have the 'extend' method that behaves exactly as the '+' operator,
|
|
||||||
except the operation is done on the list itself, instead of on a
|
|
||||||
copy.
|
|
||||||
|
|
||||||
The augmented assignment syntax can support this behaviour
|
To make this possible, a number of new `hooks' are added to Python
|
||||||
explicitly. When the magic method for 'in-place' operation are
|
classes and C extention types, which are called when the object in
|
||||||
missing, it can fall back to the normal methods for that
|
question is used as the left hand side of an augmented assignment
|
||||||
operation, maintaining full backward compatibility even when
|
operation. If the class or type does not implement the `in-place'
|
||||||
mixing the new syntax with old objects.
|
hooks, the normal hooks for the particular binary operation are
|
||||||
|
used.
|
||||||
|
|
||||||
|
So, given an instance object `x', the expression
|
||||||
|
|
||||||
|
x += y
|
||||||
|
|
||||||
|
tries to call x.__add_ab__(y), which is the 'in-place' variant of
|
||||||
|
__add__. If __add_ab__ is not present, x.__add__(y) is
|
||||||
|
attempted, and finally y.__radd__(x) if __add__ is missing too.
|
||||||
|
There is no `right-hand-side' variant of __add_ab__, because that
|
||||||
|
would require for `y' to know how to in-place modify `x', which is
|
||||||
|
an unsafe assumption. The __add_ab__ hook should behave exactly
|
||||||
|
like __add__, returning the result of the operation (which could
|
||||||
|
be `self') which is to be stored in the variable `x'.
|
||||||
|
|
||||||
|
For C extention types, the `hooks' are members of the
|
||||||
|
PyNumberMethods and PySequenceMethods structures, and are called
|
||||||
|
in exactly the same manner as the existing non-inplace operations,
|
||||||
|
including argument coercion. C methods should also take care to
|
||||||
|
return a new reference to the result object, whether it's the same
|
||||||
|
object or a new one. So if the original object is returned, it
|
||||||
|
should be INCREF()'d appropriately.
|
||||||
|
|
||||||
The other benifit of augmented assignment is readability. After
|
|
||||||
the general concept of augmented assignment is grasped, all the
|
|
||||||
augmented assigment operators instantly become obvious. There is
|
|
||||||
no need for non-obvious and non-standard method names to implement
|
|
||||||
efficient, in-place operations, and there is no need to check the
|
|
||||||
type of an object before operating on it: the augmented assignment
|
|
||||||
will work for all types that implement that basic operation, not
|
|
||||||
merely those that implement the augmented variant.
|
|
||||||
|
|
||||||
And the last problem with augmented assignment, what to do with
|
|
||||||
indexes and function calls in the expression, can be solved in a
|
|
||||||
very Pythonic manner: if it looks like it's only called once, it
|
|
||||||
*is* only called once. Taking this expression:
|
|
||||||
|
|
||||||
seq[func(x)] += x
|
|
||||||
|
|
||||||
The function 'func' is called once, and 'seq' is indexed twice:
|
|
||||||
once to retrieve the value (__getitem__), and once to store it
|
|
||||||
(__setitem__). So the expression can be rewritten as:
|
|
||||||
|
|
||||||
tmp = func(x)
|
|
||||||
seq[tmp] = seq[tmp] + x
|
|
||||||
|
|
||||||
The augmented assignment form of this expression is much more
|
|
||||||
readable.
|
|
||||||
|
|
||||||
|
|
||||||
|
New methods
|
||||||
|
|
||||||
|
The proposed implementation adds the following 11 possible `hooks'
|
||||||
|
which Python classes can implement to overload the augmented
|
||||||
|
assignment operations:
|
||||||
|
|
||||||
|
__add_ab__
|
||||||
|
__sub_ab__
|
||||||
|
__mul_ab__
|
||||||
|
__div_ab__
|
||||||
|
__mod_ab__
|
||||||
|
__pow_ab__
|
||||||
|
__lshift_ab__
|
||||||
|
__rshift_ab__
|
||||||
|
__and_ab__
|
||||||
|
__xor_ab__
|
||||||
|
__or_ab__
|
||||||
|
|
||||||
|
The `__add_ab__' name is one proposed by Guido[1], and stands for `and
|
||||||
|
becomes'. Other proposed names include '__iadd__', `__add_in__'
|
||||||
|
`__inplace_add__'
|
||||||
|
|
||||||
|
For C extention types, the following struct members are added:
|
||||||
|
|
||||||
|
To PyNumberMethods:
|
||||||
|
binaryfunc nb_inplace_add;
|
||||||
|
binaryfunc nb_inplace_subtract;
|
||||||
|
binaryfunc nb_inplace_multiply;
|
||||||
|
binaryfunc nb_inplace_divide;
|
||||||
|
binaryfunc nb_inplace_remainder;
|
||||||
|
binaryfunc nb_inplace_power;
|
||||||
|
binaryfunc nb_inplace_lshift;
|
||||||
|
binaryfunc nb_inplace_rshift;
|
||||||
|
binaryfunc nb_inplace_and;
|
||||||
|
binaryfunc nb_inplace_xor;
|
||||||
|
binaryfunc nb_inplace_or;
|
||||||
|
|
||||||
|
To PySequenceMethods:
|
||||||
|
binaryfunc sq_inplace_concat;
|
||||||
|
intargfunc sq_inplace_repeat;
|
||||||
|
|
||||||
|
In order to keep binary compatibility, the tp_flags TypeObject
|
||||||
|
member is used to determine whether the TypeObject in question has
|
||||||
|
allocated room for these slots. Until a clean break in binary
|
||||||
|
compatibility is made (which may or may not happen before 2.0)
|
||||||
|
code that wants to use one of the new struct members must first
|
||||||
|
check that they are available with the 'PyType_HasFeature()' macro:
|
||||||
|
|
||||||
|
if (PyType_HasFeature(x->ob_type, Py_TPFLAGS_HAVE_INPLACE_OPS) &&
|
||||||
|
x->ob_type->tp_as_number && x->ob_type->tp_as_number->nb_inplace_add) {
|
||||||
|
/* ... */
|
||||||
|
|
||||||
|
This check must be made even before testing the method slots for
|
||||||
|
NULL values! The macro only tests whether the slots are available,
|
||||||
|
not whether they are filled with methods or not.
|
||||||
|
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
|
||||||
|
The current implementation of augmented assignment[2] adds, in
|
||||||
|
addition to the methods and slots alread covered, 13 new bytecodes
|
||||||
|
and 13 new API functions.
|
||||||
|
|
||||||
|
The API functions are simply in-place versions of the current
|
||||||
|
binary-operation API functions:
|
||||||
|
|
||||||
|
PyNumber_InPlaceAdd(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceSubtract(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceMultiply(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceDivide(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceRemainder(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlacePower(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceLshift(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceRshift(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceAnd(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceXor(PyObject *o1, PyObject *o2);
|
||||||
|
PyNumber_InPlaceOr(PyObject *o1, PyObject *o2);
|
||||||
|
PySequence_InPlaceConcat(PyObject *o1, PyObject *o2);
|
||||||
|
PySequence_InPlaceRepeat(PyObject *o, int count);
|
||||||
|
|
||||||
|
They call either the Python class hooks (if either of the objects
|
||||||
|
is a Python class instance) or the C type's number or sequence
|
||||||
|
methods.
|
||||||
|
|
||||||
|
The new bytecodes are:
|
||||||
|
INPLACE_ADD
|
||||||
|
INPLACE_SUBTRACT
|
||||||
|
INPLACE_MULTIPLY
|
||||||
|
INPLACE_DIVIDE
|
||||||
|
INPLACE_REMAINDER
|
||||||
|
INPLACE_POWER
|
||||||
|
INPLACE_LEFTSHIFT
|
||||||
|
INPLACE_RIGHTSHIFT
|
||||||
|
INPLACE_AND
|
||||||
|
INPLACE_XOR
|
||||||
|
INPLACE_OR
|
||||||
|
ROT_FOUR
|
||||||
|
DUP_TOPX
|
||||||
|
|
||||||
|
The INPLACE_* bytecodes mirror the BINARY_* bytecodes, except that
|
||||||
|
they are implemented as calls to the 'InPlace' API functions. The
|
||||||
|
other two bytecodes are 'utility' bytecodes: ROT_FOUR behaves like
|
||||||
|
ROT_THREE except that the four topmost stack items are rotated.
|
||||||
|
|
||||||
|
DUP_TOPX is a bytecode that takes a single argument, which should
|
||||||
|
be an integer between 1 and 5 (inclusive) which is the number of
|
||||||
|
items to duplicate in one block. Given a stack like this (where
|
||||||
|
the left side of the list is the 'top' of the stack):
|
||||||
|
|
||||||
|
[a, b, c, d, e, f, g]
|
||||||
|
|
||||||
|
"DUP_TOPX 3" would duplicate the top 3 items, resulting in this
|
||||||
|
stack:
|
||||||
|
|
||||||
|
[a, b, c, d, e, f, g, e, f, g]
|
||||||
|
|
||||||
|
DUP_TOPX with an argument of 1 is the same as DUP_TOP. The limit
|
||||||
|
of 5 is purely an implementation limit. The implementation of
|
||||||
|
augmented assignment requires only DUP_TOPX with an argument of 2
|
||||||
|
and 3, and could do without this new opcode at the cost of a fair
|
||||||
|
number of DUP_TOP and ROT_*.
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
|
||||||
|
[1] http://www.python.org/pipermail/python-list/2000-June/059556.html
|
||||||
|
[2]
|
||||||
|
http://sourceforge.net/patch?func=detailpatch&patch_id=100699&group_id=5470
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue