Complete rewrite of the PEP, discarding the previous long explanation about

augmented assignment. Kept as short as possible, but might still be
considered wordy :-)
This commit is contained in:
Thomas Wouters 2000-08-07 12:40:00 +00:00
parent 93d20f2bf1
commit 18701cf307
1 changed files with 181 additions and 118 deletions

View File

@ -3,10 +3,9 @@ Title: Augmented Assignments
Version: $Revision$ Version: $Revision$
Owner: thomas@xs4all.net (Thomas Wouters) Owner: thomas@xs4all.net (Thomas Wouters)
Python-Version: 2.0 Python-Version: 2.0
Status: Incomplete Status: Draft
Introduction Introduction
This PEP describes the `augmented assignment' proposal for Python This PEP describes the `augmented assignment' proposal for Python
@ -19,133 +18,197 @@ Introduction
definitive historical record. definitive historical record.
Proposed semantics
The Origin of Augmented Assignment
Augmented assignment refers to binary operators that combine two The proposed patch that adds augmented assignment to Python
existing operators: the assignment operator, and one of the binary introduces the following new operators:
operators. Its origins lie in other programming languages, most
notably `C', where it was defined for performance reasons. They
are meant to replace the repetetive syntax of, for instance,
adding the number '1' to a variable:
x = x + 1; += -= *= /= %= **= <<= >>= &= ^= |=
with an expression that is shorter, less error-prone and easier to
optimize (by the compiler):
x += 1; They implement the same operator as their normal binary form, with
the exception that the operation is done `in-place' whenever
The same goes for all other binary operands, resulting in the possible.
following augmented assignment operator list, based on Python's
current binary operator list:
+=, -=, /=, *=, %=, **=, >>=, <<=, &=, |=, ^=
See the documentation of each operator on what they do. They truly behave as augmented assignment, in that they perform
all of the normal load and store operations, in addition to the
binary operation they are intended to do. So, given the expression:
Augmented Assignment in Python
The traditional reasons for augmented assignment, readability and
optimization, are not as obvious in Python, for several reasons.
- Numbers are immutable, they cannot be changed. In other x += y
programming languages, a variable holds a value, and altering
the variable changes the value it holds. In Python, variables
hold `references' to values, and altering an immutable value
means changing the variable, not what it points to.
- Assignment is a different operation in Python. In most
languages, variables are containers, and assignment copies a
value into that container. In Python, assignment binds a value
to a name, it does not copy the value into a new storage space.
- The augmented assignment operators map fairly directly into the
underlying hardware. Python does not deal directly with the
hardware it runs on, so this `natural inclusion' does not make
sense.
- The augmented assigment syntax is subtly different in more
complex expressions. What to do, for instance, in a case such
as this:
seq[i:calc(seq, i)] *= r
It is unclear whether 'seq' gets indexed once or twice, and
whether 'calc' gets called once or twice.
Normal operators
There are, however, good reasons to include augented assignment.
One of these has to do with Python's way of handling operators. In
Python, a user defined class can implement one or more of the
binary operators by supplying a 'magic' method name. For instance,
for a class to support '<instance> + <object>', the '__add__'
method should be defined. This method should return a new object,
which is the result of the expression.
For the case of '<object> + <instance>', where 'object' does not The object `x' is loaded, then added with 1, and the resulting
have an '__add__' method, the class can define a '__radd__' object is stored back in the original place. The precise action
method, which then should behave exactly as '__add__'. Indeed, performed on the two arguments depends on the type of `x', and
'__radd__' is often a different name for the same method. possibly of `y'.
For C extention types, a similar technique is available, through
the PyNumberMethods and PySequenceMethods members of the PyType
structure.
However, the problem with this approach is that the '__add__' The idea behind augmented assignment in Python is that it isn't
method cannot know in what context it is called. It cannot tell just an easier way to write the common practice of storing the
whether it should create a new object, or whether it is allowed to result of a binary operation in its left-hand operand, but also a
modify itself. (As would be the case in 'x = x + 1') As a result, way for the left-hand operand in question to know that it should
the '__add__' method, and all other such 'magic' methods, should operate 'on itself', rather than creating a modified copy of
always return a new object. For large objects, this can be very itself.
inefficient.
This inefficiency is often solved by adding a method that does the
appropriate modification 'in-place'. List objects, for instance,
have the 'extend' method that behaves exactly as the '+' operator,
except the operation is done on the list itself, instead of on a
copy.
The augmented assignment syntax can support this behaviour To make this possible, a number of new `hooks' are added to Python
explicitly. When the magic method for 'in-place' operation are classes and C extention types, which are called when the object in
missing, it can fall back to the normal methods for that question is used as the left hand side of an augmented assignment
operation, maintaining full backward compatibility even when operation. If the class or type does not implement the `in-place'
mixing the new syntax with old objects. hooks, the normal hooks for the particular binary operation are
used.
So, given an instance object `x', the expression
x += y
tries to call x.__add_ab__(y), which is the 'in-place' variant of
__add__. If __add_ab__ is not present, x.__add__(y) is
attempted, and finally y.__radd__(x) if __add__ is missing too.
There is no `right-hand-side' variant of __add_ab__, because that
would require for `y' to know how to in-place modify `x', which is
an unsafe assumption. The __add_ab__ hook should behave exactly
like __add__, returning the result of the operation (which could
be `self') which is to be stored in the variable `x'.
For C extention types, the `hooks' are members of the
PyNumberMethods and PySequenceMethods structures, and are called
in exactly the same manner as the existing non-inplace operations,
including argument coercion. C methods should also take care to
return a new reference to the result object, whether it's the same
object or a new one. So if the original object is returned, it
should be INCREF()'d appropriately.
The other benifit of augmented assignment is readability. After
the general concept of augmented assignment is grasped, all the
augmented assigment operators instantly become obvious. There is
no need for non-obvious and non-standard method names to implement
efficient, in-place operations, and there is no need to check the
type of an object before operating on it: the augmented assignment
will work for all types that implement that basic operation, not
merely those that implement the augmented variant.
And the last problem with augmented assignment, what to do with
indexes and function calls in the expression, can be solved in a
very Pythonic manner: if it looks like it's only called once, it
*is* only called once. Taking this expression:
seq[func(x)] += x
The function 'func' is called once, and 'seq' is indexed twice:
once to retrieve the value (__getitem__), and once to store it
(__setitem__). So the expression can be rewritten as:
tmp = func(x)
seq[tmp] = seq[tmp] + x
The augmented assignment form of this expression is much more
readable.
New methods
The proposed implementation adds the following 11 possible `hooks'
which Python classes can implement to overload the augmented
assignment operations:
__add_ab__
__sub_ab__
__mul_ab__
__div_ab__
__mod_ab__
__pow_ab__
__lshift_ab__
__rshift_ab__
__and_ab__
__xor_ab__
__or_ab__
The `__add_ab__' name is one proposed by Guido[1], and stands for `and
becomes'. Other proposed names include '__iadd__', `__add_in__'
`__inplace_add__'
For C extention types, the following struct members are added:
To PyNumberMethods:
binaryfunc nb_inplace_add;
binaryfunc nb_inplace_subtract;
binaryfunc nb_inplace_multiply;
binaryfunc nb_inplace_divide;
binaryfunc nb_inplace_remainder;
binaryfunc nb_inplace_power;
binaryfunc nb_inplace_lshift;
binaryfunc nb_inplace_rshift;
binaryfunc nb_inplace_and;
binaryfunc nb_inplace_xor;
binaryfunc nb_inplace_or;
To PySequenceMethods:
binaryfunc sq_inplace_concat;
intargfunc sq_inplace_repeat;
In order to keep binary compatibility, the tp_flags TypeObject
member is used to determine whether the TypeObject in question has
allocated room for these slots. Until a clean break in binary
compatibility is made (which may or may not happen before 2.0)
code that wants to use one of the new struct members must first
check that they are available with the 'PyType_HasFeature()' macro:
if (PyType_HasFeature(x->ob_type, Py_TPFLAGS_HAVE_INPLACE_OPS) &&
x->ob_type->tp_as_number && x->ob_type->tp_as_number->nb_inplace_add) {
/* ... */
This check must be made even before testing the method slots for
NULL values! The macro only tests whether the slots are available,
not whether they are filled with methods or not.
Implementation
The current implementation of augmented assignment[2] adds, in
addition to the methods and slots alread covered, 13 new bytecodes
and 13 new API functions.
The API functions are simply in-place versions of the current
binary-operation API functions:
PyNumber_InPlaceAdd(PyObject *o1, PyObject *o2);
PyNumber_InPlaceSubtract(PyObject *o1, PyObject *o2);
PyNumber_InPlaceMultiply(PyObject *o1, PyObject *o2);
PyNumber_InPlaceDivide(PyObject *o1, PyObject *o2);
PyNumber_InPlaceRemainder(PyObject *o1, PyObject *o2);
PyNumber_InPlacePower(PyObject *o1, PyObject *o2);
PyNumber_InPlaceLshift(PyObject *o1, PyObject *o2);
PyNumber_InPlaceRshift(PyObject *o1, PyObject *o2);
PyNumber_InPlaceAnd(PyObject *o1, PyObject *o2);
PyNumber_InPlaceXor(PyObject *o1, PyObject *o2);
PyNumber_InPlaceOr(PyObject *o1, PyObject *o2);
PySequence_InPlaceConcat(PyObject *o1, PyObject *o2);
PySequence_InPlaceRepeat(PyObject *o, int count);
They call either the Python class hooks (if either of the objects
is a Python class instance) or the C type's number or sequence
methods.
The new bytecodes are:
INPLACE_ADD
INPLACE_SUBTRACT
INPLACE_MULTIPLY
INPLACE_DIVIDE
INPLACE_REMAINDER
INPLACE_POWER
INPLACE_LEFTSHIFT
INPLACE_RIGHTSHIFT
INPLACE_AND
INPLACE_XOR
INPLACE_OR
ROT_FOUR
DUP_TOPX
The INPLACE_* bytecodes mirror the BINARY_* bytecodes, except that
they are implemented as calls to the 'InPlace' API functions. The
other two bytecodes are 'utility' bytecodes: ROT_FOUR behaves like
ROT_THREE except that the four topmost stack items are rotated.
DUP_TOPX is a bytecode that takes a single argument, which should
be an integer between 1 and 5 (inclusive) which is the number of
items to duplicate in one block. Given a stack like this (where
the left side of the list is the 'top' of the stack):
[a, b, c, d, e, f, g]
"DUP_TOPX 3" would duplicate the top 3 items, resulting in this
stack:
[a, b, c, d, e, f, g, e, f, g]
DUP_TOPX with an argument of 1 is the same as DUP_TOP. The limit
of 5 is purely an implementation limit. The implementation of
augmented assignment requires only DUP_TOPX with an argument of 2
and 3, and could do without this new opcode at the cost of a fair
number of DUP_TOP and ROT_*.
Copyright
This document has been placed in the public domain.
References
[1] http://www.python.org/pipermail/python-list/2000-June/059556.html
[2]
http://sourceforge.net/patch?func=detailpatch&patch_id=100699&group_id=5470