python-peps/peps/pep-0203.rst

324 lines
13 KiB
ReStructuredText

PEP: 203
Title: Augmented Assignments
Author: Thomas Wouters <thomas@python.org>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 13-Jul-2000
Python-Version: 2.0
Post-History: 14-Aug-2000
Introduction
============
This PEP describes the *augmented assignment* proposal for Python 2.0. This
PEP tracks the status and ownership of this feature, slated for introduction
in Python 2.0. It contains a description of the feature and outlines changes
necessary to support the feature. This PEP summarizes discussions held in
mailing list forums [1]_, and provides URLs for further information where
appropriate. The CVS revision history of this file contains the definitive
historical record.
Proposed Semantics
==================
The proposed patch that adds augmented assignment to Python introduces the
following new operators::
+= -= *= /= %= **= <<= >>= &= ^= |=
They implement the same operator as their normal binary form, except that the
operation is done *in-place* when the left-hand side object supports it, and
that the left-hand side is only evaluated once.
They truly behave as augmented assignment, in that they perform all of the
normal load and store operations, in addition to the binary operation they are
intended to do. So, given the expression::
x += y
The object ``x`` is loaded, then ``y`` is added to it, and the resulting
object is stored back in the original place. The precise action performed on
the two arguments depends on the type of ``x``, and possibly of ``y``.
The idea behind augmented assignment in Python is that it isn't just an easier
way to write the common practice of storing the result of a binary operation
in its left-hand operand, but also a way for the left-hand operand in question
to know that it should operate *on itself*, rather than creating a modified
copy of itself.
To make this possible, a number of new *hooks* are added to Python classes and
C extension types, which are called when the object in question is used as the
left hand side of an augmented assignment operation. If the class or type
does not implement the *in-place* hooks, the normal hooks for the particular
binary operation are used.
So, given an instance object ``x``, the expression::
x += y
tries to call ``x.__iadd__(y)``, which is the *in-place* variant of
``__add__`` . If ``__iadd__`` is not present, ``x.__add__(y)`` is attempted,
and finally ``y.__radd__(x)`` if ``__add__`` is missing too. There is no
*right-hand-side* variant of ``__iadd__``, because that would require for
``y`` to know how to in-place modify ``x``, which is unsafe to say the least.
The ``__iadd__`` hook should behave similar to ``__add__``, returning the
result of the operation (which could be ``self``) which is to be assigned to
the variable ``x``.
For C extension types, the *hooks* are members of the ``PyNumberMethods`` and
``PySequenceMethods`` structures. Some special semantics apply to make the
use of these methods, and the mixing of Python instance objects and C types,
as unsurprising as possible.
In the generic case of ``x <augop> y`` (or a similar case using the
``PyNumber_InPlace`` API functions) the principal object being operated on is
``x``. This differs from normal binary operations, where ``x`` and ``y``
could be considered *co-operating*, because unlike in binary operations, the
operands in an in-place operation cannot be swapped. However, in-place
operations do fall back to normal binary operations when in-place modification
is not supported, resulting in the following rules:
- If the left-hand object (``x``) is an instance object, and it has a
``__coerce__`` method, call that function with ``y`` as the argument. If
coercion succeeds, and the resulting left-hand object is a different object
than ``x``, stop processing it as in-place and call the appropriate function
for the normal binary operation, with the coerced ``x`` and ``y`` as
arguments. The result of the operation is whatever that function returns.
If coercion does not yield a different object for ``x``, or ``x`` does not
define a ``__coerce__`` method, and ``x`` has the appropriate ``__ihook__``
for this operation, call that method with ``y`` as the argument, and the
result of the operation is whatever that method returns.
- Otherwise, if the left-hand object is not an instance object, but its type
does define the in-place function for this operation, call that function
with ``x`` and ``y`` as the arguments, and the result of the operation is
whatever that function returns.
Note that no coercion on either ``x`` or ``y`` is done in this case, and
it's perfectly valid for a C type to receive an instance object as the
second argument; that is something that cannot happen with normal binary
operations.
- Otherwise, process it exactly as a normal binary operation (not in-place),
including argument coercion. In short, if either argument is an instance
object, resolve the operation through ``__coerce__``, ``__hook__`` and
``__rhook__``. Otherwise, both objects are C types, and they are coerced
and passed to the appropriate function.
- If no way to process the operation can be found, raise a ``TypeError`` with
an error message specific to the operation.
- Some special casing exists to account for the case of ``+`` and ``*``,
which have a special meaning for sequences: for ``+``, sequence
concatenation, no coercion what so ever is done if a C type defines
``sq_concat`` or ``sq_inplace_concat``. For ``*``, sequence repeating,
``y`` is converted to a C integer before calling either
``sq_inplace_repeat`` and ``sq_repeat``. This is done even if ``y`` is an
instance, though not if ``x`` is an instance.
The in-place function should always return a new reference, either to the
old ``x`` object if the operation was indeed performed in-place, or to a new
object.
Rationale
=========
There are two main reasons for adding this feature to Python: simplicity of
expression, and support for in-place operations. The end result is a tradeoff
between simplicity of syntax and simplicity of expression; like most new
features, augmented assignment doesn't add anything that was previously
impossible. It merely makes these things easier to do.
Adding augmented assignment will make Python's syntax more complex. Instead
of a single assignment operation, there are now twelve assignment operations,
eleven of which also perform a binary operation. However, these eleven new
forms of assignment are easy to understand as the coupling between assignment
and the binary operation, and they require no large conceptual leap to
understand. Furthermore, languages that do have augmented assignment have
shown that they are a popular, much used feature. Expressions of the form::
<x> = <x> <operator> <y>
are common enough in those languages to make the extra syntax worthwhile, and
Python does not have significantly fewer of those expressions. Quite the
opposite, in fact, since in Python you can also concatenate lists with a
binary operator, something that is done quite frequently. Writing the above
expression as::
<x> <operator>= <y>
is both more readable and less error prone, because it is instantly obvious to
the reader that it is ``<x>`` that is being changed, and not ``<x>`` that is
being replaced by something almost, but not quite, entirely unlike ``<x>``.
The new in-place operations are especially useful to matrix calculation and
other applications that require large objects. In order to efficiently deal
with the available program memory, such packages cannot blindly use the
current binary operations. Because these operations always create a new
object, adding a single item to an existing (large) object would result in
copying the entire object (which may cause the application to run out of
memory), add the single item, and then possibly delete the original object,
depending on reference count.
To work around this problem, the packages currently have to use methods or
functions to modify an object in-place, which is definitely less readable than
an augmented assignment expression. Augmented assignment won't solve all the
problems for these packages, since some operations cannot be expressed in the
limited set of binary operators to start with, but it is a start. :pep:`211`
is looking at adding new operators.
New methods
===========
The proposed implementation adds the following 11 possible *hooks* which
Python classes can implement to overload the augmented assignment operations::
__iadd__
__isub__
__imul__
__idiv__
__imod__
__ipow__
__ilshift__
__irshift__
__iand__
__ixor__
__ior__
The *i* in ``__iadd__`` stands for *in-place*.
For C extension types, the following struct members are added.
To ``PyNumberMethods``::
binaryfunc nb_inplace_add;
binaryfunc nb_inplace_subtract;
binaryfunc nb_inplace_multiply;
binaryfunc nb_inplace_divide;
binaryfunc nb_inplace_remainder;
binaryfunc nb_inplace_power;
binaryfunc nb_inplace_lshift;
binaryfunc nb_inplace_rshift;
binaryfunc nb_inplace_and;
binaryfunc nb_inplace_xor;
binaryfunc nb_inplace_or;
To ``PySequenceMethods``::
binaryfunc sq_inplace_concat;
intargfunc sq_inplace_repeat;
In order to keep binary compatibility, the ``tp_flags`` TypeObject member is
used to determine whether the TypeObject in question has allocated room for
these slots. Until a clean break in binary compatibility is made (which may
or may not happen before 2.0) code that wants to use one of the new struct
members must first check that they are available with the
``PyType_HasFeature()`` macro::
if (PyType_HasFeature(x->ob_type, Py_TPFLAGS_HAVE_INPLACE_OPS) &&
x->ob_type->tp_as_number && x->ob_type->tp_as_number->nb_inplace_add) {
/* ... */
This check must be made even before testing the method slots for ``NULL``
values! The macro only tests whether the slots are available, not whether
they are filled with methods or not.
Implementation
==============
The current implementation of augmented assignment [2]_ adds, in addition to
the methods and slots already covered, 13 new bytecodes and 13 new API
functions.
The API functions are simply in-place versions of the current binary-operation
API functions::
PyNumber_InPlaceAdd(PyObject *o1, PyObject *o2);
PyNumber_InPlaceSubtract(PyObject *o1, PyObject *o2);
PyNumber_InPlaceMultiply(PyObject *o1, PyObject *o2);
PyNumber_InPlaceDivide(PyObject *o1, PyObject *o2);
PyNumber_InPlaceRemainder(PyObject *o1, PyObject *o2);
PyNumber_InPlacePower(PyObject *o1, PyObject *o2);
PyNumber_InPlaceLshift(PyObject *o1, PyObject *o2);
PyNumber_InPlaceRshift(PyObject *o1, PyObject *o2);
PyNumber_InPlaceAnd(PyObject *o1, PyObject *o2);
PyNumber_InPlaceXor(PyObject *o1, PyObject *o2);
PyNumber_InPlaceOr(PyObject *o1, PyObject *o2);
PySequence_InPlaceConcat(PyObject *o1, PyObject *o2);
PySequence_InPlaceRepeat(PyObject *o, int count);
They call either the Python class hooks (if either of the objects is a Python
class instance) or the C type's number or sequence methods.
The new bytecodes are::
INPLACE_ADD
INPLACE_SUBTRACT
INPLACE_MULTIPLY
INPLACE_DIVIDE
INPLACE_REMAINDER
INPLACE_POWER
INPLACE_LEFTSHIFT
INPLACE_RIGHTSHIFT
INPLACE_AND
INPLACE_XOR
INPLACE_OR
ROT_FOUR
DUP_TOPX
The ``INPLACE_*`` bytecodes mirror the ``BINARY_*`` bytecodes, except that
they are implemented as calls to the ``InPlace`` API functions. The other two
bytecodes are *utility* bytecodes: ``ROT_FOUR`` behaves like ``ROT_THREE``
except that the four topmost stack items are rotated.
``DUP_TOPX`` is a bytecode that takes a single argument, which should be an
integer between 1 and 5 (inclusive) which is the number of items to duplicate
in one block. Given a stack like this (where the right side of the list is
the *top* of the stack)::
[1, 2, 3, 4, 5]
``DUP_TOPX 3`` would duplicate the top 3 items, resulting in this stack::
[1, 2, 3, 4, 5, 3, 4, 5]
``DUP_TOPX`` with an argument of 1 is the same as ``DUP_TOP``. The limit of 5
is purely an implementation limit . The implementation of augmented
assignment requires only ``DUP_TOPX`` with an argument of 2 and 3, and could
do without this new opcode at the cost of a fair number of ``DUP_TOP`` and
``ROT_*``.
Open Issues
===========
The ``PyNumber_InPlace`` API is only a subset of the normal ``PyNumber`` API:
only those functions that are required to support the augmented assignment
syntax are included. If other in-place API functions are needed, they can be
added later.
The ``DUP_TOPX`` bytecode is a conveniency bytecode, and is not actually
necessary. It should be considered whether this bytecode is worth having.
There seems to be no other possible use for this bytecode at this time.
Copyright
=========
This document has been placed in the public domain.
References
==========
.. [1] http://www.python.org/pipermail/python-list/2000-June/059556.html
.. [2] http://sourceforge.net/patch?func=detailpatch&patch_id=100699&group_id=5470