python-peps/pep-0203.txt

PEP: 203
Title: Augmented Assignments
Version: $Revision$
Author: thomas@xs4all.net (Thomas Wouters)
Status: Accepted
Type: Standards Track
Python-Version: 2.0
Created: 13-Jul-2000
Post-History: 14-Aug-2000


Introduction

    This PEP describes the `augmented assignment' proposal for Python
    2.0.  This PEP tracks the status and ownership of this feature,
    slated for introduction in Python 2.0.  It contains a description
    of the feature and outlines changes necessary to support the
    feature.  This PEP summarizes discussions held in mailing list
    forums, and provides URLs for further information where
    appropriate.  The CVS revision history of this file contains the
    definitive historical record.


Proposed semantics

    The proposed patch that adds augmented assignment to Python
    introduces the following new operators:

       += -= *= /= %= **= <<= >>= &= ^= |=

    They implement the same operator as their normal binary form,
    except that the operation is done `in-place' when the left-hand
    side object supports it, and that the left-hand side is only
    evaluated once.

    They truly behave as augmented assignment, in that they perform
    all of the normal load and store operations, in addition to the
    binary operation they are intended to do. So, given the expression:

       x += y

    The object `x' is loaded, then `y' is added to it, and the
    resulting object is stored back in the original place. The precise
    action performed on the two arguments depends on the type of `x',
    and possibly of `y'.

    The idea behind augmented assignment in Python is that it isn't
    just an easier way to write the common practice of storing the
    result of a binary operation in its left-hand operand, but also a
    way for the left-hand operand in question to know that it should
    operate `on itself', rather than creating a modified copy of
    itself.

    To make this possible, a number of new `hooks' are added to Python
    classes and C extension types, which are called when the object in
    question is used as the left hand side of an augmented assignment
    operation. If the class or type does not implement the `in-place'
    hooks, the normal hooks for the particular binary operation are
    used.

    So, given an instance object `x', the expression

        x += y

    tries to call x.__iadd__(y), which is the `in-place' variant of
    __add__. If __iadd__ is not present, x.__add__(y) is attempted,
    and finally y.__radd__(x) if __add__ is missing too.  There is no
    `right-hand-side' variant of __iadd__, because that would require
    for `y' to know how to in-place modify `x', which is unsafe to say
    the least. The __iadd__ hook should behave similar to __add__,
    returning the result of the operation (which could be `self')
    which is to be assigned to the variable `x'.

    For C extension types, the `hooks' are members of the
    PyNumberMethods and PySequenceMethods structures.  Some special
    semantics apply to make the use of these methods, and the mixing
    of Python instance objects and C types, as unsurprising as
    possible.

    In the generic case of `x <augop> y' (or a similar case using the
    PyNumber_InPlace API functions) the principal object being
    operated on is `x'.  This differs from normal binary operations,
    where `x' and `y' could be considered `co-operating', because
    unlike in binary operations, the operands in an in-place operation
    cannot be swapped.  However, in-place operations do fall back to
    normal binary operations when in-place modification is not
    supported, resuling in the following rules:

    - If the left-hand object (`x') is an instance object, and it
      has a `__coerce__' method, call that function with `y' as the
      argument. If coercion succeeds, and the resulting left-hand
      object is a different object than `x', stop processing it as
      in-place and call the appropriate function for the normal binary
      operation, with the coerced `x' and `y' as arguments. The result
      of the operation is whatever that function returns.

      If coercion does not yield a different object for `x', or `x'
      does not define a `__coerce__' method, and `x' has the
      appropriate `__ihook__' for this operation, call that method
      with `y' as the argument, and the result of the operation is
      whatever that method returns.

    - Otherwise, if the left-hand object is not an instance object,
      but its type does define the in-place function for this
      operation, call that function with `x' and `y' as the arguments,
      and the result of the operation is whatever that function
      returns.

      Note that no coercion on either `x' or `y' is done in this case,
      and it's perfectly valid for a C type to receive an instance
      object as the second argument; that is something that cannot
      happen with normal binary operations.

    - Otherwise, process it exactly as a normal binary operation (not
      in-place), including argument coercion. In short, if either
      argument is an instance object, resolve the operation through
      `__coerce__', `__hook__' and `__rhook__'. Otherwise, both
      objects are C types, and they are coerced and passed to the
      appropriate function.

    - If no way to process the operation can be found, raise a
      TypeError with an error message specific to the operation.

    - Some special casing exists to account for the case of `+' and
      `*', which have a special meaning for sequences: for `+',
      sequence concatenation, no coercion what so ever is done if a C
      type defines sq_concat or sq_inplace_concat. For `*', sequence
      repeating, `y' is converted to a C integer before calling either
      sq_inplace_repeat and sq_repeat. This is done even if `y' is an
      instance, though not if `x' is an instance.

    The in-place function should always return a new reference, either
    to the old `x' object if the operation was indeed performed
    in-place, or to a new object.


Rationale

    There are two main reasons for adding this feature to Python:
    simplicity of expression, and support for in-place operations. The
    end result is a tradeoff between simplicity of syntax and
    simplicity of expression; like most new features, augmented
    assignment doesn't add anything that was previously impossible. It
    merely makes these things easier to do.

    Adding augmented assignment will make Python's syntax more complex.
    Instead of a single assignment operation, there are now twelve
    assignment operations, eleven of which also perform an binary
    operation. However, these eleven new forms of assignment are easy
    to understand as the coupling between assignment and the binary
    operation, and they require no large conceptual leap to
    understand. Furthermore, languages that do have augmented
    assignment have shown that they are a popular, much used feature.
    Expressions of the form

        <x> = <x> <operator> <y>

    are common enough in those languages to make the extra syntax
    worthwhile, and Python does not have significantly fewer of those
    expressions. Quite the opposite, in fact, since in Python you can
    also concatenate lists with a binary operator, something that is
    done quite frequently. Writing the above expression as

        <x> <operator>= <y>

    is both more readable and less error prone, because it is
    instantly obvious to the reader that it is <x> that is being
    changed, and not <x> that is being replaced by something almost,
    but not quite, entirely unlike <x>.

    The new in-place operations are especially useful to matrix
    calculation and other applications that require large objects. In
    order to efficiently deal with the available program memory, such
    packages cannot blindly use the current binary operations. Because
    these operations always create a new object, adding a single item
    to an existing (large) object would result in copying the entire
    object (which may cause the application to run out of memory), add
    the single item, and then possibly delete the original object,
    depending on reference count.

    To work around this problem, the packages currently have to use
    methods or functions to modify an object in-place, which is
    definitely less readable than an augmented assignment expression.
    Augmented assignment won't solve all the problems for these
    packages, since some operations cannot be expressed in the limited
    set of binary operators to start with, but it is a start. A
    different PEP[2] is looking at adding new operators.


New methods

    The proposed implementation adds the following 11 possible `hooks'
    which Python classes can implement to overload the augmented
    assignment operations:

        __iadd__
        __isub__
        __imul__
        __idiv__
        __imod__
        __ipow__
        __ilshift__
        __irshift__
        __iand__
        __ixor__
        __ior__

    The `i' in `__iadd__' stands for `in-place'.

    For C extension types, the following struct members are added:

    To PyNumberMethods:
        binaryfunc nb_inplace_add;
        binaryfunc nb_inplace_subtract;
        binaryfunc nb_inplace_multiply;
        binaryfunc nb_inplace_divide;
        binaryfunc nb_inplace_remainder;
        binaryfunc nb_inplace_power;
        binaryfunc nb_inplace_lshift;
        binaryfunc nb_inplace_rshift;
        binaryfunc nb_inplace_and;
        binaryfunc nb_inplace_xor;
        binaryfunc nb_inplace_or;

    To PySequenceMethods:
        binaryfunc sq_inplace_concat;
        intargfunc sq_inplace_repeat;

    In order to keep binary compatibility, the tp_flags TypeObject
    member is used to determine whether the TypeObject in question has
    allocated room for these slots. Until a clean break in binary
    compatibility is made (which may or may not happen before 2.0)
    code that wants to use one of the new struct members must first
    check that they are available with the `PyType_HasFeature()'
    macro:

    if (PyType_HasFeature(x->ob_type, Py_TPFLAGS_HAVE_INPLACE_OPS) &&
        x->ob_type->tp_as_number && x->ob_type->tp_as_number->nb_inplace_add) {
            /* ... */

    This check must be made even before testing the method slots for
    NULL values! The macro only tests whether the slots are available,
    not whether they are filled with methods or not.


Implementation

    The current implementation of augmented assignment[1] adds, in
    addition to the methods and slots already covered, 13 new bytecodes
    and 13 new API functions.

    The API functions are simply in-place versions of the current
    binary-operation API functions:

        PyNumber_InPlaceAdd(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceSubtract(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceMultiply(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceDivide(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceRemainder(PyObject *o1, PyObject *o2);
        PyNumber_InPlacePower(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceLshift(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceRshift(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceAnd(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceXor(PyObject *o1, PyObject *o2);
        PyNumber_InPlaceOr(PyObject *o1, PyObject *o2);
        PySequence_InPlaceConcat(PyObject *o1, PyObject *o2);
        PySequence_InPlaceRepeat(PyObject *o, int count);

    They call either the Python class hooks (if either of the objects
    is a Python class instance) or the C type's number or sequence
    methods.

    The new bytecodes are:
        INPLACE_ADD
        INPLACE_SUBTRACT
        INPLACE_MULTIPLY
        INPLACE_DIVIDE
        INPLACE_REMAINDER
        INPLACE_POWER
        INPLACE_LEFTSHIFT
        INPLACE_RIGHTSHIFT
        INPLACE_AND
        INPLACE_XOR
        INPLACE_OR
        ROT_FOUR
        DUP_TOPX

    The INPLACE_* bytecodes mirror the BINARY_* bytecodes, except that
    they are implemented as calls to the `InPlace' API functions. The
    other two bytecodes are `utility' bytecodes: ROT_FOUR behaves like
    ROT_THREE except that the four topmost stack items are rotated.

    DUP_TOPX is a bytecode that takes a single argument, which should
    be an integer between 1 and 5 (inclusive) which is the number of
    items to duplicate in one block. Given a stack like this (where
    the right side of the list is the `top' of the stack):

        [1, 2, 3, 4, 5]

    "DUP_TOPX 3" would duplicate the top 3 items, resulting in this
    stack:

        [1, 2, 3, 4, 5, 3, 4, 5]

    DUP_TOPX with an argument of 1 is the same as DUP_TOP. The limit
    of 5 is purely an implementation limit. The implementation of
    augmented assignment requires only DUP_TOPX with an argument of 2
    and 3, and could do without this new opcode at the cost of a fair
    number of DUP_TOP and ROT_*.


Open Issues

    The PyNumber_InPlace API is only a subset of the normal PyNumber
    API: only those functions that are required to support the
    augmented assignment syntax are included. If other in-place API
    functions are needed, they can be added later.


    The DUP_TOPX bytecode is a conveniency bytecode, and is not
    actually necessary. It should be considered whether this bytecode
    is worth having. There seems to be no other possible use for this
    bytecode at this time.


Copyright

    This document has been placed in the public domain.


References

    [1] http://www.python.org/pipermail/python-list/2000-June/059556.html
    [2] http://sourceforge.net/patch?func=detailpatch&patch_id=100699&group_id=5470
    [3] PEP 211, Adding New Linear Algebra Operators,
        http://python.sourceforge.net/peps/pep-0211.html


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: