python-peps/pep-0207.txt

PEP: 207
Title: Rich Comparisions
Version: $Revision$
Last-Modified: $Date$
Author: guido@python.org (Guido van Rossum), DavidA@ActiveState.com (David Ascher)
Python-Version: 2.1
Type: Standards
Status: Final


Abstract

    This PEP proposes several new features for comparisons:

    - Allow separately overloading of <, >, <=, >=, ==, !=, both in
      classes and in C extensions.

    - Allow any of those overloaded operators to return something else
      besides a Boolean result.


Motivation

    The main motivation comes from NumPy, whose users agree that A<B
    should return an array of elementwise comparison outcomes; they
    currently have to spell this as less(A,B) because A<B can only
    return a Boolean result or raise an exception.

    An additional motivation is that frequently, types don't have a
    natural ordering, but still need to be compared for equality.
    Currently such a type *must* implement comparison and thus define
    an arbitrary ordering, just so that equality can be tested.

    Also, for some object types an equality test can be implemented
    much more efficiently than an ordering test; for example, lists
    and dictionaries that differ in length are unequal, but the
    ordering requires inspecting some (potentially all) items.


Previous Work

    Rich Comparisons have been proposed before; in particular by David
    Ascher, after experience with Numerical Python:

      http://starship.python.net/crew/da/proposals/richcmp.html

    It is also included below as an Appendix.  Most of the material in
    this PEP is derived from David's proposal.


Concerns

    1 Backwards compatibility, both at the Python level (classes using
      __cmp__ need not be changed) and at the C level (extensions
      defining tp_compare need not be changed, code using
      PyObject_Compare() must work even if the compared objects use
      the new rich comparison scheme).

    2 When A<B returns a matrix of elementwise comparisons, an easy
      mistake to make is to use this expression in a Boolean context.
      Without special precautions, it would always be true.  This use
      should raise an exception instead.

    3 If a class overrides x==y but nothing else, should x!=y be
      computed as not(x==y), or fail?  What about the similar
      relationship between < and >=, or between > and <=?

    4 Similarly, should we allow x<y to be calculated from y>x?  And
      x<=y from not(x>y)?  And x==y from y==x, or x!=y from y!=x?

    5 When comparison operators return elementwise comparisons, what
      to do about shortcut operators like A<B<C, ``A<B and C<D'',
      ``A<B or C<D''?

    6 What to do about min() and max(), the 'in' and 'not in'
      operators, list.sort(), dictionary key comparison, and other
      uses of comparisons by built-in operations?


Proposed Resolutions

    1 Full backwards compatibility can be achieved as follows.  When
      an object defines tp_compare() but not tp_richcompare(), and a
      rich comparison is requested, the outcome of tp_compare() is
      used in the ovious way.  E.g. if "<" is requested, an exception if
      tp_compare() raises an exception, the outcome is 1 if
      tp_compare() is negative, and 0 if it is zero or positive.  Etc.

      Full forward compatibility can be achieved as follows.  When a
      classic comparison is requested on an object that implements
      tp_richcompare(), up to three comparisons are used: first == is
      tried, and if it returns true, 0 is returned; next, < is tried
      and if it returns true, -1 is returned; next, > is tried and if
      it returns true, +1 is returned.  If any operator tried returns
      a non-Boolean value (see below), the exception raised by
      conversion to Boolean is passed through.  If none of the
      operators tried returns true, the classic comparison fallbacks
      are tried next.

      (I thought long and hard about the order in which the three
      comparisons should be tried.  At one point I had a convincing
      argument for doing it in this order, based on the behavior of
      comparisons for cyclical data structures.  But since that code
      has changed again, I'm not so sure that it makes a difference
      any more.)

    2 Any type that returns a collection of Booleans instead of a
      single boolean should define nb_nonzero() to raise an exception.
      Such a type is considered a non-Boolean.

    3 The == and != operators are not assumed to be each other's
      complement (e.g. IEEE 754 floating point numbers do not satisfy
      this).  It is up to the type to implement this if desired.
      Similar for < and >=, or > and <=; there are lots of examples
      where these assumptions aren't true (e.g. tabnanny).

    4 The reflexivity rules *are* assumed by Python.  Thus, the
      interpreter may swap y>x with x<y, y>=x with x<=y, and may swap
      the arguments of x==y and x!=y.  (Note: Python currently assumes
      that x==x is always true and x!=x is never true; this should not
      be assumed.)

    5 In the current proposal, when A<B returns an array of
      elementwise comparisons, this outcome is considered non-Boolean,
      and its interpretation as Boolean by the shortcut operators
      raises an exception.  David Ascher's proposal tries to deal
      with this; I don't think this is worth the additional complexity
      in the code generator.  Instead of A<B<C, you can write
      (A<B)&(B<C).

    6 The min() and list.sort() operations will only use the
      < operator; max() will only use the > operator.  The 'in' and
      'not in' operators and dictionary lookup will only use the ==
      operator.


Implementation Proposal

    This closely follows David Ascher's proposal.

    C API

    - New functions:

      PyObject *PyObject_RichCompare(PyObject *, PyObject *, int)

      This performs the requested rich comparison, returning a Python
      object or raising an exception.  The 3rd argument must be one of
      Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT or Py_GE.

      int PyObject_RichCompareBool(PyObject *, PyObject *, int)

      This performs the requested rich comparison, returning a
      Boolean: -1 for exception, 0 for false, 1 for true.  The 3rd
      argument must be one of Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT or
      Py_GE.  Note that when PyObject_RichCompare() returns a
      non-Boolean object, PyObject_RichCompareBool() will raise an
      exception.

    - New typedef:

      typedef PyObject *(*richcmpfunc) (PyObject *, PyObject *, int);

    - New slot in type object, replacing spare tp_xxx7:

      richcmpfunc tp_richcompare;

      This should be a function with the same signature as
      PyObject_RichCompare(), and performing the same comparison.
      At least one of the arguments is of the type whose
      tp_richcompare slot is being used, but the other may have a
      different type.  If the function cannot compare the particular
      combination of objects, it should return a new reference to
      Py_NotImplemented.

    - PyObject_Compare() is changed to try rich comparisons if they
      are defined (but only if classic comparisons aren't defined).

    Changes to the interpreter

    - Whenever PyObject_Compare() is called with the intent of getting
      the outcome of a particular comparison (e.g. in list.sort(), and
      of course for the comparison operators in ceval.c), the code is
      changed to call PyObject_RichCompare() or
      PyObject_RichCompareBool() instead; if the C code needs to know
      the outcome of the comparison, PyObject_IsTrue() is called on
      the result (which may raise an exception).

    - Most built-in types that currently define a comparison will be
      modified to define a rich comparison instead.  (This is
      optional; I've converted lists, tuples, complex numbers, and
      arrays so far, and am not sure whether I will convert others.)

    Classes

    - Classes can define new special methods __lt__, __le__, __eq__,
      __ne__,__gt__, __ge__ to override the corresponding operators.
      (I.e., <, <=, ==, !=, >, >=. You gotta love the Fortran
      heritage.)  If a class defines __cmp__ as well, it is only used
      when __lt__ etc. have been tried and return NotImplemented.


Copyright

    This document has been placed in the public domain.


Appendix

    Here is most of David Ascher's original proposal (version 0.2.1,
    dated Wed Jul 22 16:49:28 1998; I've left the Contents, History
    and Patches sections out).  It addresses almost all concerns
    above.


Abstract

    A new mechanism allowing comparisons of Python objects to return
    values other than -1, 0, or 1 (or raise exceptions) is
    proposed. This mechanism is entirely backwards compatible, and can
    be controlled at the level of the C PyObject type or of the Python
    class definition. There are three cooperating parts to the
    proposed mechanism:

    - the use of the last slot in the type object structure to store a
    pointer to a rich comparison function

    - the addition of special methods for classes

    - the addition of an optional argument to the builtin cmp()
    function.


Motivation

    The current comparison protocol for Python objects assumes that
    any two Python objects can be compared (as of Python 1.5, object
    comparisons can raise exceptions), and that the return value for
    any comparison should be -1, 0 or 1. -1 indicates that the first
    argument to the comparison function is less than the right one, +1
    indicating the contrapositive, and 0 indicating that the two
    objects are equal. While this mechanism allows the establishment
    of a order relationship (e.g. for use by the sort() method of list
    objects), it has proven to be limited in the context of Numeric
    Python (NumPy).

    Specifically, NumPy allows the creation of multidimensional
    arrays, which support most of the numeric operators. Thus:

             x = array((1,2,3,4))        y = array((2,2,4,4))

    are two NumPy arrays. While they can be added elementwise,:

             z = x + y   # z == array((3,4,7,8))

    they cannot be compared in the current framework - the released
    version of NumPy compares the pointers, (thus yielding junk
    information) which was the only solution before the recent
    addition of the ability (in 1.5) to raise exceptions in comparison
    functions.

    Even with the ability to raise exceptions, the current protocol
    makes array comparisons useless. To deal with this fact, NumPy
    includes several functions which perform the comparisons: less(),
    less_equal(), greater(), greater_equal(), equal(),
    not_equal(). These functions return arrays with the same shape as
    their arguments (modulo broadcasting), filled with 0's and 1's
    depending on whether the comparison is true or not for each
    element pair. Thus, for example, using the arrays x and y defined
    above:

             less(x,y) 

    would be an array containing the numbers (1,0,0,0).

    The current proposal is to modify the Python object interface to
    allow the NumPy package to make it so that x < y returns the same
    thing as less(x,y). The exact return value is up to the NumPy
    package -- what this proposal really asks for is changing the
    Python core so that extension objects have the ability to return
    something other than -1, 0, 1, should their authors choose to do
    so.

Current State of Affairs

    The current protocol is, at the C level, that each object type
    defines a tp_compare slot, which is a pointer to a function which
    takes two PyObject* references and returns -1, 0, or 1. This
    function is called by the PyObject_Compare() function defined in
    the C API. PyObject_Compare() is also called by the builtin
    function cmp() which takes two arguments.

Proposed Mechanism

    1. Changes to the C structure for type objects

    The last available slot in the PyTypeObject, reserved up to now
    for future expansion, is used to optionally store a pointer to a
    new comparison function, of type richcmpfunc defined by:

           typedef PyObject *(*richcmpfunc)
                Py_PROTO((PyObject *, PyObject *, int));

    This function takes three arguments. The first two are the objects
    to be compared, and the third is an integer corresponding to an
    opcode (one of LT, LE, EQ, NE, GT, GE). If this slot is left NULL,
    then rich comparison for that object type is not supported (except
    for class instances whose class provide the special methods
    described below).

    The above opcodes need to be added to the published Python/C API
    (probably under the names Py_LT, Py_LE, etc.)

    2. Additions of special methods for classes

    Classes wishing to support the rich comparison mechanisms must add
    one or more of the following new special methods:

             def __lt__(self, other):
                ...
             def __le__(self, other):
                ...
             def __gt__(self, other):
                ...
             def __ge__(self, other):
                ...
             def __eq__(self, other):
                ...
             def __ne__(self, other):
                ...

    Each of these is called when the class instance is the on the
    left-hand-side of the corresponding operators (<, <=, >, >=, ==,
    and != or <>). The argument other is set to the object on the
    right side of the operator. The return value of these methods is
    up to the class implementor (after all, that's the entire point of
    the proposal).

    If the object on the left side of the operator does not define an
    appropriate rich comparison operator (either at the C level or
    with one of the special methods, then the comparison is reversed,
    and the right hand operator is called with the opposite operator,
    and the two objects are swapped. This assumes that a < b and b > a
    are equivalent, as are a <= b and b >= a, and that == and != are
    commutative (e.g. a == b if and only if b == a).

    For example, if obj1 is an object which supports the rich
    comparison protocol and x and y are objects which do not support
    the rich comparison protocol, then obj1 < x will call the __lt__
    method of obj1 with x as the second argument. x < obj1 will call
    obj1's __gt__ method with x as a second argument, and x < y will
    just use the existing (non-rich) comparison mechanism.

    The above mechanism is such that classes can get away with not
    implementing either __lt__ and __le__ or __gt__ and
    __ge__. Further smarts could have been added to the comparison
    mechanism, but this limited set of allowed "swaps" was chosen
    because it doesn't require the infrastructure to do any processing
    (negation) of return values. The choice of six special methods was
    made over a single (e.g. __richcmp__) method to allow the
    dispatching on the opcode to be performed at the level of the C
    implementation rather than the user-defined method.

    3. Addition of an optional argument to the builtin cmp()

    The builtin cmp() is still used for simple comparisons. For rich
    comparisons, it is called with a third argument, one of "<", "<=",
    ">", ">=", "==", "!=", "<>" (the last two have the same
    meaning). When called with one of these strings as the third
    argument, cmp() can return any Python object. Otherwise, it can
    only return -1, 0 or 1 as before.

Chained Comparisons

    Problem

    It would be nice to allow objects for which the comparison returns
    something other than -1, 0, or 1 to be used in chained
    comparisons, such as:

             x < y < z

    Currently, this is interpreted by Python as:

             temp1 = x < y
             if temp1:
               return y < z
             else:
               return temp1     

    Note that this requires testing the truth value of the result of
    comparisons, with potential "shortcutting" of the right-side
    comparison testings. In other words, the truth-value of the result
    of the result of the comparison determines the result of a chained
    operation. This is problematic in the case of arrays, since if x,
    y and z are three arrays, then the user expects:

            x < y < z

    to be an array of 0's and 1's where 1's are in the locations
    corresponding to the elements of y which are between the
    corresponding elements in x and z. In other words, the right-hand
    side must be evaluated regardless of the result of x < y, which is
    incompatible with the mechanism currently in use by the parser.

    Solution

    Guido mentioned that one possible way out would be to change the
    code generated by chained comparisons to allow arrays to be
    chained-compared intelligently. What follows is a mixture of his
    idea and my suggestions. The code generated for x < y < z would be
    equivalent to:

             temp1 = x < y
             if temp1:
               temp2 = y < z
               return boolean_combine(temp1, temp2)
             else:
               return temp1     

    where boolean_combine is a new function which does something like
    the following:

             def boolean_combine(a, b):
                 if hasattr(a, '__boolean_and__') or \
                    hasattr(b, '__boolean_and__'):
                     try:
                         return a.__boolean_and__(b)
                     except:
                         return b.__boolean_and__(a)
                 else: # standard behavior
                     if a:
                         return b
                     else: 
                         return 0

    where the __boolean_and__ special method is implemented for
    C-level types by another value of the third argument to the
    richcmp function. This method would perform a boolean comparison
    of the arrays (currently implemented in the umath module as the
    logical_and ufunc).

    Thus, objects returned by rich comparisons should always test
    true, but should define another special method which creates
    boolean combinations of them and their argument.

    This solution has the advantage of allowing chained comparisons to
    work for arrays, but the disadvantage that it requires comparison
    arrays to always return true (in an ideal world, I'd have them
    always raise an exception on truth testing, since the meaning of
    testing "if a>b:" is massively ambiguous.

    The inlining already present which deals with integer comparisons
    would still apply, resulting in no performance cost for the most
    common cases.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
-												New PEPs, all initially owned by David Ascher.

											
										
										
											2000-07-15 19:28:27 -04:00
+								PEP: 207
 								Title: Rich Comparisions
 								Version: $Revision$
-												added Last-Modified headers to all PEPs without

											
										
										
											2006-03-23 15:13:19 -05:00
+								Last-Modified: $Date$
-												Fix authorship for the last time\!

											
										
										
											2000-12-06 16:22:52 -05:00
+								Author: guido@python.org (Guido van Rossum), DavidA@ActiveState.com (David Ascher)
-												New PEPs, all initially owned by David Ascher.

											
										
										
											2000-07-15 19:28:27 -04:00
+								Python-Version: 2.1
-												Add missing PEP type

											
										
										
											2006-02-09 03:25:22 -05:00
+								Type: Standards
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								Status: Final
-												New PEPs, all initially owned by David Ascher.

											
										
										
											2000-07-15 19:28:27 -04:00
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
+								Abstract
 								    This PEP proposes several new features for comparisons:
 								    - Allow separately overloading of <, >, <=, >=, ==, !=, both in
 								      classes and in C extensions.
 								    - Allow any of those overloaded operators to return something else
 								      besides a Boolean result.
 								Motivation
 								    The main motivation comes from NumPy, whose users agree that A<B
 								    should return an array of elementwise comparison outcomes; they
 								    currently have to spell this as less(A,B) because A<B can only
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								    return a Boolean result or raise an exception.
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
 								    An additional motivation is that frequently, types don't have a
 								    natural ordering, but still need to be compared for equality.
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								    Currently such a type *must* implement comparison and thus define
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
+								    an arbitrary ordering, just so that equality can be tested.
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								    Also, for some object types an equality test can be implemented
 								    much more efficiently than an ordering test; for example, lists
 								    and dictionaries that differ in length are unequal, but the
 								    ordering requires inspecting some (potentially all) items.
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
 								Previous Work
 								    Rich Comparisons have been proposed before; in particular by David
 								    Ascher, after experience with Numerical Python:
 								      http://starship.python.net/crew/da/proposals/richcmp.html
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								    It is also included below as an Appendix.  Most of the material in
 								    this PEP is derived from David's proposal.
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
 								Concerns
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+Backwards compatibility, both at the Python level (classes using
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
+								      __cmp__ need not be changed) and at the C level (extensions
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								      defining tp_compare need not be changed, code using
 								      PyObject_Compare() must work even if the compared objects use
 								      the new rich comparison scheme).
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+When A<B returns a matrix of elementwise comparisons, an easy
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
+								      mistake to make is to use this expression in a Boolean context.
 								      Without special precautions, it would always be true.  This use
 								      should raise an exception instead.
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+If a class overrides x==y but nothing else, should x!=y be
 								      computed as not(x==y), or fail?  What about the similar
 								      relationship between < and >=, or between > and <=?
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+Similarly, should we allow x<y to be calculated from y>x?  And
 								      x<=y from not(x>y)?  And x==y from y==x, or x!=y from y!=x?
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+When comparison operators return elementwise comparisons, what
 								      to do about shortcut operators like A<B<C, ``A<B and C<D'',
 								      ``A<B or C<D''?
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+What to do about min() and max(), the 'in' and 'not in'
 								      operators, list.sort(), dictionary key comparison, and other
 								      uses of comparisons by built-in operations?
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								Proposed Resolutions
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+Full backwards compatibility can be achieved as follows.  When
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								      an object defines tp_compare() but not tp_richcompare(), and a
 								      rich comparison is requested, the outcome of tp_compare() is
 								      used in the ovious way.  E.g. if "<" is requested, an exception if
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								      tp_compare() raises an exception, the outcome is 1 if
 								      tp_compare() is negative, and 0 if it is zero or positive.  Etc.
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								      Full forward compatibility can be achieved as follows.  When a
 								      classic comparison is requested on an object that implements
 								      tp_richcompare(), up to three comparisons are used: first == is
 								      tried, and if it returns true, 0 is returned; next, < is tried
 								      and if it returns true, -1 is returned; next, > is tried and if
 								      it returns true, +1 is returned.  If any operator tried returns
 								      a non-Boolean value (see below), the exception raised by
 								      conversion to Boolean is passed through.  If none of the
 								      operators tried returns true, the classic comparison fallbacks
 								      are tried next.
 								      (I thought long and hard about the order in which the three
 								      comparisons should be tried.  At one point I had a convincing
 								      argument for doing it in this order, based on the behavior of
 								      comparisons for cyclical data structures.  But since that code
 								      has changed again, I'm not so sure that it makes a difference
 								      any more.)
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
 Any type that returns a collection of Booleans instead of a
 								      single boolean should define nb_nonzero() to raise an exception.
 								      Such a type is considered a non-Boolean.
 The == and != operators are not assumed to be each other's
 								      complement (e.g. IEEE 754 floating point numbers do not satisfy
 								      this).  It is up to the type to implement this if desired.
 								      Similar for < and >=, or > and <=; there are lots of examples
 								      where these assumptions aren't true (e.g. tabnanny).
 The reflexivity rules *are* assumed by Python.  Thus, the
 								      interpreter may swap y>x with x<y, y>=x with x<=y, and may swap
 								      the arguments of x==y and x!=y.  (Note: Python currently assumes
 								      that x==x is always true and x!=x is never true; this should not
 								      be assumed.)
 In the current proposal, when A<B returns an array of
 								      elementwise comparisons, this outcome is considered non-Boolean,
 								      and its interpretation as Boolean by the shortcut operators
 								      raises an exception.  David Ascher's proposal tries to deal
 								      with this; I don't think this is worth the additional complexity
 								      in the code generator.  Instead of A<B<C, you can write
-												Fix typo in the alternate way to spell A<B<C, found by
geepokey@yahoo.com (Gumby).

											
										
										
											2001-02-05 10:56:04 -05:00
+								      (A<B)&(B<C).
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+The min() and list.sort() operations will only use the
 								      < operator; max() will only use the > operator.  The 'in' and
 								      'not in' operators and dictionary lookup will only use the ==
 								      operator.
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
 								Implementation Proposal
 								    This closely follows David Ascher's proposal.
 								    C API
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								    - New functions:
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
-												Repaired a few places where the PEP was incorrect or out of touch with
reality.  Specifically, the third argument to PyObject_RichCompare* is an int,
not an enum.  Also, when the tp_richcompare function cannot compare the
combination of objects, it needs to return Py_NotImplemented, not
PyExc_NotImplemented (and a new reference to that object at that).

											
										
										
											2004-08-05 19:16:25 -04:00
+								      PyObject *PyObject_RichCompare(PyObject *, PyObject *, int)
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
 								      This performs the requested rich comparison, returning a Python
 								      object or raising an exception.  The 3rd argument must be one of
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								      Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT or Py_GE.
-												Repaired a few places where the PEP was incorrect or out of touch with
reality.  Specifically, the third argument to PyObject_RichCompare* is an int,
not an enum.  Also, when the tp_richcompare function cannot compare the
combination of objects, it needs to return Py_NotImplemented, not
PyExc_NotImplemented (and a new reference to that object at that).

											
										
										
											2004-08-05 19:16:25 -04:00
+								      int PyObject_RichCompareBool(PyObject *, PyObject *, int)
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
 								      This performs the requested rich comparison, returning a
 								      Boolean: -1 for exception, 0 for false, 1 for true.  The 3rd
 								      argument must be one of Py_LT, Py_LE, Py_EQ, Py_NE, Py_GT or
 								      Py_GE.  Note that when PyObject_RichCompare() returns a
 								      non-Boolean object, PyObject_RichCompareBool() will raise an
 								      exception.
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
 								    - New typedef:
 								      typedef PyObject *(*richcmpfunc) (PyObject *, PyObject *, int);
 								    - New slot in type object, replacing spare tp_xxx7:
 								      richcmpfunc tp_richcompare;
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								      This should be a function with the same signature as
 								      PyObject_RichCompare(), and performing the same comparison.
 								      At least one of the arguments is of the type whose
 								      tp_richcompare slot is being used, but the other may have a
 								      different type.  If the function cannot compare the particular
-												Repaired a few places where the PEP was incorrect or out of touch with
reality.  Specifically, the third argument to PyObject_RichCompare* is an int,
not an enum.  Also, when the tp_richcompare function cannot compare the
combination of objects, it needs to return Py_NotImplemented, not
PyExc_NotImplemented (and a new reference to that object at that).

											
										
										
											2004-08-05 19:16:25 -04:00
+								      combination of objects, it should return a new reference to
 								      Py_NotImplemented.
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								    - PyObject_Compare() is changed to try rich comparisons if they
 								      are defined (but only if classic comparisons aren't defined).
 								    Changes to the interpreter
 								    - Whenever PyObject_Compare() is called with the intent of getting
 								      the outcome of a particular comparison (e.g. in list.sort(), and
 								      of course for the comparison operators in ceval.c), the code is
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								      changed to call PyObject_RichCompare() or
 								      PyObject_RichCompareBool() instead; if the C code needs to know
 								      the outcome of the comparison, PyObject_IsTrue() is called on
 								      the result (which may raise an exception).
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								    - Most built-in types that currently define a comparison will be
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								      modified to define a rich comparison instead.  (This is
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								      optional; I've converted lists, tuples, complex numbers, and
 								      arrays so far, and am not sure whether I will convert others.)
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
 								    Classes
-												Finalized this PEP.  Closed all unresolved issues.

											
										
										
											2001-01-19 17:29:19 -05:00
+								    - Classes can define new special methods __lt__, __le__, __eq__,
 								      __ne__,__gt__, __ge__ to override the corresponding operators.
 								      (I.e., <, <=, ==, !=, >, >=. You gotta love the Fortran
 								      heritage.)  If a class defines __cmp__ as well, it is only used
 								      when __lt__ etc. have been tried and return NotImplemented.
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
 								Copyright
 								    This document has been placed in the public domain.
 								Appendix
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								    Here is most of David Ascher's original proposal (version 0.2.1,
 								    dated Wed Jul 22 16:49:28 1998; I've left the Contents, History
 								    and Patches sections out).  It addresses almost all concerns
 								    above.
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
 								Abstract
 								    A new mechanism allowing comparisons of Python objects to return
 								    values other than -1, 0, or 1 (or raise exceptions) is
 								    proposed. This mechanism is entirely backwards compatible, and can
 								    be controlled at the level of the C PyObject type or of the Python
 								    class definition. There are three cooperating parts to the
 								    proposed mechanism:
 								    - the use of the last slot in the type object structure to store a
 								    pointer to a rich comparison function
 								    - the addition of special methods for classes
 								    - the addition of an optional argument to the builtin cmp()
 								    function.
 								Motivation
 								    The current comparison protocol for Python objects assumes that
 								    any two Python objects can be compared (as of Python 1.5, object
 								    comparisons can raise exceptions), and that the return value for
 								    any comparison should be -1, 0 or 1. -1 indicates that the first
 								    argument to the comparison function is less than the right one, +1
 								    indicating the contrapositive, and 0 indicating that the two
 								    objects are equal. While this mechanism allows the establishment
 								    of a order relationship (e.g. for use by the sort() method of list
 								    objects), it has proven to be limited in the context of Numeric
 								    Python (NumPy).
 								    Specifically, NumPy allows the creation of multidimensional
 								    arrays, which support most of the numeric operators. Thus:
 								             x = array((1,2,3,4))        y = array((2,2,4,4))
 								    are two NumPy arrays. While they can be added elementwise,:
 								             z = x + y   # z == array((3,4,7,8))
 								    they cannot be compared in the current framework - the released
 								    version of NumPy compares the pointers, (thus yielding junk
 								    information) which was the only solution before the recent
 								    addition of the ability (in 1.5) to raise exceptions in comparison
 								    functions.
 								    Even with the ability to raise exceptions, the current protocol
 								    makes array comparisons useless. To deal with this fact, NumPy
 								    includes several functions which perform the comparisons: less(),
 								    less_equal(), greater(), greater_equal(), equal(),
 								    not_equal(). These functions return arrays with the same shape as
 								    their arguments (modulo broadcasting), filled with 0's and 1's
 								    depending on whether the comparison is true or not for each
 								    element pair. Thus, for example, using the arrays x and y defined
 								    above:
 								             less(x,y)
 								    would be an array containing the numbers (1,0,0,0).
 								    The current proposal is to modify the Python object interface to
 								    allow the NumPy package to make it so that x < y returns the same
 								    thing as less(x,y). The exact return value is up to the NumPy
 								    package -- what this proposal really asks for is changing the
 								    Python core so that extension objects have the ability to return
 								    something other than -1, 0, 1, should their authors choose to do
 								    so.
 								Current State of Affairs
 								    The current protocol is, at the C level, that each object type
 								    defines a tp_compare slot, which is a pointer to a function which
 								    takes two PyObject* references and returns -1, 0, or 1. This
 								    function is called by the PyObject_Compare() function defined in
 								    the C API. PyObject_Compare() is also called by the builtin
 								    function cmp() which takes two arguments.
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								Proposed Mechanism
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
+. Changes to the C structure for type objects
-												This is now ready for discussion.  David, please read -- did I forget
anything?

											
										
										
											2000-12-06 12:41:38 -05:00
+								    The last available slot in the PyTypeObject, reserved up to now
-												Checking in some text.  Most of this is simply an Appendix repeating
what David Ascher said in 1998 (before the starship bites the dust
again).

											
										
										
											2000-12-04 15:32:13 -05:00
+								    for future expansion, is used to optionally store a pointer to a
 								    new comparison function, of type richcmpfunc defined by:
 								           typedef PyObject *(*richcmpfunc)
 								                Py_PROTO((PyObject *, PyObject *, int));
 								    This function takes three arguments. The first two are the objects
 								    to be compared, and the third is an integer corresponding to an
 								    opcode (one of LT, LE, EQ, NE, GT, GE). If this slot is left NULL,
 								    then rich comparison for that object type is not supported (except
 								    for class instances whose class provide the special methods
 								    described below).
 								    The above opcodes need to be added to the published Python/C API
 								    (probably under the names Py_LT, Py_LE, etc.)
 . Additions of special methods for classes
 								    Classes wishing to support the rich comparison mechanisms must add
 								    one or more of the following new special methods:
 								             def __lt__(self, other):
 								                ...
 								             def __le__(self, other):
 								                ...
 								             def __gt__(self, other):
 								                ...
 								             def __ge__(self, other):
 								                ...
 								             def __eq__(self, other):
 								                ...
 								             def __ne__(self, other):
 								                ...
 								    Each of these is called when the class instance is the on the
 								    left-hand-side of the corresponding operators (<, <=, >, >=, ==,
 								    and != or <>). The argument other is set to the object on the
 								    right side of the operator. The return value of these methods is
 								    up to the class implementor (after all, that's the entire point of
 								    the proposal).
 								    If the object on the left side of the operator does not define an
 								    appropriate rich comparison operator (either at the C level or
 								    with one of the special methods, then the comparison is reversed,
 								    and the right hand operator is called with the opposite operator,
 								    and the two objects are swapped. This assumes that a < b and b > a
 								    are equivalent, as are a <= b and b >= a, and that == and != are
 								    commutative (e.g. a == b if and only if b == a).
 								    For example, if obj1 is an object which supports the rich
 								    comparison protocol and x and y are objects which do not support
 								    the rich comparison protocol, then obj1 < x will call the __lt__
 								    method of obj1 with x as the second argument. x < obj1 will call
 								    obj1's __gt__ method with x as a second argument, and x < y will
 								    just use the existing (non-rich) comparison mechanism.
 								    The above mechanism is such that classes can get away with not
 								    implementing either __lt__ and __le__ or __gt__ and
 								    __ge__. Further smarts could have been added to the comparison
 								    mechanism, but this limited set of allowed "swaps" was chosen
 								    because it doesn't require the infrastructure to do any processing
 								    (negation) of return values. The choice of six special methods was
 								    made over a single (e.g. __richcmp__) method to allow the
 								    dispatching on the opcode to be performed at the level of the C
 								    implementation rather than the user-defined method.
 . Addition of an optional argument to the builtin cmp()
 								    The builtin cmp() is still used for simple comparisons. For rich
 								    comparisons, it is called with a third argument, one of "<", "<=",
 								    ">", ">=", "==", "!=", "<>" (the last two have the same
 								    meaning). When called with one of these strings as the third
 								    argument, cmp() can return any Python object. Otherwise, it can
 								    only return -1, 0 or 1 as before.
 								Chained Comparisons
 								    Problem
 								    It would be nice to allow objects for which the comparison returns
 								    something other than -1, 0, or 1 to be used in chained
 								    comparisons, such as:
 								             x < y < z
 								    Currently, this is interpreted by Python as:
 								             temp1 = x < y
 								             if temp1:
 								               return y < z
 								             else:
 								               return temp1
 								    Note that this requires testing the truth value of the result of
 								    comparisons, with potential "shortcutting" of the right-side
 								    comparison testings. In other words, the truth-value of the result
 								    of the result of the comparison determines the result of a chained
 								    operation. This is problematic in the case of arrays, since if x,
 								    y and z are three arrays, then the user expects:
 								            x < y < z
 								    to be an array of 0's and 1's where 1's are in the locations
 								    corresponding to the elements of y which are between the
 								    corresponding elements in x and z. In other words, the right-hand
 								    side must be evaluated regardless of the result of x < y, which is
 								    incompatible with the mechanism currently in use by the parser.
 								    Solution
 								    Guido mentioned that one possible way out would be to change the
 								    code generated by chained comparisons to allow arrays to be
 								    chained-compared intelligently. What follows is a mixture of his
 								    idea and my suggestions. The code generated for x < y < z would be
 								    equivalent to:
 								             temp1 = x < y
 								             if temp1:
 								               temp2 = y < z
 								               return boolean_combine(temp1, temp2)
 								             else:
 								               return temp1
 								    where boolean_combine is a new function which does something like
 								    the following:
 								             def boolean_combine(a, b):
 								                 if hasattr(a, '__boolean_and__') or \
 								                    hasattr(b, '__boolean_and__'):
 								                     try:
 								                         return a.__boolean_and__(b)
 								                     except:
 								                         return b.__boolean_and__(a)
 								                 else: # standard behavior
 								                     if a:
 								                         return b
 								                     else:
 								                         return 0
 								    where the __boolean_and__ special method is implemented for
 								    C-level types by another value of the third argument to the
 								    richcmp function. This method would perform a boolean comparison
 								    of the arrays (currently implemented in the umath module as the
 								    logical_and ufunc).
 								    Thus, objects returned by rich comparisons should always test
 								    true, but should define another special method which creates
 								    boolean combinations of them and their argument.
 								    This solution has the advantage of allowing chained comparisons to
 								    work for arrays, but the disadvantage that it requires comparison
 								    arrays to always return true (in an ideal world, I'd have them
 								    always raise an exception on truth testing, since the meaning of
 								    testing "if a>b:" is massively ambiguous.
 								    The inlining already present which deals with integer comparisons
 								    would still apply, resulting in no performance cost for the most
 								    common cases.
-												New PEPs, all initially owned by David Ascher.

											
										
										
											2000-07-15 19:28:27 -04:00
 								Local Variables:
 								mode: indented-text
 								indent-tabs-mode: nil
 								End: