This is now ready for discussion. David, please read -- did I forget

anything?
This commit is contained in:
Guido van Rossum 2000-12-06 17:41:38 +00:00
parent 3a679ad2aa
commit 2d42675b10
1 changed files with 168 additions and 24 deletions

View File

@ -22,11 +22,11 @@ Motivation
The main motivation comes from NumPy, whose users agree that A<B
should return an array of elementwise comparison outcomes; they
currently have to spell this as less(A,B) because A<B can only
return a Boolean result or an exception.
return a Boolean result or raise an exception.
An additional motivation is that frequently, types don't have a
natural ordering, but still need to be compared for equality.
Currenlty such a type *must* implement comparison and thus assign
Currently such a type *must* implement comparison and thus define
an arbitrary ordering, just so that equality can be tested.
More motivation can be found in the proposals listed under
@ -40,41 +40,181 @@ Previous Work
http://starship.python.net/crew/da/proposals/richcmp.html
It is also included as an appendix. In this proposal, David also
proposes the addition of an optional 3rd argument to cmp(), as in:
cmp(a, b, "<") or cmp(a, b, "!=").
It is also included below as an Appendix. Most of the material in
this PEP is derived from David's proposal.
Concerns
- Backwards compatibility, both at the Python level (classes using
1 Backwards compatibility, both at the Python level (classes using
__cmp__ need not be changed) and at the C level (extensions
defining tp_compare need not be changed).
defining tp_compare need not be changed, code using
PyObject_Compare() must work even if the compared objects use
the new rich comparison scheme).
- When A<B returns a matrix of elementwise comparisons, an easy
2 When A<B returns a matrix of elementwise comparisons, an easy
mistake to make is to use this expression in a Boolean context.
Without special precautions, it would always be true. This use
should raise an exception instead.
- If a class overrides x==y but nothing else, should x!=y be
computed as not(x==y), or fail? (I think this is OK; David
disagrees.)
3 If a class overrides x==y but nothing else, should x!=y be
computed as not(x==y), or fail? What about the similar
relationship between < and >=, or between > and <=?
- Similarly, should we allow x<y to be calculated from y>x? (I
think this is OK; David agrees.)
4 Similarly, should we allow x<y to be calculated from y>x? And
x<=y from not(x>y)? And x==y from y==x, or x!=y from y!=x?
- Similarly, should we allow x<=y to be calculated from not(x>y)?
(I think this is *not* OK; neither does David.)
5 When comparison operators return elementwise comparisons, what
to do about shortcut operators like A<B<C, ``A<B and C<D'',
``A<B or C<D''?
- When using comparisons to generate elementwise comparisons, what
to do about shortcut operators like A<B<C or ``A<B and C<D''?
(David proposes a solution for A<B<C, but it means that ``if
A<B:...'' will assume ``if true:...''.
6 What to do about min() and max(), the 'in' and 'not in'
operators, list.sort(), dictionary key comparison, and other
uses of comparisons by built-in operations?
Solution
Proposed Resolutions
To be done.
1 Full backwards compatibility can be achieved as follows. When
an object defines tp_compare() but not tp_richcmp(), and a rich
comparison is requested, the outcome of tp_compare() is used in
the ovious way. E.g. if "<" is requested, an exception if
tp_compare() raises an exception, the outcome is 1 if
tp_compare() is negative, and 0 if it is zero or positive. Etc.
Full forward compatibility can be achieved as follows. (This is
a bit arbitrary.) When a classic comparison is requested on an
object that only implements tp_richcmp(), up to three
comparisons are used: first == is tried, and if it returns true,
0 is returned; next, < is tried and if it returns true, -1 is
returned; next, > is tried and if it returns true, +1 is
returned. Finally, TypeError("incomparable objects") exception
is raised. If any operator tried returns a non-Boolean value
(see below), the exception is passed through.
2 Any type that returns a collection of Booleans instead of a
single boolean should define nb_nonzero() to raise an exception.
Such a type is considered a non-Boolean.
3 The == and != operators are not assumed to be each other's
complement (e.g. IEEE 754 floating point numbers do not satisfy
this). It is up to the type to implement this if desired.
Similar for < and >=, or > and <=; there are lots of examples
where these assumptions aren't true (e.g. tabnanny).
4 The reflexivity rules *are* assumed by Python. Thus, the
interpreter may swap y>x with x<y, y>=x with x<=y, and may swap
the arguments of x==y and x!=y. (Note: Python currently assumes
that x==x is always true and x!=x is never true; this should not
be assumed.)
5 In the current proposal, when A<B returns an array of
elementwise comparisons, this outcome is considered non-Boolean,
and its interpretation as Boolean by the shortcut operators
raises an exception. David Ascher's proposal tries to deal
with this; I don't think this is worth the additional complexity
in the code generator. Instead of A<B<C, you can write
(A<B)&(C<D).
6 The min(), max() and list.sort() operations will only use the
< operator. The 'in' and 'not in' operators and dictionary
lookup will only use the == operator.
Implementation Proposal
This closely follows David Ascher's proposal.
C API
- New function:
PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
This performs the requested rich comparison, returning a Python
object or raising an exception. The 3rd argument must be one of
LT, LE, EQ, NE, GT or GE.
- New typedef:
typedef PyObject *(*richcmpfunc) (PyObject *, PyObject *, int);
- New slot in type object, replacing spare tp_xxx7:
richcmpfunc tp_richcompare;
- PyObject_Compare() is changed to try rich comparisons if they
are defined (but only if classic comparisons aren't defined).
Changes to the interpreter
- Whenever PyObject_Compare() is called with the intent of getting
the outcome of a particular comparison (e.g. in list.sort(), and
of course for the comparison operators in ceval.c), the code is
changed to call PyObject_RichCompare() instead; if the C code
needs to know the outcome of the comparison, PyObject_IsTrue()
is called on the result (which may raise an exception).
- All built-in types that currently define a comparison will be
modified to define a rich comparison instead. (This is
optional!)
Classes
- Classes can define new special methods __lt__, __le__, __gt__,
__ge__, __eq__, __ne__ to override the corresponding operators.
(You gotta love the Fortran heritage.) If a class overrides
__cmp__ as well, it is only used by PyObject_Compare().
Unresolved Issues
- David Ascher's proposal also introduces cmp(a, b, op) where op
is one of "<", "<=", ">", ">=", "==", "!=", which should return
the same as a <op> b. Is this necessary?
- The rules for mixed comparisons are confusing. I propose to
wait until the new coercion mechanism is in place, and use the
new coercion rules for mixed comparisons. This may occasionally
cause x<y to be replaced by y>x, if x doesn't implement
comparison to y, but y does implement comparison to x.
- With the above design, if a type or class defines both classic
and rich comparisons, classic comparisons override rich
comparisons when PyObject_Compare() is called, but rich
comparisons override when PyObject_RichCompare() is called. Is
this right? Or should rich comparisons always win except when
cmp() is called?
- Should we even bother upgrading the existing types?
- If so, how should comparisons on container types be defined?
Suppose we have a list whose items define rich comparisons. How
should the itemwise comparisons be done? For example:
def __lt__(a, b): # a<b for lists
for i in range(min(len(a), len(b))):
ai, bi = a[i], b[i]
if ai < bi: return 1
if ai == bi: continue
if ai > bi: return 0
raise TypeError, "incomparable item types"
return len(a) < len(b)
This uses the same sequence of comparisons as cmp(), so it may
as well use cmp() instead:
def __lt__(a, b): # a<b for lists
for i in range(min(len(a), len(b))):
c = cmp(a[i], b[i])
if c < 0: return 1
if c == 0: continue
if c > 0: return 0
assert 0 # unreachable
return len(a) < len(b)
And now there's not really a reason to change lists to rich
comparisons.
Copyright
@ -84,8 +224,10 @@ Copyright
Appendix
Here, for posterity, is most of David Ascher's original proposal.
It addresses almost all concerns.
Here is most of David Ascher's original proposal (version 0.2.1,
dated Wed Jul 22 16:49:28 1998; I've left the Contents, History
and Patches sections out). It addresses almost all concerns
above.
Abstract
@ -165,9 +307,11 @@ Current State of Affairs
the C API. PyObject_Compare() is also called by the builtin
function cmp() which takes two arguments.
Proposed Mechanism
1. Changes to the C structure for type objects
The last availabel slot in the PyTypeObject, reserved up to now
The last available slot in the PyTypeObject, reserved up to now
for future expansion, is used to optionally store a pointer to a
new comparison function, of type richcmpfunc defined by: