2000-07-15 19:28:27 -04:00
|
|
|
|
PEP: 208
|
|
|
|
|
Title: Reworking the Coercion Model
|
|
|
|
|
Version: $Revision$
|
2006-03-02 14:54:50 -05:00
|
|
|
|
Author: nas@arctrix.com (Neil Schemenauer), mal@lemburg.com (Marc-André Lemburg)
|
2001-01-19 17:32:11 -05:00
|
|
|
|
Status: Final
|
2000-12-04 22:17:08 -05:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Created: 04-Dec-2000
|
|
|
|
|
Post-History:
|
2000-07-15 19:28:27 -04:00
|
|
|
|
Python-Version: 2.1
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
|
|
|
|
|
Many Python types implement numeric operations. When the arguments of
|
|
|
|
|
a numeric operation are of different types, the interpreter tries to
|
|
|
|
|
coerce the arguments into a common type. The numeric operation is
|
|
|
|
|
then performed using this common type. This PEP proposes a new type
|
|
|
|
|
flag to indicate that arguments to a type's numeric operations should
|
|
|
|
|
not be coerced. Operations that do not support the supplied types
|
|
|
|
|
indicate it by returning a new singleton object. Types which do not
|
|
|
|
|
set the type flag are handled in a backwards compatible manner.
|
|
|
|
|
Allowing operations handle different types is often simpler, more
|
2000-12-11 21:00:37 -05:00
|
|
|
|
flexible, and faster than having the interpreter do coercion.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
|
2000-12-11 21:00:37 -05:00
|
|
|
|
Rationale
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
When implementing numeric or other related operations, it is often
|
|
|
|
|
desirable to provide not only operations between operands of one type
|
|
|
|
|
only, e.g. integer + integer, but to generalize the idea behind the
|
2000-12-11 21:00:37 -05:00
|
|
|
|
operation to other type combinations as well, e.g. integer + float.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
A common approach to this mixed type situation is to provide a method
|
|
|
|
|
of "lifting" the operands to a common type (coercion) and then use
|
2000-12-11 21:00:37 -05:00
|
|
|
|
that type's operand method as execution mechanism. Yet, this strategy
|
|
|
|
|
has a few drawbacks:
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
* the "lifting" process creates at least one new (temporary)
|
2000-12-11 21:00:37 -05:00
|
|
|
|
operand object,
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
* since the coercion method is not being told about the operation
|
|
|
|
|
that is to follow, it is not possible to implement operation
|
2000-12-11 21:00:37 -05:00
|
|
|
|
specific coercion of types,
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
* there is no elegant way to solve situations were a common type
|
2000-12-11 21:00:37 -05:00
|
|
|
|
is not at hand, and
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
* the coercion method will always have to be called prior to the
|
2000-12-11 21:00:37 -05:00
|
|
|
|
operation's method itself.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
A fix for this situation is obviously needed, since these drawbacks
|
|
|
|
|
make implementations of types needing these features very cumbersome,
|
2000-12-11 21:00:37 -05:00
|
|
|
|
if not impossible. As an example, have a look at the DateTime and
|
2000-12-04 22:17:08 -05:00
|
|
|
|
DateTimeDelta[1] types, the first being absolute, the second
|
|
|
|
|
relative. You can always add a relative value to an absolute one,
|
2000-12-11 21:00:37 -05:00
|
|
|
|
giving a new absolute value. Yet, there is no common type which the
|
2000-12-04 22:17:08 -05:00
|
|
|
|
existing coercion mechanism could use to implement that operation.
|
|
|
|
|
|
|
|
|
|
Currently, PyInstance types are treated specially by the interpreter
|
|
|
|
|
in that their numeric methods are passed arguments of different types.
|
|
|
|
|
Removing this special case simplifies the interpreter and allows other
|
|
|
|
|
types to implement numeric methods that behave like instance types.
|
|
|
|
|
This is especially useful for extension types like ExtensionClass.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
|
|
|
|
|
Instead of using a central coercion method, the process of handling
|
2000-12-11 21:00:37 -05:00
|
|
|
|
different operand types is simply left to the operation. If the
|
2000-12-04 22:17:08 -05:00
|
|
|
|
operation finds that it cannot handle the given operand type
|
2000-12-11 21:00:37 -05:00
|
|
|
|
combination, it may return a special singleton as indicator.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
Note that "numbers" (anything that implements the number protocol, or
|
|
|
|
|
part of it) written in Python already use the first part of this
|
|
|
|
|
strategy - it is the C level API that we focus on here.
|
|
|
|
|
|
|
|
|
|
To maintain nearly 100% backward compatibility we have to be very
|
|
|
|
|
careful to make numbers that don't know anything about the new
|
|
|
|
|
strategy (old style numbers) work just as well as those that expect
|
2000-12-11 21:00:37 -05:00
|
|
|
|
the new scheme (new style numbers). Furthermore, binary compatibility
|
2000-12-04 22:17:08 -05:00
|
|
|
|
is a must, meaning that the interpreter may only access and use new
|
2000-12-11 21:00:37 -05:00
|
|
|
|
style operations if the number indicates the availability of these.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
A new style number is considered by the interpreter as such if and
|
2002-09-02 04:30:27 -04:00
|
|
|
|
only it it sets the type flag Py_TPFLAGS_CHECKTYPES. The main
|
2000-12-04 22:17:08 -05:00
|
|
|
|
difference between an old style number and a new style one is that the
|
|
|
|
|
numeric slot functions can no longer assume to be passed arguments of
|
2000-12-11 21:00:37 -05:00
|
|
|
|
identical type. New style slots must check all arguments for proper
|
|
|
|
|
type and implement the necessary conversions themselves. This may seem
|
2000-12-04 22:17:08 -05:00
|
|
|
|
to cause more work on the behalf of the type implementor, but is in
|
|
|
|
|
fact no more difficult than writing the same kind of routines for an
|
2000-12-11 21:00:37 -05:00
|
|
|
|
old style coercion slot.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
If a new style slot finds that it cannot handle the passed argument
|
|
|
|
|
type combination, it may return a new reference of the special
|
2000-12-11 21:00:37 -05:00
|
|
|
|
singleton Py_NotImplemented to the caller. This will cause the caller
|
2000-12-04 22:17:08 -05:00
|
|
|
|
to try the other operands operation slots until it finds a slot that
|
2000-12-11 21:00:37 -05:00
|
|
|
|
does implement the operation for the specific type combination. If
|
|
|
|
|
none of the possible slots succeed, it raises a TypeError.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
To make the implementation easy to understand (the whole topic is
|
|
|
|
|
esoteric enough), a new layer in the handling of numeric operations is
|
2000-12-11 21:00:37 -05:00
|
|
|
|
introduced. This layer takes care of all the different cases that need
|
2000-12-04 22:17:08 -05:00
|
|
|
|
to be taken into account when dealing with all the possible
|
2000-12-11 21:00:37 -05:00
|
|
|
|
combinations of old and new style numbers. It is implemented by the
|
|
|
|
|
two static functions binary_op() and ternary_op(), which are both
|
|
|
|
|
internal functions that only the functions in Objects/abstract.c
|
|
|
|
|
have access to. The numeric API (PyNumber_*) is easy to adapt to
|
|
|
|
|
this new layer.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
As a side-effect all numeric slots can be NULL-checked (this has to be
|
|
|
|
|
done anyway, so the added feature comes at no extra cost).
|
|
|
|
|
|
2000-12-11 21:00:37 -05:00
|
|
|
|
|
2000-12-04 22:17:08 -05:00
|
|
|
|
The scheme used by the layer to execute a binary operation is as
|
2000-12-11 21:00:37 -05:00
|
|
|
|
follows:
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
v | w | Action taken
|
|
|
|
|
---------+------------+----------------------------------
|
|
|
|
|
new | new | v.op(v,w), w.op(v,w)
|
|
|
|
|
new | old | v.op(v,w), coerce(v,w), v.op(v,w)
|
|
|
|
|
old | new | w.op(v,w), coerce(v,w), v.op(v,w)
|
|
|
|
|
old | old | coerce(v,w), v.op(v,w)
|
|
|
|
|
|
|
|
|
|
The indicated action sequence is executed from left to right until
|
|
|
|
|
either the operation succeeds and a valid result (!=
|
2000-12-11 21:00:37 -05:00
|
|
|
|
Py_NotImplemented) is returned or an exception is raised. Exceptions
|
|
|
|
|
are returned to the calling function as-is. If a slot returns
|
|
|
|
|
Py_NotImplemented, the next item in the sequence is executed.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
Note that coerce(v,w) will use the old style nb_coerce slot methods
|
2000-12-11 21:00:37 -05:00
|
|
|
|
via a call to PyNumber_Coerce().
|
|
|
|
|
|
|
|
|
|
Ternary operations have a few more cases to handle:
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
v | w | z | Action taken
|
|
|
|
|
----+-----+-----+------------------------------------
|
|
|
|
|
new | new | new | v.op(v,w,z), w.op(v,w,z), z.op(v,w,z)
|
|
|
|
|
new | old | new | v.op(v,w,z), z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
old | new | new | w.op(v,w,z), z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
old | old | new | z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
new | new | old | v.op(v,w,z), w.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
new | old | old | v.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
old | new | old | w.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
old | old | old | coerce(v,w,z), v.op(v,w,z)
|
|
|
|
|
|
2000-12-11 21:00:37 -05:00
|
|
|
|
The same notes as above, except that coerce(v,w,z) actually does:
|
|
|
|
|
|
|
|
|
|
if z != Py_None:
|
|
|
|
|
coerce(v,w), coerce(v,z), coerce(w,z)
|
|
|
|
|
else:
|
|
|
|
|
# treat z as absent variable
|
|
|
|
|
coerce(v,w)
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The current implementation uses this scheme already (there's only one
|
2000-12-11 21:00:37 -05:00
|
|
|
|
ternary slot: nb_pow(a,b,c)).
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
Note that the numeric protocol is also used for some other related
|
|
|
|
|
tasks, e.g. sequence concatenation. These can also benefit from the
|
|
|
|
|
new mechanism by implementing right-hand operations for type
|
2000-12-11 21:00:37 -05:00
|
|
|
|
combinations that would otherwise fail to work. As an example, take
|
|
|
|
|
string concatenation: currently you can only do string + string. With
|
2000-12-04 22:17:08 -05:00
|
|
|
|
the new mechanism, a new string-like type could implement new_type +
|
|
|
|
|
string and string + new_type, even though strings don't know anything
|
2000-12-11 21:00:37 -05:00
|
|
|
|
about new_type.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
Since comparisons also rely on coercion (every time you compare an
|
|
|
|
|
integer to a float, the integer is first converted to float and then
|
2000-12-11 21:00:37 -05:00
|
|
|
|
compared...), a new slot to handle numeric comparisons is needed:
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
2000-12-11 21:00:37 -05:00
|
|
|
|
PyObject *nb_cmp(PyObject *v, PyObject *w)
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
This slot should compare the two objects and return an integer object
|
2000-12-11 21:00:37 -05:00
|
|
|
|
stating the result. Currently, this result integer may only be -1, 0,
|
2000-12-04 22:17:08 -05:00
|
|
|
|
1. If the slot cannot handle the type combination, it may return a
|
2000-12-11 21:00:37 -05:00
|
|
|
|
reference to Py_NotImplemented. [XXX Note that this slot is still
|
|
|
|
|
in flux since it should take into account rich comparisons
|
|
|
|
|
(i.e. PEP 207).]
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
2000-12-11 21:00:37 -05:00
|
|
|
|
Numeric comparisons are handled by a new numeric protocol API:
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
2000-12-11 21:00:37 -05:00
|
|
|
|
PyObject *PyNumber_Compare(PyObject *v, PyObject *w)
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
This function compare the two objects as "numbers" and return an
|
2000-12-11 21:00:37 -05:00
|
|
|
|
integer object stating the result. Currently, this result integer may
|
|
|
|
|
only be -1, 0, 1. In case the operation cannot be handled by the given
|
|
|
|
|
objects, a TypeError is raised.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
The PyObject_Compare() API needs to adjusted accordingly to make use
|
2000-12-11 21:00:37 -05:00
|
|
|
|
of this new API.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
Other changes include adapting some of the built-in functions (e.g.
|
2000-12-11 21:00:37 -05:00
|
|
|
|
cmp()) to use this API as well. Also, PyNumber_CoerceEx() will need to
|
|
|
|
|
check for new style numbers before calling the nb_coerce slot. New
|
2000-12-04 22:17:08 -05:00
|
|
|
|
style numbers don't provide a coercion slot and thus cannot be
|
2000-12-11 21:00:37 -05:00
|
|
|
|
explicitly coerced.
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference Implementation
|
|
|
|
|
|
|
|
|
|
A preliminary patch for the CVS version of Python is available through
|
|
|
|
|
the Source Forge patch manager[2].
|
|
|
|
|
|
|
|
|
|
|
2000-12-05 10:05:14 -05:00
|
|
|
|
Credits
|
|
|
|
|
|
2006-03-02 14:54:50 -05:00
|
|
|
|
This PEP and the patch are heavily based on work done by Marc-André
|
2000-12-05 10:05:14 -05:00
|
|
|
|
Lemburg[3].
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
2000-12-04 22:17:08 -05:00
|
|
|
|
References
|
|
|
|
|
|
|
|
|
|
[1] http://www.lemburg.com/files/python/mxDateTime.html
|
2000-12-11 21:00:37 -05:00
|
|
|
|
[2] http://sourceforge.net/patch/?func=detailpatch&patch_id=102652&group_id=5470
|
2000-12-05 10:05:14 -05:00
|
|
|
|
[3] http://www.lemburg.com/files/python/CoercionProposal.html
|
2000-12-04 22:17:08 -05:00
|
|
|
|
|
2000-07-15 19:28:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
2006-03-02 14:54:50 -05:00
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
2000-07-15 19:28:27 -04:00
|
|
|
|
End:
|