234 lines
9.9 KiB
Plaintext
234 lines
9.9 KiB
Plaintext
PEP: 208
|
||
Title: Reworking the Coercion Model
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: nas@arctrix.com (Neil Schemenauer), mal@lemburg.com (Marc-André Lemburg)
|
||
Status: Final
|
||
Type: Standards Track
|
||
Created: 04-Dec-2000
|
||
Python-Version: 2.1
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
|
||
Many Python types implement numeric operations. When the arguments of
|
||
a numeric operation are of different types, the interpreter tries to
|
||
coerce the arguments into a common type. The numeric operation is
|
||
then performed using this common type. This PEP proposes a new type
|
||
flag to indicate that arguments to a type's numeric operations should
|
||
not be coerced. Operations that do not support the supplied types
|
||
indicate it by returning a new singleton object. Types which do not
|
||
set the type flag are handled in a backwards compatible manner.
|
||
Allowing operations handle different types is often simpler, more
|
||
flexible, and faster than having the interpreter do coercion.
|
||
|
||
|
||
Rationale
|
||
|
||
When implementing numeric or other related operations, it is often
|
||
desirable to provide not only operations between operands of one type
|
||
only, e.g. integer + integer, but to generalize the idea behind the
|
||
operation to other type combinations as well, e.g. integer + float.
|
||
|
||
A common approach to this mixed type situation is to provide a method
|
||
of "lifting" the operands to a common type (coercion) and then use
|
||
that type's operand method as execution mechanism. Yet, this strategy
|
||
has a few drawbacks:
|
||
|
||
* the "lifting" process creates at least one new (temporary)
|
||
operand object,
|
||
|
||
* since the coercion method is not being told about the operation
|
||
that is to follow, it is not possible to implement operation
|
||
specific coercion of types,
|
||
|
||
* there is no elegant way to solve situations were a common type
|
||
is not at hand, and
|
||
|
||
* the coercion method will always have to be called prior to the
|
||
operation's method itself.
|
||
|
||
A fix for this situation is obviously needed, since these drawbacks
|
||
make implementations of types needing these features very cumbersome,
|
||
if not impossible. As an example, have a look at the DateTime and
|
||
DateTimeDelta[1] types, the first being absolute, the second
|
||
relative. You can always add a relative value to an absolute one,
|
||
giving a new absolute value. Yet, there is no common type which the
|
||
existing coercion mechanism could use to implement that operation.
|
||
|
||
Currently, PyInstance types are treated specially by the interpreter
|
||
in that their numeric methods are passed arguments of different types.
|
||
Removing this special case simplifies the interpreter and allows other
|
||
types to implement numeric methods that behave like instance types.
|
||
This is especially useful for extension types like ExtensionClass.
|
||
|
||
|
||
Specification
|
||
|
||
Instead of using a central coercion method, the process of handling
|
||
different operand types is simply left to the operation. If the
|
||
operation finds that it cannot handle the given operand type
|
||
combination, it may return a special singleton as indicator.
|
||
|
||
Note that "numbers" (anything that implements the number protocol, or
|
||
part of it) written in Python already use the first part of this
|
||
strategy - it is the C level API that we focus on here.
|
||
|
||
To maintain nearly 100% backward compatibility we have to be very
|
||
careful to make numbers that don't know anything about the new
|
||
strategy (old style numbers) work just as well as those that expect
|
||
the new scheme (new style numbers). Furthermore, binary compatibility
|
||
is a must, meaning that the interpreter may only access and use new
|
||
style operations if the number indicates the availability of these.
|
||
|
||
A new style number is considered by the interpreter as such if and
|
||
only if it sets the type flag Py_TPFLAGS_CHECKTYPES. The main
|
||
difference between an old style number and a new style one is that the
|
||
numeric slot functions can no longer assume to be passed arguments of
|
||
identical type. New style slots must check all arguments for proper
|
||
type and implement the necessary conversions themselves. This may seem
|
||
to cause more work on the behalf of the type implementor, but is in
|
||
fact no more difficult than writing the same kind of routines for an
|
||
old style coercion slot.
|
||
|
||
If a new style slot finds that it cannot handle the passed argument
|
||
type combination, it may return a new reference of the special
|
||
singleton Py_NotImplemented to the caller. This will cause the caller
|
||
to try the other operands operation slots until it finds a slot that
|
||
does implement the operation for the specific type combination. If
|
||
none of the possible slots succeed, it raises a TypeError.
|
||
|
||
To make the implementation easy to understand (the whole topic is
|
||
esoteric enough), a new layer in the handling of numeric operations is
|
||
introduced. This layer takes care of all the different cases that need
|
||
to be taken into account when dealing with all the possible
|
||
combinations of old and new style numbers. It is implemented by the
|
||
two static functions binary_op() and ternary_op(), which are both
|
||
internal functions that only the functions in Objects/abstract.c
|
||
have access to. The numeric API (PyNumber_*) is easy to adapt to
|
||
this new layer.
|
||
|
||
As a side-effect all numeric slots can be NULL-checked (this has to be
|
||
done anyway, so the added feature comes at no extra cost).
|
||
|
||
|
||
The scheme used by the layer to execute a binary operation is as
|
||
follows:
|
||
|
||
v | w | Action taken
|
||
---------+------------+----------------------------------
|
||
new | new | v.op(v,w), w.op(v,w)
|
||
new | old | v.op(v,w), coerce(v,w), v.op(v,w)
|
||
old | new | w.op(v,w), coerce(v,w), v.op(v,w)
|
||
old | old | coerce(v,w), v.op(v,w)
|
||
|
||
The indicated action sequence is executed from left to right until
|
||
either the operation succeeds and a valid result (!=
|
||
Py_NotImplemented) is returned or an exception is raised. Exceptions
|
||
are returned to the calling function as-is. If a slot returns
|
||
Py_NotImplemented, the next item in the sequence is executed.
|
||
|
||
Note that coerce(v,w) will use the old style nb_coerce slot methods
|
||
via a call to PyNumber_Coerce().
|
||
|
||
Ternary operations have a few more cases to handle:
|
||
|
||
v | w | z | Action taken
|
||
----+-----+-----+------------------------------------
|
||
new | new | new | v.op(v,w,z), w.op(v,w,z), z.op(v,w,z)
|
||
new | old | new | v.op(v,w,z), z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
||
old | new | new | w.op(v,w,z), z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
||
old | old | new | z.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
||
new | new | old | v.op(v,w,z), w.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
||
new | old | old | v.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
||
old | new | old | w.op(v,w,z), coerce(v,w,z), v.op(v,w,z)
|
||
old | old | old | coerce(v,w,z), v.op(v,w,z)
|
||
|
||
The same notes as above, except that coerce(v,w,z) actually does:
|
||
|
||
if z != Py_None:
|
||
coerce(v,w), coerce(v,z), coerce(w,z)
|
||
else:
|
||
# treat z as absent variable
|
||
coerce(v,w)
|
||
|
||
|
||
The current implementation uses this scheme already (there's only one
|
||
ternary slot: nb_pow(a,b,c)).
|
||
|
||
Note that the numeric protocol is also used for some other related
|
||
tasks, e.g. sequence concatenation. These can also benefit from the
|
||
new mechanism by implementing right-hand operations for type
|
||
combinations that would otherwise fail to work. As an example, take
|
||
string concatenation: currently you can only do string + string. With
|
||
the new mechanism, a new string-like type could implement new_type +
|
||
string and string + new_type, even though strings don't know anything
|
||
about new_type.
|
||
|
||
Since comparisons also rely on coercion (every time you compare an
|
||
integer to a float, the integer is first converted to float and then
|
||
compared...), a new slot to handle numeric comparisons is needed:
|
||
|
||
PyObject *nb_cmp(PyObject *v, PyObject *w)
|
||
|
||
This slot should compare the two objects and return an integer object
|
||
stating the result. Currently, this result integer may only be -1, 0,
|
||
1. If the slot cannot handle the type combination, it may return a
|
||
reference to Py_NotImplemented. [XXX Note that this slot is still
|
||
in flux since it should take into account rich comparisons
|
||
(i.e. PEP 207).]
|
||
|
||
Numeric comparisons are handled by a new numeric protocol API:
|
||
|
||
PyObject *PyNumber_Compare(PyObject *v, PyObject *w)
|
||
|
||
This function compare the two objects as "numbers" and return an
|
||
integer object stating the result. Currently, this result integer may
|
||
only be -1, 0, 1. In case the operation cannot be handled by the given
|
||
objects, a TypeError is raised.
|
||
|
||
The PyObject_Compare() API needs to adjusted accordingly to make use
|
||
of this new API.
|
||
|
||
Other changes include adapting some of the built-in functions (e.g.
|
||
cmp()) to use this API as well. Also, PyNumber_CoerceEx() will need to
|
||
check for new style numbers before calling the nb_coerce slot. New
|
||
style numbers don't provide a coercion slot and thus cannot be
|
||
explicitly coerced.
|
||
|
||
|
||
Reference Implementation
|
||
|
||
A preliminary patch for the CVS version of Python is available through
|
||
the Source Forge patch manager[2].
|
||
|
||
|
||
Credits
|
||
|
||
This PEP and the patch are heavily based on work done by Marc-André
|
||
Lemburg[3].
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
References
|
||
|
||
[1] http://www.lemburg.com/files/python/mxDateTime.html
|
||
[2] http://sourceforge.net/patch/?func=detailpatch&patch_id=102652&group_id=5470
|
||
[3] http://www.lemburg.com/files/python/CoercionProposal.html
|
||
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|