Completely reworked to be more specific about the motivation, the

implementation and the transition.
This commit is contained in:
Guido van Rossum 2001-07-26 19:29:39 +00:00
parent 32a207d6aa
commit 1862e5722d
1 changed files with 273 additions and 77 deletions

View File

@ -11,119 +11,315 @@ Post-History: 16-Mar-2001
Abstract
Dividing integers currently returns the floor of the quantities.
This behavior is known as integer division, and is similar to what
C and FORTRAN do. This has the useful property that all
operations on integers return integers, but it does tend to put a
hump in the learning curve when new programmers are surprised that
The current division (/) operator has an ambiguous meaning for
numerical arguments: it returns the floor of the mathematical
result if the arguments are ints or longs, but it returns a
reasonable approximation of the result if the arguments are floats
or complex. This makes expressions expecting float or complex
results error-prone when integers are not expected but possible as
inputs.
1/2 == 0
We propose to fix this by introducing different operators for
different operations: x/y to return a reasonable approximation of
the mathematical result of the division ("true division"), x//y to
return the floor ("floor division"). We call the current, mixed
meaning of x/y "classic division".
This proposal shows a way to change this while keeping backward
compatibility issues in mind.
Because of severe backwards compatibility issues, not to mention a
major flamewar on c.l.py, we propose the following transitional
measures (starting with Python 2.2):
NOTE: in the light of recent discussions in the newsgroup, the
motivation in this PEP (and details) need to be extended. I'll do
that ASAP.
- Classic division will remain the default in the Python 2.x
series; true division will be standard in Python 3.0.
- The // operator will be available to request floor division
unambiguously.
- The future division statement, spelled "from __future__ import
division", will change the / operator to mean true division
throughout the module.
- A command line option will enable run-time warnings for classic
division applied to int or long arguments; another command line
option will make true division the default.
- The standard library will use the future division statement and
the // operator when appropriate, so as to completely avoid
classic division.
Rationale
Motivation
The behavior of integer division is a major stumbling block found
in user testing of Python. This manages to trip up new
programmers regularly and even causes the experienced programmer
to make the occasional mistake. The workarounds, like explicitly
coercing one of the operands to float or use a non-integer
literal, are very non-intuitive and lower the readability of the
program.
The classic division operator makes it hard to write numerical
expressions that are supposed to give correct results from
arbitrary numerical inputs. For all other operators, one can
write down a formula such as x*y**2 + z, and the calculated result
will be close to the mathematical result (within the limits of
numerical accuracy, of course) for any numerical input type (int,
long, float, or complex). But division poses a problem: if the
expressions for both arguments happen to have an integral type, it
implements floor division rather than true division.
The problem is unique to dynamically typed languages: in a
statically typed language like C, the inputs, typically function
arguments, would be declared as double or float, and when a call
passes an integer argument, it is converted to double or float at
the time of the call. Python doesn't have argument type
declarations, so integer arguments can easily find their way into
an expression.
The problem is particularly pernicious since ints are perfect
substitutes for floats in all other circumstances: math.sqrt(2)
returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0
return the same value, and so on. Thus, the author of a numerical
routine may only use floating point numbers to test his code, and
believe that it works correctly, and a user may accidentally pass
in an integer input value and get incorrect results.
Another way to look at this is that classic division makes it
difficult to write polymorphic functions that work well with
either float or int arguments; all other operators already do the
right thing. No algorithm that works for both ints and floats has
a need for truncating division in one case and true division in
the other.
The correct work-around is subtle: casting an argument to float()
is wrong if it could be a complex number; adding 0.0 to an
argument doesn't preserve the sign of the argument if it was minus
zero. The only solution without either downside is multiplying an
argument (typically the first) by 1.0. This leaves the value and
sign unchanged for float and complex, and turns int and long into
a float with the corresponding value.
It is the opinion of the authors that this is a real design bug in
Python, and that it should be fixed sooner rather than later.
Assuming Python usage will continue to grow, the cost of leaving
this bug in the language will eventually outweigh the cost of
fixing old code -- there is an upper bound to the amount of code
to be fixed, but the amount of code that might be affected by the
bug in the future is unbounded.
Another reason for this change is the desire to ultimately unify
Python's numeric model. This is the subject of PEP 228[0] (which
is currently incomplete). A unified numeric model removes most of
the user's need to be aware of different numerical types. This is
good for beginners, but also takes away concerns about different
numeric behavior for advanced programmers. (Of course, it won't
remove concerns about numerical stability and accuracy.)
In a unified numeric model, the different types (int, long, float,
complex, and possibly others, such as a new rational type) serve
mostly as storage optimizations, and to some extent to indicate
orthogonal properties such as inexactness or complexity. In a
unified model, the integer 1 should be indistinguishable from the
floating point number 1.0 (except for its inexactness), and both
should behave the same in all numeric contexts. Clearly, in a
unified numeric model, if a==b and c==d, a/c should equal b/d
(taking some liberties due to rounding for inexact numbers), and
since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also
equal 0.5. Likewise, since 1//2 equals zero, 1.0//2.0 should also
equal zero.
// Operator
Variations
A `//' operator which will be introduced, which will call the
nb_intdivide or __intdiv__ slots. This operator will be
implemented in all the Python numeric types, and will have the
semantics of
Esthetically, x//y doesn't please everyone, and hence several
variations have been proposed: x div y, or div(x, y), sometimes in
combination with x mod y or mod(x, y) as an alternative spelling
for x%y.
We consider these solutions inferior, on the following grounds.
- Using x div y would introduce a new keyword. Since div is a
popular identifier, this would break a fair amount of existing
code, unless the new keyword was only recognized under a future
division statement. Since it is expected that the majority of
code that needs to be converted is dividing integers, this would
greatly increase the need for the future division statement.
Even with a future statement, the general sentiment against
adding new keywords unless absolutely necessary argues against
this.
- Using div(x, y) makes the conversion of old code much harder.
Replacing x/y with x//y or x div y can be done with a simple
query replace; in most cases the programmer can easily verify
that a particular module only works with integers so all
occurrences of x/y can be replaced. (The query replace is still
needed to weed out slashes occurring in comments or string
literals.) Replacing x/y with div(x, y) would require a much
more intelligent tool, since the extent of the expressions to
the left and right of the / must be analized before the
placement of the "div(" and ")" part can be decided.
Alternatives
In order to reduce the amount of old code that needs to be
converted, several alternative proposals have been put forth.
Here is a brief discussion of each proposal (or category of
proposals). If you know of an alternative that was discussed on
c.l.py that isn't mentioned here, please mail the second author.
- Let / keep its classic semantics; introduce // for true
division. This doesn't solve the problem that the classic /
operator makes it hard to write polymorphic numeric functions
accept int and float arguments, and still requires the use of
x*1.0/y whenever true divisions is required.
- Use a directive to use specific division semantics in a module,
rather than a future statement. This retains classic division
as a permanent wart in the language, requiring future
generations of Python programmers to be aware of the problem and
the remedies.
- Use "from __past__ import division" to use classic division
semantics in a module. This also retains the classic division
as a permanent wart, or at least for a long time (eventually the
past division statement could raise an ImportError).
- Use a directive (or some other way) to specify the Python
version for which a specific piece of code was developed. This
requires future Python interpreters to be able to emulate
*exactly* every previous version of Python, and moreover to do
so for multiple versions in the same interpreter. This is way
too much work. A much simpler solution is to keep multiple
interpreters installed.
Specification
During the transitional phase, we have to support *three* division
operators within the same program: classic division (for / in
modules without a future division statement), true division (for /
in modules with a future division statement), and floor division
(for //). Each operator comes in two flavors: regular, and as an
augmented assignment operator (/= or //=).
The names associated with these variations are:
- Overloaded operator methods:
__div__(), __floordiv__(), __truediv__();
__idiv__(), __ifloordiv__(), __itruediv__().
- Abstract API C functions:
PyNumber_Divide(), PyNumber_FloorDivide(),
PyNumber_TrueDivide();
PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
PyNumber_InPlaceTrueDivide().
- Byte code opcodes:
BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;
INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.
- PyNumberMethod slots:
nb_divide, nb_floor_divide, nb_true_divide,
nb_inplace_divide, nb_inplace_floor_divide,
nb_inplace_true_divide.
The added PyNumberMethod slots require an additional flag in
tp_flags; this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and
will be included in Py_TPFLAGS_DEFAULT.
The true and floor division APIs will look for the corresponding
slots and call that; when that slot is NULL, they will raise an
exception. There is no fallback to the classic divide slot.
Command Line Option
The -D command line option takes a string argument that can take
three values: "old", "warn", or "new". The default is "old" in
Python 2.2 but will change to "warn" in later 2.x versions. The
"old" value means the classic division operator acts as described.
The "warn" value means the classic division operator issues a
warning (a DeprecatinWarning using the standard warning framework)
when applied to ints or longs. The "new" value changes the
default globally so that the / operator is always interpreted as
true division. The "new" option is only intended for use in
certain educational environments, where true division is required,
but asking the students to include the future division statement
in all their code would be a problem.
This option will not be supported in Python 3.0; Python 3.0 will
always interpret / as true division.
Semantics of Floor Division
Floor division will be implemented in all the Python numeric
types, and will have the semantics of
a // b == floor(a/b)
Except that the type of a//b will be the type a and b will be
except that the type of a//b will be the type a and b will be
coerced into. Specifically, if a and b are of the same type, a//b
will be of that type too. In the current Python 2.2 implementation,
this is implemented via the BINARY_INTDIVIDE opcode, and currently
does the right thing only for ints and longs (and other extension types
which behave like integers).
will be of that type too.
Changing the Semantics of the / Operator
Semantics of True Division
The nb_divide slot on integers (and long integers, if these are a
separate type, but see PEP 237 [1]) will issue a warning when given
integers a and b such that
a % b != 0
The warning will be off by default in the 2.2 release, and on by
default for in the next Python release, and will stay in effect
for 24 months. The next Python release after 24 months, it will
implement
(a/b) * b = a (more or less)
The type of a/b will be either a float or a rational, depending on
other PEPs[2, 3]. However, the result will be integral in all case
the division has no remainder. This will not implemented in Python 2.2.
True division for ints and longs will convert the arguments to
float and then apply a float division. That is, even 2/1 will
return a float (2.0), not an int.
__future__
The Future Division Statement
See PEP 236[4] for the __future__ specific rules.
If "from __future__ import division" is present in a module, or if
-Dnew is used, the / and /= operators are translated to true
division opcodes; otherwise they are translated to classic
division (until Python 3.0 comes along, where they are always
translated to true division).
If "from __future__ import division" is present in the
module, until the IntType nb_divide is changed, the "/" operator
is compiled to add 0.0 to the divisor before doing the division.
This is not exactly the same as what will happen in the future,
since in the future, the result will be integral when it can.
This is done via the BINARY_FLOATDIVIDE opcode.
The future division statement has no effect on the recognition or
translation of // and //=.
In Python 2.2, unless the __future__ statement is present,
the '/' operator is compiled to DIVISION opcode, which is the
same
C API
There are four new C-level functions. PyNumber_IntDivide and
PyNumber_FloatDivide correspond to the // operator and
the / operator with __future__ statement, respectively.
PyNumber_InPlaceIntDivide and PyNumber_InPlaceIntDivide correspond
to the //= operator and to the /= operator with __future__ statement
respectively. Please refer to the discussion of the operator
for specification of these functions' behavior.
See PEP 236[4] for the general rules for future statements.
FAQ
Should the // operator be renamed to "div"?
Q. How do I write code that works under the classic rules as well
as under the new rules without using // or a future division
statement?
No. There are problems with new keywords.
A. Use x*1.0/y for true division, divmod(x, y)[0] for int
division. Especially the latter is best hidden inside a
function. You may also write floor(x)/y for true division if
you are sure that you don't expect complex numbers. If you
know your integers are never negative, you can use int(x/y) --
while the documentation of int() says that int() can round or
truncate depending on the C implementation, we know of no C
implementation that doesn't truncate, and we're going to change
the spec for int() to promise truncation. Note that for
negative ints, classic division (and floor division) round
towards negative infinity, while int() rounds towards zero.
Should the // be made into a function called "div"?
Q. Why is my question not listed here?
No. People expect to be able to write math expressions directly
in Python.
A. Because we weren't of it. If you've discussed it on c.l.py,
please mail the second author.
Implementation
A mostly-complete implementation (not exactly following the above
spec, but close enough except for the lack of a warning for
truncated results from old division) is available from the
SourceForge patch manager[5]
A very early implementation (not yet following the above spec
is available from the SourceForge patch manager[5].
References
[0] PEP 228, Reworking Python's Numeric Model
http://www.python.org/peps/pep-0228.html
[1] PEP 237, Unifying Long Integers and Integers, Zadka,
http://www.python.org/peps/pep-0237.html