Completely reworked to be more specific about the motivation, the
implementation and the transition.
This commit is contained in:
parent
32a207d6aa
commit
1862e5722d
350
pep-0238.txt
350
pep-0238.txt
|
@ -11,119 +11,315 @@ Post-History: 16-Mar-2001
|
|||
|
||||
Abstract
|
||||
|
||||
Dividing integers currently returns the floor of the quantities.
|
||||
This behavior is known as integer division, and is similar to what
|
||||
C and FORTRAN do. This has the useful property that all
|
||||
operations on integers return integers, but it does tend to put a
|
||||
hump in the learning curve when new programmers are surprised that
|
||||
The current division (/) operator has an ambiguous meaning for
|
||||
numerical arguments: it returns the floor of the mathematical
|
||||
result if the arguments are ints or longs, but it returns a
|
||||
reasonable approximation of the result if the arguments are floats
|
||||
or complex. This makes expressions expecting float or complex
|
||||
results error-prone when integers are not expected but possible as
|
||||
inputs.
|
||||
|
||||
1/2 == 0
|
||||
We propose to fix this by introducing different operators for
|
||||
different operations: x/y to return a reasonable approximation of
|
||||
the mathematical result of the division ("true division"), x//y to
|
||||
return the floor ("floor division"). We call the current, mixed
|
||||
meaning of x/y "classic division".
|
||||
|
||||
This proposal shows a way to change this while keeping backward
|
||||
compatibility issues in mind.
|
||||
Because of severe backwards compatibility issues, not to mention a
|
||||
major flamewar on c.l.py, we propose the following transitional
|
||||
measures (starting with Python 2.2):
|
||||
|
||||
NOTE: in the light of recent discussions in the newsgroup, the
|
||||
motivation in this PEP (and details) need to be extended. I'll do
|
||||
that ASAP.
|
||||
- Classic division will remain the default in the Python 2.x
|
||||
series; true division will be standard in Python 3.0.
|
||||
|
||||
- The // operator will be available to request floor division
|
||||
unambiguously.
|
||||
|
||||
- The future division statement, spelled "from __future__ import
|
||||
division", will change the / operator to mean true division
|
||||
throughout the module.
|
||||
|
||||
- A command line option will enable run-time warnings for classic
|
||||
division applied to int or long arguments; another command line
|
||||
option will make true division the default.
|
||||
|
||||
- The standard library will use the future division statement and
|
||||
the // operator when appropriate, so as to completely avoid
|
||||
classic division.
|
||||
|
||||
|
||||
Rationale
|
||||
Motivation
|
||||
|
||||
The behavior of integer division is a major stumbling block found
|
||||
in user testing of Python. This manages to trip up new
|
||||
programmers regularly and even causes the experienced programmer
|
||||
to make the occasional mistake. The workarounds, like explicitly
|
||||
coercing one of the operands to float or use a non-integer
|
||||
literal, are very non-intuitive and lower the readability of the
|
||||
program.
|
||||
The classic division operator makes it hard to write numerical
|
||||
expressions that are supposed to give correct results from
|
||||
arbitrary numerical inputs. For all other operators, one can
|
||||
write down a formula such as x*y**2 + z, and the calculated result
|
||||
will be close to the mathematical result (within the limits of
|
||||
numerical accuracy, of course) for any numerical input type (int,
|
||||
long, float, or complex). But division poses a problem: if the
|
||||
expressions for both arguments happen to have an integral type, it
|
||||
implements floor division rather than true division.
|
||||
|
||||
The problem is unique to dynamically typed languages: in a
|
||||
statically typed language like C, the inputs, typically function
|
||||
arguments, would be declared as double or float, and when a call
|
||||
passes an integer argument, it is converted to double or float at
|
||||
the time of the call. Python doesn't have argument type
|
||||
declarations, so integer arguments can easily find their way into
|
||||
an expression.
|
||||
|
||||
The problem is particularly pernicious since ints are perfect
|
||||
substitutes for floats in all other circumstances: math.sqrt(2)
|
||||
returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0
|
||||
return the same value, and so on. Thus, the author of a numerical
|
||||
routine may only use floating point numbers to test his code, and
|
||||
believe that it works correctly, and a user may accidentally pass
|
||||
in an integer input value and get incorrect results.
|
||||
|
||||
Another way to look at this is that classic division makes it
|
||||
difficult to write polymorphic functions that work well with
|
||||
either float or int arguments; all other operators already do the
|
||||
right thing. No algorithm that works for both ints and floats has
|
||||
a need for truncating division in one case and true division in
|
||||
the other.
|
||||
|
||||
The correct work-around is subtle: casting an argument to float()
|
||||
is wrong if it could be a complex number; adding 0.0 to an
|
||||
argument doesn't preserve the sign of the argument if it was minus
|
||||
zero. The only solution without either downside is multiplying an
|
||||
argument (typically the first) by 1.0. This leaves the value and
|
||||
sign unchanged for float and complex, and turns int and long into
|
||||
a float with the corresponding value.
|
||||
|
||||
It is the opinion of the authors that this is a real design bug in
|
||||
Python, and that it should be fixed sooner rather than later.
|
||||
Assuming Python usage will continue to grow, the cost of leaving
|
||||
this bug in the language will eventually outweigh the cost of
|
||||
fixing old code -- there is an upper bound to the amount of code
|
||||
to be fixed, but the amount of code that might be affected by the
|
||||
bug in the future is unbounded.
|
||||
|
||||
Another reason for this change is the desire to ultimately unify
|
||||
Python's numeric model. This is the subject of PEP 228[0] (which
|
||||
is currently incomplete). A unified numeric model removes most of
|
||||
the user's need to be aware of different numerical types. This is
|
||||
good for beginners, but also takes away concerns about different
|
||||
numeric behavior for advanced programmers. (Of course, it won't
|
||||
remove concerns about numerical stability and accuracy.)
|
||||
|
||||
In a unified numeric model, the different types (int, long, float,
|
||||
complex, and possibly others, such as a new rational type) serve
|
||||
mostly as storage optimizations, and to some extent to indicate
|
||||
orthogonal properties such as inexactness or complexity. In a
|
||||
unified model, the integer 1 should be indistinguishable from the
|
||||
floating point number 1.0 (except for its inexactness), and both
|
||||
should behave the same in all numeric contexts. Clearly, in a
|
||||
unified numeric model, if a==b and c==d, a/c should equal b/d
|
||||
(taking some liberties due to rounding for inexact numbers), and
|
||||
since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also
|
||||
equal 0.5. Likewise, since 1//2 equals zero, 1.0//2.0 should also
|
||||
equal zero.
|
||||
|
||||
|
||||
// Operator
|
||||
Variations
|
||||
|
||||
A `//' operator which will be introduced, which will call the
|
||||
nb_intdivide or __intdiv__ slots. This operator will be
|
||||
implemented in all the Python numeric types, and will have the
|
||||
semantics of
|
||||
Esthetically, x//y doesn't please everyone, and hence several
|
||||
variations have been proposed: x div y, or div(x, y), sometimes in
|
||||
combination with x mod y or mod(x, y) as an alternative spelling
|
||||
for x%y.
|
||||
|
||||
We consider these solutions inferior, on the following grounds.
|
||||
|
||||
- Using x div y would introduce a new keyword. Since div is a
|
||||
popular identifier, this would break a fair amount of existing
|
||||
code, unless the new keyword was only recognized under a future
|
||||
division statement. Since it is expected that the majority of
|
||||
code that needs to be converted is dividing integers, this would
|
||||
greatly increase the need for the future division statement.
|
||||
Even with a future statement, the general sentiment against
|
||||
adding new keywords unless absolutely necessary argues against
|
||||
this.
|
||||
|
||||
- Using div(x, y) makes the conversion of old code much harder.
|
||||
Replacing x/y with x//y or x div y can be done with a simple
|
||||
query replace; in most cases the programmer can easily verify
|
||||
that a particular module only works with integers so all
|
||||
occurrences of x/y can be replaced. (The query replace is still
|
||||
needed to weed out slashes occurring in comments or string
|
||||
literals.) Replacing x/y with div(x, y) would require a much
|
||||
more intelligent tool, since the extent of the expressions to
|
||||
the left and right of the / must be analized before the
|
||||
placement of the "div(" and ")" part can be decided.
|
||||
|
||||
|
||||
Alternatives
|
||||
|
||||
In order to reduce the amount of old code that needs to be
|
||||
converted, several alternative proposals have been put forth.
|
||||
Here is a brief discussion of each proposal (or category of
|
||||
proposals). If you know of an alternative that was discussed on
|
||||
c.l.py that isn't mentioned here, please mail the second author.
|
||||
|
||||
- Let / keep its classic semantics; introduce // for true
|
||||
division. This doesn't solve the problem that the classic /
|
||||
operator makes it hard to write polymorphic numeric functions
|
||||
accept int and float arguments, and still requires the use of
|
||||
x*1.0/y whenever true divisions is required.
|
||||
|
||||
- Use a directive to use specific division semantics in a module,
|
||||
rather than a future statement. This retains classic division
|
||||
as a permanent wart in the language, requiring future
|
||||
generations of Python programmers to be aware of the problem and
|
||||
the remedies.
|
||||
|
||||
- Use "from __past__ import division" to use classic division
|
||||
semantics in a module. This also retains the classic division
|
||||
as a permanent wart, or at least for a long time (eventually the
|
||||
past division statement could raise an ImportError).
|
||||
|
||||
- Use a directive (or some other way) to specify the Python
|
||||
version for which a specific piece of code was developed. This
|
||||
requires future Python interpreters to be able to emulate
|
||||
*exactly* every previous version of Python, and moreover to do
|
||||
so for multiple versions in the same interpreter. This is way
|
||||
too much work. A much simpler solution is to keep multiple
|
||||
interpreters installed.
|
||||
|
||||
|
||||
Specification
|
||||
|
||||
During the transitional phase, we have to support *three* division
|
||||
operators within the same program: classic division (for / in
|
||||
modules without a future division statement), true division (for /
|
||||
in modules with a future division statement), and floor division
|
||||
(for //). Each operator comes in two flavors: regular, and as an
|
||||
augmented assignment operator (/= or //=).
|
||||
|
||||
The names associated with these variations are:
|
||||
|
||||
- Overloaded operator methods:
|
||||
|
||||
__div__(), __floordiv__(), __truediv__();
|
||||
|
||||
__idiv__(), __ifloordiv__(), __itruediv__().
|
||||
|
||||
- Abstract API C functions:
|
||||
|
||||
PyNumber_Divide(), PyNumber_FloorDivide(),
|
||||
PyNumber_TrueDivide();
|
||||
|
||||
PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
|
||||
PyNumber_InPlaceTrueDivide().
|
||||
|
||||
- Byte code opcodes:
|
||||
|
||||
BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;
|
||||
|
||||
INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.
|
||||
|
||||
- PyNumberMethod slots:
|
||||
|
||||
nb_divide, nb_floor_divide, nb_true_divide,
|
||||
|
||||
nb_inplace_divide, nb_inplace_floor_divide,
|
||||
nb_inplace_true_divide.
|
||||
|
||||
The added PyNumberMethod slots require an additional flag in
|
||||
tp_flags; this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and
|
||||
will be included in Py_TPFLAGS_DEFAULT.
|
||||
|
||||
The true and floor division APIs will look for the corresponding
|
||||
slots and call that; when that slot is NULL, they will raise an
|
||||
exception. There is no fallback to the classic divide slot.
|
||||
|
||||
|
||||
Command Line Option
|
||||
|
||||
The -D command line option takes a string argument that can take
|
||||
three values: "old", "warn", or "new". The default is "old" in
|
||||
Python 2.2 but will change to "warn" in later 2.x versions. The
|
||||
"old" value means the classic division operator acts as described.
|
||||
The "warn" value means the classic division operator issues a
|
||||
warning (a DeprecatinWarning using the standard warning framework)
|
||||
when applied to ints or longs. The "new" value changes the
|
||||
default globally so that the / operator is always interpreted as
|
||||
true division. The "new" option is only intended for use in
|
||||
certain educational environments, where true division is required,
|
||||
but asking the students to include the future division statement
|
||||
in all their code would be a problem.
|
||||
|
||||
This option will not be supported in Python 3.0; Python 3.0 will
|
||||
always interpret / as true division.
|
||||
|
||||
|
||||
Semantics of Floor Division
|
||||
|
||||
Floor division will be implemented in all the Python numeric
|
||||
types, and will have the semantics of
|
||||
|
||||
a // b == floor(a/b)
|
||||
|
||||
Except that the type of a//b will be the type a and b will be
|
||||
except that the type of a//b will be the type a and b will be
|
||||
coerced into. Specifically, if a and b are of the same type, a//b
|
||||
will be of that type too. In the current Python 2.2 implementation,
|
||||
this is implemented via the BINARY_INTDIVIDE opcode, and currently
|
||||
does the right thing only for ints and longs (and other extension types
|
||||
which behave like integers).
|
||||
will be of that type too.
|
||||
|
||||
|
||||
Changing the Semantics of the / Operator
|
||||
Semantics of True Division
|
||||
|
||||
The nb_divide slot on integers (and long integers, if these are a
|
||||
separate type, but see PEP 237 [1]) will issue a warning when given
|
||||
integers a and b such that
|
||||
|
||||
a % b != 0
|
||||
|
||||
The warning will be off by default in the 2.2 release, and on by
|
||||
default for in the next Python release, and will stay in effect
|
||||
for 24 months. The next Python release after 24 months, it will
|
||||
implement
|
||||
|
||||
(a/b) * b = a (more or less)
|
||||
|
||||
The type of a/b will be either a float or a rational, depending on
|
||||
other PEPs[2, 3]. However, the result will be integral in all case
|
||||
the division has no remainder. This will not implemented in Python 2.2.
|
||||
True division for ints and longs will convert the arguments to
|
||||
float and then apply a float division. That is, even 2/1 will
|
||||
return a float (2.0), not an int.
|
||||
|
||||
|
||||
__future__
|
||||
The Future Division Statement
|
||||
|
||||
See PEP 236[4] for the __future__ specific rules.
|
||||
If "from __future__ import division" is present in a module, or if
|
||||
-Dnew is used, the / and /= operators are translated to true
|
||||
division opcodes; otherwise they are translated to classic
|
||||
division (until Python 3.0 comes along, where they are always
|
||||
translated to true division).
|
||||
|
||||
If "from __future__ import division" is present in the
|
||||
module, until the IntType nb_divide is changed, the "/" operator
|
||||
is compiled to add 0.0 to the divisor before doing the division.
|
||||
This is not exactly the same as what will happen in the future,
|
||||
since in the future, the result will be integral when it can.
|
||||
This is done via the BINARY_FLOATDIVIDE opcode.
|
||||
The future division statement has no effect on the recognition or
|
||||
translation of // and //=.
|
||||
|
||||
In Python 2.2, unless the __future__ statement is present,
|
||||
the '/' operator is compiled to DIVISION opcode, which is the
|
||||
same
|
||||
|
||||
|
||||
C API
|
||||
|
||||
There are four new C-level functions. PyNumber_IntDivide and
|
||||
PyNumber_FloatDivide correspond to the // operator and
|
||||
the / operator with __future__ statement, respectively.
|
||||
PyNumber_InPlaceIntDivide and PyNumber_InPlaceIntDivide correspond
|
||||
to the //= operator and to the /= operator with __future__ statement
|
||||
respectively. Please refer to the discussion of the operator
|
||||
for specification of these functions' behavior.
|
||||
See PEP 236[4] for the general rules for future statements.
|
||||
|
||||
|
||||
FAQ
|
||||
|
||||
Should the // operator be renamed to "div"?
|
||||
Q. How do I write code that works under the classic rules as well
|
||||
as under the new rules without using // or a future division
|
||||
statement?
|
||||
|
||||
No. There are problems with new keywords.
|
||||
A. Use x*1.0/y for true division, divmod(x, y)[0] for int
|
||||
division. Especially the latter is best hidden inside a
|
||||
function. You may also write floor(x)/y for true division if
|
||||
you are sure that you don't expect complex numbers. If you
|
||||
know your integers are never negative, you can use int(x/y) --
|
||||
while the documentation of int() says that int() can round or
|
||||
truncate depending on the C implementation, we know of no C
|
||||
implementation that doesn't truncate, and we're going to change
|
||||
the spec for int() to promise truncation. Note that for
|
||||
negative ints, classic division (and floor division) round
|
||||
towards negative infinity, while int() rounds towards zero.
|
||||
|
||||
Should the // be made into a function called "div"?
|
||||
Q. Why is my question not listed here?
|
||||
|
||||
No. People expect to be able to write math expressions directly
|
||||
in Python.
|
||||
A. Because we weren't of it. If you've discussed it on c.l.py,
|
||||
please mail the second author.
|
||||
|
||||
|
||||
Implementation
|
||||
|
||||
A mostly-complete implementation (not exactly following the above
|
||||
spec, but close enough except for the lack of a warning for
|
||||
truncated results from old division) is available from the
|
||||
SourceForge patch manager[5]
|
||||
A very early implementation (not yet following the above spec
|
||||
is available from the SourceForge patch manager[5].
|
||||
|
||||
|
||||
References
|
||||
|
||||
[0] PEP 228, Reworking Python's Numeric Model
|
||||
http://www.python.org/peps/pep-0228.html
|
||||
|
||||
[1] PEP 237, Unifying Long Integers and Integers, Zadka,
|
||||
http://www.python.org/peps/pep-0237.html
|
||||
|
||||
|
|
Loading…
Reference in New Issue