481 lines
22 KiB
ReStructuredText
481 lines
22 KiB
ReStructuredText
PEP: 238
|
|
Title: Changing the Division Operator
|
|
Author: Moshe Zadka <moshez@zadka.site.co.il>,
|
|
Guido van Rossum <guido@python.org>
|
|
Status: Final
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 11-Mar-2001
|
|
Python-Version: 2.2
|
|
Post-History: 16-Mar-2001, 26-Jul-2001, 27-Jul-2001
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
The current division (``/``) operator has an ambiguous meaning for numerical
|
|
arguments: it returns the floor of the mathematical result of division if the
|
|
arguments are ints or longs, but it returns a reasonable approximation of the
|
|
division result if the arguments are floats or complex. This makes
|
|
expressions expecting float or complex results error-prone when integers are
|
|
not expected but possible as inputs.
|
|
|
|
We propose to fix this by introducing different operators for different
|
|
operations: ``x/y`` to return a reasonable approximation of the mathematical
|
|
result of the division ("true division"), ``x//y`` to return the floor
|
|
("floor division"). We call the current, mixed meaning of x/y
|
|
"classic division".
|
|
|
|
Because of severe backwards compatibility issues, not to mention a major
|
|
flamewar on c.l.py, we propose the following transitional measures (starting
|
|
with Python 2.2):
|
|
|
|
- Classic division will remain the default in the Python 2.x series; true
|
|
division will be standard in Python 3.0.
|
|
|
|
- The ``//`` operator will be available to request floor division
|
|
unambiguously.
|
|
|
|
- The future division statement, spelled ``from __future__ import division``,
|
|
will change the ``/`` operator to mean true division throughout the module.
|
|
|
|
- A command line option will enable run-time warnings for classic division
|
|
applied to int or long arguments; another command line option will make true
|
|
division the default.
|
|
|
|
- The standard library will use the future division statement and the ``//``
|
|
operator when appropriate, so as to completely avoid classic division.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
The classic division operator makes it hard to write numerical expressions
|
|
that are supposed to give correct results from arbitrary numerical inputs.
|
|
For all other operators, one can write down a formula such as ``x*y**2 + z``,
|
|
and the calculated result will be close to the mathematical result (within the
|
|
limits of numerical accuracy, of course) for any numerical input type (int,
|
|
long, float, or complex). But division poses a problem: if the expressions
|
|
for both arguments happen to have an integral type, it implements floor
|
|
division rather than true division.
|
|
|
|
The problem is unique to dynamically typed languages: in a statically typed
|
|
language like C, the inputs, typically function arguments, would be declared
|
|
as double or float, and when a call passes an integer argument, it is
|
|
converted to double or float at the time of the call. Python doesn't have
|
|
argument type declarations, so integer arguments can easily find their way
|
|
into an expression.
|
|
|
|
The problem is particularly pernicious since ints are perfect substitutes for
|
|
floats in all other circumstances: ``math.sqrt(2)`` returns the same value as
|
|
``math.sqrt(2.0)``, ``3.14*100`` and ``3.14*100.0`` return the same value, and
|
|
so on. Thus, the author of a numerical routine may only use floating point
|
|
numbers to test his code, and believe that it works correctly, and a user may
|
|
accidentally pass in an integer input value and get incorrect results.
|
|
|
|
Another way to look at this is that classic division makes it difficult to
|
|
write polymorphic functions that work well with either float or int arguments;
|
|
all other operators already do the right thing. No algorithm that works for
|
|
both ints and floats has a need for truncating division in one case and true
|
|
division in the other.
|
|
|
|
The correct work-around is subtle: casting an argument to float() is wrong if
|
|
it could be a complex number; adding 0.0 to an argument doesn't preserve the
|
|
sign of the argument if it was minus zero. The only solution without either
|
|
downside is multiplying an argument (typically the first) by 1.0. This leaves
|
|
the value and sign unchanged for float and complex, and turns int and long
|
|
into a float with the corresponding value.
|
|
|
|
It is the opinion of the authors that this is a real design bug in Python, and
|
|
that it should be fixed sooner rather than later. Assuming Python usage will
|
|
continue to grow, the cost of leaving this bug in the language will eventually
|
|
outweigh the cost of fixing old code -- there is an upper bound to the amount
|
|
of code to be fixed, but the amount of code that might be affected by the bug
|
|
in the future is unbounded.
|
|
|
|
Another reason for this change is the desire to ultimately unify Python's
|
|
numeric model. This is the subject of :pep:`228` (which is currently
|
|
incomplete). A unified numeric model removes most of the user's need to be
|
|
aware of different numerical types. This is good for beginners, but also
|
|
takes away concerns about different numeric behavior for advanced programmers.
|
|
(Of course, it won't remove concerns about numerical stability and accuracy.)
|
|
|
|
In a unified numeric model, the different types (int, long, float, complex,
|
|
and possibly others, such as a new rational type) serve mostly as storage
|
|
optimizations, and to some extent to indicate orthogonal properties such as
|
|
inexactness or complexity. In a unified model, the integer 1 should be
|
|
indistinguishable from the floating point number 1.0 (except for its
|
|
inexactness), and both should behave the same in all numeric contexts.
|
|
Clearly, in a unified numeric model, if ``a==b`` and ``c==d``, ``a/c`` should
|
|
equal ``b/d`` (taking some liberties due to rounding for inexact numbers), and
|
|
since everybody agrees that ``1.0/2.0`` equals 0.5, ``1/2`` should also equal
|
|
0.5. Likewise, since ``1//2`` equals zero, ``1.0//2.0`` should also equal
|
|
zero.
|
|
|
|
|
|
Variations
|
|
==========
|
|
|
|
Aesthetically, ``x//y`` doesn't please everyone, and hence several variations
|
|
have been proposed. They are addressed here:
|
|
|
|
- ``x div y``. This would introduce a new keyword. Since ``div`` is a
|
|
popular identifier, this would break a fair amount of existing code, unless
|
|
the new keyword was only recognized under a future division statement.
|
|
Since it is expected that the majority of code that needs to be converted is
|
|
dividing integers, this would greatly increase the need for the future
|
|
division statement. Even with a future statement, the general sentiment
|
|
against adding new keywords unless absolutely necessary argues against this.
|
|
|
|
- ``div(x, y)``. This makes the conversion of old code much harder.
|
|
Replacing ``x/y`` with ``x//y`` or ``x div y`` can be done with a simple
|
|
query replace; in most cases the programmer can easily verify that a
|
|
particular module only works with integers so all occurrences of ``x/y`` can
|
|
be replaced. (The query replace is still needed to weed out slashes
|
|
occurring in comments or string literals.) Replacing ``x/y`` with
|
|
``div(x, y)`` would require a much more intelligent tool, since the extent
|
|
of the expressions to the left and right of the ``/`` must be analyzed
|
|
before the placement of the ``div(`` and ``)`` part can be decided.
|
|
|
|
- ``x \ y``. The backslash is already a token, meaning line continuation, and
|
|
in general it suggests an *escape* to Unix eyes. In addition (this due to
|
|
Terry Reedy) this would make things like ``eval("x\y")`` harder to get
|
|
right.
|
|
|
|
|
|
Alternatives
|
|
============
|
|
|
|
In order to reduce the amount of old code that needs to be converted, several
|
|
alternative proposals have been put forth. Here is a brief discussion of each
|
|
proposal (or category of proposals). If you know of an alternative that was
|
|
discussed on c.l.py that isn't mentioned here, please mail the second author.
|
|
|
|
- Let ``/`` keep its classic semantics; introduce ``//`` for true division.
|
|
This still leaves a broken operator in the language, and invites to use the
|
|
broken behavior. It also shuts off the road to a unified numeric model a la
|
|
:pep:`228`.
|
|
|
|
- Let int division return a special "portmanteau" type that behaves as an
|
|
integer in integer context, but like a float in a float context. The
|
|
problem with this is that after a few operations, the int and the float
|
|
value could be miles apart, it's unclear which value should be used in
|
|
comparisons, and of course many contexts (like conversion to string) don't
|
|
have a clear integer or float preference.
|
|
|
|
- Use a directive to use specific division semantics in a module, rather than
|
|
a future statement. This retains classic division as a permanent wart in
|
|
the language, requiring future generations of Python programmers to be
|
|
aware of the problem and the remedies.
|
|
|
|
- Use ``from __past__ import division`` to use classic division semantics in a
|
|
module. This also retains the classic division as a permanent wart, or at
|
|
least for a long time (eventually the past division statement could raise an
|
|
``ImportError``).
|
|
|
|
- Use a directive (or some other way) to specify the Python version for which
|
|
a specific piece of code was developed. This requires future Python
|
|
interpreters to be able to emulate *exactly* several previous versions of
|
|
Python, and moreover to do so for multiple versions within the same
|
|
interpreter. This is way too much work. A much simpler solution is to keep
|
|
multiple interpreters installed. Another argument against this is that the
|
|
version directive is almost always overspecified: most code written for
|
|
Python X.Y, works for Python X.(Y-1) and X.(Y+1) as well, so specifying X.Y
|
|
as a version is more constraining than it needs to be. At the same time,
|
|
there's no way to know at which future or past version the code will break.
|
|
|
|
|
|
API Changes
|
|
===========
|
|
|
|
During the transitional phase, we have to support *three* division operators
|
|
within the same program: classic division (for ``/`` in modules without a
|
|
future division statement), true division (for ``/`` in modules with a future
|
|
division statement), and floor division (for ``//``). Each operator comes in
|
|
two flavors: regular, and as an augmented assignment operator (``/=`` or
|
|
``//=``).
|
|
|
|
The names associated with these variations are:
|
|
|
|
- Overloaded operator methods::
|
|
|
|
__div__(), __floordiv__(), __truediv__();
|
|
__idiv__(), __ifloordiv__(), __itruediv__().
|
|
|
|
- Abstract API C functions::
|
|
|
|
PyNumber_Divide(), PyNumber_FloorDivide(),
|
|
PyNumber_TrueDivide();
|
|
|
|
PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
|
|
PyNumber_InPlaceTrueDivide().
|
|
|
|
- Byte code opcodes::
|
|
|
|
BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;
|
|
INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.
|
|
|
|
- PyNumberMethod slots::
|
|
|
|
nb_divide, nb_floor_divide, nb_true_divide,
|
|
nb_inplace_divide, nb_inplace_floor_divide,
|
|
nb_inplace_true_divide.
|
|
|
|
The added ``PyNumberMethod`` slots require an additional flag in ``tp_flags``;
|
|
this flag will be named ``Py_TPFLAGS_HAVE_NEWDIVIDE`` and will be included in
|
|
``Py_TPFLAGS_DEFAULT``.
|
|
|
|
The true and floor division APIs will look for the corresponding slots and
|
|
call that; when that slot is ``NULL``, they will raise an exception. There is
|
|
no fallback to the classic divide slot.
|
|
|
|
In Python 3.0, the classic division semantics will be removed; the classic
|
|
division APIs will become synonymous with true division.
|
|
|
|
|
|
Command Line Option
|
|
===================
|
|
|
|
The ``-Q`` command line option takes a string argument that can take four
|
|
values: ``old``, ``warn``, ``warnall``, or ``new``. The default is ``old``
|
|
in Python 2.2 but will change to ``warn`` in later 2.x versions. The ``old``
|
|
value means the classic division operator acts as described. The ``warn``
|
|
value means the classic division operator issues a warning (a
|
|
``DeprecationWarning`` using the standard warning framework) when applied
|
|
to ints or longs. The ``warnall`` value also issues warnings for classic
|
|
division when applied to floats or complex; this is for use by the
|
|
``fixdiv.py`` conversion script mentioned below. The ``new`` value changes
|
|
the default globally so that the ``/`` operator is always interpreted as
|
|
true division. The ``new`` option is only intended for use in certain
|
|
educational environments, where true division is required, but asking the
|
|
students to include the future division statement in all their code would be a
|
|
problem.
|
|
|
|
This option will not be supported in Python 3.0; Python 3.0 will always
|
|
interpret ``/`` as true division.
|
|
|
|
(This option was originally proposed as ``-D``, but that turned out to be an
|
|
existing option for Jython, hence the Q -- mnemonic for Quotient. Other names
|
|
have been proposed, like ``-Qclassic``, ``-Qclassic-warn``, ``-Qtrue``, or
|
|
``-Qold_division`` etc.; these seem more verbose to me without much advantage.
|
|
After all the term classic division is not used in the language at all (only
|
|
in the PEP), and the term true division is rarely used in the language -- only
|
|
in ``__truediv__``.)
|
|
|
|
|
|
Semantics of Floor Division
|
|
===========================
|
|
|
|
Floor division will be implemented in all the Python numeric types, and will
|
|
have the semantics of::
|
|
|
|
a // b == floor(a/b)
|
|
|
|
except that the result type will be the common type into which *a* and *b* are
|
|
coerced before the operation.
|
|
|
|
Specifically, if *a* and *b* are of the same type, ``a//b`` will be of that
|
|
type too. If the inputs are of different types, they are first coerced to a
|
|
common type using the same rules used for all other arithmetic operators.
|
|
|
|
In particular, if *a* and *b* are both ints or longs, the result has the same
|
|
type and value as for classic division on these types (including the case of
|
|
mixed input types; ``int//long`` and ``long//int`` will both return a long).
|
|
|
|
For floating point inputs, the result is a float. For example::
|
|
|
|
3.5//2.0 == 1.0
|
|
|
|
For complex numbers, ``//`` raises an exception, since ``floor()`` of a
|
|
complex number is not allowed.
|
|
|
|
For user-defined classes and extension types, all semantics are up to the
|
|
implementation of the class or type.
|
|
|
|
|
|
Semantics of True Division
|
|
==========================
|
|
|
|
True division for ints and longs will convert the arguments to float and then
|
|
apply a float division. That is, even ``2/1`` will return a ``float (2.0)``,
|
|
not an int. For floats and complex, it will be the same as classic division.
|
|
|
|
The 2.2 implementation of true division acts as if the float type had
|
|
unbounded range, so that overflow doesn't occur unless the magnitude of the
|
|
mathematical *result* is too large to represent as a float. For example,
|
|
after ``x = 1L << 40000``, ``float(x)`` raises ``OverflowError`` (note that
|
|
this is also new in 2.2: previously the outcome was platform-dependent, most
|
|
commonly a float infinity). But ``x/x`` returns 1.0 without exception,
|
|
while ``x/1`` raises ``OverflowError``.
|
|
|
|
Note that for int and long arguments, true division may lose information; this
|
|
is in the nature of true division (as long as rationals are not in the
|
|
language). Algorithms that consciously use longs should consider using
|
|
``//``, as true division of longs retains no more than 53 bits of precision
|
|
(on most platforms).
|
|
|
|
If and when a rational type is added to Python (see :pep:`239`), true
|
|
division for ints and longs should probably return a rational. This avoids
|
|
the problem with true division of ints and longs losing information. But
|
|
until then, for consistency, float is the only choice for true division.
|
|
|
|
|
|
The Future Division Statement
|
|
=============================
|
|
|
|
If ``from __future__ import division`` is present in a module, or if
|
|
``-Qnew`` is used, the ``/`` and ``/=`` operators are translated to true
|
|
division opcodes; otherwise they are translated to classic division (until
|
|
Python 3.0 comes along, where they are always translated to true division).
|
|
|
|
The future division statement has no effect on the recognition or translation
|
|
of ``//`` and ``//=``.
|
|
|
|
See :pep:`236` for the general rules for future statements.
|
|
|
|
(It has been proposed to use a longer phrase, like *true_division* or
|
|
*modern_division*. These don't seem to add much information.)
|
|
|
|
|
|
Open Issues
|
|
===========
|
|
|
|
We expect that these issues will be resolved over time, as more feedback is
|
|
received or we gather more experience with the initial implementation.
|
|
|
|
- It has been proposed to call ``//`` the quotient operator, and the ``/``
|
|
operator the ratio operator. I'm not sure about this -- for some people
|
|
quotient is just a synonym for division, and ratio suggests rational
|
|
numbers, which is wrong. I prefer the terminology to be slightly awkward
|
|
if that avoids unambiguity. Also, for some folks *quotient* suggests
|
|
truncation towards zero, not towards infinity as *floor division*
|
|
says explicitly.
|
|
|
|
- It has been argued that a command line option to change the default is
|
|
evil. It can certainly be dangerous in the wrong hands: for example, it
|
|
would be impossible to combine a 3rd party library package that requires
|
|
``-Qnew`` with another one that requires ``-Qold``. But I believe that the
|
|
VPython folks need a way to enable true division by default, and other
|
|
educators might need the same. These usually have enough control over the
|
|
library packages available in their environment.
|
|
|
|
- For classes to have to support all three of ``__div__()``,
|
|
``__floordiv__()`` and ``__truediv__()`` seems painful; and what to do in
|
|
3.0? Maybe we only need ``__div__()`` and ``__floordiv__()``, or maybe at
|
|
least true division should try ``__truediv__()`` first and ``__div__()``
|
|
second.
|
|
|
|
|
|
Resolved Issues
|
|
===============
|
|
|
|
- Issue: For very large long integers, the definition of true division as
|
|
returning a float causes problems, since the range of Python longs is much
|
|
larger than that of Python floats. This problem will disappear if and when
|
|
rational numbers are supported.
|
|
|
|
Resolution: For long true division, Python uses an internal float type with
|
|
native double precision but unbounded range, so that OverflowError doesn't
|
|
occur unless the quotient is too large to represent as a native double.
|
|
|
|
- Issue: In the interim, maybe the long-to-float conversion could be made to
|
|
raise ``OverflowError`` if the long is out of range.
|
|
|
|
Resolution: This has been implemented, but, as above, the magnitude of the
|
|
inputs to long true division doesn't matter; only the magnitude of the
|
|
quotient matters.
|
|
|
|
- Issue: Tim Peters will make sure that whenever an in-range float is
|
|
returned, decent precision is guaranteed.
|
|
|
|
Resolution: Provided the quotient of long true division is representable as
|
|
a float, it suffers no more than 3 rounding errors: one each for converting
|
|
the inputs to an internal float type with native double precision but
|
|
unbounded range, and one more for the division. However, note that if the
|
|
magnitude of the quotient is too *small* to represent as a native double,
|
|
0.0 is returned without exception ("silent underflow").
|
|
|
|
|
|
FAQ
|
|
===
|
|
|
|
When will Python 3.0 be released?
|
|
---------------------------------
|
|
|
|
We don't plan that long ahead, so we can't say for sure. We want to allow
|
|
at least two years for the transition. If Python 3.0 comes out sooner,
|
|
we'll keep the 2.x line alive for backwards compatibility until at least
|
|
two years from the release of Python 2.2. In practice, you will be able
|
|
to continue to use the Python 2.x line for several years after Python 3.0
|
|
is released, so you can take your time with the transition. Sites are
|
|
expected to have both Python 2.x and Python 3.x installed simultaneously.
|
|
|
|
Why isn't true division called float division?
|
|
----------------------------------------------
|
|
|
|
Because I want to keep the door open to *possibly* introducing rationals
|
|
and making 1/2 return a rational rather than a float. See :pep:`239`.
|
|
|
|
Why is there a need for ``__truediv__`` and ``__itruediv__``?
|
|
-------------------------------------------------------------
|
|
|
|
We don't want to make user-defined classes second-class citizens.
|
|
Certainly not with the type/class unification going on.
|
|
|
|
How do I write code that works under the classic rules as well as under the new rules without using ``//`` or a future division statement?
|
|
------------------------------------------------------------------------------------------------------------------------------------------
|
|
|
|
Use ``x*1.0/y`` for true division, ``divmod(x, y)`` (:pep:`228`) for int
|
|
division. Especially the latter is best hidden inside a function. You
|
|
may also write ``float(x)/y`` for true division if you are sure that you
|
|
don't expect complex numbers. If you know your integers are never
|
|
negative, you can use ``int(x/y)`` -- while the documentation of ``int()``
|
|
says that ``int()`` can round or truncate depending on the C
|
|
implementation, we know of no C implementation that doesn't truncate, and
|
|
we're going to change the spec for ``int()`` to promise truncation. Note
|
|
that classic division (and floor division) round towards negative
|
|
infinity, while ``int()`` rounds towards zero, giving different answers
|
|
for negative numbers.
|
|
|
|
How do I specify the division semantics for ``input()``, ``compile()``, ``execfile()``, ``eval()`` and ``exec``?
|
|
----------------------------------------------------------------------------------------------------------------
|
|
|
|
They inherit the choice from the invoking module. :pep:`236` now lists
|
|
this as a resolved problem, referring to :pep:`264`.
|
|
|
|
What about code compiled by the codeop module?
|
|
----------------------------------------------
|
|
|
|
This is dealt with properly; see :pep:`264`.
|
|
|
|
Will there be conversion tools or aids?
|
|
---------------------------------------
|
|
|
|
Certainly. While these are outside the scope of the PEP, I should point
|
|
out two simple tools that will be released with Python 2.2a3:
|
|
``Tools/scripts/finddiv.py`` finds division operators (slightly smarter
|
|
than ``grep /``) and ``Tools/scripts/fixdiv.py`` can produce patches based
|
|
on run-time analysis.
|
|
|
|
Why is my question not answered here?
|
|
-------------------------------------
|
|
|
|
Because we weren't aware of it. If it's been discussed on c.l.py and you
|
|
believe the answer is of general interest, please notify the second
|
|
author. (We don't have the time or inclination to answer every question
|
|
sent in private email, hence the requirement that it be discussed on
|
|
c.l.py first.)
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Essentially everything mentioned here is implemented in CVS and will be
|
|
released with Python 2.2a3; most of it was already released with Python 2.2a2.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|