Completely reworked to be more specific about the motivation, the

implementation and the transition.
2001-07-26 19:29:39 +00:00 · 2001-07-26 19:29:39 +00:00 · 1862e5722d
parent 32a207d6aa
commit 1862e5722d
1 changed files with 273 additions and 77 deletions
--- a/pep-0238.txt
+++ b/pep-0238.txt
@ -11,119 +11,315 @@ Post-History: 16-Mar-2001

 Abstract

-    Dividing integers currently returns the floor of the quantities.
-    This behavior is known as integer division, and is similar to what
-    C and FORTRAN do.  This has the useful property that all
-    operations on integers return integers, but it does tend to put a
-    hump in the learning curve when new programmers are surprised that
+    The current division (/) operator has an ambiguous meaning for
+    numerical arguments: it returns the floor of the mathematical
+    result if the arguments are ints or longs, but it returns a
+    reasonable approximation of the result if the arguments are floats
+    or complex.  This makes expressions expecting float or complex
+    results error-prone when integers are not expected but possible as
+    inputs.

-        1/2 == 0
+    We propose to fix this by introducing different operators for
+    different operations: x/y to return a reasonable approximation of
+    the mathematical result of the division ("true division"), x//y to
+    return the floor ("floor division").  We call the current, mixed
+    meaning of x/y "classic division".

-    This proposal shows a way to change this while keeping backward
-    compatibility issues in mind.
+    Because of severe backwards compatibility issues, not to mention a
+    major flamewar on c.l.py, we propose the following transitional
+    measures (starting with Python 2.2):

-    NOTE: in the light of recent discussions in the newsgroup, the
-    motivation in this PEP (and details) need to be extended.  I'll do
-    that ASAP.
+    - Classic division will remain the default in the Python 2.x
+      series; true division will be standard in Python 3.0.
+
+    - The // operator will be available to request floor division
+      unambiguously.
+
+    - The future division statement, spelled "from __future__ import
+      division", will change the / operator to mean true division
+      throughout the module.
+
+    - A command line option will enable run-time warnings for classic
+      division applied to int or long arguments; another command line
+      option will make true division the default.
+
+    - The standard library will use the future division statement and
+      the // operator when appropriate, so as to completely avoid
+      classic division.


-Rationale
+Motivation

-    The behavior of integer division is a major stumbling block found
-    in user testing of Python.  This manages to trip up new
-    programmers regularly and even causes the experienced programmer
-    to make the occasional mistake.  The workarounds, like explicitly
-    coercing one of the operands to float or use a non-integer
-    literal, are very non-intuitive and lower the readability of the
-    program.
+    The classic division operator makes it hard to write numerical
+    expressions that are supposed to give correct results from
+    arbitrary numerical inputs.  For all other operators, one can
+    write down a formula such as x*y**2 + z, and the calculated result
+    will be close to the mathematical result (within the limits of
+    numerical accuracy, of course) for any numerical input type (int,
+    long, float, or complex).  But division poses a problem: if the
+    expressions for both arguments happen to have an integral type, it
+    implements floor division rather than true division.
+
+    The problem is unique to dynamically typed languages: in a
+    statically typed language like C, the inputs, typically function
+    arguments, would be declared as double or float, and when a call
+    passes an integer argument, it is converted to double or float at
+    the time of the call.  Python doesn't have argument type
+    declarations, so integer arguments can easily find their way into
+    an expression.
+
+    The problem is particularly pernicious since ints are perfect
+    substitutes for floats in all other circumstances: math.sqrt(2)
+    returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0
+    return the same value, and so on.  Thus, the author of a numerical
+    routine may only use floating point numbers to test his code, and
+    believe that it works correctly, and a user may accidentally pass
+    in an integer input value and get incorrect results.
+
+    Another way to look at this is that classic division makes it
+    difficult to write polymorphic functions that work well with
+    either float or int arguments; all other operators already do the
+    right thing.  No algorithm that works for both ints and floats has
+    a need for truncating division in one case and true division in
+    the other.
+
+    The correct work-around is subtle: casting an argument to float()
+    is wrong if it could be a complex number; adding 0.0 to an
+    argument doesn't preserve the sign of the argument if it was minus
+    zero.  The only solution without either downside is multiplying an
+    argument (typically the first) by 1.0.  This leaves the value and
+    sign unchanged for float and complex, and turns int and long into
+    a float with the corresponding value.
+
+    It is the opinion of the authors that this is a real design bug in
+    Python, and that it should be fixed sooner rather than later.
+    Assuming Python usage will continue to grow, the cost of leaving
+    this bug in the language will eventually outweigh the cost of
+    fixing old code -- there is an upper bound to the amount of code
+    to be fixed, but the amount of code that might be affected by the
+    bug in the future is unbounded.
+
+    Another reason for this change is the desire to ultimately unify
+    Python's numeric model.  This is the subject of PEP 228[0] (which
+    is currently incomplete).  A unified numeric model removes most of
+    the user's need to be aware of different numerical types.  This is
+    good for beginners, but also takes away concerns about different
+    numeric behavior for advanced programmers.  (Of course, it won't
+    remove concerns about numerical stability and accuracy.)
+
+    In a unified numeric model, the different types (int, long, float,
+    complex, and possibly others, such as a new rational type) serve
+    mostly as storage optimizations, and to some extent to indicate
+    orthogonal properties such as inexactness or complexity.  In a
+    unified model, the integer 1 should be indistinguishable from the
+    floating point number 1.0 (except for its inexactness), and both
+    should behave the same in all numeric contexts.  Clearly, in a
+    unified numeric model, if a==b and c==d, a/c should equal b/d
+    (taking some liberties due to rounding for inexact numbers), and
+    since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also
+    equal 0.5.  Likewise, since 1//2 equals zero, 1.0//2.0 should also
+    equal zero.


-// Operator
+Variations

-    A `//' operator which will be introduced, which will call the
-    nb_intdivide or __intdiv__ slots.  This operator will be
-    implemented in all the Python numeric types, and will have the
-    semantics of
+    Esthetically, x//y doesn't please everyone, and hence several
+    variations have been proposed: x div y, or div(x, y), sometimes in
+    combination with x mod y or mod(x, y) as an alternative spelling
+    for x%y.
+
+    We consider these solutions inferior, on the following grounds.
+
+    - Using x div y would introduce a new keyword.  Since div is a
+      popular identifier, this would break a fair amount of existing
+      code, unless the new keyword was only recognized under a future
+      division statement.  Since it is expected that the majority of
+      code that needs to be converted is dividing integers, this would
+      greatly increase the need for the future division statement.
+      Even with a future statement, the general sentiment against
+      adding new keywords unless absolutely necessary argues against
+      this.
+
+    - Using div(x, y) makes the conversion of old code much harder.
+      Replacing x/y with x//y or x div y can be done with a simple
+      query replace; in most cases the programmer can easily verify
+      that a particular module only works with integers so all
+      occurrences of x/y can be replaced.  (The query replace is still
+      needed to weed out slashes occurring in comments or string
+      literals.)  Replacing x/y with div(x, y) would require a much
+      more intelligent tool, since the extent of the expressions to
+      the left and right of the / must be analized before the
+      placement of the "div(" and ")" part can be decided.
+
+
+Alternatives
+
+    In order to reduce the amount of old code that needs to be
+    converted, several alternative proposals have been put forth.
+    Here is a brief discussion of each proposal (or category of
+    proposals).  If you know of an alternative that was discussed on
+    c.l.py that isn't mentioned here, please mail the second author.
+
+    - Let / keep its classic semantics; introduce // for true
+      division.  This doesn't solve the problem that the classic /
+      operator makes it hard to write polymorphic numeric functions
+      accept int and float arguments, and still requires the use of
+      x*1.0/y whenever true divisions is required.
+
+    - Use a directive to use specific division semantics in a module,
+      rather than a future statement.  This retains classic division
+      as a permanent wart in the language, requiring future
+      generations of Python programmers to be aware of the problem and
+      the remedies.
+
+    - Use "from __past__ import division" to use classic division
+      semantics in a module.  This also retains the classic division
+      as a permanent wart, or at least for a long time (eventually the
+      past division statement could raise an ImportError).
+
+    - Use a directive (or some other way) to specify the Python
+      version for which a specific piece of code was developed.  This
+      requires future Python interpreters to be able to emulate
+      *exactly* every previous version of Python, and moreover to do
+      so for multiple versions in the same interpreter.  This is way
+      too much work.  A much simpler solution is to keep multiple
+      interpreters installed.
+
+
+Specification
+
+    During the transitional phase, we have to support *three* division
+    operators within the same program: classic division (for / in
+    modules without a future division statement), true division (for /
+    in modules with a future division statement), and floor division
+    (for //).  Each operator comes in two flavors: regular, and as an
+    augmented assignment operator (/= or //=).
+
+    The names associated with these variations are:
+
+    - Overloaded operator methods:
+
+      __div__(), __floordiv__(), __truediv__();
+
+      __idiv__(), __ifloordiv__(), __itruediv__().
+
+    - Abstract API C functions:
+
+      PyNumber_Divide(), PyNumber_FloorDivide(),
+      PyNumber_TrueDivide();
+
+      PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
+      PyNumber_InPlaceTrueDivide().
+
+    - Byte code opcodes:
+
+      BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;
+
+      INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.
+
+    - PyNumberMethod slots:
+
+      nb_divide, nb_floor_divide, nb_true_divide,
+
+      nb_inplace_divide, nb_inplace_floor_divide,
+      nb_inplace_true_divide.
+
+    The added PyNumberMethod slots require an additional flag in
+    tp_flags; this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and
+    will be included in Py_TPFLAGS_DEFAULT.
+
+    The true and floor division APIs will look for the corresponding
+    slots and call that; when that slot is NULL, they will raise an
+    exception.  There is no fallback to the classic divide slot.
+
+
+Command Line Option
+
+    The -D command line option takes a string argument that can take
+    three values: "old", "warn", or "new".  The default is "old" in
+    Python 2.2 but will change to "warn" in later 2.x versions.  The
+    "old" value means the classic division operator acts as described.
+    The "warn" value means the classic division operator issues a
+    warning (a DeprecatinWarning using the standard warning framework)
+    when applied to ints or longs.  The "new" value changes the
+    default globally so that the / operator is always interpreted as
+    true division.  The "new" option is only intended for use in
+    certain educational environments, where true division is required,
+    but asking the students to include the future division statement
+    in all their code would be a problem.
+
+    This option will not be supported in Python 3.0; Python 3.0 will
+    always interpret / as true division.
+
+
+Semantics of Floor Division
+
+    Floor division will be implemented in all the Python numeric
+    types, and will have the semantics of

        a // b == floor(a/b)

-    Except that the type of a//b will be the type a and b will be
+    except that the type of a//b will be the type a and b will be
    coerced into.  Specifically, if a and b are of the same type, a//b
-    will be of that type too. In the current Python 2.2 implementation,
-    this is implemented via the BINARY_INTDIVIDE opcode, and currently
-    does the right thing only for ints and longs (and other extension types 
-    which behave like integers).
+    will be of that type too.


-Changing the Semantics of the / Operator
+Semantics of True Division

-    The nb_divide slot on integers (and long integers, if these are a
-    separate type, but see PEP 237 [1]) will issue a warning when given
-    integers a and b such that
-
-        a % b != 0
-
-    The warning will be off by default in the 2.2 release, and on by
-    default for in the next Python release, and will stay in effect
-    for 24 months.  The next Python release after 24 months, it will
-    implement
-
-        (a/b) * b = a (more or less)
-
-    The type of a/b will be either a float or a rational, depending on
-    other PEPs[2, 3]. However, the result will be integral in all case
-    the division has no remainder. This will not implemented in Python 2.2.
+    True division for ints and longs will convert the arguments to
+    float and then apply a float division.  That is, even 2/1 will
+    return a float (2.0), not an int.


-__future__
+The Future Division Statement

-    See PEP 236[4] for the __future__ specific rules.
+    If "from __future__ import division" is present in a module, or if
+    -Dnew is used, the / and /= operators are translated to true
+    division opcodes; otherwise they are translated to classic
+    division (until Python 3.0 comes along, where they are always
+    translated to true division).

-    If "from __future__ import division" is present in the
-    module, until the IntType nb_divide is changed, the "/" operator
-    is compiled to add 0.0 to the divisor before doing the division.
-    This is not exactly the same as what will happen in the future,
-    since in the future, the result will be integral when it can.
-    This is done via the BINARY_FLOATDIVIDE opcode.
+    The future division statement has no effect on the recognition or
+    translation of // and //=.

-    In Python 2.2, unless the __future__ statement is present,
-    the '/' operator is compiled to DIVISION opcode, which is the
-    same
-    
-
-C API
-
-    There are four new C-level functions. PyNumber_IntDivide and 
-    PyNumber_FloatDivide correspond to the // operator and
-    the / operator with __future__ statement, respectively. 
-    PyNumber_InPlaceIntDivide and PyNumber_InPlaceIntDivide correspond
-    to the //= operator and to the /= operator with __future__ statement
-    respectively. Please refer to the discussion of the operator
-    for specification of these functions' behavior.
+    See PEP 236[4] for the general rules for future statements.


 FAQ

-    Should the // operator be renamed to "div"?
+    Q. How do I write code that works under the classic rules as well
+       as under the new rules without using // or a future division
+       statement?

-    No. There are problems with new keywords. 
+    A. Use x*1.0/y for true division, divmod(x, y)[0] for int
+       division.  Especially the latter is best hidden inside a
+       function.  You may also write floor(x)/y for true division if
+       you are sure that you don't expect complex numbers.  If you
+       know your integers are never negative, you can use int(x/y) --
+       while the documentation of int() says that int() can round or
+       truncate depending on the C implementation, we know of no C
+       implementation that doesn't truncate, and we're going to change
+       the spec for int() to promise truncation.  Note that for
+       negative ints, classic division (and floor division) round
+       towards negative infinity, while int() rounds towards zero.

-    Should the // be made into a function called "div"?
+    Q. Why is my question not listed here?

-    No. People expect to be able to write math expressions directly
-    in Python.
+    A. Because we weren't of it.  If you've discussed it on c.l.py,
+       please mail the second author.


 Implementation

-    A mostly-complete implementation (not exactly following the above
-    spec, but close enough except for the lack of a warning for
-    truncated results from old division) is available from the
-    SourceForge patch manager[5]
+    A very early implementation (not yet following the above spec
+     is available from the SourceForge patch manager[5].


 References

+    [0] PEP 228, Reworking Python's Numeric Model
+        http://www.python.org/peps/pep-0228.html
+
    [1] PEP 237, Unifying Long Integers and Integers, Zadka,
        http://www.python.org/peps/pep-0237.html