reSTify PEP 237 (#273)

This commit is contained in:
csabella 2017-06-01 20:51:29 -04:00 committed by Brett Cannon
parent 9cf4bde5e6
commit 5d3ac8912e
1 changed files with 228 additions and 243 deletions

View File

@ -5,52 +5,53 @@ Last-Modified: $Date$
Author: Moshe Zadka, Guido van Rossum Author: Moshe Zadka, Guido van Rossum
Status: Final Status: Final
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst
Created: 11-Mar-2001 Created: 11-Mar-2001
Python-Version: 2.2 Python-Version: 2.2
Post-History: 16-Mar-2001, 14-Aug-2001, 23-Aug-2001 Post-History: 16-Mar-2001, 14-Aug-2001, 23-Aug-2001
Abstract Abstract
========
Python currently distinguishes between two kinds of integers Python currently distinguishes between two kinds of integers (ints): regular
(ints): regular or short ints, limited by the size of a C long or short ints, limited by the size of a C long (typically 32 or 64 bits), and
(typically 32 or 64 bits), and long ints, which are limited only long ints, which are limited only by available memory. When operations on
by available memory. When operations on short ints yield results short ints yield results that don't fit in a C long, they raise an error.
that don't fit in a C long, they raise an error. There are some There are some other distinctions too. This PEP proposes to do away with most
other distinctions too. This PEP proposes to do away with most of of the differences in semantics, unifying the two types from the perspective
the differences in semantics, unifying the two types from the of the Python user.
perspective of the Python user.
Rationale Rationale
=========
Many programs find a need to deal with larger numbers after the Many programs find a need to deal with larger numbers after the fact, and
fact, and changing the algorithms later is bothersome. It can changing the algorithms later is bothersome. It can hinder performance in the
hinder performance in the normal case, when all arithmetic is normal case, when all arithmetic is performed using long ints whether or not
performed using long ints whether or not they are needed. they are needed.
Having the machine word size exposed to the language hinders Having the machine word size exposed to the language hinders portability. For
portability. For examples Python source files and .pyc's are not examples Python source files and .pyc's are not portable between 32-bit and
portable between 32-bit and 64-bit machines because of this. 64-bit machines because of this.
There is also the general desire to hide unnecessary details from There is also the general desire to hide unnecessary details from the Python
the Python user when they are irrelevant for most applications. user when they are irrelevant for most applications. An example is memory
An example is memory allocation, which is explicit in C but allocation, which is explicit in C but automatic in Python, giving us the
automatic in Python, giving us the convenience of unlimited sizes convenience of unlimited sizes on strings, lists, etc. It makes sense to
on strings, lists, etc. It makes sense to extend this convenience extend this convenience to numbers.
to numbers.
It will give new Python programmers (whether they are new to It will give new Python programmers (whether they are new to programming in
programming in general or not) one less thing to learn before they general or not) one less thing to learn before they can start using the
can start using the language. language.
Implementation Implementation
==============
Initially, two alternative implementations were proposed (one by Initially, two alternative implementations were proposed (one by each author):
each author):
1. The PyInt type's slot for a C long will be turned into a 1. The ``PyInt`` type's slot for a C long will be turned into a::
union { union {
long i; long i;
@ -60,241 +61,222 @@ Implementation
} bignum; } bignum;
}; };
Only the n-1 lower bits of the long have any meaning; the top Only the ``n-1`` lower bits of the ``long`` have any meaning; the top bit
bit is always set. This distinguishes the union. All PyInt is always set. This distinguishes the ``union``. All ``PyInt`` functions
functions will check this bit before deciding which types of will check this bit before deciding which types of operations to use.
operations to use.
2. The existing short and long int types remain, but operations 2. The existing short and long int types remain, but operations return
return a long int instead of raising OverflowError when a a long int instead of raising ``OverflowError`` when a result cannot be
result cannot be represented as a short int. A new type, represented as a short int. A new type, ``integer``, may be introduced
integer, may be introduced that is an abstract base type of that is an abstract base type of which both the ``int`` and ``long``
which both the int and long implementation types are implementation types are subclassed. This is useful so that programs can
subclassed. This is useful so that programs can check check integer-ness with a single test::
integer-ness with a single test:
if isinstance(i, integer): ... if isinstance(i, integer): ...
After some consideration, the second implementation plan was After some consideration, the second implementation plan was selected, since
selected, since it is far easier to implement, is backwards it is far easier to implement, is backwards compatible at the C API level, and
compatible at the C API level, and in addition can be implemented in addition can be implemented partially as a transitional measure.
partially as a transitional measure.
Incompatibilities Incompatibilities
=================
The following operations have (usually subtly) different semantics The following operations have (usually subtly) different semantics for short
for short and for long integers, and one or the other will have to and for long integers, and one or the other will have to be changed somehow.
be changed somehow. This is intended to be an exhaustive list. This is intended to be an exhaustive list. If you know of any other operation
If you know of any other operation that differ in outcome that differ in outcome depending on whether a short or a long int with the same
depending on whether a short or a long int with the same value is value is passed, please write the second author.
passed, please write the second author.
- Currently, all arithmetic operators on short ints except << - Currently, all arithmetic operators on short ints except ``<<`` raise
raise OverflowError if the result cannot be represented as a ``OverflowError`` if the result cannot be represented as a short int. This
short int. This will be changed to return a long int instead. will be changed to return a long int instead. The following operators can
The following operators can currently raise OverflowError: x+y, currently raise ``OverflowError``: ``x+y``, ``x-y``, ``x*y``, ``x**y``,
x-y, x*y, x**y, divmod(x, y), x/y, x%y, and -x. (The last four ``divmod(x, y)``, ``x/y``, ``x%y``, and ``-x``. (The last four can only
can only overflow when the value -sys.maxint-1 is involved.) overflow when the value ``-sys.maxint-1`` is involved.)
- Currently, x<<n can lose bits for short ints. This will be - Currently, ``x<<n`` can lose bits for short ints. This will be changed to
changed to return a long int containing all the shifted-out return a long int containing all the shifted-out bits, if returning a short
bits, if returning a short int would lose bits (where changing int would lose bits (where changing sign is considered a special case of
sign is considered a special case of losing bits). losing bits).
- Currently, hex and oct literals for short ints may specify - Currently, hex and oct literals for short ints may specify negative values;
negative values; for example 0xffffffff == -1 on a 32-bit for example ``0xffffffff == -1`` on a 32-bit machine. This will be changed
machine. This will be changed to equal 0xffffffffL (2**32-1). to equal ``0xffffffffL`` (``2**32-1``).
- Currently, the '%u', '%x', '%X' and '%o' string formatting - Currently, the ``%u``, ``%x``, ``%X`` and ``%o`` string formatting operators
operators and the hex() and oct() built-in functions behave and the ``hex()`` and ``oct()`` built-in functions behave differently for
differently for negative numbers: negative short ints are negative numbers: negative short ints are formatted as unsigned C long,
formatted as unsigned C long, while negative long ints are while negative long ints are formatted with a minus sign. This will be
formatted with a minus sign. This will be changed to use the changed to use the long int semantics in all cases (but without the trailing
long int semantics in all cases (but without the trailing 'L' *L* that currently distinguishes the output of ``hex()`` and ``oct()`` for
that currently distinguishes the output of hex() and oct() for long ints). Note that this means that ``%u`` becomes an alias for ``%d``.
long ints). Note that this means that '%u' becomes an alias for It will eventually be removed.
'%d'. It will eventually be removed.
- Currently, repr() of a long int returns a string ending in 'L' - Currently, ``repr()`` of a long int returns a string ending in *L* while
while repr() of a short int doesn't. The 'L' will be dropped; ``repr()`` of a short int doesn't. The *L* will be dropped; but not before
but not before Python 3.0. Python 3.0.
- Currently, an operation with long operands will never return a - Currently, an operation with long operands will never return a short int.
short int. This *may* change, since it allows some This *may* change, since it allows some optimization. (No changes have been
optimization. (No changes have been made in this area yet, and made in this area yet, and none are planned.)
none are planned.)
- The expression type(x).__name__ depends on whether x is a short - The expression ``type(x).__name__`` depends on whether *x* is a short or a
or a long int. Since implementation alternative 2 is chosen, long int. Since implementation alternative 2 is chosen, this difference
this difference will remain. (In Python 3.0, we *may* be able will remain. (In Python 3.0, we *may* be able to deploy a trick to hide the
to deploy a trick to hide the difference, because it *is* difference, because it *is* annoying to reveal the difference to user code,
annoying to reveal the difference to user code, and more so as and more so as the difference between the two types is less visible.)
the difference between the two types is less visible.)
- Long and short ints are handled different by the marshal module, - Long and short ints are handled different by the ``marshal`` module, and by
and by the pickle and cPickle modules. This difference will the ``pickle`` and ``cPickle`` modules. This difference will remain (at
remain (at least until Python 3.0). least until Python 3.0).
- Short ints with small values (typically between -1 and 99 - Short ints with small values (typically between -1 and 99 inclusive) are
inclusive) are "interned" -- whenever a result has such a value, *interned* -- whenever a result has such a value, an existing short int with
an existing short int with the same value is returned. This is the same value is returned. This is not done for long ints with the same
not done for long ints with the same values. This difference values. This difference will remain. (Since there is no guarantee of this
will remain. (Since there is no guarantee of this interning, it interning, it is debatable whether this is a semantic difference -- but code
is debatable whether this is a semantic difference -- but code may exist that uses ``is`` for comparisons of short ints and happens to work
may exist that uses 'is' for comparisons of short ints and because of this interning. Such code may fail if used with long ints.)
happens to work because of this interning. Such code may fail
if used with long ints.)
Literals Literals
========
A trailing 'L' at the end of an integer literal will stop having A trailing *L* at the end of an integer literal will stop having any
any meaning, and will be eventually become illegal. The compiler meaning, and will be eventually become illegal. The compiler will choose the
will choose the appropriate type solely based on the value. appropriate type solely based on the value. (Until Python 3.0, it will force
(Until Python 3.0, it will force the literal to be a long; but the literal to be a long; but literals without a trailing *L* may also be
literals without a trailing 'L' may also be long, if they are not long, if they are not representable as short ints.)
representable as short ints.)
Built-in Functions Built-in Functions
==================
The function int() will return a short or a long int depending on The function ``int()`` will return a short or a long int depending on the
the argument value. In Python 3.0, the function long() will call argument value. In Python 3.0, the function ``long()`` will call the function
the function int(); before then, it will continue to force the ``int()``; before then, it will continue to force the result to be a long int,
result to be a long int, but otherwise work the same way as int(). but otherwise work the same way as ``int()``. The built-in name ``long`` will
The built-in name 'long' will remain in the language to represent remain in the language to represent the long implementation type (unless it is
the long implementation type (unless it is completely eradicated completely eradicated in Python 3.0), but using the ``int()`` function is
in Python 3.0), but using the int() function is still recommended, still recommended, since it will automatically return a long when needed.
since it will automatically return a long when needed.
C API C API
=====
The C API remains unchanged; C code will still need to be aware of The C API remains unchanged; C code will still need to be aware of the
the difference between short and long ints. (The Python 3.0 C API difference between short and long ints. (The Python 3.0 C API will probably
will probably be completely incompatible.) be completely incompatible.)
The PyArg_Parse*() APIs already accept long ints, as long as they The ``PyArg_Parse*()`` APIs already accept long ints, as long as they are
are within the range representable by C ints or longs, so that within the range representable by C ints or longs, so that functions taking C
functions taking C int or long argument won't have to worry about int or long argument won't have to worry about dealing with Python longs.
dealing with Python longs.
Transition Transition
==========
There are three major phases to the transition: There are three major phases to the transition:
A. Short int operations that currently raise OverflowError return 1. Short int operations that currently raise ``OverflowError`` return a long
a long int value instead. This is the only change in this int value instead. This is the only change in this phase. Literals will
phase. Literals will still distinguish between short and long still distinguish between short and long ints. The other semantic
ints. The other semantic differences listed above (including differences listed above (including the behavior of ``<<``) will remain.
the behavior of <<) will remain. Because this phase only Because this phase only changes situations that currently raise
changes situations that currently raise OverflowError, it is ``OverflowError``, it is assumed that this won't break existing code.
assumed that this won't break existing code. (Code that (Code that depends on this exception would have to be too convoluted to be
depends on this exception would have to be too convoluted to be concerned about it.) For those concerned about extreme backwards
concerned about it.) For those concerned about extreme compatibility, a command line option (or a call to the warnings module)
backwards compatibility, a command line option (or a call to will allow a warning or an error to be issued at this point, but this is
the warnings module) will allow a warning or an error to be off by default.
issued at this point, but this is off by default.
B. The remaining semantic differences are addressed. In all cases 2. The remaining semantic differences are addressed. In all cases the long
the long int semantics will prevail. Since this will introduce int semantics will prevail. Since this will introduce backwards
backwards incompatibilities which will break some old code, incompatibilities which will break some old code, this phase may require a
this phase may require a future statement and/or warnings, and future statement and/or warnings, and a prolonged transition phase. The
a prolonged transition phase. The trailing 'L' will continue trailing *L* will continue to be used for longs as input and by
to be used for longs as input and by repr(). ``repr()``.
C. The trailing 'L' is dropped from repr(), and made illegal on A. Warnings are enabled about operations that will change their numeric
input. (If possible, the 'long' type completely disappears.) outcome in stage 2B, in particular ``hex()`` and ``oct()``, ``%u``,
The trailing 'L' is also dropped from hex() and oct(). ``%x``, ``%X`` and ``%o``, ``hex`` and ``oct`` literals in the
(inclusive) range ``[sys.maxint+1, sys.maxint*2+1]``, and left shifts
losing bits.
B. The new semantic for these operations are implemented. Operations that
give different results than before will *not* issue a warning.
Phase A will be implemented in Python 2.2. 3. The trailing *L* is dropped from ``repr()``, and made illegal on input.
(If possible, the ``long`` type completely disappears.) The trailing *L*
is also dropped from ``hex()`` and ``oct()``.
Phase B will be implemented gradually in Python 2.3 and Python Phase 1 will be implemented in Python 2.2.
2.4. Envisioned stages of phase B:
B0. Warnings are enabled about operations that will change their Phase 2 will be implemented gradually, with 2A in Python 2.3 and 2B in
numeric outcome in stage B1, in particular hex() and oct(), Python 2.4.
'%u', '%x', '%X' and '%o', hex and oct literals in the
(inclusive) range [sys.maxint+1, sys.maxint*2+1], and left
shifts losing bits.
B1. The new semantic for these operations are implemented. Phase 3 will be implemented in Python 3.0 (at least two years after Python 2.4
Operations that give different results than before will *not* is released).
issue a warning.
We propose the following timeline:
B0. Python 2.3.
B1. Python 2.4.
Phase C will be implemented in Python 3.0 (at least two years
after Python 2.4 is released).
OverflowWarning OverflowWarning
===============
Here are the rules that guide warnings generated in situations Here are the rules that guide warnings generated in situations that currently
that currently raise OverflowError. This applies to transition raise ``OverflowError``. This applies to transition phase 1. Historical
phase A. Historical note: despite that phase A was completed in note: despite that phase 1 was completed in Python 2.2, and phase 2A in Python
Python 2.2, and phase B0 in Python 2.3, nobody noticed that 2.3, nobody noticed that OverflowWarning was still generated in Python 2.3.
OverflowWarning was still generated in Python 2.3. It was finally It was finally disabled in Python 2.4. The Python builtin
disabled in Python 2.4. The Python builtin OverflowWarning, and ``OverflowWarning``, and the corresponding C API ``PyExc_OverflowWarning``,
the corresponding C API PyExc_OverflowWarning, are no longer are no longer generated or used in Python 2.4, but will remain for the
generated or used in Python 2.4, but will remain for the (unlikely) (unlikely) case of user code until Python 2.5.
case of user code until Python 2.5.
- A new warning category is introduced, OverflowWarning. This is - A new warning category is introduced, ``OverflowWarning``. This is a
a built-in name. built-in name.
- If an int result overflows, an OverflowWarning warning is - If an int result overflows, an ``OverflowWarning`` warning is issued, with a
issued, with a message argument indicating the operation, message argument indicating the operation, e.g. "integer addition". This
e.g. "integer addition". This may or may not cause a warning may or may not cause a warning message to be displayed on ``sys.stderr``, or
message to be displayed on sys.stderr, or may cause an exception may cause an exception to be raised, all under control of the ``-W`` command
to be raised, all under control of the -W command line and the line and the warnings module.
warnings module.
- The OverflowWarning warning is ignored by default. - The ``OverflowWarning`` warning is ignored by default.
- The OverflowWarning warning can be controlled like all warnings, - The ``OverflowWarning`` warning can be controlled like all warnings, via the
via the -W command line option or via the ``-W`` command line option or via the ``warnings.filterwarnings()`` call.
warnings.filterwarnings() call. For example: For example::
python -Wdefault::OverflowWarning python -Wdefault::OverflowWarning
cause the OverflowWarning to be displayed the first time it cause the ``OverflowWarning`` to be displayed the first time it occurs at a
occurs at a particular source line, and particular source line, and::
python -Werror::OverflowWarning python -Werror::OverflowWarning
cause the OverflowWarning to be turned into an exception cause the ``OverflowWarning`` to be turned into an exception whenever it
whenever it happens. The following code enables the warning happens. The following code enables the warning from inside the program::
from inside the program:
import warnings import warnings
warnings.filterwarnings("default", "", OverflowWarning) warnings.filterwarnings("default", "", OverflowWarning)
See the python man page for the -W option and the warnings See the python ``man`` page for the ``-W`` option and the ``warnings``
module documentation for filterwarnings(). module documentation for ``filterwarnings()``.
- If the OverflowWarning warning is turned into an error, - If the ``OverflowWarning`` warning is turned into an error,
OverflowError is substituted. This is needed for backwards ``OverflowError`` is substituted. This is needed for backwards
compatibility. compatibility.
- Unless the warning is turned into an exceptions, the result of - Unless the warning is turned into an exceptions, the result of the operation
the operation (e.g., x+y) is recomputed after converting the (e.g., ``x+y``) is recomputed after converting the arguments to long ints.
arguments to long ints.
Example Example
=======
If you pass a long int to a C function or built-in operation that If you pass a long int to a C function or built-in operation that takes an
takes an integer, it will be treated the same as a short int as integer, it will be treated the same as a short int as long as the value fits
long as the value fits (by virtue of how PyArg_ParseTuple() is (by virtue of how ``PyArg_ParseTuple()`` is implemented). If the long value
implemented). If the long value doesn't fit, it will still raise doesn't fit, it will still raise an ``OverflowError``. For example::
an OverflowError. For example:
def fact(n): def fact(n):
if n <= 1: if n <= 1:
@ -305,50 +287,53 @@ Example
n = input("Gimme an int: ") n = input("Gimme an int: ")
print A[fact(n)%17] print A[fact(n)%17]
For n >= 13, this currently raises OverflowError (unless the user For ``n >= 13``, this currently raises ``OverflowError`` (unless the user
enters a trailing 'L' as part of their input), even though the enters a trailing *L* as part of their input), even though the calculated
calculated index would always be in range(17). With the new index would always be in ``range(17)``. With the new approach this code will
approach this code will do the right thing: the index will be do the right thing: the index will be calculated as a long int, but its value
calculated as a long int, but its value will be in range. will be in range.
Resolved Issues Resolved Issues
===============
These issues, previously open, have been resolved. These issues, previously open, have been resolved.
- hex() and oct() applied to longs will continue to produce a - ``hex()`` and ``oct()`` applied to longs will continue to produce a trailing
trailing 'L' until Python 3000. The original text above wasn't *L* until Python 3000. The original text above wasn't clear about this,
clear about this, but since it didn't happen in Python 2.4 it but since it didn't happen in Python 2.4 it was thought better to leave it
was thought better to leave it alone. BDFL pronouncement here: alone. BDFL pronouncement here:
http://mail.python.org/pipermail/python-dev/2006-June/065918.html http://mail.python.org/pipermail/python-dev/2006-June/065918.html
- What to do about sys.maxint? Leave it in, since it is still - What to do about ``sys.maxint``? Leave it in, since it is still relevant
relevant whenever the distinction between short and long ints is whenever the distinction between short and long ints is still relevant (e.g.
still relevant (e.g. when inspecting the type of a value). when inspecting the type of a value).
- Should we remove '%u' completely? Remove it. - Should we remove ``%u`` completely? Remove it.
- Should we warn about << not truncating integers? Yes. - Should we warn about ``<<`` not truncating integers? Yes.
- Should the overflow warning be on a portable maximum size? No. - Should the overflow warning be on a portable maximum size? No.
Implementation Implementation
==============
The implementation work for the Python 2.x line is completed; The implementation work for the Python 2.x line is completed; phase 1 was
phase A was released with Python 2.2, phase B0 with Python 2.3, released with Python 2.2, phase 2A with Python 2.3, and phase 2B will be
and phase B1 will be released with Python 2.4 (and is already in released with Python 2.4 (and is already in CVS).
CVS).
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
Local Variables: ..
mode: indented-text Local Variables:
indent-tabs-mode: nil mode: indented-text
End: indent-tabs-mode: nil
End: