2007-05-01 12:57:09 -04:00
|
|
|
|
PEP: 3127
|
|
|
|
|
Title: Integer Literal Support and Syntax
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Patrick Maupin <pmaupin@gmail.com>
|
|
|
|
|
Discussions-To: Python-3000@python.org
|
2007-05-10 18:56:48 -04:00
|
|
|
|
Status: Accepted
|
2007-05-01 12:57:09 -04:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 14-Mar-2007
|
2007-06-19 00:20:07 -04:00
|
|
|
|
Python-Version: 3.0
|
2007-05-01 12:57:09 -04:00
|
|
|
|
Post-History: 18-Mar-2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This PEP proposes changes to the Python core to rationalize
|
|
|
|
|
the treatment of string literal representations of integers
|
|
|
|
|
in different radices (bases). These changes are targeted at
|
|
|
|
|
Python 3.0, but the backward-compatible parts of the changes
|
|
|
|
|
should be added to Python 2.6, so that all valid 3.0 integer
|
|
|
|
|
literals will also be valid in 2.6.
|
|
|
|
|
|
|
|
|
|
The proposal is that:
|
|
|
|
|
|
|
|
|
|
a) octal literals must now be specified
|
|
|
|
|
with a leading "0o" or "0O" instead of "0";
|
|
|
|
|
|
|
|
|
|
b) binary literals are now supported via a
|
|
|
|
|
leading "0b" or "0B"; and
|
|
|
|
|
|
|
|
|
|
c) provision will be made for binary numbers in
|
|
|
|
|
string formatting.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
This PEP was motivated by two different issues:
|
|
|
|
|
|
|
|
|
|
- The default octal representation of integers is silently confusing
|
|
|
|
|
to people unfamiliar with C-like languages. It is extremely easy
|
|
|
|
|
to inadvertently create an integer object with the wrong value,
|
|
|
|
|
because '013' means 'decimal 11', not 'decimal 13', to the Python
|
|
|
|
|
language itself, which is not the meaning that most humans would
|
|
|
|
|
assign to this literal.
|
|
|
|
|
|
|
|
|
|
- Some Python users have a strong desire for binary support in
|
|
|
|
|
the language.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
=============
|
|
|
|
|
|
|
|
|
|
Grammar specification
|
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
|
|
The grammar will be changed. For Python 2.6, the changed and
|
|
|
|
|
new token definitions will be::
|
|
|
|
|
|
|
|
|
|
integer ::= decimalinteger | octinteger | hexinteger |
|
|
|
|
|
bininteger | oldoctinteger
|
|
|
|
|
|
|
|
|
|
octinteger ::= "0" ("o" | "O") octdigit+
|
|
|
|
|
|
|
|
|
|
bininteger ::= "0" ("b" | "B") bindigit+
|
|
|
|
|
|
|
|
|
|
oldoctinteger ::= "0" octdigit+
|
|
|
|
|
|
|
|
|
|
bindigit ::= "0" | "1"
|
|
|
|
|
|
|
|
|
|
For Python 3.0, "oldoctinteger" will not be supported, and
|
|
|
|
|
an exception will be raised if a literal has a leading "0" and
|
|
|
|
|
a second character which is a digit.
|
|
|
|
|
|
|
|
|
|
For both versions, this will require changes to PyLong_FromString
|
|
|
|
|
as well as the grammar.
|
|
|
|
|
|
|
|
|
|
The documentation will have to be changed as well: grammar.txt,
|
|
|
|
|
as well as the integer literal section of the reference manual.
|
|
|
|
|
|
|
|
|
|
PEP 306 should be checked for other issues, and that PEP should
|
|
|
|
|
be updated if the procedure described therein is insufficient.
|
|
|
|
|
|
|
|
|
|
int() specification
|
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
|
|
int(s, 0) will also match the new grammar definition.
|
|
|
|
|
|
|
|
|
|
This should happen automatically with the changes to
|
|
|
|
|
PyLong_FromString required for the grammar change.
|
|
|
|
|
|
|
|
|
|
Also the documentation for int() should be changed to explain
|
|
|
|
|
that int(s) operates identically to int(s, 10), and the word
|
|
|
|
|
"guess" should be removed from the description of int(s, 0).
|
|
|
|
|
|
|
|
|
|
long() specification
|
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
|
|
For Python 2.6, the long() implementation and documentation
|
|
|
|
|
should be changed to reflect the new grammar.
|
|
|
|
|
|
|
|
|
|
Tokenizer exception handling
|
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
|
|
If an invalid token contains a leading "0", the exception
|
|
|
|
|
error message should be more informative than the current
|
|
|
|
|
"SyntaxError: invalid token". It should explain that decimal
|
|
|
|
|
numbers may not have a leading zero, and that octal numbers
|
|
|
|
|
require an "o" after the leading zero.
|
|
|
|
|
|
|
|
|
|
int() exception handling
|
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
|
|
The ValueError raised for any call to int() with a string
|
|
|
|
|
should at least explicitly contain the base in the error
|
|
|
|
|
message, e.g.::
|
|
|
|
|
|
|
|
|
|
ValueError: invalid literal for base 8 int(): 09
|
|
|
|
|
|
|
|
|
|
oct() function
|
|
|
|
|
---------------
|
|
|
|
|
|
|
|
|
|
oct() should be updated to output '0o' in front of
|
|
|
|
|
the octal digits (for 3.0, and 2.6 compatibility mode).
|
|
|
|
|
|
|
|
|
|
Output formatting
|
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
|
|
The string (and unicode in 2.6) % operator will have
|
|
|
|
|
'b' format specifier added for binary, and the alternate
|
|
|
|
|
syntax of the 'o' option will need to be updated to
|
|
|
|
|
add '0o' in front, instead of '0'.
|
|
|
|
|
|
|
|
|
|
PEP 3101 already supports 'b' for binary output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Transition from 2.6 to 3.0
|
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
|
|
The 2to3 translator will have to insert 'o' into any
|
|
|
|
|
octal string literal.
|
|
|
|
|
|
|
|
|
|
The Py3K compatible option to Python 2.6 should cause
|
|
|
|
|
attempts to use oldoctinteger literals to raise an
|
|
|
|
|
exception.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
Most of the discussion on these issues occurred on the Python-3000
|
|
|
|
|
mailing list starting 14-Mar-2007, prompted by an observation that
|
|
|
|
|
the average human being would be completely mystified upon finding
|
|
|
|
|
that prepending a "0" to a string of digits changes the meaning of
|
|
|
|
|
that digit string entirely.
|
|
|
|
|
|
|
|
|
|
It was pointed out during this discussion that a similar, but shorter,
|
|
|
|
|
discussion on the subject occurred in January of 2006, prompted by a
|
|
|
|
|
discovery of the same issue.
|
|
|
|
|
|
|
|
|
|
Background
|
|
|
|
|
----------
|
|
|
|
|
|
|
|
|
|
For historical reasons, Python's string representation of integers
|
|
|
|
|
in different bases (radices), for string formatting and token
|
|
|
|
|
literals, borrows heavily from C. [1]_ [2]_ Usage has shown that
|
|
|
|
|
the historical method of specifying an octal number is confusing,
|
|
|
|
|
and also that it would be nice to have additional support for binary
|
|
|
|
|
literals.
|
|
|
|
|
|
|
|
|
|
Throughout this document, unless otherwise noted, discussions about
|
|
|
|
|
the string representation of integers relate to these features:
|
|
|
|
|
|
|
|
|
|
- Literal integer tokens, as used by normal module compilation,
|
|
|
|
|
by eval(), and by int(token, 0). (int(token) and int(token, 2-36)
|
|
|
|
|
are not modified by this proposal.)
|
|
|
|
|
|
|
|
|
|
* Under 2.6, long() is treated the same as int()
|
|
|
|
|
|
|
|
|
|
- Formatting of integers into strings, either via the % string
|
|
|
|
|
operator or the new PEP 3101 advanced string formatting method.
|
|
|
|
|
|
|
|
|
|
It is presumed that:
|
|
|
|
|
|
|
|
|
|
- All of these features should have an identical set
|
|
|
|
|
of supported radices, for consistency.
|
|
|
|
|
|
|
|
|
|
- Python source code syntax and int(mystring, 0) should
|
|
|
|
|
continue to share identical behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Removal of old octal syntax
|
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
|
|
This PEP proposes that the ability to specify an octal number by
|
|
|
|
|
using a leading zero will be removed from the language in Python 3.0
|
|
|
|
|
(and the Python 3.0 preview mode of 2.6), and that a SyntaxError will
|
|
|
|
|
be raised whenever a leading "0" is immediately followed by another
|
|
|
|
|
digit.
|
|
|
|
|
|
|
|
|
|
During the present discussion, it was almost universally agreed that::
|
|
|
|
|
|
|
|
|
|
eval('010') == 8
|
|
|
|
|
|
|
|
|
|
should no longer be true, because that is confusing to new users.
|
|
|
|
|
It was also proposed that::
|
|
|
|
|
|
|
|
|
|
eval('0010') == 10
|
|
|
|
|
|
|
|
|
|
should become true, but that is much more contentious, because it is so
|
|
|
|
|
inconsistent with usage in other computer languages that mistakes are
|
|
|
|
|
likely to be made.
|
|
|
|
|
|
|
|
|
|
Almost all currently popular computer languages, including C/C++,
|
|
|
|
|
Java, Perl, and JavaScript, treat a sequence of digits with a
|
|
|
|
|
leading zero as an octal number. Proponents of treating these
|
|
|
|
|
numbers as decimal instead have a very valid point -- as discussed
|
|
|
|
|
in `Supported radices`_, below, the entire non-computer world uses
|
|
|
|
|
decimal numbers almost exclusively. There is ample anecdotal
|
|
|
|
|
evidence that many people are dismayed and confused if they
|
|
|
|
|
are confronted with non-decimal radices.
|
|
|
|
|
|
|
|
|
|
However, in most situations, most people do not write gratuitous
|
|
|
|
|
zeros in front of their decimal numbers. The primary exception is
|
|
|
|
|
when an attempt is being made to line up columns of numbers. But
|
|
|
|
|
since PEP 8 specifically discourages the use of spaces to try to
|
|
|
|
|
align Python code, one would suspect the same argument should apply
|
|
|
|
|
to the use of leading zeros for the same purpose.
|
|
|
|
|
|
|
|
|
|
Finally, although the email discussion often focused on whether anybody
|
|
|
|
|
actually *uses* octal any more, and whether we should cater to those
|
|
|
|
|
old-timers in any case, that is almost entirely besides the point.
|
|
|
|
|
|
|
|
|
|
Assume the rare complete newcomer to computing who *does*, either
|
|
|
|
|
occasionally or as a matter of habit, use leading zeros for decimal
|
|
|
|
|
numbers. Python could either:
|
|
|
|
|
|
|
|
|
|
a) silently do the wrong thing with his numbers, as it does now;
|
|
|
|
|
|
|
|
|
|
b) immediately disabuse him of the notion that this is viable syntax
|
|
|
|
|
(and yes, the SyntaxWarning should be more gentle than it
|
|
|
|
|
currently is, but that is a subject for a different PEP); or
|
|
|
|
|
|
|
|
|
|
c) let him continue to think that computers are happy with
|
|
|
|
|
multi-digit decimal integers which start with "0".
|
|
|
|
|
|
|
|
|
|
Some people passionately believe that (c) is the correct answer,
|
|
|
|
|
and they would be absolutely right if we could be sure that new
|
|
|
|
|
users will never blossom and grow and start writing AJAX applications.
|
|
|
|
|
|
|
|
|
|
So while a new Python user may (currently) be mystified at the
|
|
|
|
|
delayed discovery that his numbers don't work properly, we can
|
|
|
|
|
fix it by explaining to him immediately that Python doesn't like
|
|
|
|
|
leading zeros (hopefully with a reasonable message!), or we can
|
|
|
|
|
delegate this teaching experience to the JavaScript interpreter
|
|
|
|
|
in the Internet Explorer browser, and let him try to debug his
|
|
|
|
|
issue there.
|
|
|
|
|
|
|
|
|
|
Supported radices
|
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
|
|
This PEP proposes that the supported radices for the Python
|
|
|
|
|
language will be 2, 8, 10, and 16.
|
|
|
|
|
|
|
|
|
|
Once it is agreed that the old syntax for octal (radix 8) representation
|
|
|
|
|
of integers must be removed from the language, the next obvious
|
|
|
|
|
question is "Do we actually need a way to specify (and display)
|
|
|
|
|
numbers in octal?"
|
|
|
|
|
|
|
|
|
|
This question is quickly followed by "What radices does the language
|
|
|
|
|
need to support?" Because computers are so adept at doing what you
|
|
|
|
|
tell them to, a tempting answer in the discussion was "all of them."
|
|
|
|
|
This answer has obviously been given before -- the int() constructor
|
|
|
|
|
will accept an explicit radix with a value between 2 and 36, inclusive,
|
|
|
|
|
with the latter number bearing a suspicious arithmetic similarity to
|
|
|
|
|
the sum of the number of numeric digits and the number of same-case
|
|
|
|
|
letters in the ASCII alphabet.
|
|
|
|
|
|
|
|
|
|
But the best argument for inclusion will have a use-case to back
|
|
|
|
|
it up, so the idea of supporting all radices was quickly rejected,
|
|
|
|
|
and the only radices left with any real support were decimal,
|
|
|
|
|
hexadecimal, octal, and binary.
|
|
|
|
|
|
|
|
|
|
Just because a particular radix has a vocal supporter on the
|
|
|
|
|
mailing list does not mean that it really should be in the
|
|
|
|
|
language, so the rest of this section is a treatise on the
|
|
|
|
|
utility of these particular radices, vs. other possible choices.
|
|
|
|
|
|
|
|
|
|
Humans use other numeric bases constantly. If I tell you that
|
|
|
|
|
it is 12:30 PM, I have communicated quantitative information
|
|
|
|
|
arguably composed of *three* separate bases (12, 60, and 2),
|
|
|
|
|
only one of which is in the "agreed" list above. But the
|
|
|
|
|
*communication* of that information used two decimal digits
|
|
|
|
|
each for the base 12 and base 60 information, and, perversely,
|
|
|
|
|
two letters for information which could have fit in a single
|
|
|
|
|
decimal digit.
|
|
|
|
|
|
|
|
|
|
So, in general, humans communicate "normal" (non-computer)
|
|
|
|
|
numerical information either via names (AM, PM, January, ...)
|
|
|
|
|
or via use of decimal notation. Obviously, names are
|
|
|
|
|
seldom used for large sets of items, so decimal is used for
|
|
|
|
|
everything else. There are studies which attempt to explain
|
|
|
|
|
why this is so, typically reaching the expected conclusion
|
|
|
|
|
that the Arabic numeral system is well-suited to human
|
|
|
|
|
cognition. [3]_
|
|
|
|
|
|
|
|
|
|
There is even support in the history of the design of
|
|
|
|
|
computers to indicate that decimal notation is the correct
|
|
|
|
|
way for computers to communicate with humans. One of
|
|
|
|
|
the first modern computers, ENIAC [4]_ computed in decimal,
|
|
|
|
|
even though there were already existing computers which
|
|
|
|
|
operated in binary.
|
|
|
|
|
|
|
|
|
|
Decimal computer operation was important enough
|
|
|
|
|
that many computers, including the ubiquitous PC, have
|
|
|
|
|
instructions designed to operate on "binary coded decimal"
|
|
|
|
|
(BCD) [5]_ , a representation which devotes 4 bits to each
|
|
|
|
|
decimal digit. These instructions date from a time when the
|
|
|
|
|
most strenuous calculations ever performed on many numbers
|
|
|
|
|
were the calculations actually required to perform textual
|
|
|
|
|
I/O with them. It is possible to display BCD without having
|
|
|
|
|
to perform a divide/remainder operation on every displayed
|
|
|
|
|
digit, and this was a huge computational win when most
|
|
|
|
|
hardware didn't have fast divide capability. Another factor
|
|
|
|
|
contributing to the use of BCD is that, with BCD calculations,
|
|
|
|
|
rounding will happen exactly the same way that a human would
|
|
|
|
|
do it, so BCD is still sometimes used in fields like finance,
|
|
|
|
|
despite the computational and storage superiority of binary.
|
|
|
|
|
|
|
|
|
|
So, if it weren't for the fact that computers themselves
|
|
|
|
|
normally use binary for efficient computation and data
|
|
|
|
|
storage, string representations of integers would probably
|
|
|
|
|
always be in decimal.
|
|
|
|
|
|
|
|
|
|
Unfortunately, computer hardware doesn't think like humans,
|
|
|
|
|
so programmers and hardware engineers must often resort to
|
|
|
|
|
thinking like the computer, which means that it is important
|
|
|
|
|
for Python to have the ability to communicate binary data
|
|
|
|
|
in a form that is understandable to humans.
|
|
|
|
|
|
|
|
|
|
The requirement that the binary data notation must be cognitively
|
|
|
|
|
easy for humans to process means that it should contain an integral
|
|
|
|
|
number of binary digits (bits) per symbol, while otherwise
|
|
|
|
|
conforming quite closely to the standard tried-and-true decimal
|
|
|
|
|
notation (position indicates power, larger magnitude on the left,
|
|
|
|
|
not too many symbols in the alphabet, etc.).
|
|
|
|
|
|
|
|
|
|
The obvious "sweet spot" for this binary data notation is
|
|
|
|
|
thus octal, which packs the largest integral number of bits
|
|
|
|
|
possible into a single symbol chosen from the Arabic numeral
|
|
|
|
|
alphabet.
|
|
|
|
|
|
|
|
|
|
In fact, some computer architectures, such as the PDP8 and the
|
|
|
|
|
8080/Z80, were defined in terms of octal, in the sense of arranging
|
|
|
|
|
the bitfields of instructions in groups of three, and using
|
|
|
|
|
octal representations to describe the instruction set.
|
|
|
|
|
|
|
|
|
|
Even today, octal is important because of bit-packed structures
|
|
|
|
|
which consist of 3 bits per field, such as Unix file permission
|
|
|
|
|
masks.
|
|
|
|
|
|
|
|
|
|
But octal has a drawback when used for larger numbers. The
|
|
|
|
|
number of bits per symbol, while integral, is not itself
|
|
|
|
|
a power of two. This limitation (given that the word size
|
|
|
|
|
of most computers these days is a power of two) has resulted
|
|
|
|
|
in hexadecimal, which is more popular than octal despite the
|
|
|
|
|
fact that it requires a 60% larger alphabet than decimal,
|
|
|
|
|
because each symbol contains 4 bits.
|
|
|
|
|
|
|
|
|
|
Some numbers, such as Unix file permission masks, are easily
|
|
|
|
|
decoded by humans when represented in octal, but difficult to
|
|
|
|
|
decode in hexadecimal, while other numbers are much easier for
|
|
|
|
|
humans to handle in hexadecimal.
|
|
|
|
|
|
|
|
|
|
Unfortunately, there are also binary numbers used in computers
|
|
|
|
|
which are not very well communicated in either hexadecimal or
|
|
|
|
|
octal. Thankfully, fewer people have to deal with these on a
|
|
|
|
|
regular basis, but on the other hand, this means that several
|
|
|
|
|
people on the discussion list questioned the wisdom of adding
|
|
|
|
|
a straight binary representation to Python.
|
|
|
|
|
|
|
|
|
|
One example of where these numbers is very useful is in
|
|
|
|
|
reading and writing hardware registers. Sometimes hardware
|
|
|
|
|
designers will eschew human readability and opt for address
|
|
|
|
|
space efficiency, by packing multiple bit fields into a single
|
|
|
|
|
hardware register at unaligned bit locations, and it is tedious
|
|
|
|
|
and error-prone for a human to reconstruct a 5 bit field which
|
|
|
|
|
consists of the upper 3 bits of one hex digit, and the lower 2
|
|
|
|
|
bits of the next hex digit.
|
|
|
|
|
|
|
|
|
|
Even if the ability of Python to communicate binary information
|
|
|
|
|
to humans is only useful for a small technical subset of the
|
|
|
|
|
population, it is exactly that population subset which contains
|
|
|
|
|
most, if not all, members of the Python core team, so even straight
|
|
|
|
|
binary, the least useful of these notations, has several enthusiastic
|
|
|
|
|
supporters and few, if any, staunch opponents, among the Python community.
|
|
|
|
|
|
|
|
|
|
Syntax for supported radices
|
|
|
|
|
-----------------------------
|
|
|
|
|
|
|
|
|
|
This proposal is to to use a "0o" prefix with either uppercase
|
|
|
|
|
or lowercase "o" for octal, and a "0b" prefix with either
|
|
|
|
|
uppercase or lowercase "b" for binary.
|
|
|
|
|
|
|
|
|
|
There was strong support for not supporting uppercase, but
|
|
|
|
|
this is a separate subject for a different PEP, as 'j' for
|
|
|
|
|
complex numbers, 'e' for exponent, and 'r' for raw string
|
|
|
|
|
(to name a few) already support uppercase.
|
|
|
|
|
|
|
|
|
|
The syntax for delimiting the different radices received a lot of
|
|
|
|
|
attention in the discussion on Python-3000. There are several
|
|
|
|
|
(sometimes conflicting) requirements and "nice-to-haves" for
|
|
|
|
|
this syntax:
|
|
|
|
|
|
|
|
|
|
- It should be as compatible with other languages and
|
|
|
|
|
previous versions of Python as is reasonable, both
|
|
|
|
|
for the input syntax and for the output (e.g. string
|
|
|
|
|
% operator) syntax.
|
|
|
|
|
|
|
|
|
|
- It should be as obvious to the casual observer as
|
|
|
|
|
possible.
|
|
|
|
|
|
|
|
|
|
- It should be easy to visually distinguish integers
|
|
|
|
|
formatted in the different bases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Proposed syntaxes included things like arbitrary radix prefixes,
|
|
|
|
|
such as 16r100 (256 in hexadecimal), and radix suffixes, similar
|
|
|
|
|
to the 100h assembler-style suffix. The debate on whether the
|
|
|
|
|
letter "O" could be used for octal was intense -- an uppercase
|
|
|
|
|
"O" looks suspiciously similar to a zero in some fonts. Suggestions
|
|
|
|
|
were made to use a "c" (the second letter of "oCtal"), or even
|
|
|
|
|
to use a "t" for "ocTal" and an "n" for "biNary" to go along
|
|
|
|
|
with the "x" for "heXadecimal".
|
|
|
|
|
|
|
|
|
|
For the string % operator, "o" was already being used to denote
|
|
|
|
|
octal, and "b" was not used for anything, so this works out
|
|
|
|
|
much better than, for example, using "c" (which means "character"
|
|
|
|
|
for the % operator).
|
|
|
|
|
|
|
|
|
|
At the end of the day, since uppercase "O" can look like a zero
|
|
|
|
|
and uppercase "B" can look like an 8, it was decided that these
|
|
|
|
|
prefixes should be lowercase only, but, like 'r' for raw string,
|
|
|
|
|
that can be a preference or style-guide issue.
|
|
|
|
|
|
|
|
|
|
Open Issues
|
|
|
|
|
===========
|
|
|
|
|
|
|
|
|
|
It was suggested in the discussion that lowercase should be used
|
|
|
|
|
for all numeric and string special modifiers, such as 'x' for
|
|
|
|
|
hexadecimal, 'r' for raw strings, 'e' for exponentiation, and
|
|
|
|
|
'j' for complex numbers. This is an issue for a separate PEP.
|
|
|
|
|
|
|
|
|
|
This PEP takes no position on uppercase or lowercase for input,
|
|
|
|
|
just noting that, for consistency, if uppercase is not to be
|
|
|
|
|
removed from input parsing for other letters, it should be
|
|
|
|
|
added for octal and binary, and documenting the changes under
|
|
|
|
|
this assumption, as there is not yet a PEP about the case issue.
|
|
|
|
|
|
|
|
|
|
Output formatting may be a different story -- there is already
|
|
|
|
|
ample precedence for case sensitivity in the output format string,
|
|
|
|
|
and there would need to be a consensus that there is a valid
|
|
|
|
|
use-case for the "alternate form" of the string % operator
|
|
|
|
|
to support uppercase 'B' or 'O' characters for binary or
|
|
|
|
|
octal output. Currently, PEP3101 does not even support this
|
|
|
|
|
alternate capability, and the hex() function does not allow
|
|
|
|
|
the programmer to specify the case of the 'x' character.
|
|
|
|
|
|
|
|
|
|
There are still some strong feelings that '0123' should be
|
|
|
|
|
allowed as a literal decimal in Python 3.0. If this is the
|
|
|
|
|
right thing to do, this can easily be covered in an additional
|
|
|
|
|
PEP. This proposal only takes the first step of making '0123'
|
|
|
|
|
not be a valid octal number, for reasons covered in the rationale.
|
|
|
|
|
|
|
|
|
|
Is there (or should there be) an option for the 2to3 translator
|
|
|
|
|
which only makes the 2.6 compatible changes? Should this be
|
|
|
|
|
run on 2.6 library code before the 2.6 release?
|
|
|
|
|
|
|
|
|
|
Should a bin() function which matches hex() and oct() be added?
|
|
|
|
|
|
|
|
|
|
Is hex() really that useful once we have advanced string formatting?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [1] GNU libc manual printf integer format conversions
|
|
|
|
|
(http://www.gnu.org/software/libc/manual/html_node/Integer-Conversions.html)
|
|
|
|
|
|
|
|
|
|
.. [2] Python string formatting operations
|
|
|
|
|
(http://docs.python.org/lib/typesseq-strings.html)
|
|
|
|
|
|
|
|
|
|
.. [3] The Representation of Numbers, Jiajie Zhang and Donald A. Norman
|
|
|
|
|
(http://acad88.sahs.uth.tmc.edu/research/publications/Number-Representation.pdf)
|
|
|
|
|
|
|
|
|
|
.. [4] ENIAC page at wikipedia
|
|
|
|
|
(http://en.wikipedia.org/wiki/ENIAC)
|
|
|
|
|
|
|
|
|
|
.. [5] BCD page at wikipedia
|
|
|
|
|
(http://en.wikipedia.org/wiki/Binary-coded_decimal)
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|