PEP 515: major revision. Use rules preferred by Guido.
This commit is contained in:
parent
2002aa056a
commit
3693b34730
187
pep-0515.txt
187
pep-0515.txt
|
@ -2,7 +2,7 @@ PEP: 515
|
||||||
Title: Underscores in Numeric Literals
|
Title: Underscores in Numeric Literals
|
||||||
Version: $Revision$
|
Version: $Revision$
|
||||||
Last-Modified: $Date$
|
Last-Modified: $Date$
|
||||||
Author: Georg Brandl
|
Author: Georg Brandl, Serhiy Storchaka
|
||||||
Status: Draft
|
Status: Draft
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
|
@ -13,13 +13,14 @@ Post-History: 10-Feb-2016, 11-Feb-2016
|
||||||
Abstract and Rationale
|
Abstract and Rationale
|
||||||
======================
|
======================
|
||||||
|
|
||||||
This PEP proposes to extend Python's syntax so that underscores can be used as
|
This PEP proposes to extend Python's syntax and number-from-string
|
||||||
visual separators for digit grouping purposes in integral, floating-point and
|
constructors so that underscores can be used as visual separators for
|
||||||
complex number literals.
|
digit grouping purposes in integral, floating-point and complex number
|
||||||
|
literals.
|
||||||
|
|
||||||
This is a common feature of other modern languages, and can aid readability of
|
This is a common feature of other modern languages, and can aid
|
||||||
long literals, or literals whose value should clearly separate into parts, such
|
readability of long literals, or literals whose value should clearly
|
||||||
as bytes or words in hexadecimal notation.
|
separate into parts, such as bytes or words in hexadecimal notation.
|
||||||
|
|
||||||
Examples::
|
Examples::
|
||||||
|
|
||||||
|
@ -32,39 +33,81 @@ Examples::
|
||||||
# grouping bits into nibbles in a binary literal
|
# grouping bits into nibbles in a binary literal
|
||||||
flags = 0b_0011_1111_0100_1110
|
flags = 0b_0011_1111_0100_1110
|
||||||
|
|
||||||
# making the literal suffix stand out more
|
# same, for string conversions
|
||||||
imag = 1.247812376e-15_j
|
flags = int('0b_1111_0000', 2)
|
||||||
|
|
||||||
|
|
||||||
Specification
|
Specification
|
||||||
=============
|
=============
|
||||||
|
|
||||||
The current proposal is to allow one or more consecutive underscores following
|
The current proposal is to allow one underscore between digits, and
|
||||||
digits and base specifiers in numeric literals. The underscores have no
|
after base specifiers in numeric literals. The underscores have no
|
||||||
semantic meaning, and literals are parsed as if the underscores were absent.
|
semantic meaning, and literals are parsed as if the underscores were
|
||||||
|
absent.
|
||||||
|
|
||||||
The production list for integer literals would therefore look like this::
|
Literal Grammar
|
||||||
|
---------------
|
||||||
|
|
||||||
integer: decimalinteger | octinteger | hexinteger | bininteger
|
The production list for integer literals would therefore look like
|
||||||
decimalinteger: nonzerodigit (digit | "_")* | "0" ("0" | "_")*
|
this::
|
||||||
|
|
||||||
|
integer: decinteger | bininteger | octinteger | hexinteger
|
||||||
|
decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
|
||||||
|
bininteger: "0" ("b" | "B") (["_"] bindigit)+
|
||||||
|
octinteger: "0" ("o" | "O") (["_"] octdigit)+
|
||||||
|
hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
|
||||||
nonzerodigit: "1"..."9"
|
nonzerodigit: "1"..."9"
|
||||||
digit: "0"..."9"
|
digit: "0"..."9"
|
||||||
octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
|
bindigit: "0" | "1"
|
||||||
hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
|
|
||||||
bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
|
|
||||||
octdigit: "0"..."7"
|
octdigit: "0"..."7"
|
||||||
hexdigit: digit | "a"..."f" | "A"..."F"
|
hexdigit: digit | "a"..."f" | "A"..."F"
|
||||||
bindigit: "0" | "1"
|
|
||||||
|
|
||||||
For floating-point and complex literals::
|
For floating-point and complex literals::
|
||||||
|
|
||||||
floatnumber: pointfloat | exponentfloat
|
floatnumber: pointfloat | exponentfloat
|
||||||
pointfloat: [intpart] fraction | intpart "."
|
pointfloat: [digitpart] fraction | digitpart "."
|
||||||
exponentfloat: (intpart | pointfloat) exponent
|
exponentfloat: (digitpart | pointfloat) exponent
|
||||||
intpart: digit (digit | "_")*
|
digitpart: digit (["_"] digit)*
|
||||||
fraction: "." intpart
|
fraction: "." digitpart
|
||||||
exponent: ("e" | "E") ["+" | "-"] intpart
|
exponent: ("e" | "E") ["+" | "-"] digitpart
|
||||||
imagnumber: (floatnumber | intpart) ("j" | "J")
|
imagnumber: (floatnumber | digitpart) ("j" | "J")
|
||||||
|
|
||||||
|
Constructors
|
||||||
|
------------
|
||||||
|
|
||||||
|
Following the same rules for placement, underscores will be allowed in
|
||||||
|
the following constructors:
|
||||||
|
|
||||||
|
- ``int()`` (with any base)
|
||||||
|
- ``float()``
|
||||||
|
- ``complex()``
|
||||||
|
- ``Decimal()``
|
||||||
|
|
||||||
|
|
||||||
|
Prior Art
|
||||||
|
=========
|
||||||
|
|
||||||
|
Those languages that do allow underscore grouping implement a large
|
||||||
|
variety of rules for allowed placement of underscores. In cases where
|
||||||
|
the language spec contradicts the actual behavior, the actual behavior
|
||||||
|
is listed. ("single" or "multiple" refer to allowing runs of
|
||||||
|
consecutive underscores.)
|
||||||
|
|
||||||
|
* Ada: single, only between digits [8]_
|
||||||
|
* C# (open proposal for 7.0): multiple, only between digits [6]_
|
||||||
|
* C++ (C++14): single, between digits (different separator chosen) [1]_
|
||||||
|
* D: multiple, anywhere, including trailing [2]_
|
||||||
|
* Java: multiple, only between digits [7]_
|
||||||
|
* Julia: single, only between digits (but not in float exponent parts)
|
||||||
|
[9]_
|
||||||
|
* Perl 5: multiple, basically anywhere, although docs say it's
|
||||||
|
restricted to one underscore between digits [3]_
|
||||||
|
* Ruby: single, only between digits (although docs say "anywhere")
|
||||||
|
[10]_
|
||||||
|
* Rust: multiple, anywhere, except for between exponent "e" and digits
|
||||||
|
[4]_
|
||||||
|
* Swift: multiple, between digits and trailing (although textual
|
||||||
|
description says only "between digits") [5]_
|
||||||
|
|
||||||
|
|
||||||
Alternative Syntax
|
Alternative Syntax
|
||||||
|
@ -73,81 +116,53 @@ Alternative Syntax
|
||||||
Underscore Placement Rules
|
Underscore Placement Rules
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
||||||
Instead of the liberal rule specified above, the use of underscores could be
|
Instead of the relatively strict rule specified above, the use of
|
||||||
limited. Common rules are (see the "other languages" section):
|
underscores could be limited. As we seen from other languages, common
|
||||||
|
rules include:
|
||||||
|
|
||||||
* Only one consecutive underscore allowed, and only between digits.
|
* Only one consecutive underscore allowed, and only between digits.
|
||||||
* Multiple consecutive underscore allowed, but only between digits.
|
* Multiple consecutive underscores allowed, but only between digits.
|
||||||
|
* Multiple consecutive underscores allowed, in most positions except
|
||||||
|
for the start of the literal, or special positions like after a
|
||||||
|
decimal point.
|
||||||
|
|
||||||
A less common rule would be to allow underscores only every N digits (where N
|
The syntax in this PEP has ultimately been selected because it covers
|
||||||
could be 3 for decimal literals, or 4 for hexadecimal ones). This is
|
the common use cases, and does not allow for syntax that would have to
|
||||||
unnecessarily restrictive, especially considering the separator placement is
|
be discouraged in style guides anyway.
|
||||||
different in different cultures.
|
|
||||||
|
A less common rule would be to allow underscores only every N digits
|
||||||
|
(where N could be 3 for decimal literals, or 4 for hexadecimal ones).
|
||||||
|
This is unnecessarily restrictive, especially considering the
|
||||||
|
separator placement is different in different cultures.
|
||||||
|
|
||||||
Different Separators
|
Different Separators
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
A proposed alternate syntax was to use whitespace for grouping. Although
|
A proposed alternate syntax was to use whitespace for grouping.
|
||||||
strings are a precedent for combining adjoining literals, the behavior can lead
|
Although strings are a precedent for combining adjoining literals, the
|
||||||
to unexpected effects which are not possible with underscores. Also, no other
|
behavior can lead to unexpected effects which are not possible with
|
||||||
language is known to use this rule, except for languages that generally
|
underscores. Also, no other language is known to use this rule,
|
||||||
disregard any whitespace.
|
except for languages that generally disregard any whitespace.
|
||||||
|
|
||||||
C++14 introduces apostrophes for grouping, which is not considered due to the
|
C++14 introduces apostrophes for grouping (because underscores introduce
|
||||||
conflict with Python's string literals. [1]_
|
ambiguity with user-defined literals), which is not considered because of the
|
||||||
|
use in Python's string literals. [1]_
|
||||||
|
|
||||||
|
|
||||||
Behavior in Other Languages
|
Open Proposals
|
||||||
===========================
|
==============
|
||||||
|
|
||||||
Those languages that do allow underscore grouping implement a large variety of
|
It has been proposed [11]_ to extend the number-to-string formatting
|
||||||
rules for allowed placement of underscores. This is a listing placing the known
|
language to allow ``_`` as a thousans separator, where currently only
|
||||||
rules into three major groups. In cases where the language spec contradicts the
|
``,`` is supported. This could be used to easily generate code with
|
||||||
actual behavior, the actual behavior is listed.
|
more readable literals.
|
||||||
|
|
||||||
**Group 1: liberal**
|
|
||||||
|
|
||||||
This group is the least homogeneous: the rules vary slightly between languages.
|
|
||||||
All of them allow trailing underscores. Some allow underscores after non-digits
|
|
||||||
like the ``e`` or the sign in exponents.
|
|
||||||
|
|
||||||
* D [2]_
|
|
||||||
* Perl 5 (underscores basically allowed anywhere, although docs say it's more
|
|
||||||
restricted) [3]_
|
|
||||||
* Rust (allows between exponent sign and digits) [4]_
|
|
||||||
* Swift (although textual description says "between digits") [5]_
|
|
||||||
|
|
||||||
**Group 2: only between digits, multiple consecutive underscores**
|
|
||||||
|
|
||||||
* C# (open proposal for 7.0) [6]_
|
|
||||||
* Java [7]_
|
|
||||||
|
|
||||||
**Group 3: only between digits, only one underscore**
|
|
||||||
|
|
||||||
* Ada [8]_
|
|
||||||
* Julia (but not in the exponent part of floats) [9]_
|
|
||||||
* Ruby (docs say "anywhere", in reality only between digits) [10]_
|
|
||||||
|
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
==============
|
==============
|
||||||
|
|
||||||
A preliminary patch that implements the specification given above has been
|
A preliminary patch that implements the specification given above has
|
||||||
posted to the issue tracker. [11]_
|
been posted to the issue tracker. [12]_
|
||||||
|
|
||||||
|
|
||||||
Open Questions
|
|
||||||
==============
|
|
||||||
|
|
||||||
This PEP currently only proposes changing the literal syntax. The following
|
|
||||||
extensions are open for discussion:
|
|
||||||
|
|
||||||
* Allowing underscores in string arguments to the ``Decimal`` constructor. It
|
|
||||||
could be argued that these are akin to literals, since there is no Decimal
|
|
||||||
literal available (yet).
|
|
||||||
|
|
||||||
* Allowing underscores in string arguments to ``int()`` with base argument 0,
|
|
||||||
``float()`` and ``complex()``.
|
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
|
@ -173,7 +188,9 @@ References
|
||||||
|
|
||||||
.. [10] http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
|
.. [10] http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
|
||||||
|
|
||||||
.. [11] http://bugs.python.org/issue26331
|
.. [11] https://mail.python.org/pipermail/python-dev/2016-February/143283.html
|
||||||
|
|
||||||
|
.. [12] http://bugs.python.org/issue26331
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
|
|
Loading…
Reference in New Issue