179 lines
5.4 KiB
Plaintext
179 lines
5.4 KiB
Plaintext
|
PEP: 515
|
|||
|
Title: Underscores in Numeric Literals
|
|||
|
Version: $Revision$
|
|||
|
Last-Modified: $Date$
|
|||
|
Author: Georg Brandl
|
|||
|
Status: Draft
|
|||
|
Type: Standards Track
|
|||
|
Content-Type: text/x-rst
|
|||
|
Created: 10-Feb-2016
|
|||
|
Python-Version: 3.6
|
|||
|
|
|||
|
Abstract and Rationale
|
|||
|
======================
|
|||
|
|
|||
|
This PEP proposes to extend Python's syntax so that underscores can be used in
|
|||
|
integral and floating-point number literals.
|
|||
|
|
|||
|
This is a common feature of other modern languages, and can aid readability of
|
|||
|
long literals, or literals whose value should clearly separate into parts, such
|
|||
|
as bytes or words in hexadecimal notation.
|
|||
|
|
|||
|
Examples::
|
|||
|
|
|||
|
# grouping decimal numbers by thousands
|
|||
|
amount = 10_000_000.0
|
|||
|
|
|||
|
# grouping hexadecimal addresses by words
|
|||
|
addr = 0xDEAD_BEEF
|
|||
|
|
|||
|
# grouping bits into bytes in a binary literal
|
|||
|
flags = 0b_0011_1111_0100_1110
|
|||
|
|
|||
|
|
|||
|
Specification
|
|||
|
=============
|
|||
|
|
|||
|
The current proposal is to allow underscores anywhere in numeric literals, with
|
|||
|
these exceptions:
|
|||
|
|
|||
|
* Leading underscores cannot be allowed, since they already introduce
|
|||
|
identifiers.
|
|||
|
* Trailing underscores are not allowed, because they look confusing and don't
|
|||
|
contribute much to readability.
|
|||
|
* The number base prefixes ``0x``, ``0o``, and ``0b`` cannot be split up,
|
|||
|
because they are fixed strings and not logically part of the number.
|
|||
|
* No underscore allowed after a sign in an exponent (``1e-_5``), because
|
|||
|
underscores can also not be used after the signs in front of the number
|
|||
|
(``-1e5``).
|
|||
|
* No underscore allowed after a decimal point, because this leads to ambiguity
|
|||
|
with attribute access (the lexer cannot know that there is no number literal
|
|||
|
in ``foo._5``).
|
|||
|
|
|||
|
There appears to be no reason to restrict the use of underscores otherwise.
|
|||
|
|
|||
|
The production list for integer literals would therefore look like this::
|
|||
|
|
|||
|
integer: decimalinteger | octinteger | hexinteger | bininteger
|
|||
|
decimalinteger: nonzerodigit [decimalrest] | "0" [("0" | "_")* "0"]
|
|||
|
nonzerodigit: "1"..."9"
|
|||
|
decimalrest: (digit | "_")* digit
|
|||
|
digit: "0"..."9"
|
|||
|
octinteger: "0" ("o" | "O") (octdigit | "_")* octdigit
|
|||
|
hexinteger: "0" ("x" | "X") (hexdigit | "_")* hexdigit
|
|||
|
bininteger: "0" ("b" | "B") (bindigit | "_")* bindigit
|
|||
|
octdigit: "0"..."7"
|
|||
|
hexdigit: digit | "a"..."f" | "A"..."F"
|
|||
|
bindigit: "0" | "1"
|
|||
|
|
|||
|
For floating-point literals::
|
|||
|
|
|||
|
floatnumber: pointfloat | exponentfloat
|
|||
|
pointfloat: [intpart] fraction | intpart "."
|
|||
|
exponentfloat: (intpart | pointfloat) exponent
|
|||
|
intpart: digit (digit | "_")*
|
|||
|
fraction: "." intpart
|
|||
|
exponent: ("e" | "E") "_"* ["+" | "-"] digit [decimalrest]
|
|||
|
|
|||
|
|
|||
|
Alternative Syntax
|
|||
|
==================
|
|||
|
|
|||
|
Underscore Placement Rules
|
|||
|
--------------------------
|
|||
|
|
|||
|
Instead of the liberal rule specified above, the use of underscores could be
|
|||
|
limited. Common rules are (see the "other languages" section):
|
|||
|
|
|||
|
* Only one consecutive underscore allowed, and only between digits.
|
|||
|
* Multiple consecutive underscore allowed, but only between digits.
|
|||
|
|
|||
|
Different Separators
|
|||
|
--------------------
|
|||
|
|
|||
|
A proposed alternate syntax was to use whitespace for grouping. Although
|
|||
|
strings are a precedent for combining adjoining literals, the behavior can lead
|
|||
|
to unexpected effects which are not possible with underscores. Also, no other
|
|||
|
language is known to use this rule, except for languages that generally
|
|||
|
disregard any whitespace.
|
|||
|
|
|||
|
C++14 introduces apostrophes for grouping, which is not considered due to the
|
|||
|
conflict with Python's string literals. [1]_
|
|||
|
|
|||
|
|
|||
|
Behavior in Other Languages
|
|||
|
===========================
|
|||
|
|
|||
|
Those languages that do allow underscore grouping implement a large variety of
|
|||
|
rules for allowed placement of underscores. This is a listing placing the known
|
|||
|
rules into three major groups. In cases where the language spec contradicts the
|
|||
|
actual behavior, the actual behavior is listed.
|
|||
|
|
|||
|
**Group 1: liberal (like this PEP)**
|
|||
|
|
|||
|
* D [2]_
|
|||
|
* Perl 5 (although docs say it's more restricted) [3]_
|
|||
|
* Rust [4]_
|
|||
|
* Swift (although textual description says "between digits") [5]_
|
|||
|
|
|||
|
**Group 2: only between digits, multiple consecutive underscores**
|
|||
|
|
|||
|
* C# (open proposal for 7.0) [6]_
|
|||
|
* Java [7]_
|
|||
|
|
|||
|
**Group 3: only between digits, only one underscore**
|
|||
|
|
|||
|
* Ada [8]_
|
|||
|
* Julia (but not in the exponent part of floats) [9]_
|
|||
|
* Ruby (docs say "anywhere", in reality only between digits) [10]_
|
|||
|
|
|||
|
|
|||
|
Implementation
|
|||
|
==============
|
|||
|
|
|||
|
A preliminary patch that implements the specification given above has been
|
|||
|
posted to the issue tracker. [11]_
|
|||
|
|
|||
|
|
|||
|
References
|
|||
|
==========
|
|||
|
|
|||
|
.. [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3499.html
|
|||
|
|
|||
|
.. [2] http://dlang.org/spec/lex.html#integerliteral
|
|||
|
|
|||
|
.. [3] http://perldoc.perl.org/perldata.html#Scalar-value-constructors
|
|||
|
|
|||
|
.. [4] http://doc.rust-lang.org/reference.html#number-literals
|
|||
|
|
|||
|
.. [5] https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/LexicalStructure.html
|
|||
|
|
|||
|
.. [6] https://github.com/dotnet/roslyn/issues/216
|
|||
|
|
|||
|
.. [7] https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html
|
|||
|
|
|||
|
.. [8] http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4
|
|||
|
|
|||
|
.. [9] http://docs.julialang.org/en/release-0.4/manual/integers-and-floating-point-numbers/
|
|||
|
|
|||
|
.. [10] http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
|
|||
|
|
|||
|
.. [11] http://bugs.python.org/issue26331
|
|||
|
|
|||
|
|
|||
|
Copyright
|
|||
|
=========
|
|||
|
|
|||
|
This document has been placed in the public domain.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
..
|
|||
|
Local Variables:
|
|||
|
mode: indented-text
|
|||
|
indent-tabs-mode: nil
|
|||
|
sentence-end-double-space: t
|
|||
|
fill-column: 70
|
|||
|
coding: utf-8
|
|||
|
End:
|