python-peps/pep-0378/index.html

319 lines
23 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="color-scheme" content="light dark">
<title>PEP 378 Format Specifier for Thousands Separator | peps.python.org</title>
<link rel="shortcut icon" href="../_static/py.png">
<link rel="canonical" href="https://peps.python.org/pep-0378/">
<link rel="stylesheet" href="../_static/style.css" type="text/css">
<link rel="stylesheet" href="../_static/mq.css" type="text/css">
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light">
<link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark">
<link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss">
<meta property="og:title" content='PEP 378 Format Specifier for Thousands Separator | peps.python.org'>
<meta property="og:description" content="Python Enhancement Proposals (PEPs)">
<meta property="og:type" content="website">
<meta property="og:url" content="https://peps.python.org/pep-0378/">
<meta property="og:site_name" content="Python Enhancement Proposals (PEPs)">
<meta property="og:image" content="https://peps.python.org/_static/og-image.png">
<meta property="og:image:alt" content="Python PEPs">
<meta property="og:image:width" content="200">
<meta property="og:image:height" content="200">
<meta name="description" content="Python Enhancement Proposals (PEPs)">
<meta name="theme-color" content="#3776ab">
</head>
<body>
<svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
<symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all">
<title>Following system colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="9"></circle>
<path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path>
</svg>
</symbol>
<symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all">
<title>Selected dark colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path stroke="none" d="M0 0h24v24H0z" fill="none"></path>
<path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path>
</svg>
</symbol>
<symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all">
<title>Selected light colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="5"></circle>
<line x1="12" y1="1" x2="12" y2="3"></line>
<line x1="12" y1="21" x2="12" y2="23"></line>
<line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
<line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
<line x1="1" y1="12" x2="3" y2="12"></line>
<line x1="21" y1="12" x2="23" y2="12"></line>
<line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
<line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
</svg>
</symbol>
</svg>
<script>
document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto"
</script>
<section id="pep-page-section">
<header>
<h1>Python Enhancement Proposals</h1>
<ul class="breadcrumbs">
<li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li>
<li><a href="../pep-0000/">PEP Index</a> &raquo; </li>
<li>PEP 378</li>
</ul>
<button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())">
<svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg>
<span class="visually-hidden">Toggle light / dark / auto colour theme</span>
</button>
</header>
<article>
<section id="pep-content">
<h1 class="page-title">PEP 378 Format Specifier for Thousands Separator</h1>
<dl class="rfc2822 field-list simple">
<dt class="field-odd">Author<span class="colon">:</span></dt>
<dd class="field-odd">Raymond Hettinger &lt;python&#32;&#97;t&#32;rcn.com&gt;</dd>
<dt class="field-even">Status<span class="colon">:</span></dt>
<dd class="field-even"><abbr title="Accepted and implementation complete, or no longer active">Final</abbr></dd>
<dt class="field-odd">Type<span class="colon">:</span></dt>
<dd class="field-odd"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd>
<dt class="field-even">Created<span class="colon">:</span></dt>
<dd class="field-even">12-Mar-2009</dd>
<dt class="field-odd">Python-Version<span class="colon">:</span></dt>
<dd class="field-odd">2.7, 3.1</dd>
<dt class="field-even">Post-History<span class="colon">:</span></dt>
<dd class="field-even">12-Mar-2009</dd>
</dl>
<hr class="docutils" />
<section id="contents">
<details><summary>Table of Contents</summary><ul class="simple">
<li><a class="reference internal" href="#motivation">Motivation</a></li>
<li><a class="reference internal" href="#main-proposal-from-alyssa-coghlan-originally-called-proposal-i">Main Proposal (from Alyssa Coghlan, originally called Proposal I)</a></li>
<li><a class="reference internal" href="#current-version-of-the-mini-language">Current Version of the Mini-Language</a></li>
<li><a class="reference internal" href="#research-into-what-other-languages-do">Research into what Other Languages Do</a></li>
<li><a class="reference internal" href="#alternative-proposal-from-eric-smith-originally-called-proposal-ii">Alternative Proposal (from Eric Smith, originally called Proposal II)</a></li>
<li><a class="reference internal" href="#commentary">Commentary</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
</details></section>
<section id="motivation">
<h2><a class="toc-backref" href="#motivation" role="doc-backlink">Motivation</a></h2>
<p>Provide a simple, non-locale aware way to format a number
with a thousands separator.</p>
<p>Adding thousands separators is one of the simplest ways to
humanize a programs output, improving its professional appearance
and readability.</p>
<p>In the finance world, output with thousands separators is the norm.
Finance users and non-professional programmers find the locale
approach to be frustrating, arcane and non-obvious.</p>
<p>The locale module presents two other challenges. First, it is
a global setting and not suitable for multi-threaded apps that
need to serve-up requests in multiple locales. Second, the
name of a relevant locale (such as “de_DE”) can vary from
platform to platform or may not be defined at all. The docs
for the locale module describe these and <a class="reference external" href="https://docs.python.org/2.6/library/locale.html#background-details-hints-tips-and-caveats">many other challenges</a>
in detail.</p>
<p>It is not the goal to replace the locale module, to perform
internationalization tasks, or accommodate every possible
convention. Such tasks are better suited to robust tools like
<a class="reference external" href="http://babel.edgewall.org/">Babel</a>. Instead, the goal is to make a common, everyday
task easier for many users.</p>
</section>
<section id="main-proposal-from-alyssa-coghlan-originally-called-proposal-i">
<h2><a class="toc-backref" href="#main-proposal-from-alyssa-coghlan-originally-called-proposal-i" role="doc-backlink">Main Proposal (from Alyssa Coghlan, originally called Proposal I)</a></h2>
<p>A comma will be added to the format() specifier mini-language:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[[</span><span class="n">fill</span><span class="p">]</span><span class="n">align</span><span class="p">][</span><span class="n">sign</span><span class="p">][</span><span class="c1">#][0][width][,][.precision][type]</span>
</pre></div>
</div>
<p>The , option indicates that commas should be included in the
output as a thousands separator. As with locales which do not
use a period as the decimal point, locales which use a
different convention for digit separation will need to use the
locale module to obtain appropriate formatting.</p>
<p>The proposal works well with floats, ints, and decimals.
It also allows easy substitution for other separators.
For example:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">format</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="s2">&quot;6,d&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s2">&quot;,&quot;</span><span class="p">,</span> <span class="s2">&quot;_&quot;</span><span class="p">)</span>
</pre></div>
</div>
<p>This technique is completely general but it is awkward in the
one case where the commas and periods need to be swapped:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">format</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="s2">&quot;6,f&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s2">&quot;,&quot;</span><span class="p">,</span> <span class="s2">&quot;X&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s2">&quot;.&quot;</span><span class="p">,</span> <span class="s2">&quot;,&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s2">&quot;X&quot;</span><span class="p">,</span> <span class="s2">&quot;.&quot;</span><span class="p">)</span>
</pre></div>
</div>
<p>The <em>width</em> argument means the total length including the commas
and decimal point:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;08,d&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39;0001,234&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mf">1234.5</span><span class="p">,</span> <span class="s2">&quot;08,.1f&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39;01,234.5&#39;</span>
</pre></div>
</div>
<p>The , option is defined as shown above for types d, e,
f, g, E, G, %, F and . To allow future extensions, it is
undefined for other types: binary, octal, hex, character,
etc.</p>
<p>This proposal has the virtue of being simpler than the alternative
proposal but is much less flexible and meets the needs of fewer
users right out of the box. It is expected that some other
solution will arise for specifying alternative separators.</p>
</section>
<section id="current-version-of-the-mini-language">
<h2><a class="toc-backref" href="#current-version-of-the-mini-language" role="doc-backlink">Current Version of the Mini-Language</a></h2>
<ul class="simple">
<li><a class="reference external" href="https://docs.python.org/2.6/library/string.html#formatstrings">Python 2.6 docs</a></li>
<li><a class="pep reference internal" href="../pep-3101/" title="PEP 3101 Advanced String Formatting">PEP 3101</a> Advanced String Formatting</li>
</ul>
</section>
<section id="research-into-what-other-languages-do">
<h2><a class="toc-backref" href="#research-into-what-other-languages-do" role="doc-backlink">Research into what Other Languages Do</a></h2>
<p>Scanning the web, Ive found that thousands separators are
usually one of COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.</p>
<p><a class="reference external" href="http://blog.stevex.net/index.php/string-formatting-in-csharp/">C-Sharp</a> provides both styles (picture formatting and type specifiers).
The type specifier approach is locale aware. The picture formatting only
offers a COMMA as a thousands separator:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">String</span><span class="o">.</span><span class="n">Format</span><span class="p">(</span><span class="s2">&quot;</span><span class="si">{0:n}</span><span class="s2">&quot;</span><span class="p">,</span> <span class="mi">12400</span><span class="p">)</span> <span class="o">==&gt;</span> <span class="s2">&quot;12,400&quot;</span>
<span class="n">String</span><span class="o">.</span><span class="n">Format</span><span class="p">(</span><span class="s2">&quot;{0:0,0}&quot;</span><span class="p">,</span> <span class="mi">12400</span><span class="p">)</span> <span class="o">==&gt;</span> <span class="s2">&quot;12,400&quot;</span>
</pre></div>
</div>
<p><a class="reference external" href="http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node200.html">Common Lisp</a> uses a COLON before the <code class="docutils literal notranslate"><span class="pre">~D</span></code> decimal type specifier to
emit a COMMA as a thousands separator. The general form of <code class="docutils literal notranslate"><span class="pre">~D</span></code> is
<code class="docutils literal notranslate"><span class="pre">~mincol,padchar,commachar,commaintervalD</span></code>. The <em>padchar</em> defaults
to SPACE. The <em>commachar</em> defaults to COMMA. The <em>commainterval</em>
defaults to three.</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">(</span><span class="nb">format</span> <span class="n">nil</span> <span class="s2">&quot;~:D&quot;</span> <span class="mi">229345007</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="s2">&quot;229,345,007&quot;</span>
</pre></div>
</div>
<ul class="simple">
<li>The <a class="reference external" href="http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html">ADA language</a> allows UNDERSCORES in its numeric literals.</li>
</ul>
<p>Visual Basic and its brethren (like <a class="reference external" href="http://www.brainbell.com/tutorials/ms-office/excel/Create_Custom_Number_Formats.htm">MS Excel</a>) use a completely
different style and have ultra-flexible custom format
specifiers like:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="s2">&quot;_($* #,##0_)&quot;</span><span class="o">.</span>
</pre></div>
</div>
<p><a class="reference external" href="http://en.wikipedia.org/wiki/Cobol#Syntactic_features">COBOL</a> uses picture clauses like:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>PICTURE $***,**9.99CR
</pre></div>
</div>
<p>Java offers a <a class="reference external" href="http://java.sun.com/javase/6/docs/api/java/text/DecimalFormat.html">Decimal.Format Class</a> that uses picture patterns (one
for positive numbers and an optional one for negatives) such as:
<code class="docutils literal notranslate"><span class="pre">&quot;#,##0.00;(#,##0.00)&quot;</span></code>. It allows arbitrary groupings including
hundreds and ten-thousands and uneven groupings. The special pattern
characters are non-localized (using a DOT for a decimal separator and
a COMMA for a grouping separator). The user can supply an alternate
set of symbols using the formatters <em>DecimalFormatSymbols</em> object.</p>
</section>
<section id="alternative-proposal-from-eric-smith-originally-called-proposal-ii">
<h2><a class="toc-backref" href="#alternative-proposal-from-eric-smith-originally-called-proposal-ii" role="doc-backlink">Alternative Proposal (from Eric Smith, originally called Proposal II)</a></h2>
<p>Make both the thousands separator and decimal separator user
specifiable but not locale aware. For simplicity, limit the
choices to a COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.
The SPACE can be either U+0020 or U+00A0.</p>
<p>Whenever a separator is followed by a precision, it is a
decimal separator and an optional separator preceding it is a
thousands separator. When the precision is absent, a lone
specifier means a thousands separator:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[[</span><span class="n">fill</span><span class="p">]</span><span class="n">align</span><span class="p">][</span><span class="n">sign</span><span class="p">][</span><span class="c1">#][0][width][tsep][dsep precision][type]</span>
</pre></div>
</div>
<p>Examples:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8.1f&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1234.0&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8,1f&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1234,0&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8.,1f&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1.234,0&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8 ,f&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1 234,0&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8d&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1234&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8,d&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1,234&#39;</span>
<span class="nb">format</span><span class="p">(</span><span class="mi">1234</span><span class="p">,</span> <span class="s2">&quot;8_d&quot;</span><span class="p">)</span> <span class="o">--&gt;</span> <span class="s1">&#39; 1_234&#39;</span>
</pre></div>
</div>
<p>This proposal meets mosts needs, but it comes at the expense
of taking a bit more effort to parse. Not every possible
convention is covered, but at least one of the options (spaces
or underscores) should be readable, understandable, and useful
to folks from many diverse backgrounds.</p>
<p>As shown in the examples, the <em>width</em> argument means the total
length including the thousands separators and decimal separators.</p>
<p>No change is proposed for the locale module.</p>
<p>The thousands separator is defined as shown above for types
d, e, f, g, %, E, G and F. To allow future
extensions, it is undefined for other types: binary, octal,
hex, character, etc.</p>
<p>The drawback to this alternative proposal is the difficulty
of mentally parsing whether a single separator is a thousands
separator or decimal separator. Perhaps it is too arcane
to link the decimal separator with the precision specifier.</p>
</section>
<section id="commentary">
<h2><a class="toc-backref" href="#commentary" role="doc-backlink">Commentary</a></h2>
<ul class="simple">
<li>Some commenters do not like the idea of format strings at all
and find them to be unreadable. Suggested alternatives include
the COBOL style PICTURE approach or a convenience function with
keyword arguments for every possible combination.</li>
<li>Some newsgroup respondants think there is no place for any
scripts that are not internationalized and that it is a step
backwards to provide a simple way to hardwire a particular choice
(thus reducing incentive to use a locale sensitive approach).</li>
<li>Another thought is that embedding some particular convention in
individual format strings makes it hard to change that convention
later. No workable alternative was suggested but the general idea
is to set the convention once and have it apply everywhere (others
commented that locale already provides a way to do this).</li>
<li>There are some precedents for grouping digits in the fractional
part of a floating point number, but this PEP does not venture into
that territory. Only digits to the left of the decimal point are
grouped. This does not preclude future extensions; it just focuses
on a single, generally useful extension to the formatting language.</li>
<li>James Knight observed that Indian/Pakistani numbering systems
group by hundreds. Ben Finney noted that Chinese group by
ten-thousands. Eric Smith pointed-out that these are already
handled by the “n” specifier in the locale module (albeit only
for integers). This PEP does not attempt to support all of those
possibilities. It focuses on a single, relatively common grouping
convention that offers a quick way to improve readability in many
(though not all) contexts.</li>
</ul>
</section>
<section id="copyright">
<h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2>
<p>This document has been placed in the public domain.</p>
</section>
</section>
<hr class="docutils" />
<p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0378.rst">https://github.com/python/peps/blob/main/peps/pep-0378.rst</a></p>
<p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0378.rst">2023-10-11 12:05:51 GMT</a></p>
</article>
<nav id="pep-sidebar">
<h2>Contents</h2>
<ul>
<li><a class="reference internal" href="#motivation">Motivation</a></li>
<li><a class="reference internal" href="#main-proposal-from-alyssa-coghlan-originally-called-proposal-i">Main Proposal (from Alyssa Coghlan, originally called Proposal I)</a></li>
<li><a class="reference internal" href="#current-version-of-the-mini-language">Current Version of the Mini-Language</a></li>
<li><a class="reference internal" href="#research-into-what-other-languages-do">Research into what Other Languages Do</a></li>
<li><a class="reference internal" href="#alternative-proposal-from-eric-smith-originally-called-proposal-ii">Alternative Proposal (from Eric Smith, originally called Proposal II)</a></li>
<li><a class="reference internal" href="#commentary">Commentary</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
<br>
<a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0378.rst">Page Source (GitHub)</a>
</nav>
</section>
<script src="../_static/colour_scheme.js"></script>
<script src="../_static/wrap_tables.js"></script>
<script src="../_static/sticky_banner.js"></script>
</body>
</html>