From 7a38db59803821ec54655157c83cad2e4d9f87bf Mon Sep 17 00:00:00 2001 From: David Goodger Date: Thu, 12 Mar 2009 12:43:24 +0000 Subject: [PATCH] added PEP 378, "Format Specifier for Thousands Separator", by Raymond Hettinger --- pep-0378.txt | 142 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 pep-0378.txt diff --git a/pep-0378.txt b/pep-0378.txt new file mode 100644 index 000000000..6f0af3880 --- /dev/null +++ b/pep-0378.txt @@ -0,0 +1,142 @@ +PEP: 378 +Title: Format Specifier for Thousands Separator +Version: $Revision$ +Last-Modified: $Date$ +Author: Raymond Hettinger +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 12-Mar-2009 +Post-History: 12-Mar-2009 + + +Motivation +========== + +Provide a simple, non-locale aware way to format a number +with a thousands separator. + +Adding thousands separators is one of the simplest ways to +improve the professional appearance and readability of output +exposed to end users. + +In the finance world, output with commas is the norm. Finance +users and non-professional programmers find the locale +approach to be frustrating, arcane and non-obvious. + +It is not the goal to replace locale or to accommodate every +possible convention. The goal is to make a common task easier +for many users. + + +Current Version of the Mini-Language +==================================== + +* `Python 2.6 docs`_ + + .. _Python 2.6 docs: http://docs.python.org/library/string.html#formatstrings + +* PEP 3101 Advanced String Formatting + + +Research so far +=============== + +Scanning the web, I've found that thousands separators are +usually one of COMMA, DOT, SPACE, or UNDERSCORE. +When a COMMA is the decimal separator, the thousands separator +is typically a DOT or SPACE (see examples from Denis Spir). + +James Knight observed that Indian/Pakistani numbering systems +group by hundreds. Ben Finney noted that Chinese group by +ten-thousands. Eric Smith pointed-out that these are already +handled by the "n" specifier in the locale module (albiet only +for integers). + +Visual Basic and its brethren (like MS Excel) use a completely +different style and have ultra-flexible custom format +specifiers like: "_($* #,##0_)". + + +Proposal I (from Nick Coghlan) +============================== + +A comma will be added to the format() specifier mini-language: + +[[fill]align][sign][#][0][width][,][.precision][type] + +The ',' option indicates that commas should be included in the +output as a thousands separator. As with locales which do not +use a period as the decimal point, locales which use a +different convention for digit separation will need to use the +locale module to obtain appropriate formatting. + +The proposal works well with floats, ints, and decimals. +It also allows easy substitution for other separators. +For example:: + + format(n, "6,f").replace(",", "_") + +This technique is completely general but it is awkward in the +one case where the commas and periods need to be swapped:: + + format(n, "6,f").replace(",", "X").replace(".", ",").replace("X", ".") + + +Proposal II (to meet Antoine Pitrou's request) +============================================== + +Make both the thousands separator and decimal separator user +specifiable but not locale aware. For simplicity, limit the +choices to a comma, period, space, or underscore. + +[[fill]align][sign][#][0][width][T[tsep]][dsep precision][type] + +Examples:: + + format(1234, "8.1f") --> ' 1234.0' + format(1234, "8,1f") --> ' 1234,0' + format(1234, "8T.,1f") --> ' 1.234,0' + format(1234, "8T .f") --> ' 1 234,0' + format(1234, "8d") --> ' 1234' + format(1234, "8T,d") --> ' 1,234' + +This proposal meets mosts needs (except for people wanting +grouping for hundreds or ten-thousands), but it comes at the +expense of being a little more complicated to learn and +remember. Also, it makes it more challenging to write custom +__format__ methods that follow the format specification +mini-language. + +No change is proposed for the local module. + + +Other Ideas +=========== + +* Lie Ryan suggested a convenience function of the form:: + + create_format(self, type='i', base=16, seppos=4, sep=':', \ + charset='0123456789abcdef', maxwidth=32, \ + minwidth=32, pad='0') + +* Eric Smith would like the C version of the mini-language + parser to be exposed. That would make it easier to write + custom __format__ methods. + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: