PEP: 378 Title: Format Specifier for Thousands Separator Version: $Revision$ Last-Modified: $Date$ Author: Raymond Hettinger Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Mar-2009 Python-Version: 2.7 and 3.1 Post-History: 12-Mar-2009 Motivation ========== Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. Current Version of the Mini-Language ==================================== * `Python 2.6 docs`_ .. _Python 2.6 docs: http://docs.python.org/library/string.html#formatstrings * PEP 3101 Advanced String Formatting Research so far =============== Scanning the web, I've found that thousands separators are usually one of COMMA, DOT, SPACE, or UNDERSCORE. When a COMMA is the decimal separator, the thousands separator is typically a DOT or SPACE (see examples from Denis Spir). James Knight observed that Indian/Pakistani numbering systems group by hundreds. Ben Finney noted that Chinese group by ten-thousands. Eric Smith pointed-out that these are already handled by the "n" specifier in the locale module (albiet only for integers). Visual Basic and its brethren (like MS Excel) use a completely different style and have ultra-flexible custom format specifiers like:: "_($* #,##0_)". COBOL uses picture clauses like:: PIC $***,**9.99CR `Common Lisp`_ uses a COLON before the ``~D`` decimal type specifier to emit a COMMA as a thousands separator. The general form of ``~D`` is ``~mincol,padchar,commachar,commaintervalD``. The *padchar* defaults to SPACE. The *commachar* defaults to COLON. The *commainterval* defaults to three. :: (format nil "~:D" 229345007) => "229,345,007" .. _`Common Lisp`: http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node200.html Proposal I (from Nick Coghlan) ============================== A comma will be added to the format() specifier mini-language: [[fill]align][sign][#][0][width][,][.precision][type] The ',' option indicates that commas should be included in the output as a thousands separator. As with locales which do not use a period as the decimal point, locales which use a different convention for digit separation will need to use the locale module to obtain appropriate formatting. The proposal works well with floats, ints, and decimals. It also allows easy substitution for other separators. For example:: format(n, "6,f").replace(",", "_") This technique is completely general but it is awkward in the one case where the commas and periods need to be swapped:: format(n, "6,f").replace(",", "X").replace(".", ",").replace("X", ".") Proposal II (to meet Antoine Pitrou's request) ============================================== Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore. [[fill]align][sign][#][0][width][T[tsep]][dsep precision][type] Examples:: format(1234, "8.1f") --> ' 1234.0' format(1234, "8,1f") --> ' 1234,0' format(1234, "8T.,1f") --> ' 1.234,0' format(1234, "8T ,f") --> ' 1 234,0' format(1234, "8d") --> ' 1234' format(1234, "8T,d") --> ' 1,234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but it comes at the expense of being a little more complicated to learn and remember. Also, it makes it more challenging to write custom __format__ methods that follow the format specification mini-language. No change is proposed for the locale module. Other Ideas =========== * Lie Ryan suggested a convenience function of the form:: create_format(self, type='i', base=16, seppos=4, sep=':', charset='0123456789abcdef', maxwidth=32, minwidth=32, pad='0') * Eric Smith would like the C version of the mini-language parser to be exposed. That would make it easier to write custom __format__ methods. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: