From 964ee28e3f66359696bae391df6d3337ea0a5508 Mon Sep 17 00:00:00 2001 From: Raymond Hettinger Date: Fri, 13 Mar 2009 23:28:44 +0000 Subject: [PATCH] Update PEP: * Summarize commentary to date. * Add APOSTROPHE and non-breaking SPACE to the list of separators. * Add more links to external references. * Detail issues with the locale module. * Clarify how proposal II is parsed. --- pep-0378.txt | 63 +++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 50 insertions(+), 13 deletions(-) diff --git a/pep-0378.txt b/pep-0378.txt index cdb1d4b76..122dcb991 100644 --- a/pep-0378.txt +++ b/pep-0378.txt @@ -25,6 +25,16 @@ In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. +The locale module presents two other challenges. First, it is +a global setting and not suitable for multi-threaded apps that +need to serve-up requests in multiple locales. Second, the +name of a relevant locale (perhaps "de_DE") can vary from +platform to platform or may not be defined at all. The docs +for the locale module describe these and `other challenges`_ +in detail. + +.. _`other challenges`: http://docs.python.org/library/locale.html#background-details-hints-tips-and-caveats + It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. @@ -54,17 +64,19 @@ ten-thousands. Eric Smith pointed-out that these are already handled by the "n" specifier in the locale module (albeit only for integers). -Visual Basic and its brethren (like MS Excel) use a completely +Visual Basic and its brethren (like `MS Excel`_) use a completely different style and have ultra-flexible custom format specifiers like:: "_($* #,##0_)". +.. _`MS Excel`: http://www.brainbell.com/tutorials/ms-office/excel/Create_Custom_Number_Formats.htm + `COBOL`_ uses picture clauses like:: - PIC $***,**9.99CR + PICTURE $***,**9.99CR -.. _`COBOL`: http://en.wikipedia.org/wiki/Cobol +.. _`COBOL`: http://en.wikipedia.org/wiki/Cobol#Syntactic_features `Common Lisp`_ uses a COLON before the ``~D`` decimal type specifier to emit a COMMA as a thousands separator. The general form of ``~D`` is @@ -91,7 +103,7 @@ offers a COMMA as a thousands separator:: Proposal I (from Nick Coghlan) ============================== -A comma will be added to the format() specifier mini-language: +A comma will be added to the format() specifier mini-language:: [[fill]align][sign][#][0][width][,][.precision][type] @@ -124,15 +136,15 @@ Proposal II (from Eric Smith) Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the -choices to a COMMA, DOT, SPACE, or UNDERSCORE. +choices to a COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE. +The SPACE can be eitherU +0020 or U+00A0. Whenever the separator is followed by a precision, it is a -decimal separator and the optional separator preceding it is a -thousands separator. When the precision is absent, the -context is integral and a lone specifier means a thousands -separator:: +decimal separator and an optional separator preceding it is a +thousands separator. When the precision is absent, a lone +specifier means a thousands separator:: -[[fill]align][sign][#][0][width][tsep|([tsep] dsep precision)][type] +[[fill]align][sign][#][0][width][tsep][dsep precision]][type] Examples:: @@ -142,13 +154,12 @@ Examples:: format(1234, "8 ,f") --> ' 1 234,0' format(1234, "8d") --> ' 1234' format(1234, "8,d") --> ' 1,234' + format(1234, "8_d") --> ' 1_234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but it comes at the expense of being a little more complicated to learn and -remember. Also, it makes it more challenging to write custom -__format__ methods that follow the format specification -mini-language. +remember. As shown in the examples, the *width* argument means the total length including the thousands separators and decimal separators. @@ -179,6 +190,32 @@ Other Ideas to write custom __format__ methods. That way Decimal.__format__ would not have to be written from scratch. +* Antoine Pitrou noted that the provision for a SPACE separator + should also allow a non-breaking space (U+00A0). + +* A poster on the newgroup, Wolfgang Rohdewald, noted that a + convention in Switzerland is use an APOSTROPHE as a + thousands separator, ``12`000.99``. + + +Commentary +========== + +* Some commenters do not like the idea of format strings at all + and find them to be unreadable. Suggested alternatives include + the COBOL style PICTURE approach or a convenience function with + keyword arguments for every possible combination. + +* Some newsgroup respondants think there is no place for any + scripts that are not internationalized and that it is a step + backwards to provide a simple way to hardwire a given convention. + +* Another thought is that embedding some particular convention in + individual format strings makes it hard to change that convention + later. No workable alternative was suggested but the general idea + is to set the convention once and have it apply everywhere (others + commented that locale already does this). + Copyright =========