Updated based on collected feedback.

2006-05-07 01:49:43 +00:00 · 2006-05-07 01:49:43 +00:00 · 9eac7defd7
parent 36abb0678f
commit 9eac7defd7
1 changed files with 89 additions and 43 deletions
--- a/pep-3101.txt
+++ b/pep-3101.txt
@ -8,7 +8,7 @@ Type: Standards
 Content-Type: text/plain
 Created: 16-Apr-2006
 Python-Version: 3.0
-Post-History:
+Post-History: 28-Apr-2006


 Abstract
@ -50,8 +50,8 @@ Specification

    The specification will consist of 4 parts:

-    - Specification of a set of methods to be added to the built-in
-      string class.
+    - Specification of a new formatting method to be added to the
+      built-in string class.

    - Specification of a new syntax for format strings.

@ -99,13 +99,13 @@ Format Strings
        "My name is Fred :-{}"

    The element within the braces is called a 'field'.  Fields consist
-    of a name, which can either be simple or compound, and an optional
-    'conversion specifier'.
+    of a 'field name', which can either be simple or compound, and an
+    optional 'conversion specifier'.

-    Simple names are either names or numbers.  If numbers, they must
-    be valid decimal numbers; if names, they must be valid Python
-    identifiers.  A number is used to identify a positional argument,
-    while a name is used to identify a keyword argument.
+    Simple field names are either names or numbers. If numbers, they
+    must be valid base-10 integers; if names, they must be valid
+    Python identifiers.  A number is used to identify a positional
+    argument, while a name is used to identify a keyword argument.

    Compound names are a sequence of simple names seperated by
    periods:
@ -118,8 +118,9 @@ Format Strings
    positional argument 0.

    Each field can also specify an optional set of 'conversion
-    specifiers'.  Conversion specifiers follow the field name, with a
-    colon (':') character separating the two:
+    specifiers' which can be used to adjust the format of that field.
+    Conversion specifiers follow the field name, with a colon (':')
+    character separating the two:

        "My name is {0:8}".format('Fred')

@ -130,11 +131,15 @@ Format Strings

    The conversion specifier consists of a sequence of zero or more
    characters, each of which can consist of any printable character
-    except for a non-escaped '}'.  The format() method does not
-    attempt to intepret the conversion specifiers in any way; it
-    merely passes all of the characters between the first colon ':'
-    and the matching right brace ('}') to the various underlying
-    formatters (described later.)
+    except for a non-escaped '}'.
+    
+    Conversion specifiers can themselves contain replacement fields;
+    this will be described in a later section.  Except for this
+    replacement, the format() method does not attempt to intepret the
+    conversion specifiers in any way; it merely passes all of the
+    characters between the first colon ':' and the matching right
+    brace ('}') to the various underlying formatters (described
+    later.)


 Standard Conversion Specifiers
@ -142,16 +147,19 @@ Standard Conversion Specifiers
    For most built-in types, the conversion specifiers will be the
    same or similar to the existing conversion specifiers used with
    the '%' operator.  Thus, instead of '%02.2x", you will say
-    '{0:2.2x}'.
+    '{0:02.2x}'.

    There are a few differences however:

    - The trailing letter is optional - you don't need to say '2.2d',
-      you can instead just say '2.2'.  If the letter is omitted, the
-      value will be converted into its 'natural' form (that is, the
-      form that it take if str() or unicode() were called on it)
-      subject to the field length and precision specifiers (if
-      supplied).
+      you can instead just say '2.2'.  If the letter is omitted, a
+      default will be assumed based on the type of the argument.
+      The defaults will be as follows:
+      
+        string or unicode object: 's'
+        integer: 'd'
+        floating-point number: 'f'
+        all other types: 's'

    - Variable field width specifiers use a nested version of the {}
      syntax, allowing the width specifier to be either a positional
@ -159,10 +167,6 @@ Standard Conversion Specifiers

        "{0:{1}.{2}d}".format(a, b, c)

-      (Note: It might be easier to parse if these used a different
-      type of delimiter, such as parens - avoiding the need to create
-      a regex that handles the recursive case.)
-
    - The support for length modifiers (which are ignored by Python
      anyway) is dropped.

@ -171,7 +175,7 @@ Standard Conversion Specifiers
    conversion specifiers are identical to the arguments to the
    strftime() function:

-        "Today is: {0:%x}".format(datetime.now())
+        "Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())


 Controlling Formatting
@ -203,19 +207,37 @@ Controlling Formatting

 User-Defined Formatting Classes

-    The code that interprets format strings can be called explicitly
-    from user code.  This allows the creation of custom formatter
-    classes that can override the normal formatting rules.
-
-    The string and unicode classes will have a class method called
-    'cformat' that does all the actual work of formatting; The
-    format() method is just a wrapper that calls cformat.
+    There will be times when customizing the formatting of fields
+    on a per-type basis is not enough.  An example might be an
+    accounting application, which displays negative numbers in
+    parentheses rather than using a negative sign.
+    
+    The string formatting system facilitates this kind of application-
+    specific formatting by allowing user code to directly invoke
+    the code that interprets format strings and fields.  User-written
+    code can intercept the normal formatting operations on a per-field
+    basis, substituting their own formatting methods.
+    
+    For example, in the aforementioned accounting application, there
+    could be an application-specific number formatter, which reuses
+    the string.format templating code to do most of the work. The
+    API for such an application-specific formatter is up to the
+    application; here are several possible examples:
+    
+        cell_format( "The total is: {0}", total )
+        
+        TemplateString( "The total is: {0}" ).format( total )
+        
+    Creating an application-specific formatter is relatively straight-
+    forward.  The string and unicode classes will have a class method
+    called 'cformat' that does all the actual work of formatting; The
+    built-in format() method is just a wrapper that calls cformat.

    The parameters to the cformat function are:

        -- The format string (or unicode; the same function handles
           both.)
-        -- A field format hook (see below)
+        -- A callable 'format hook', which is called once per field
        -- A tuple containing the positional arguments
        -- A dict containing the keyword arguments

@ -223,15 +245,16 @@ User-Defined Formatting Classes
    string, and return a new string (or unicode) with all of the
    fields replaced with their formatted values.

-    For each field, the cformat function will attempt to call the
-    field format hook with the following arguments:
+    The format hook is a callable object supplied by the user, which
+    is invoked once per field, and which can override the normal
+    formatting for that field.  For each field, the cformat function
+    will attempt to call the field format hook with the following
+    arguments:

-       field_hook(value, conversion, buffer)
+       format_hook(value, conversion, buffer)

    The 'value' field corresponds to the value being formatted, which
-    was retrieved from the arguments using the field name.  (The
-    field_hook has no control over the selection of values, only
-    how they are formatted.)
+    was retrieved from the arguments using the field name.

    The 'conversion' argument is the conversion spec part of the
    field, which will be either a string or unicode object, depending
@ -299,11 +322,34 @@ Alternate Syntax

    - Other variations include Ruby's #{}, PHP's {$name}, and so
      on.
-
+      
+    Some specific aspects of the syntax warrant additional comments:
+    
+    1) The use of the backslash character for escapes.  A few people
+    suggested doubling the brace characters to indicate a literal
+    brace rather than using backslash as an escape character.  This is
+    also the convention used in the .Net libraries.  Here's how the
+    previously-given example would look with this convention:
+    
+        "My name is {0} :-{{}}".format('Fred')
+    
+    One problem with this syntax is that it conflicts with the use of
+    nested braces to allow parameterization of the conversion
+    specifiers:
+    
+        "{0:{1}.{2}}".format(a, b, c)
+        
+    (There are alternative solutions, but they are too long to go
+    into here.)
+    
+    2) The use of the colon character (':') as a separator for
+    conversion specifiers.  This was chosen simply because that's
+    what .Net uses.
+    

 Sample Implementation

-    A rought prototype of the underlying 'cformat' function has been
+    A rough prototype of the underlying 'cformat' function has been
    coded in Python, however it needs much refinement before being
    submitted.