python-peps/pep-3101.txt

PEP: 3101
Title: Advanced String Formatting
Version: $Revision$
Last-Modified: $Date$
Author: Talin <talin at acm.org>
Status: Draft
Type: Standards
Content-Type: text/plain
Created: 16-Apr-2006
Python-Version: 3.0
Post-History: 28-Apr-2006


Abstract

    This PEP proposes a new system for built-in string formatting
    operations, intended as a replacement for the existing '%' string
    formatting operator.


Rationale

    Python currently provides two methods of string interpolation:

    - The '%' operator for strings. [1]

    - The string.Template module. [2]

    The scope of this PEP will be restricted to proposals for built-in
    string formatting operations (in other words, methods of the
    built-in string type).
    
    The '%' operator is primarily limited by the fact that it is a
    binary operator, and therefore can take at most two arguments.
    One of those arguments is already dedicated to the format string,
    leaving all other variables to be squeezed into the remaining
    argument.  The current practice is to use either a dictionary or a
    tuple as the second argument, but as many people have commented
    [3], this lacks flexibility.  The "all or nothing" approach
    (meaning that one must choose between only positional arguments,
    or only named arguments) is felt to be overly constraining.

    While there is some overlap between this proposal and
    string.Template, it is felt that each serves a distinct need,
    and that one does not obviate the other.  In any case,
    string.Template will not be discussed here.


Specification

    The specification will consist of 4 parts:

    - Specification of a new formatting method to be added to the
      built-in string class.

    - Specification of a new syntax for format strings.

    - Specification of a new set of class methods to control the
      formatting and conversion of objects.

    - Specification of an API for user-defined formatting classes.


String Methods

    The build-in string class will gain a new method, 'format',
    which takes takes an arbitrary number of positional and keyword
    arguments:

        "The story of {0}, {1}, and {c}".format(a, b, c=d)

    Within a format string, each positional argument is identified
    with a number, starting from zero, so in the above example, 'a' is
    argument 0 and 'b' is argument 1.  Each keyword argument is
    identified by its keyword name, so in the above example, 'c' is
    used to refer to the third argument.

    The result of the format call is an object of the same type
    (string or unicode) as the format string.


Format Strings

    Brace characters ('curly braces') are used to indicate a
    replacement field within the string:

        "My name is {0}".format('Fred')

    The result of this is the string:

        "My name is Fred"

    Braces can be escaped using a backslash:

        "My name is {0} :-\{\}".format('Fred')

    Which would produce:

        "My name is Fred :-{}"

    The element within the braces is called a 'field'.  Fields consist
    of a 'field name', which can either be simple or compound, and an
    optional 'conversion specifier'.

    Simple field names are either names or numbers. If numbers, they
    must be valid base-10 integers; if names, they must be valid
    Python identifiers.  A number is used to identify a positional
    argument, while a name is used to identify a keyword argument.

    Compound names are a sequence of simple names seperated by
    periods:

        "My name is {0.name} :-\{\}".format(dict(name='Fred'))

    Compound names can be used to access specific dictionary entries,
    array elements, or object attributes.  In the above example, the
    '{0.name}' field refers to the dictionary entry 'name' within
    positional argument 0.

    Each field can also specify an optional set of 'conversion
    specifiers' which can be used to adjust the format of that field.
    Conversion specifiers follow the field name, with a colon (':')
    character separating the two:

        "My name is {0:8}".format('Fred')

    The meaning and syntax of the conversion specifiers depends on the
    type of object that is being formatted, however many of the
    built-in types will recognize a standard set of conversion
    specifiers.

    The conversion specifier consists of a sequence of zero or more
    characters, each of which can consist of any printable character
    except for a non-escaped '}'.
    
    Conversion specifiers can themselves contain replacement fields;
    this will be described in a later section.  Except for this
    replacement, the format() method does not attempt to intepret the
    conversion specifiers in any way; it merely passes all of the
    characters between the first colon ':' and the matching right
    brace ('}') to the various underlying formatters (described
    later.)


Standard Conversion Specifiers

    For most built-in types, the conversion specifiers will be the
    same or similar to the existing conversion specifiers used with
    the '%' operator.  Thus, instead of '%02.2x", you will say
    '{0:02.2x}'.

    There are a few differences however:

    - The trailing letter is optional - you don't need to say '2.2d',
      you can instead just say '2.2'.  If the letter is omitted, a
      default will be assumed based on the type of the argument.
      The defaults will be as follows:
      
        string or unicode object: 's'
        integer: 'd'
        floating-point number: 'f'
        all other types: 's'

    - Variable field width specifiers use a nested version of the {}
      syntax, allowing the width specifier to be either a positional
      or keyword argument:

        "{0:{1}.{2}d}".format(a, b, c)

    - The support for length modifiers (which are ignored by Python
      anyway) is dropped.

    For non-built-in types, the conversion specifiers will be specific
    to that type.  An example is the 'datetime' class, whose
    conversion specifiers are identical to the arguments to the
    strftime() function:

        "Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())


Controlling Formatting

    A class that wishes to implement a custom interpretation of its
    conversion specifiers can implement a __format__ method:

    class AST:
        def __format__(self, specifiers):
            ...

    The 'specifiers' argument will be either a string object or a
    unicode object, depending on the type of the original format
    string.  The __format__ method should test the type of the
    specifiers parameter to determine whether to return a string or
    unicode object.  It is the responsibility of the __format__ method
    to return an object of the proper type.

    string.format() will format each field using the following steps:

     1) See if the value to be formatted has a __format__ method.  If
        it does, then call it.

     2) Otherwise, check the internal formatter within string.format
        that contains knowledge of certain builtin types.

     3) Otherwise, call str() or unicode() as appropriate.


User-Defined Formatting Classes

    There will be times when customizing the formatting of fields
    on a per-type basis is not enough.  An example might be an
    accounting application, which displays negative numbers in
    parentheses rather than using a negative sign.
    
    The string formatting system facilitates this kind of application-
    specific formatting by allowing user code to directly invoke
    the code that interprets format strings and fields.  User-written
    code can intercept the normal formatting operations on a per-field
    basis, substituting their own formatting methods.
    
    For example, in the aforementioned accounting application, there
    could be an application-specific number formatter, which reuses
    the string.format templating code to do most of the work. The
    API for such an application-specific formatter is up to the
    application; here are several possible examples:
    
        cell_format( "The total is: {0}", total )
        
        TemplateString( "The total is: {0}" ).format( total )
        
    Creating an application-specific formatter is relatively straight-
    forward.  The string and unicode classes will have a class method
    called 'cformat' that does all the actual work of formatting; The
    built-in format() method is just a wrapper that calls cformat.

    The parameters to the cformat function are:

        -- The format string (or unicode; the same function handles
           both.)
        -- A callable 'format hook', which is called once per field
        -- A tuple containing the positional arguments
        -- A dict containing the keyword arguments

    The cformat function will parse all of the fields in the format
    string, and return a new string (or unicode) with all of the
    fields replaced with their formatted values.

    The format hook is a callable object supplied by the user, which
    is invoked once per field, and which can override the normal
    formatting for that field.  For each field, the cformat function
    will attempt to call the field format hook with the following
    arguments:

       format_hook(value, conversion, buffer)

    The 'value' field corresponds to the value being formatted, which
    was retrieved from the arguments using the field name.

    The 'conversion' argument is the conversion spec part of the
    field, which will be either a string or unicode object, depending
    on the type of the original format string.

    The 'buffer' argument is a Python array object, either a byte
    array or unicode character array.  The buffer object will contain
    the partially constructed string; the field hook is free to modify
    the contents of this buffer if needed.

    The field_hook will be called once per field. The field_hook may
    take one of two actions:

        1) Return False, indicating that the field_hook will not
           process this field and the default formatting should be
           used.  This decision should be based on the type of the
           value object, and the contents of the conversion string.

        2) Append the formatted field to the buffer, and return True.


Alternate Syntax

    Naturally, one of the most contentious issues is the syntax of the
    format strings, and in particular the markup conventions used to
    indicate fields.

    Rather than attempting to exhaustively list all of the various
    proposals, I will cover the ones that are most widely used
    already.

    - Shell variable syntax: $name and $(name) (or in some variants,
      ${name}).  This is probably the oldest convention out there, and
      is used by Perl and many others.  When used without the braces,
      the length of the variable is determined by lexically scanning
      until an invalid character is found.

      This scheme is generally used in cases where interpolation is
      implicit - that is, in environments where any string can contain
      interpolation variables, and no special subsitution function
      need be invoked.  In such cases, it is important to prevent the
      interpolation behavior from occuring accidentally, so the '$'
      (which is otherwise a relatively uncommonly-used character) is
      used to signal when the behavior should occur.

      It is the author's opinion, however, that in cases where the
      formatting is explicitly invoked, that less care needs to be
      taken to prevent accidental interpolation, in which case a
      lighter and less unwieldy syntax can be used.

    - Printf and its cousins ('%'), including variations that add a
      field index, so that fields can be interpolated out of order.

    - Other bracket-only variations.  Various MUDs (Multi-User
      Dungeons) such as MUSH have used brackets (e.g. [name]) to do
      string interpolation.  The Microsoft .Net libraries uses braces
      ({}), and a syntax which is very similar to the one in this
      proposal, although the syntax for conversion specifiers is quite
      different. [4]

    - Backquoting.  This method has the benefit of minimal syntactical
      clutter, however it lacks many of the benefits of a function
      call syntax (such as complex expression arguments, custom
      formatters, etc.).

    - Other variations include Ruby's #{}, PHP's {$name}, and so
      on.
      
    Some specific aspects of the syntax warrant additional comments:
    
    1) The use of the backslash character for escapes.  A few people
    suggested doubling the brace characters to indicate a literal
    brace rather than using backslash as an escape character.  This is
    also the convention used in the .Net libraries.  Here's how the
    previously-given example would look with this convention:
    
        "My name is {0} :-{{}}".format('Fred')
    
    One problem with this syntax is that it conflicts with the use of
    nested braces to allow parameterization of the conversion
    specifiers:
    
        "{0:{1}.{2}}".format(a, b, c)
        
    (There are alternative solutions, but they are too long to go
    into here.)
    
    2) The use of the colon character (':') as a separator for
    conversion specifiers.  This was chosen simply because that's
    what .Net uses.
    

Sample Implementation

    A rough prototype of the underlying 'cformat' function has been
    coded in Python, however it needs much refinement before being
    submitted.
    

Backwards Compatibility

    Backwards compatibility can be maintained by leaving the existing
    mechanisms in place.  The new system does not collide with any of
    the method names of the existing string formatting techniques, so
    both systems can co-exist until it comes time to deprecate the
    older system.


References

    [1] Python Library Reference - String formating operations
    http://docs.python.org/lib/typesseq-strings.html

    [2] Python Library References - Template strings
    http://docs.python.org/lib/node109.html

    [3] [Python-3000] String formating operations in python 3k
        http://mail.python.org/pipermail/python-3000/2006-April/000285.html

    [4] Composite Formatting - [.Net Framework Developer's Guide]
        http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp?frame=true


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								PEP: 3101
 								Title: Advanced String Formatting
 								Version: $Revision$
 								Last-Modified: $Date$
 								Author: Talin <talin at acm.org>
 								Status: Draft
 								Type: Standards
 								Content-Type: text/plain
 								Created: 16-Apr-2006
 								Python-Version: 3.0
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								Post-History: 28-Apr-2006
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								Abstract
 								    This PEP proposes a new system for built-in string formatting
 								    operations, intended as a replacement for the existing '%' string
 								    formatting operator.
 								Rationale
 								    Python currently provides two methods of string interpolation:
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    - The '%' operator for strings. [1]
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    - The string.Template module. [2]
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    The scope of this PEP will be restricted to proposals for built-in
 								    string formatting operations (in other words, methods of the
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    built-in string type).
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								    The '%' operator is primarily limited by the fact that it is a
 								    binary operator, and therefore can take at most two arguments.
 								    One of those arguments is already dedicated to the format string,
 								    leaving all other variables to be squeezed into the remaining
 								    argument.  The current practice is to use either a dictionary or a
 								    tuple as the second argument, but as many people have commented
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    [3], this lacks flexibility.  The "all or nothing" approach
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								    (meaning that one must choose between only positional arguments,
 								    or only named arguments) is felt to be overly constraining.
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    While there is some overlap between this proposal and
 								    string.Template, it is felt that each serves a distinct need,
 								    and that one does not obviate the other.  In any case,
 								    string.Template will not be discussed here.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								Specification
 								    The specification will consist of 4 parts:
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    - Specification of a new formatting method to be added to the
 								      built-in string class.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    - Specification of a new syntax for format strings.
 								    - Specification of a new set of class methods to control the
 								      formatting and conversion of objects.
 								    - Specification of an API for user-defined formatting classes.
 								String Methods
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    The build-in string class will gain a new method, 'format',
 								    which takes takes an arbitrary number of positional and keyword
 								    arguments:
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								        "The story of {0}, {1}, and {c}".format(a, b, c=d)
 								    Within a format string, each positional argument is identified
 								    with a number, starting from zero, so in the above example, 'a' is
 								    argument 0 and 'b' is argument 1.  Each keyword argument is
 								    identified by its keyword name, so in the above example, 'c' is
 								    used to refer to the third argument.
 								    The result of the format call is an object of the same type
 								    (string or unicode) as the format string.
 								Format Strings
 								    Brace characters ('curly braces') are used to indicate a
 								    replacement field within the string:
 								        "My name is {0}".format('Fred')
 								    The result of this is the string:
 								        "My name is Fred"
 								    Braces can be escaped using a backslash:
 								        "My name is {0} :-\{\}".format('Fred')
 								    Which would produce:
 								        "My name is Fred :-{}"
 								    The element within the braces is called a 'field'.  Fields consist
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    of a 'field name', which can either be simple or compound, and an
 								    optional 'conversion specifier'.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    Simple field names are either names or numbers. If numbers, they
 								    must be valid base-10 integers; if names, they must be valid
 								    Python identifiers.  A number is used to identify a positional
 								    argument, while a name is used to identify a keyword argument.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    Compound names are a sequence of simple names seperated by
 								    periods:
 								        "My name is {0.name} :-\{\}".format(dict(name='Fred'))
 								    Compound names can be used to access specific dictionary entries,
 								    array elements, or object attributes.  In the above example, the
 								    '{0.name}' field refers to the dictionary entry 'name' within
 								    positional argument 0.
 								    Each field can also specify an optional set of 'conversion
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    specifiers' which can be used to adjust the format of that field.
 								    Conversion specifiers follow the field name, with a colon (':')
 								    character separating the two:
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								        "My name is {0:8}".format('Fred')
 								    The meaning and syntax of the conversion specifiers depends on the
 								    type of object that is being formatted, however many of the
 								    built-in types will recognize a standard set of conversion
 								    specifiers.
 								    The conversion specifier consists of a sequence of zero or more
 								    characters, each of which can consist of any printable character
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    except for a non-escaped '}'.
 								    Conversion specifiers can themselves contain replacement fields;
 								    this will be described in a later section.  Except for this
 								    replacement, the format() method does not attempt to intepret the
 								    conversion specifiers in any way; it merely passes all of the
 								    characters between the first colon ':' and the matching right
 								    brace ('}') to the various underlying formatters (described
 								    later.)
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								Standard Conversion Specifiers
 								    For most built-in types, the conversion specifiers will be the
 								    same or similar to the existing conversion specifiers used with
 								    the '%' operator.  Thus, instead of '%02.2x", you will say
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    '{0:02.2x}'.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    There are a few differences however:
 								    - The trailing letter is optional - you don't need to say '2.2d',
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								      you can instead just say '2.2'.  If the letter is omitted, a
 								      default will be assumed based on the type of the argument.
 								      The defaults will be as follows:
 								        string or unicode object: 's'
 								        integer: 'd'
 								        floating-point number: 'f'
 								        all other types: 's'
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    - Variable field width specifiers use a nested version of the {}
 								      syntax, allowing the width specifier to be either a positional
 								      or keyword argument:
 								        "{0:{1}.{2}d}".format(a, b, c)
 								    - The support for length modifiers (which are ignored by Python
 								      anyway) is dropped.
 								    For non-built-in types, the conversion specifiers will be specific
 								    to that type.  An example is the 'datetime' class, whose
 								    conversion specifiers are identical to the arguments to the
 								    strftime() function:
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								        "Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								Controlling Formatting
 								    A class that wishes to implement a custom interpretation of its
 								    conversion specifiers can implement a __format__ method:
 								    class AST:
 								        def __format__(self, specifiers):
 								            ...
 								    The 'specifiers' argument will be either a string object or a
 								    unicode object, depending on the type of the original format
 								    string.  The __format__ method should test the type of the
 								    specifiers parameter to determine whether to return a string or
 								    unicode object.  It is the responsibility of the __format__ method
 								    to return an object of the proper type.
 								    string.format() will format each field using the following steps:
 ) See if the value to be formatted has a __format__ method.  If
 								        it does, then call it.
 ) Otherwise, check the internal formatter within string.format
 								        that contains knowledge of certain builtin types.
 ) Otherwise, call str() or unicode() as appropriate.
 								User-Defined Formatting Classes
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    There will be times when customizing the formatting of fields
 								    on a per-type basis is not enough.  An example might be an
 								    accounting application, which displays negative numbers in
 								    parentheses rather than using a negative sign.
 								    The string formatting system facilitates this kind of application-
 								    specific formatting by allowing user code to directly invoke
 								    the code that interprets format strings and fields.  User-written
 								    code can intercept the normal formatting operations on a per-field
 								    basis, substituting their own formatting methods.
 								    For example, in the aforementioned accounting application, there
 								    could be an application-specific number formatter, which reuses
 								    the string.format templating code to do most of the work. The
 								    API for such an application-specific formatter is up to the
 								    application; here are several possible examples:
 								        cell_format( "The total is: {0}", total )
 								        TemplateString( "The total is: {0}" ).format( total )
 								    Creating an application-specific formatter is relatively straight-
 								    forward.  The string and unicode classes will have a class method
 								    called 'cformat' that does all the actual work of formatting; The
 								    built-in format() method is just a wrapper that calls cformat.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    The parameters to the cformat function are:
 								        -- The format string (or unicode; the same function handles
 								           both.)
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								        -- A callable 'format hook', which is called once per field
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								        -- A tuple containing the positional arguments
 								        -- A dict containing the keyword arguments
 								    The cformat function will parse all of the fields in the format
 								    string, and return a new string (or unicode) with all of the
 								    fields replaced with their formatted values.
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    The format hook is a callable object supplied by the user, which
 								    is invoked once per field, and which can override the normal
 								    formatting for that field.  For each field, the cformat function
 								    will attempt to call the field format hook with the following
 								    arguments:
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								       format_hook(value, conversion, buffer)
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    The 'value' field corresponds to the value being formatted, which
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    was retrieved from the arguments using the field name.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    The 'conversion' argument is the conversion spec part of the
 								    field, which will be either a string or unicode object, depending
 								    on the type of the original format string.
 								    The 'buffer' argument is a Python array object, either a byte
 								    array or unicode character array.  The buffer object will contain
 								    the partially constructed string; the field hook is free to modify
 								    the contents of this buffer if needed.
 								    The field_hook will be called once per field. The field_hook may
 								    take one of two actions:
 ) Return False, indicating that the field_hook will not
 								           process this field and the default formatting should be
 								           used.  This decision should be based on the type of the
 								           value object, and the contents of the conversion string.
 ) Append the formatted field to the buffer, and return True.
 								Alternate Syntax
 								    Naturally, one of the most contentious issues is the syntax of the
 								    format strings, and in particular the markup conventions used to
 								    indicate fields.
 								    Rather than attempting to exhaustively list all of the various
 								    proposals, I will cover the ones that are most widely used
 								    already.
 								    - Shell variable syntax: $name and $(name) (or in some variants,
 								      ${name}).  This is probably the oldest convention out there, and
 								      is used by Perl and many others.  When used without the braces,
 								      the length of the variable is determined by lexically scanning
 								      until an invalid character is found.
 								      This scheme is generally used in cases where interpolation is
 								      implicit - that is, in environments where any string can contain
 								      interpolation variables, and no special subsitution function
 								      need be invoked.  In such cases, it is important to prevent the
 								      interpolation behavior from occuring accidentally, so the '$'
 								      (which is otherwise a relatively uncommonly-used character) is
 								      used to signal when the behavior should occur.
 								      It is the author's opinion, however, that in cases where the
 								      formatting is explicitly invoked, that less care needs to be
 								      taken to prevent accidental interpolation, in which case a
 								      lighter and less unwieldy syntax can be used.
 								    - Printf and its cousins ('%'), including variations that add a
 								      field index, so that fields can be interpolated out of order.
 								    - Other bracket-only variations.  Various MUDs (Multi-User
 								      Dungeons) such as MUSH have used brackets (e.g. [name]) to do
 								      string interpolation.  The Microsoft .Net libraries uses braces
 								      ({}), and a syntax which is very similar to the one in this
 								      proposal, although the syntax for conversion specifiers is quite
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								      different. [4]
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
 								    - Backquoting.  This method has the benefit of minimal syntactical
 								      clutter, however it lacks many of the benefits of a function
 								      call syntax (such as complex expression arguments, custom
 								      formatters, etc.).
 								    - Other variations include Ruby's #{}, PHP's {$name}, and so
 								      on.
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
 								    Some specific aspects of the syntax warrant additional comments:
 ) The use of the backslash character for escapes.  A few people
 								    suggested doubling the brace characters to indicate a literal
 								    brace rather than using backslash as an escape character.  This is
 								    also the convention used in the .Net libraries.  Here's how the
 								    previously-given example would look with this convention:
 								        "My name is {0} :-{{}}".format('Fred')
 								    One problem with this syntax is that it conflicts with the use of
 								    nested braces to allow parameterization of the conversion
 								    specifiers:
 								        "{0:{1}.{2}}".format(a, b, c)
 								    (There are alternative solutions, but they are too long to go
 								    into here.)
 ) The use of the colon character (':') as a separator for
 								    conversion specifiers.  This was chosen simply because that's
 								    what .Net uses.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								Sample Implementation
-												Updated based on collected feedback.

											
										
										
											2006-05-06 21:49:43 -04:00
+								    A rough prototype of the underlying 'cformat' function has been
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    coded in Python, however it needs much refinement before being
 								    submitted.
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								Backwards Compatibility
 								    Backwards compatibility can be maintained by leaving the existing
 								    mechanisms in place.  The new system does not collide with any of
 								    the method names of the existing string formatting techniques, so
 								    both systems can co-exist until it comes time to deprecate the
 								    older system.
 								References
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    [1] Python Library Reference - String formating operations
 								    http://docs.python.org/lib/typesseq-strings.html
 								    [2] Python Library References - Template strings
 								    http://docs.python.org/lib/node109.html
 								    [3] [Python-3000] String formating operations in python 3k
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								        http://mail.python.org/pipermail/python-3000/2006-April/000285.html
-												Removed references to previous 'fformat' proposal.
Added clarification about relationship with string.Template

											
										
										
											2006-04-27 12:53:54 -04:00
+								    [4] Composite Formatting - [.Net Framework Developer's Guide]
-												added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

											
										
										
											2006-04-26 16:33:25 -04:00
+								        http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp?frame=true
 								Copyright
 								    This document has been placed in the public domain.
 								Local Variables:
 								mode: indented-text
 								indent-tabs-mode: nil
 								sentence-end-double-space: t
 								fill-column: 70
 								coding: utf-8
 								End: