Updated based on collected feedback.

This commit is contained in:
Talin 2006-05-07 01:49:43 +00:00
parent 36abb0678f
commit 9eac7defd7
1 changed files with 89 additions and 43 deletions

View File

@ -8,7 +8,7 @@ Type: Standards
Content-Type: text/plain Content-Type: text/plain
Created: 16-Apr-2006 Created: 16-Apr-2006
Python-Version: 3.0 Python-Version: 3.0
Post-History: Post-History: 28-Apr-2006
Abstract Abstract
@ -50,8 +50,8 @@ Specification
The specification will consist of 4 parts: The specification will consist of 4 parts:
- Specification of a set of methods to be added to the built-in - Specification of a new formatting method to be added to the
string class. built-in string class.
- Specification of a new syntax for format strings. - Specification of a new syntax for format strings.
@ -99,13 +99,13 @@ Format Strings
"My name is Fred :-{}" "My name is Fred :-{}"
The element within the braces is called a 'field'. Fields consist The element within the braces is called a 'field'. Fields consist
of a name, which can either be simple or compound, and an optional of a 'field name', which can either be simple or compound, and an
'conversion specifier'. optional 'conversion specifier'.
Simple names are either names or numbers. If numbers, they must Simple field names are either names or numbers. If numbers, they
be valid decimal numbers; if names, they must be valid Python must be valid base-10 integers; if names, they must be valid
identifiers. A number is used to identify a positional argument, Python identifiers. A number is used to identify a positional
while a name is used to identify a keyword argument. argument, while a name is used to identify a keyword argument.
Compound names are a sequence of simple names seperated by Compound names are a sequence of simple names seperated by
periods: periods:
@ -118,8 +118,9 @@ Format Strings
positional argument 0. positional argument 0.
Each field can also specify an optional set of 'conversion Each field can also specify an optional set of 'conversion
specifiers'. Conversion specifiers follow the field name, with a specifiers' which can be used to adjust the format of that field.
colon (':') character separating the two: Conversion specifiers follow the field name, with a colon (':')
character separating the two:
"My name is {0:8}".format('Fred') "My name is {0:8}".format('Fred')
@ -130,11 +131,15 @@ Format Strings
The conversion specifier consists of a sequence of zero or more The conversion specifier consists of a sequence of zero or more
characters, each of which can consist of any printable character characters, each of which can consist of any printable character
except for a non-escaped '}'. The format() method does not except for a non-escaped '}'.
attempt to intepret the conversion specifiers in any way; it
merely passes all of the characters between the first colon ':' Conversion specifiers can themselves contain replacement fields;
and the matching right brace ('}') to the various underlying this will be described in a later section. Except for this
formatters (described later.) replacement, the format() method does not attempt to intepret the
conversion specifiers in any way; it merely passes all of the
characters between the first colon ':' and the matching right
brace ('}') to the various underlying formatters (described
later.)
Standard Conversion Specifiers Standard Conversion Specifiers
@ -142,16 +147,19 @@ Standard Conversion Specifiers
For most built-in types, the conversion specifiers will be the For most built-in types, the conversion specifiers will be the
same or similar to the existing conversion specifiers used with same or similar to the existing conversion specifiers used with
the '%' operator. Thus, instead of '%02.2x", you will say the '%' operator. Thus, instead of '%02.2x", you will say
'{0:2.2x}'. '{0:02.2x}'.
There are a few differences however: There are a few differences however:
- The trailing letter is optional - you don't need to say '2.2d', - The trailing letter is optional - you don't need to say '2.2d',
you can instead just say '2.2'. If the letter is omitted, the you can instead just say '2.2'. If the letter is omitted, a
value will be converted into its 'natural' form (that is, the default will be assumed based on the type of the argument.
form that it take if str() or unicode() were called on it) The defaults will be as follows:
subject to the field length and precision specifiers (if
supplied). string or unicode object: 's'
integer: 'd'
floating-point number: 'f'
all other types: 's'
- Variable field width specifiers use a nested version of the {} - Variable field width specifiers use a nested version of the {}
syntax, allowing the width specifier to be either a positional syntax, allowing the width specifier to be either a positional
@ -159,10 +167,6 @@ Standard Conversion Specifiers
"{0:{1}.{2}d}".format(a, b, c) "{0:{1}.{2}d}".format(a, b, c)
(Note: It might be easier to parse if these used a different
type of delimiter, such as parens - avoiding the need to create
a regex that handles the recursive case.)
- The support for length modifiers (which are ignored by Python - The support for length modifiers (which are ignored by Python
anyway) is dropped. anyway) is dropped.
@ -171,7 +175,7 @@ Standard Conversion Specifiers
conversion specifiers are identical to the arguments to the conversion specifiers are identical to the arguments to the
strftime() function: strftime() function:
"Today is: {0:%x}".format(datetime.now()) "Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())
Controlling Formatting Controlling Formatting
@ -203,19 +207,37 @@ Controlling Formatting
User-Defined Formatting Classes User-Defined Formatting Classes
The code that interprets format strings can be called explicitly There will be times when customizing the formatting of fields
from user code. This allows the creation of custom formatter on a per-type basis is not enough. An example might be an
classes that can override the normal formatting rules. accounting application, which displays negative numbers in
parentheses rather than using a negative sign.
The string and unicode classes will have a class method called The string formatting system facilitates this kind of application-
'cformat' that does all the actual work of formatting; The specific formatting by allowing user code to directly invoke
format() method is just a wrapper that calls cformat. the code that interprets format strings and fields. User-written
code can intercept the normal formatting operations on a per-field
basis, substituting their own formatting methods.
For example, in the aforementioned accounting application, there
could be an application-specific number formatter, which reuses
the string.format templating code to do most of the work. The
API for such an application-specific formatter is up to the
application; here are several possible examples:
cell_format( "The total is: {0}", total )
TemplateString( "The total is: {0}" ).format( total )
Creating an application-specific formatter is relatively straight-
forward. The string and unicode classes will have a class method
called 'cformat' that does all the actual work of formatting; The
built-in format() method is just a wrapper that calls cformat.
The parameters to the cformat function are: The parameters to the cformat function are:
-- The format string (or unicode; the same function handles -- The format string (or unicode; the same function handles
both.) both.)
-- A field format hook (see below) -- A callable 'format hook', which is called once per field
-- A tuple containing the positional arguments -- A tuple containing the positional arguments
-- A dict containing the keyword arguments -- A dict containing the keyword arguments
@ -223,15 +245,16 @@ User-Defined Formatting Classes
string, and return a new string (or unicode) with all of the string, and return a new string (or unicode) with all of the
fields replaced with their formatted values. fields replaced with their formatted values.
For each field, the cformat function will attempt to call the The format hook is a callable object supplied by the user, which
field format hook with the following arguments: is invoked once per field, and which can override the normal
formatting for that field. For each field, the cformat function
will attempt to call the field format hook with the following
arguments:
field_hook(value, conversion, buffer) format_hook(value, conversion, buffer)
The 'value' field corresponds to the value being formatted, which The 'value' field corresponds to the value being formatted, which
was retrieved from the arguments using the field name. (The was retrieved from the arguments using the field name.
field_hook has no control over the selection of values, only
how they are formatted.)
The 'conversion' argument is the conversion spec part of the The 'conversion' argument is the conversion spec part of the
field, which will be either a string or unicode object, depending field, which will be either a string or unicode object, depending
@ -300,10 +323,33 @@ Alternate Syntax
- Other variations include Ruby's #{}, PHP's {$name}, and so - Other variations include Ruby's #{}, PHP's {$name}, and so
on. on.
Some specific aspects of the syntax warrant additional comments:
1) The use of the backslash character for escapes. A few people
suggested doubling the brace characters to indicate a literal
brace rather than using backslash as an escape character. This is
also the convention used in the .Net libraries. Here's how the
previously-given example would look with this convention:
"My name is {0} :-{{}}".format('Fred')
One problem with this syntax is that it conflicts with the use of
nested braces to allow parameterization of the conversion
specifiers:
"{0:{1}.{2}}".format(a, b, c)
(There are alternative solutions, but they are too long to go
into here.)
2) The use of the colon character (':') as a separator for
conversion specifiers. This was chosen simply because that's
what .Net uses.
Sample Implementation Sample Implementation
A rought prototype of the underlying 'cformat' function has been A rough prototype of the underlying 'cformat' function has been
coded in Python, however it needs much refinement before being coded in Python, however it needs much refinement before being
submitted. submitted.