Updated based on collected feedback.

This commit is contained in:
Talin 2006-05-07 01:49:43 +00:00
parent 36abb0678f
commit 9eac7defd7
1 changed files with 89 additions and 43 deletions

View File

@ -8,7 +8,7 @@ Type: Standards
Content-Type: text/plain
Created: 16-Apr-2006
Python-Version: 3.0
Post-History:
Post-History: 28-Apr-2006
Abstract
@ -50,8 +50,8 @@ Specification
The specification will consist of 4 parts:
- Specification of a set of methods to be added to the built-in
string class.
- Specification of a new formatting method to be added to the
built-in string class.
- Specification of a new syntax for format strings.
@ -99,13 +99,13 @@ Format Strings
"My name is Fred :-{}"
The element within the braces is called a 'field'. Fields consist
of a name, which can either be simple or compound, and an optional
'conversion specifier'.
of a 'field name', which can either be simple or compound, and an
optional 'conversion specifier'.
Simple names are either names or numbers. If numbers, they must
be valid decimal numbers; if names, they must be valid Python
identifiers. A number is used to identify a positional argument,
while a name is used to identify a keyword argument.
Simple field names are either names or numbers. If numbers, they
must be valid base-10 integers; if names, they must be valid
Python identifiers. A number is used to identify a positional
argument, while a name is used to identify a keyword argument.
Compound names are a sequence of simple names seperated by
periods:
@ -118,8 +118,9 @@ Format Strings
positional argument 0.
Each field can also specify an optional set of 'conversion
specifiers'. Conversion specifiers follow the field name, with a
colon (':') character separating the two:
specifiers' which can be used to adjust the format of that field.
Conversion specifiers follow the field name, with a colon (':')
character separating the two:
"My name is {0:8}".format('Fred')
@ -130,11 +131,15 @@ Format Strings
The conversion specifier consists of a sequence of zero or more
characters, each of which can consist of any printable character
except for a non-escaped '}'. The format() method does not
attempt to intepret the conversion specifiers in any way; it
merely passes all of the characters between the first colon ':'
and the matching right brace ('}') to the various underlying
formatters (described later.)
except for a non-escaped '}'.
Conversion specifiers can themselves contain replacement fields;
this will be described in a later section. Except for this
replacement, the format() method does not attempt to intepret the
conversion specifiers in any way; it merely passes all of the
characters between the first colon ':' and the matching right
brace ('}') to the various underlying formatters (described
later.)
Standard Conversion Specifiers
@ -142,16 +147,19 @@ Standard Conversion Specifiers
For most built-in types, the conversion specifiers will be the
same or similar to the existing conversion specifiers used with
the '%' operator. Thus, instead of '%02.2x", you will say
'{0:2.2x}'.
'{0:02.2x}'.
There are a few differences however:
- The trailing letter is optional - you don't need to say '2.2d',
you can instead just say '2.2'. If the letter is omitted, the
value will be converted into its 'natural' form (that is, the
form that it take if str() or unicode() were called on it)
subject to the field length and precision specifiers (if
supplied).
you can instead just say '2.2'. If the letter is omitted, a
default will be assumed based on the type of the argument.
The defaults will be as follows:
string or unicode object: 's'
integer: 'd'
floating-point number: 'f'
all other types: 's'
- Variable field width specifiers use a nested version of the {}
syntax, allowing the width specifier to be either a positional
@ -159,10 +167,6 @@ Standard Conversion Specifiers
"{0:{1}.{2}d}".format(a, b, c)
(Note: It might be easier to parse if these used a different
type of delimiter, such as parens - avoiding the need to create
a regex that handles the recursive case.)
- The support for length modifiers (which are ignored by Python
anyway) is dropped.
@ -171,7 +175,7 @@ Standard Conversion Specifiers
conversion specifiers are identical to the arguments to the
strftime() function:
"Today is: {0:%x}".format(datetime.now())
"Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())
Controlling Formatting
@ -203,19 +207,37 @@ Controlling Formatting
User-Defined Formatting Classes
The code that interprets format strings can be called explicitly
from user code. This allows the creation of custom formatter
classes that can override the normal formatting rules.
The string and unicode classes will have a class method called
'cformat' that does all the actual work of formatting; The
format() method is just a wrapper that calls cformat.
There will be times when customizing the formatting of fields
on a per-type basis is not enough. An example might be an
accounting application, which displays negative numbers in
parentheses rather than using a negative sign.
The string formatting system facilitates this kind of application-
specific formatting by allowing user code to directly invoke
the code that interprets format strings and fields. User-written
code can intercept the normal formatting operations on a per-field
basis, substituting their own formatting methods.
For example, in the aforementioned accounting application, there
could be an application-specific number formatter, which reuses
the string.format templating code to do most of the work. The
API for such an application-specific formatter is up to the
application; here are several possible examples:
cell_format( "The total is: {0}", total )
TemplateString( "The total is: {0}" ).format( total )
Creating an application-specific formatter is relatively straight-
forward. The string and unicode classes will have a class method
called 'cformat' that does all the actual work of formatting; The
built-in format() method is just a wrapper that calls cformat.
The parameters to the cformat function are:
-- The format string (or unicode; the same function handles
both.)
-- A field format hook (see below)
-- A callable 'format hook', which is called once per field
-- A tuple containing the positional arguments
-- A dict containing the keyword arguments
@ -223,15 +245,16 @@ User-Defined Formatting Classes
string, and return a new string (or unicode) with all of the
fields replaced with their formatted values.
For each field, the cformat function will attempt to call the
field format hook with the following arguments:
The format hook is a callable object supplied by the user, which
is invoked once per field, and which can override the normal
formatting for that field. For each field, the cformat function
will attempt to call the field format hook with the following
arguments:
field_hook(value, conversion, buffer)
format_hook(value, conversion, buffer)
The 'value' field corresponds to the value being formatted, which
was retrieved from the arguments using the field name. (The
field_hook has no control over the selection of values, only
how they are formatted.)
was retrieved from the arguments using the field name.
The 'conversion' argument is the conversion spec part of the
field, which will be either a string or unicode object, depending
@ -299,11 +322,34 @@ Alternate Syntax
- Other variations include Ruby's #{}, PHP's {$name}, and so
on.
Some specific aspects of the syntax warrant additional comments:
1) The use of the backslash character for escapes. A few people
suggested doubling the brace characters to indicate a literal
brace rather than using backslash as an escape character. This is
also the convention used in the .Net libraries. Here's how the
previously-given example would look with this convention:
"My name is {0} :-{{}}".format('Fred')
One problem with this syntax is that it conflicts with the use of
nested braces to allow parameterization of the conversion
specifiers:
"{0:{1}.{2}}".format(a, b, c)
(There are alternative solutions, but they are too long to go
into here.)
2) The use of the colon character (':') as a separator for
conversion specifiers. This was chosen simply because that's
what .Net uses.
Sample Implementation
A rought prototype of the underlying 'cformat' function has been
A rough prototype of the underlying 'cformat' function has been
coded in Python, however it needs much refinement before being
submitted.