2006-04-26 16:33:25 -04:00
|
|
|
|
PEP: 3101
|
|
|
|
|
Title: Advanced String Formatting
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Talin <talin at acm.org>
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards
|
|
|
|
|
Content-Type: text/plain
|
|
|
|
|
Created: 16-Apr-2006
|
|
|
|
|
Python-Version: 3.0
|
2006-06-10 20:59:06 -04:00
|
|
|
|
Post-History: 28-Apr-2006, 6-May-2006, 10-Jun-2006
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
|
|
|
|
|
This PEP proposes a new system for built-in string formatting
|
|
|
|
|
operations, intended as a replacement for the existing '%' string
|
|
|
|
|
formatting operator.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
|
|
|
|
|
Python currently provides two methods of string interpolation:
|
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
- The '%' operator for strings. [1]
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
- The string.Template module. [2]
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
The scope of this PEP will be restricted to proposals for built-in
|
|
|
|
|
string formatting operations (in other words, methods of the
|
2006-04-27 12:53:54 -04:00
|
|
|
|
built-in string type).
|
|
|
|
|
|
2006-04-26 16:33:25 -04:00
|
|
|
|
The '%' operator is primarily limited by the fact that it is a
|
|
|
|
|
binary operator, and therefore can take at most two arguments.
|
|
|
|
|
One of those arguments is already dedicated to the format string,
|
|
|
|
|
leaving all other variables to be squeezed into the remaining
|
|
|
|
|
argument. The current practice is to use either a dictionary or a
|
|
|
|
|
tuple as the second argument, but as many people have commented
|
2006-04-27 12:53:54 -04:00
|
|
|
|
[3], this lacks flexibility. The "all or nothing" approach
|
2006-04-26 16:33:25 -04:00
|
|
|
|
(meaning that one must choose between only positional arguments,
|
|
|
|
|
or only named arguments) is felt to be overly constraining.
|
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
While there is some overlap between this proposal and
|
|
|
|
|
string.Template, it is felt that each serves a distinct need,
|
|
|
|
|
and that one does not obviate the other. In any case,
|
|
|
|
|
string.Template will not be discussed here.
|
|
|
|
|
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
The specification will consist of the following parts:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-05-06 21:49:43 -04:00
|
|
|
|
- Specification of a new formatting method to be added to the
|
|
|
|
|
built-in string class.
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
- Specification of a new syntax for format strings.
|
|
|
|
|
|
|
|
|
|
- Specification of a new set of class methods to control the
|
|
|
|
|
formatting and conversion of objects.
|
|
|
|
|
|
|
|
|
|
- Specification of an API for user-defined formatting classes.
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
- Specification of how formatting errors are handled.
|
|
|
|
|
|
|
|
|
|
Note on string encodings: Since this PEP is being targeted
|
|
|
|
|
at Python 3.0, it is assumed that all strings are unicode strings,
|
|
|
|
|
and that the use of the word 'string' in the context of this
|
|
|
|
|
document will generally refer to a Python 3.0 string, which is
|
|
|
|
|
the same as Python 2.x unicode object.
|
|
|
|
|
|
|
|
|
|
If it should happen that this functionality is backported to
|
|
|
|
|
the 2.x series, then it will be necessary to handle both regular
|
|
|
|
|
string as well as unicode objects. All of the function call
|
|
|
|
|
interfaces described in this PEP can be used for both strings
|
|
|
|
|
and unicode objects, and in all cases there is sufficient
|
|
|
|
|
information to be able to properly deduce the output string
|
|
|
|
|
type (in other words, there is no need for two separate APIs).
|
|
|
|
|
In all cases, the type of the template string dominates - that
|
|
|
|
|
is, the result of the conversion will always result in an object
|
|
|
|
|
that contains the same representation of characters as the
|
|
|
|
|
input template string.
|
|
|
|
|
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
String Methods
|
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
The build-in string class will gain a new method, 'format',
|
|
|
|
|
which takes takes an arbitrary number of positional and keyword
|
|
|
|
|
arguments:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
"The story of {0}, {1}, and {c}".format(a, b, c=d)
|
|
|
|
|
|
|
|
|
|
Within a format string, each positional argument is identified
|
|
|
|
|
with a number, starting from zero, so in the above example, 'a' is
|
|
|
|
|
argument 0 and 'b' is argument 1. Each keyword argument is
|
|
|
|
|
identified by its keyword name, so in the above example, 'c' is
|
|
|
|
|
used to refer to the third argument.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Format Strings
|
|
|
|
|
|
|
|
|
|
Brace characters ('curly braces') are used to indicate a
|
|
|
|
|
replacement field within the string:
|
|
|
|
|
|
|
|
|
|
"My name is {0}".format('Fred')
|
|
|
|
|
|
|
|
|
|
The result of this is the string:
|
|
|
|
|
|
|
|
|
|
"My name is Fred"
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
Braces can be escaped by doubling:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
"My name is {0} :-{{}}".format('Fred')
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
Which would produce:
|
|
|
|
|
|
|
|
|
|
"My name is Fred :-{}"
|
2006-06-10 20:59:06 -04:00
|
|
|
|
|
2006-04-26 16:33:25 -04:00
|
|
|
|
The element within the braces is called a 'field'. Fields consist
|
2006-05-06 21:49:43 -04:00
|
|
|
|
of a 'field name', which can either be simple or compound, and an
|
|
|
|
|
optional 'conversion specifier'.
|
2006-06-10 20:59:06 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Simple and Compound Field Names
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-05-06 21:49:43 -04:00
|
|
|
|
Simple field names are either names or numbers. If numbers, they
|
|
|
|
|
must be valid base-10 integers; if names, they must be valid
|
|
|
|
|
Python identifiers. A number is used to identify a positional
|
|
|
|
|
argument, while a name is used to identify a keyword argument.
|
2006-06-10 20:59:06 -04:00
|
|
|
|
|
|
|
|
|
A compound field name is a combination of multiple simple field
|
|
|
|
|
names in an expression:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
"My name is {0.name}".format(file('out.txt'))
|
|
|
|
|
|
|
|
|
|
This example shows the use of the 'getattr' or 'dot' operator
|
|
|
|
|
in a field expression. The dot operator allows an attribute of
|
|
|
|
|
an input value to be specified as the field value.
|
|
|
|
|
|
|
|
|
|
The types of expressions that can be used in a compound name
|
|
|
|
|
have been deliberately limited in order to prevent potential
|
|
|
|
|
security exploits resulting from the ability to place arbitrary
|
|
|
|
|
Python expressions inside of strings. Only two operators are
|
|
|
|
|
supported, the '.' (getattr) operator, and the '[]' (getitem)
|
|
|
|
|
operator.
|
|
|
|
|
|
|
|
|
|
An example of the 'getitem' syntax:
|
|
|
|
|
|
|
|
|
|
"My name is {0[name]}".format(dict(name='Fred'))
|
|
|
|
|
|
|
|
|
|
It should be noted that the use of 'getitem' within a string is
|
|
|
|
|
much more limited than its normal use. In the above example, the
|
|
|
|
|
string 'name' really is the literal string 'name', not a variable
|
|
|
|
|
named 'name'. The rules for parsing an item key are the same as
|
|
|
|
|
for parsing a simple name - in other words, if it looks like a
|
|
|
|
|
number, then its treated as a number, if it looks like an
|
|
|
|
|
identifier, then it is used as a string.
|
|
|
|
|
|
|
|
|
|
It is not possible to specify arbitrary dictionary keys from
|
|
|
|
|
within a format string.
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
Conversion Specifiers
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
Each field can also specify an optional set of 'conversion
|
2006-05-06 21:49:43 -04:00
|
|
|
|
specifiers' which can be used to adjust the format of that field.
|
|
|
|
|
Conversion specifiers follow the field name, with a colon (':')
|
|
|
|
|
character separating the two:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
"My name is {0:8}".format('Fred')
|
|
|
|
|
|
|
|
|
|
The meaning and syntax of the conversion specifiers depends on the
|
|
|
|
|
type of object that is being formatted, however many of the
|
|
|
|
|
built-in types will recognize a standard set of conversion
|
|
|
|
|
specifiers.
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
Conversion specifiers can themselves contain replacement fields.
|
|
|
|
|
For example, a field whose field width it itself a parameter
|
|
|
|
|
could be specified via:
|
|
|
|
|
|
|
|
|
|
"{0:{1}}".format(a, b, c)
|
|
|
|
|
|
|
|
|
|
Note that the doubled '}' at the end, which would normally be
|
|
|
|
|
escaped, is not escaped in this case. The reason is because
|
|
|
|
|
the '{{' and '}}' syntax for escapes is only applied when used
|
|
|
|
|
*outside* of a format field. Within a format field, the brace
|
|
|
|
|
characters always have their normal meaning.
|
2006-05-06 21:49:43 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
The syntax for conversion specifiers is open-ended, since except
|
|
|
|
|
than doing field replacements, the format() method does not
|
|
|
|
|
attempt to interpret them in any way; it merely passes all of the
|
|
|
|
|
characters between the first colon and the matching brace to
|
|
|
|
|
the various underlying formatter methods.
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Standard Conversion Specifiers
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
If an object does not define its own conversion specifiers, a
|
|
|
|
|
standard set of conversion specifiers are used. These are similar
|
|
|
|
|
in concept to the conversion specifiers used by the existing '%'
|
|
|
|
|
operator, however there are also a number of significant
|
|
|
|
|
differences. The standard conversion specifiers fall into three
|
|
|
|
|
major categories: string conversions, integer conversions and
|
|
|
|
|
floating point conversions.
|
|
|
|
|
|
|
|
|
|
The general form of a standard conversion specifier is:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
[[fill]align][sign][width][.precision][type]
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
The brackets ([]) indicate an optional field.
|
|
|
|
|
|
|
|
|
|
Then the optional align flag can be one of the following:
|
|
|
|
|
|
|
|
|
|
'<' - Forces the field to be left-aligned within the available
|
|
|
|
|
space (This is the default.)
|
|
|
|
|
'>' - Forces the field to be right-aligned within the
|
|
|
|
|
available space.
|
|
|
|
|
'=' - Forces the padding to be placed between immediately
|
|
|
|
|
after the sign, if any. This is used for printing fields
|
|
|
|
|
in the form '+000000120'.
|
|
|
|
|
|
|
|
|
|
Note that unless a minimum field width is defined, the field
|
|
|
|
|
width will always be the same size as the data to fill it, so
|
|
|
|
|
that the alignment option has no meaning in this case.
|
|
|
|
|
|
|
|
|
|
The optional 'fill' character defines the character to be used to
|
|
|
|
|
pad the field to the minimum width. The alignment flag must be
|
|
|
|
|
supplied if the character is a number other than 0 (otherwise the
|
|
|
|
|
character would be interpreted as part of the field width
|
|
|
|
|
specifier). A zero fill character without an alignment flag
|
|
|
|
|
implies an alignment type of '='.
|
|
|
|
|
|
|
|
|
|
The 'sign' field can be one of the following:
|
|
|
|
|
|
|
|
|
|
'+' - indicates that a sign should be used for both
|
|
|
|
|
positive as well as negative numbers
|
|
|
|
|
'-' - indicates that a sign should be used only for negative
|
|
|
|
|
numbers (this is the default behaviour)
|
|
|
|
|
' ' - indicates that a leading space should be used on
|
|
|
|
|
positive numbers
|
|
|
|
|
'()' - indicates that negative numbers should be surrounded
|
|
|
|
|
by parentheses
|
|
|
|
|
|
|
|
|
|
'width' is a decimal integer defining the minimum field width. If
|
|
|
|
|
not specified, then the field width will be determined by the
|
|
|
|
|
content.
|
|
|
|
|
|
|
|
|
|
The 'precision' field is a decimal number indicating how many
|
|
|
|
|
digits should be displayed after the decimal point.
|
|
|
|
|
|
|
|
|
|
Finally, the 'type' determines how the data should be presented.
|
|
|
|
|
If the type field is absent, an appropriate type will be assigned
|
|
|
|
|
based on the value to be formatted ('d' for integers and longs,
|
|
|
|
|
'g' for floats, and 's' for everything else.)
|
|
|
|
|
|
|
|
|
|
The available string conversion types are:
|
|
|
|
|
|
|
|
|
|
's' - String format. Invokes str() on the object.
|
|
|
|
|
This is the default conversion specifier type.
|
|
|
|
|
'r' - Repr format. Invokes repr() on the object.
|
|
|
|
|
|
|
|
|
|
There are several integer conversion types. All invoke int() on
|
|
|
|
|
the object before attempting to format it.
|
|
|
|
|
|
|
|
|
|
The available integer conversion types are:
|
|
|
|
|
|
|
|
|
|
'b' - Binary. Outputs the number in base 2.
|
|
|
|
|
'c' - Character. Converts the integer to the corresponding
|
|
|
|
|
unicode character before printing.
|
|
|
|
|
'd' - Decimal Integer. Outputs the number in base 10.
|
|
|
|
|
'o' - Octal format. Outputs the number in base 8.
|
|
|
|
|
'x' - Hex format. Outputs the number in base 16, using lower-
|
|
|
|
|
case letters for the digits above 9.
|
|
|
|
|
'X' - Hex format. Outputs the number in base 16, using upper-
|
|
|
|
|
case letters for the digits above 9.
|
|
|
|
|
|
|
|
|
|
There are several floating point conversion types. All invoke
|
|
|
|
|
float() on the object before attempting to format it.
|
|
|
|
|
|
|
|
|
|
The available floating point conversion types are:
|
|
|
|
|
|
|
|
|
|
'e' - Exponent notation. Prints the number in scientific
|
|
|
|
|
notation using the letter 'e' to indicate the exponent.
|
|
|
|
|
'E' - Exponent notation. Same as 'e' except it uses an upper
|
|
|
|
|
case 'E' as the separator character.
|
|
|
|
|
'f' - Fixed point. Displays the number as a fixed-point
|
|
|
|
|
number.
|
|
|
|
|
'F' - Fixed point. Same as 'f'.
|
|
|
|
|
'g' - General format. This prints the number as a fixed-point
|
|
|
|
|
number, unless the number is too large, in which case
|
|
|
|
|
it switches to 'e' exponent notation.
|
|
|
|
|
'G' - General format. Same as 'g' except switches to 'E'
|
|
|
|
|
if the number gets to large.
|
|
|
|
|
'n' - Number. This is the same as 'g', except that it uses the
|
|
|
|
|
current locale setting to insert the appropriate
|
|
|
|
|
number separator characters.
|
|
|
|
|
'%' - Percentage. Multiplies the number by 100 and displays
|
|
|
|
|
in fixed ('f') format, followed by a percent sign.
|
|
|
|
|
|
|
|
|
|
Objects are able to define their own conversion specifiers to
|
|
|
|
|
replace the standard ones. An example is the 'datetime' class,
|
|
|
|
|
whose conversion specifiers might look something like the
|
|
|
|
|
arguments to the strftime() function:
|
|
|
|
|
|
|
|
|
|
"Today is: {0:a b d H:M:S Y}".format(datetime.now())
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Controlling Formatting
|
|
|
|
|
|
|
|
|
|
A class that wishes to implement a custom interpretation of its
|
|
|
|
|
conversion specifiers can implement a __format__ method:
|
|
|
|
|
|
|
|
|
|
class AST:
|
|
|
|
|
def __format__(self, specifiers):
|
|
|
|
|
...
|
|
|
|
|
|
|
|
|
|
The 'specifiers' argument will be either a string object or a
|
|
|
|
|
unicode object, depending on the type of the original format
|
|
|
|
|
string. The __format__ method should test the type of the
|
|
|
|
|
specifiers parameter to determine whether to return a string or
|
|
|
|
|
unicode object. It is the responsibility of the __format__ method
|
|
|
|
|
to return an object of the proper type.
|
|
|
|
|
|
|
|
|
|
string.format() will format each field using the following steps:
|
|
|
|
|
|
|
|
|
|
1) See if the value to be formatted has a __format__ method. If
|
|
|
|
|
it does, then call it.
|
|
|
|
|
|
|
|
|
|
2) Otherwise, check the internal formatter within string.format
|
|
|
|
|
that contains knowledge of certain builtin types.
|
|
|
|
|
|
|
|
|
|
3) Otherwise, call str() or unicode() as appropriate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
User-Defined Formatting Classes
|
|
|
|
|
|
2006-05-06 21:49:43 -04:00
|
|
|
|
There will be times when customizing the formatting of fields
|
|
|
|
|
on a per-type basis is not enough. An example might be an
|
|
|
|
|
accounting application, which displays negative numbers in
|
|
|
|
|
parentheses rather than using a negative sign.
|
|
|
|
|
|
|
|
|
|
The string formatting system facilitates this kind of application-
|
|
|
|
|
specific formatting by allowing user code to directly invoke
|
|
|
|
|
the code that interprets format strings and fields. User-written
|
|
|
|
|
code can intercept the normal formatting operations on a per-field
|
|
|
|
|
basis, substituting their own formatting methods.
|
|
|
|
|
|
|
|
|
|
For example, in the aforementioned accounting application, there
|
|
|
|
|
could be an application-specific number formatter, which reuses
|
|
|
|
|
the string.format templating code to do most of the work. The
|
|
|
|
|
API for such an application-specific formatter is up to the
|
|
|
|
|
application; here are several possible examples:
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
cell_format("The total is: {0}", total)
|
2006-05-06 21:49:43 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
TemplateString("The total is: {0}").format(total)
|
2006-05-06 21:49:43 -04:00
|
|
|
|
|
|
|
|
|
Creating an application-specific formatter is relatively straight-
|
|
|
|
|
forward. The string and unicode classes will have a class method
|
|
|
|
|
called 'cformat' that does all the actual work of formatting; The
|
|
|
|
|
built-in format() method is just a wrapper that calls cformat.
|
2006-06-10 20:59:06 -04:00
|
|
|
|
|
|
|
|
|
The type signature for the cFormat function is as follows:
|
|
|
|
|
|
|
|
|
|
cformat(template, format_hook, args, kwargs)
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
The parameters to the cformat function are:
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
-- The format template string.
|
2006-05-06 21:49:43 -04:00
|
|
|
|
-- A callable 'format hook', which is called once per field
|
2006-04-26 16:33:25 -04:00
|
|
|
|
-- A tuple containing the positional arguments
|
|
|
|
|
-- A dict containing the keyword arguments
|
|
|
|
|
|
|
|
|
|
The cformat function will parse all of the fields in the format
|
|
|
|
|
string, and return a new string (or unicode) with all of the
|
|
|
|
|
fields replaced with their formatted values.
|
|
|
|
|
|
2006-05-06 21:49:43 -04:00
|
|
|
|
The format hook is a callable object supplied by the user, which
|
|
|
|
|
is invoked once per field, and which can override the normal
|
|
|
|
|
formatting for that field. For each field, the cformat function
|
|
|
|
|
will attempt to call the field format hook with the following
|
|
|
|
|
arguments:
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
format_hook(value, conversion)
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
The 'value' field corresponds to the value being formatted, which
|
2006-05-06 21:49:43 -04:00
|
|
|
|
was retrieved from the arguments using the field name.
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
The 'conversion' argument is the conversion spec part of the
|
|
|
|
|
field, which will be either a string or unicode object, depending
|
|
|
|
|
on the type of the original format string.
|
|
|
|
|
|
|
|
|
|
The field_hook will be called once per field. The field_hook may
|
|
|
|
|
take one of two actions:
|
2006-06-10 20:59:06 -04:00
|
|
|
|
|
|
|
|
|
1) Return a string or unicode object that is the result
|
|
|
|
|
of the formatting operation.
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
2) Return None, indicating that the field_hook will not
|
2006-04-26 16:33:25 -04:00
|
|
|
|
process this field and the default formatting should be
|
|
|
|
|
used. This decision should be based on the type of the
|
|
|
|
|
value object, and the contents of the conversion string.
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
|
|
|
|
|
Error handling
|
|
|
|
|
|
|
|
|
|
The string formatting system has two error handling modes, which
|
|
|
|
|
are controlled by the value of a class variable:
|
|
|
|
|
|
|
|
|
|
string.strict_format_errors = True
|
|
|
|
|
|
|
|
|
|
The 'strict_format_errors' flag defaults to False, or 'lenient'
|
|
|
|
|
mode. Setting it to True enables 'strict' mode. The current mode
|
|
|
|
|
determines how errors are handled, depending on the type of the
|
|
|
|
|
error.
|
|
|
|
|
|
|
|
|
|
The types of errors that can occur are:
|
|
|
|
|
|
|
|
|
|
1) Reference to a missing or invalid argument from within a
|
|
|
|
|
field specifier. In strict mode, this will raise an exception.
|
|
|
|
|
In lenient mode, this will cause the value of the field to be
|
|
|
|
|
replaced with the string '?name?', where 'name' will be the
|
|
|
|
|
type of error (KeyError, IndexError, or AttributeError).
|
|
|
|
|
|
|
|
|
|
So for example:
|
|
|
|
|
|
|
|
|
|
>>> string.strict_format_errors = False
|
|
|
|
|
>>> print 'Item 2 of argument 0 is: {0[2]}'.format( [0,1] )
|
|
|
|
|
"Item 2 of argument 0 is: ?IndexError?"
|
|
|
|
|
|
|
|
|
|
2) Unused argument. In strict mode, this will raise an exception.
|
|
|
|
|
In lenient mode, this will be ignored.
|
|
|
|
|
|
|
|
|
|
3) Exception raised by underlying formatter. These exceptions
|
|
|
|
|
are always passed through, regardless of the current mode.
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alternate Syntax
|
|
|
|
|
|
|
|
|
|
Naturally, one of the most contentious issues is the syntax of the
|
|
|
|
|
format strings, and in particular the markup conventions used to
|
|
|
|
|
indicate fields.
|
|
|
|
|
|
|
|
|
|
Rather than attempting to exhaustively list all of the various
|
|
|
|
|
proposals, I will cover the ones that are most widely used
|
|
|
|
|
already.
|
|
|
|
|
|
|
|
|
|
- Shell variable syntax: $name and $(name) (or in some variants,
|
|
|
|
|
${name}). This is probably the oldest convention out there, and
|
|
|
|
|
is used by Perl and many others. When used without the braces,
|
|
|
|
|
the length of the variable is determined by lexically scanning
|
|
|
|
|
until an invalid character is found.
|
|
|
|
|
|
|
|
|
|
This scheme is generally used in cases where interpolation is
|
|
|
|
|
implicit - that is, in environments where any string can contain
|
|
|
|
|
interpolation variables, and no special subsitution function
|
|
|
|
|
need be invoked. In such cases, it is important to prevent the
|
|
|
|
|
interpolation behavior from occuring accidentally, so the '$'
|
|
|
|
|
(which is otherwise a relatively uncommonly-used character) is
|
|
|
|
|
used to signal when the behavior should occur.
|
|
|
|
|
|
|
|
|
|
It is the author's opinion, however, that in cases where the
|
|
|
|
|
formatting is explicitly invoked, that less care needs to be
|
|
|
|
|
taken to prevent accidental interpolation, in which case a
|
|
|
|
|
lighter and less unwieldy syntax can be used.
|
|
|
|
|
|
|
|
|
|
- Printf and its cousins ('%'), including variations that add a
|
|
|
|
|
field index, so that fields can be interpolated out of order.
|
|
|
|
|
|
|
|
|
|
- Other bracket-only variations. Various MUDs (Multi-User
|
|
|
|
|
Dungeons) such as MUSH have used brackets (e.g. [name]) to do
|
|
|
|
|
string interpolation. The Microsoft .Net libraries uses braces
|
|
|
|
|
({}), and a syntax which is very similar to the one in this
|
|
|
|
|
proposal, although the syntax for conversion specifiers is quite
|
2006-04-27 12:53:54 -04:00
|
|
|
|
different. [4]
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
|
|
|
|
- Backquoting. This method has the benefit of minimal syntactical
|
|
|
|
|
clutter, however it lacks many of the benefits of a function
|
|
|
|
|
call syntax (such as complex expression arguments, custom
|
|
|
|
|
formatters, etc.).
|
|
|
|
|
|
|
|
|
|
- Other variations include Ruby's #{}, PHP's {$name}, and so
|
|
|
|
|
on.
|
2006-05-06 21:49:43 -04:00
|
|
|
|
|
|
|
|
|
Some specific aspects of the syntax warrant additional comments:
|
|
|
|
|
|
2006-06-10 20:59:06 -04:00
|
|
|
|
1) Backslash character for escapes. The original version of
|
|
|
|
|
this PEP used backslash rather than doubling to escape a bracket.
|
|
|
|
|
This worked because backslashes in Python string literals that
|
|
|
|
|
don't conform to a standard backslash sequence such as '\n'
|
|
|
|
|
are left unmodified. However, this caused a certain amount
|
|
|
|
|
of confusion, and led to potential situations of multiple
|
|
|
|
|
recursive escapes, i.e. '\\\\{' to place a literal backslash
|
|
|
|
|
in front of a bracket.
|
2006-05-06 21:49:43 -04:00
|
|
|
|
|
|
|
|
|
2) The use of the colon character (':') as a separator for
|
|
|
|
|
conversion specifiers. This was chosen simply because that's
|
|
|
|
|
what .Net uses.
|
|
|
|
|
|
2006-04-26 16:33:25 -04:00
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
Sample Implementation
|
|
|
|
|
|
2006-05-06 21:49:43 -04:00
|
|
|
|
A rough prototype of the underlying 'cformat' function has been
|
2006-04-27 12:53:54 -04:00
|
|
|
|
coded in Python, however it needs much refinement before being
|
|
|
|
|
submitted.
|
|
|
|
|
|
|
|
|
|
|
2006-04-26 16:33:25 -04:00
|
|
|
|
Backwards Compatibility
|
|
|
|
|
|
|
|
|
|
Backwards compatibility can be maintained by leaving the existing
|
|
|
|
|
mechanisms in place. The new system does not collide with any of
|
|
|
|
|
the method names of the existing string formatting techniques, so
|
|
|
|
|
both systems can co-exist until it comes time to deprecate the
|
|
|
|
|
older system.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
[1] Python Library Reference - String formating operations
|
|
|
|
|
http://docs.python.org/lib/typesseq-strings.html
|
|
|
|
|
|
|
|
|
|
[2] Python Library References - Template strings
|
|
|
|
|
http://docs.python.org/lib/node109.html
|
|
|
|
|
|
|
|
|
|
[3] [Python-3000] String formating operations in python 3k
|
2006-04-26 16:33:25 -04:00
|
|
|
|
http://mail.python.org/pipermail/python-3000/2006-April/000285.html
|
|
|
|
|
|
2006-04-27 12:53:54 -04:00
|
|
|
|
[4] Composite Formatting - [.Net Framework Developer's Guide]
|
2006-04-26 16:33:25 -04:00
|
|
|
|
http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp?frame=true
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|