210 lines
8.0 KiB
Plaintext
210 lines
8.0 KiB
Plaintext
PEP: 292
|
||
Title: Simpler String Substitutions
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: barry@python.org (Barry A. Warsaw)
|
||
Status: Final
|
||
Type: Standards Track
|
||
Created: 18-Jun-2002
|
||
Python-Version: 2.4
|
||
Post-History: 18-Jun-2002, 23-Mar-2004, 22-Aug-2004
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP describes a simpler string substitution feature, also
|
||
known as string interpolation. This PEP is "simpler" in two
|
||
respects:
|
||
|
||
1. Python's current string substitution feature
|
||
(i.e. %-substitution) is complicated and error prone. This PEP
|
||
is simpler at the cost of some expressiveness.
|
||
|
||
2. PEP 215 proposed an alternative string interpolation feature,
|
||
introducing a new `$' string prefix. PEP 292 is simpler than
|
||
this because it involves no syntax changes and has much simpler
|
||
rules for what substitutions can occur in the string.
|
||
|
||
|
||
Rationale
|
||
|
||
Python currently supports a string substitution syntax based on
|
||
C's printf() '%' formatting character[1]. While quite rich,
|
||
%-formatting codes are also error prone, even for
|
||
experienced Python programmers. A common mistake is to leave off
|
||
the trailing format character, e.g. the `s' in "%(name)s".
|
||
|
||
In addition, the rules for what can follow a % sign are fairly
|
||
complex, while the usual application rarely needs such complexity.
|
||
Most scripts need to do some string interpolation, but most of
|
||
those use simple `stringification' formats, i.e. %s or %(name)s
|
||
This form should be made simpler and less error prone.
|
||
|
||
|
||
A Simpler Proposal
|
||
|
||
We propose the addition of a new class, called 'Template', which
|
||
will live in the string module. The Template class supports new
|
||
rules for string substitution; its value contains placeholders,
|
||
introduced with the $ character. The following rules for
|
||
$-placeholders apply:
|
||
|
||
1. $$ is an escape; it is replaced with a single $
|
||
|
||
2. $identifier names a substitution placeholder matching a mapping
|
||
key of "identifier". By default, "identifier" must spell a
|
||
Python identifier as defined in [2]. The first non-identifier
|
||
character after the $ character terminates this placeholder
|
||
specification.
|
||
|
||
3. ${identifier} is equivalent to $identifier. It is required
|
||
when valid identifier characters follow the placeholder but are
|
||
not part of the placeholder, e.g. "${noun}ification".
|
||
|
||
If the $ character appears at the end of the line, or is followed
|
||
by any other character than those described above, a ValueError
|
||
will be raised at interpolation time. Values in mapping are
|
||
converted automatically to strings.
|
||
|
||
No other characters have special meaning, however it is possible
|
||
to derive from the Template class to define different substitution
|
||
rules. For example, a derived class could allow for periods in
|
||
the placeholder (e.g. to support a kind of dynamic namespace and
|
||
attribute path lookup), or could define a delimiter character
|
||
other than '$'.
|
||
|
||
Once the Template has been created, substitutions can be performed
|
||
by calling one of two methods:
|
||
|
||
- substitute(). This method returns a new string which results
|
||
when the values of a mapping are substituted for the
|
||
placeholders in the Template. If there are placeholders which
|
||
are not present in the mapping, a KeyError will be raised.
|
||
|
||
- safe_substitute(). This is similar to the substitute() method,
|
||
except that KeyErrors are never raised (due to placeholders
|
||
missing from the mapping). When a placeholder is missing, the
|
||
original placeholder will appear in the resulting string.
|
||
|
||
Here are some examples:
|
||
|
||
>>> from string import Template
|
||
>>> s = Template('${name} was born in ${country}')
|
||
>>> print s.substitute(name='Guido', country='the Netherlands')
|
||
Guido was born in the Netherlands
|
||
>>> print s.substitute(name='Guido')
|
||
Traceback (most recent call last):
|
||
[...]
|
||
KeyError: 'country'
|
||
>>> print s.safe_substitute(name='Guido')
|
||
Guido was born in ${country}
|
||
|
||
The signature of substitute() and safe_substitute() allows for
|
||
passing the mapping of placeholders to values, either as a single
|
||
dictionary-like object in the first positional argument, or as
|
||
keyword arguments as shown above. The exact details and
|
||
signatures of these two methods is reserved for the standard
|
||
library documentation.
|
||
|
||
|
||
Why `$' and Braces?
|
||
|
||
The BDFL said it best[4]: "The $ means "substitution" in so many
|
||
languages besides Perl that I wonder where you've been. [...]
|
||
We're copying this from the shell."
|
||
|
||
Thus the substitution rules are chosen because of the similarity
|
||
with so many other languages. This makes the substitution rules
|
||
easier to teach, learn, and remember.
|
||
|
||
|
||
Comparison to PEP 215
|
||
|
||
PEP 215 describes an alternate proposal for string interpolation.
|
||
Unlike that PEP, this one does not propose any new syntax for
|
||
Python. All the proposed new features are embodied in a new
|
||
library module. PEP 215 proposes a new string prefix
|
||
representation such as $"" which signal to Python that a new type
|
||
of string is present. $-strings would have to interact with the
|
||
existing r-prefixes and u-prefixes, essentially doubling the
|
||
number of string prefix combinations.
|
||
|
||
PEP 215 also allows for arbitrary Python expressions inside the
|
||
$-strings, so that you could do things like:
|
||
|
||
import sys
|
||
print $"sys = $sys, sys = $sys.modules['sys']"
|
||
|
||
which would return
|
||
|
||
sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
|
||
|
||
It's generally accepted that the rules in PEP 215 are safe in the
|
||
sense that they introduce no new security issues (see PEP 215,
|
||
"Security Issues" for details). However, the rules are still
|
||
quite complex, and make it more difficult to see the substitution
|
||
placeholder in the original $-string.
|
||
|
||
The interesting thing is that the Template class defined in this
|
||
PEP is designed for inheritance and, with a little extra work,
|
||
it's possible to support PEP 215's functionality using existing
|
||
Python syntax.
|
||
|
||
For example, one could define subclasses of Template and dict that
|
||
allowed for a more complex placeholder syntax and a mapping that
|
||
evaluated those placeholders.
|
||
|
||
|
||
Internationalization
|
||
|
||
The implementation supports internationalization by recording the
|
||
original template string in the Template instance's 'template'
|
||
attribute. This attribute would serve as the lookup key in an
|
||
gettext-based catalog. It is up to the application to turn the
|
||
resulting string back into a Template for substitution.
|
||
|
||
However, the Template class was designed to work more intuitively
|
||
in an internationalized application, by supporting the mixing-in
|
||
of Template and unicode subclasses. Thus an internationalized
|
||
application could create an application-specific subclass,
|
||
multiply inheriting from Template and unicode, and using instances
|
||
of that subclass as the gettext catalog key. Further, the
|
||
subclass could alias the special __mod__() method to either
|
||
.substitute() or .safe_substitute() to provide a more traditional
|
||
string/unicode like %-operator substitution syntax.
|
||
|
||
|
||
Reference Implementation
|
||
|
||
The implementation has been committed to the Python 2.4 source tree.
|
||
|
||
|
||
References
|
||
|
||
[1] String Formatting Operations
|
||
http://www.python.org/doc/current/lib/typesseq-strings.html
|
||
|
||
[2] Identifiers and Keywords
|
||
http://www.python.org/doc/current/ref/identifiers.html
|
||
|
||
[3] Guido's python-dev posting from 21-Jul-2002
|
||
http://mail.python.org/pipermail/python-dev/2002-July/026397.html
|
||
|
||
[4] http://mail.python.org/pipermail/python-dev/2002-June/025652.html
|
||
|
||
[5] Reference Implementation
|
||
http://sourceforge.net/tracker/index.php?func=detail&aid=1014055&group_id=5470&atid=305470
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|