207 lines
7.7 KiB
Plaintext
207 lines
7.7 KiB
Plaintext
PEP: 292
|
|
Title: Simpler String Substitutions
|
|
Author: Barry Warsaw <barry@python.org>
|
|
Status: Final
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 18-Jun-2002
|
|
Python-Version: 2.4
|
|
Post-History: 18-Jun-2002, 23-Mar-2004, 22-Aug-2004
|
|
Replaces: 215
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP describes a simpler string substitution feature, also
|
|
known as string interpolation. This PEP is "simpler" in two
|
|
respects:
|
|
|
|
1. Python's current string substitution feature
|
|
(i.e. ``%``-substitution) is complicated and error prone. This PEP
|
|
is simpler at the cost of some expressiveness.
|
|
|
|
2. :pep:`215` proposed an alternative string interpolation feature,
|
|
introducing a new ``$`` string prefix. :pep:`292` is simpler than
|
|
this because it involves no syntax changes and has much simpler
|
|
rules for what substitutions can occur in the string.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Python currently supports a string substitution syntax based on
|
|
C's ``printf()`` '``%``' formatting character [1]_. While quite rich,
|
|
``%``-formatting codes are also error prone, even for
|
|
experienced Python programmers. A common mistake is to leave off
|
|
the trailing format character, e.g. the '``s``' in ``"%(name)s"``.
|
|
|
|
In addition, the rules for what can follow a ``%`` sign are fairly
|
|
complex, while the usual application rarely needs such complexity.
|
|
Most scripts need to do some string interpolation, but most of
|
|
those use simple 'stringification' formats, i.e. ``%s`` or ``%(name)s``
|
|
This form should be made simpler and less error prone.
|
|
|
|
|
|
A Simpler Proposal
|
|
==================
|
|
|
|
We propose the addition of a new class, called ``Template``, which
|
|
will live in the string module. The ``Template`` class supports new
|
|
rules for string substitution; its value contains placeholders,
|
|
introduced with the ``$`` character. The following rules for
|
|
``$``-placeholders apply:
|
|
|
|
1. ``$$`` is an escape; it is replaced with a single ``$``
|
|
|
|
2. ``$identifier`` names a substitution placeholder matching a mapping
|
|
key of "identifier". By default, "identifier" must spell a
|
|
Python identifier as defined in [2]_. The first non-identifier
|
|
character after the ``$`` character terminates this placeholder
|
|
specification.
|
|
|
|
3. ``${identifier}`` is equivalent to ``$identifier``. It is required
|
|
when valid identifier characters follow the placeholder but are
|
|
not part of the placeholder, e.g. ``"${noun}ification"``.
|
|
|
|
If the ``$`` character appears at the end of the line, or is followed
|
|
by any other character than those described above, a ``ValueError``
|
|
will be raised at interpolation time. Values in mapping are
|
|
converted automatically to strings.
|
|
|
|
No other characters have special meaning, however it is possible
|
|
to derive from the ``Template`` class to define different substitution
|
|
rules. For example, a derived class could allow for periods in
|
|
the placeholder (e.g. to support a kind of dynamic namespace and
|
|
attribute path lookup), or could define a delimiter character
|
|
other than ``$``.
|
|
|
|
Once the ``Template`` has been created, substitutions can be performed
|
|
by calling one of two methods:
|
|
|
|
- ``substitute()``. This method returns a new string which results
|
|
when the values of a mapping are substituted for the
|
|
placeholders in the ``Template``. If there are placeholders which
|
|
are not present in the mapping, a ``KeyError`` will be raised.
|
|
|
|
- ``safe_substitute()``. This is similar to the ``substitute()`` method,
|
|
except that ``KeyErrors`` are never raised (due to placeholders
|
|
missing from the mapping). When a placeholder is missing, the
|
|
original placeholder will appear in the resulting string.
|
|
|
|
Here are some examples::
|
|
|
|
|
|
>>> from string import Template
|
|
>>> s = Template('${name} was born in ${country}')
|
|
>>> print s.substitute(name='Guido', country='the Netherlands')
|
|
Guido was born in the Netherlands
|
|
>>> print s.substitute(name='Guido')
|
|
Traceback (most recent call last):
|
|
[...]
|
|
KeyError: 'country'
|
|
>>> print s.safe_substitute(name='Guido')
|
|
Guido was born in ${country}
|
|
|
|
The signature of ``substitute()`` and ``safe_substitute()`` allows for
|
|
passing the mapping of placeholders to values, either as a single
|
|
dictionary-like object in the first positional argument, or as
|
|
keyword arguments as shown above. The exact details and
|
|
signatures of these two methods is reserved for the standard
|
|
library documentation.
|
|
|
|
|
|
Why ``$`` and Braces?
|
|
=====================
|
|
|
|
The BDFL said it best [3]_: "The ``$`` means "substitution" in so many
|
|
languages besides Perl that I wonder where you've been. [...]
|
|
We're copying this from the shell."
|
|
|
|
Thus the substitution rules are chosen because of the similarity
|
|
with so many other languages. This makes the substitution rules
|
|
easier to teach, learn, and remember.
|
|
|
|
|
|
Comparison to PEP 215
|
|
=====================
|
|
|
|
:pep:`215` describes an alternate proposal for string interpolation.
|
|
Unlike that PEP, this one does not propose any new syntax for
|
|
Python. All the proposed new features are embodied in a new
|
|
library module. :pep:`215` proposes a new string prefix
|
|
representation such as ``$""`` which signal to Python that a new type
|
|
of string is present. ``$``-strings would have to interact with the
|
|
existing r-prefixes and u-prefixes, essentially doubling the
|
|
number of string prefix combinations.
|
|
|
|
:pep:`215` also allows for arbitrary Python expressions inside the
|
|
``$``-strings, so that you could do things like::
|
|
|
|
import sys
|
|
print $"sys = $sys, sys = $sys.modules['sys']"
|
|
|
|
which would return::
|
|
|
|
sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
|
|
|
|
It's generally accepted that the rules in :pep:`215` are safe in the
|
|
sense that they introduce no new security issues (see :pep:`215`,
|
|
"Security Issues" for details). However, the rules are still
|
|
quite complex, and make it more difficult to see the substitution
|
|
placeholder in the original ``$``-string.
|
|
|
|
The interesting thing is that the ``Template`` class defined in this
|
|
PEP is designed for inheritance and, with a little extra work,
|
|
it's possible to support :pep:`215`'s functionality using existing
|
|
Python syntax.
|
|
|
|
For example, one could define subclasses of ``Template`` and dict that
|
|
allowed for a more complex placeholder syntax and a mapping that
|
|
evaluated those placeholders.
|
|
|
|
|
|
Internationalization
|
|
====================
|
|
|
|
The implementation supports internationalization by recording the
|
|
original template string in the ``Template`` instance's ``template``
|
|
attribute. This attribute would serve as the lookup key in an
|
|
gettext-based catalog. It is up to the application to turn the
|
|
resulting string back into a ``Template`` for substitution.
|
|
|
|
However, the ``Template`` class was designed to work more intuitively
|
|
in an internationalized application, by supporting the mixing-in
|
|
of ``Template`` and unicode subclasses. Thus an internationalized
|
|
application could create an application-specific subclass,
|
|
multiply inheriting from ``Template`` and unicode, and using instances
|
|
of that subclass as the gettext catalog key. Further, the
|
|
subclass could alias the special ``__mod__()`` method to either
|
|
``.substitute()`` or ``.safe_substitute()`` to provide a more traditional
|
|
string/unicode like ``%``-operator substitution syntax.
|
|
|
|
|
|
Reference Implementation
|
|
========================
|
|
|
|
The implementation [4]_ has been committed to the Python 2.4 source tree.
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] String Formatting Operations
|
|
https://docs.python.org/release/2.6/library/stdtypes.html#string-formatting-operations
|
|
|
|
.. [2] Identifiers and Keywords
|
|
https://docs.python.org/release/2.6/reference/lexical_analysis.html#identifiers-and-keywords
|
|
|
|
.. [3] https://mail.python.org/pipermail/python-dev/2002-June/025652.html
|
|
|
|
.. [4] Reference Implementation
|
|
http://sourceforge.net/tracker/index.php?func=detail&aid=1014055&group_id=5470&atid=305470
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|