150 lines
4.4 KiB
ReStructuredText
150 lines
4.4 KiB
ReStructuredText
PEP: 215
|
|
Title: String Interpolation
|
|
Author: Ka-Ping Yee <ping@zesty.ca>
|
|
Status: Superseded
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 24-Jul-2000
|
|
Python-Version: 2.1
|
|
Post-History:
|
|
Superseded-By: 292
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
This document proposes a string interpolation feature for Python
|
|
to allow easier string formatting. The suggested syntax change
|
|
is the introduction of a '$' prefix that triggers the special
|
|
interpretation of the '$' character within a string, in a manner
|
|
reminiscent to the variable interpolation found in Unix shells,
|
|
awk, Perl, or Tcl.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is in the public domain.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
Strings may be preceded with a '$' prefix that comes before the
|
|
leading single or double quotation mark (or triplet) and before
|
|
any of the other string prefixes ('r' or 'u'). Such a string is
|
|
processed for interpolation after the normal interpretation of
|
|
backslash-escapes in its contents. The processing occurs just
|
|
before the string is pushed onto the value stack, each time the
|
|
string is pushed. In short, Python behaves exactly as if '$'
|
|
were a unary operator applied to the string. The operation
|
|
performed is as follows:
|
|
|
|
The string is scanned from start to end for the '$' character
|
|
(``\x24`` in 8-bit strings or ``\u0024`` in Unicode strings). If there
|
|
are no '$' characters present, the string is returned unchanged.
|
|
|
|
Any '$' found in the string, followed by one of the two kinds of
|
|
expressions described below, is replaced with the value of the
|
|
expression as evaluated in the current namespaces. The value is
|
|
converted with ``str()`` if the containing string is an 8-bit string,
|
|
or with ``unicode()`` if it is a Unicode string.
|
|
|
|
1. A Python identifier optionally followed by any number of
|
|
trailers, where a trailer consists of:
|
|
- a dot and an identifier,
|
|
- an expression enclosed in square brackets, or
|
|
- an argument list enclosed in parentheses
|
|
(This is exactly the pattern expressed in the Python grammar
|
|
by "``NAME trailer*``", using the definitions in ``Grammar/Grammar``.)
|
|
|
|
2. Any complete Python expression enclosed in curly braces.
|
|
|
|
Two dollar-signs ("$$") are replaced with a single "$".
|
|
|
|
|
|
Examples
|
|
========
|
|
|
|
Here is an example of an interactive session exhibiting the
|
|
expected behaviour of this feature. ::
|
|
|
|
>>> a, b = 5, 6
|
|
>>> print $'a = $a, b = $b'
|
|
a = 5, b = 6
|
|
>>> $u'uni${a}ode'
|
|
u'uni5ode'
|
|
>>> print $'\$a'
|
|
5
|
|
>>> print $r'\$a'
|
|
\5
|
|
>>> print $'$$$a.$b'
|
|
$5.6
|
|
>>> print $'a + b = ${a + b}'
|
|
a + b = 11
|
|
>>> import sys
|
|
>>> print $'References to $a: $sys.getrefcount(a)'
|
|
References to 5: 15
|
|
>>> print $"sys = $sys, sys = $sys.modules['sys']"
|
|
sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
|
|
>>> print $'BDFL = $sys.copyright.split()[4].upper()'
|
|
BDFL = GUIDO
|
|
|
|
|
|
Discussion
|
|
==========
|
|
|
|
'$' is chosen as the interpolation character within the
|
|
string for the sake of familiarity, since it is already used
|
|
for this purpose in many other languages and contexts.
|
|
|
|
It is then natural to choose '$' as a prefix, since it is a
|
|
mnemonic for the interpolation character.
|
|
|
|
Trailers are permitted to give this interpolation mechanism
|
|
even more power than the interpolation available in most other
|
|
languages, while the expression to be interpolated remains
|
|
clearly visible and free of curly braces.
|
|
|
|
'$' works like an operator and could be implemented as an
|
|
operator, but that prevents the compile-time optimization
|
|
and presents security issues. So, it is only allowed as a
|
|
string prefix.
|
|
|
|
|
|
Security Issues
|
|
===============
|
|
|
|
"$" has the power to eval, but only to eval a literal. As
|
|
described here (a string prefix rather than an operator), it
|
|
introduces no new security issues since the expressions to be
|
|
evaluated must be literally present in the code.
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
The ``Itpl`` module at [1]_ provides a
|
|
prototype of this feature. It uses the ``tokenize`` module to find
|
|
the end of an expression to be interpolated, then calls ``eval()``
|
|
on the expression each time a value is needed. In the prototype,
|
|
the expression is parsed and compiled again each time it is
|
|
evaluated.
|
|
|
|
As an optimization, interpolated strings could be compiled
|
|
directly into the corresponding bytecode; that is, ::
|
|
|
|
$'a = $a, b = $b'
|
|
|
|
could be compiled as though it were the expression ::
|
|
|
|
('a = ' + str(a) + ', b = ' + str(b))
|
|
|
|
so that it only needs to be compiled once.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] http://www.lfw.org/python/Itpl.py
|