142 lines
4.6 KiB
Plaintext
142 lines
4.6 KiB
Plaintext
PEP: 215
|
||
Title: String Interpolation
|
||
Version: $Revision$
|
||
Author: ping@lfw.org (Ka-Ping Yee)
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Python-Version: 2.1
|
||
Created: 24-Jul-2000
|
||
Post-History:
|
||
|
||
Abstract
|
||
|
||
This document proposes a string interpolation feature for Python
|
||
to allow easier string formatting. The suggested syntax change
|
||
is the introduction of a '$' prefix that triggers the special
|
||
interpretation of the '$' character within a string, in a manner
|
||
reminiscent to the variable interpolation found in Unix shells,
|
||
awk, Perl, or Tcl.
|
||
|
||
|
||
Copyright
|
||
|
||
This document is in the public domain.
|
||
|
||
|
||
Specification
|
||
|
||
Strings may be preceded with a '$' prefix that comes before the
|
||
leading single or double quotation mark (or triplet) and before
|
||
any of the other string prefixes ('r' or 'u'). Such a string is
|
||
processed for interpolation after the normal interpretation of
|
||
backslash-escapes in its contents. The processing occurs just
|
||
before the string is pushed onto the value stack, each time the
|
||
string is pushed. In short, Python behaves exactly as if '$'
|
||
were a unary operator applied to the string. The operation
|
||
performed is as follows:
|
||
|
||
The string is scanned from start to end for the '$' character
|
||
(\x24 in 8-bit strings or \u0024 in Unicode strings). If there
|
||
are no '$' characters present, the string is returned unchanged.
|
||
|
||
Any '$' found in the string, followed by one of the two kinds of
|
||
expressions described below, is replaced with the value of the
|
||
expression as evaluated in the current namespaces. The value is
|
||
converted with str() if the containing string is an 8-bit string,
|
||
or with unicode() if it is a Unicode string.
|
||
|
||
1. A Python identifier optionally followed by any number of
|
||
trailers, where a trailer consists of:
|
||
- a dot and an identifier,
|
||
- an expression enclosed in square brackets, or
|
||
- an argument list enclosed in parentheses
|
||
(This is exactly the pattern expressed in the Python grammar
|
||
by "NAME trailer*", using the definitions in Grammar/Grammar.)
|
||
|
||
2. Any complete Python expression enclosed in curly braces.
|
||
|
||
Two dollar-signs ("$$") are replaced with a single "$".
|
||
|
||
|
||
Examples
|
||
|
||
Here is an example of an interactive session exhibiting the
|
||
expected behaviour of this feature.
|
||
|
||
>>> a, b = 5, 6
|
||
>>> print $'a = $a, b = $b'
|
||
a = 5, b = 6
|
||
>>> $u'uni${a}ode'
|
||
u'uni5ode'
|
||
>>> print $'\$a'
|
||
5
|
||
>>> print $r'\$a'
|
||
\5
|
||
>>> print $'$$$a.$b'
|
||
$5.6
|
||
>>> print $'a + b = ${a + b}'
|
||
a + b = 11
|
||
>>> import sys
|
||
>>> print $'References to $a: $sys.getrefcount(a)'
|
||
References to 5: 15
|
||
>>> print $"sys = $sys, sys = $sys.modules['sys']"
|
||
sys = <module 'sys' (built-in)>, sys = <module 'sys' (built-in)>
|
||
>>> print $'BDFL = $sys.copyright.split()[4].upper()'
|
||
BDFL = GUIDO
|
||
|
||
|
||
Discussion
|
||
|
||
'$' is chosen as the interpolation character within the
|
||
string for the sake of familiarity, since it is already used
|
||
for this purpose in many other languages and contexts.
|
||
|
||
It is then natural to choose '$' as a prefix, since it is a
|
||
mnemonic for the interpolation character.
|
||
|
||
Trailers are permitted to give this interpolation mechanism
|
||
even more power than the interpolation available in most other
|
||
languages, while the expression to be interpolated remains
|
||
clearly visible and free of curly braces.
|
||
|
||
'$' works like an operator and could be implemented as an
|
||
operator, but that prevents the compile-time optimization
|
||
and presents security issues. So, it is only allowed as a
|
||
string prefix.
|
||
|
||
|
||
Security Issues
|
||
|
||
"$" has the power to eval, but only to eval a literal. As
|
||
described here (a string prefix rather than an operator), it
|
||
introduces no new security issues since the expressions to be
|
||
evaluated must be literally present in the code.
|
||
|
||
|
||
Implementation
|
||
|
||
The Itpl module at http://www.lfw.org/python/Itpl.py provides a
|
||
prototype of this feature. It uses the tokenize module to find
|
||
the end of an expression to be interpolated, then calls eval()
|
||
on the expression each time a value is needed. In the prototype,
|
||
the expression is parsed and compiled again each time it is
|
||
evaluated.
|
||
|
||
As an optimization, interpolated strings could be compiled
|
||
directly into the corresponding bytecode; that is,
|
||
|
||
$'a = $a, b = $b'
|
||
|
||
could be compiled as though it were the expression
|
||
|
||
('a = ' + str(a) + ', b = ' + str(b))
|
||
|
||
so that it only needs to be compiled once.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|