2015-08-08 10:55:03 -04:00
|
|
|
|
PEP: 501
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Title: General purpose string interpolation
|
2015-08-08 05:20:33 -04:00
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Nick Coghlan <ncoghlan@gmail.com>
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 08-Aug-2015
|
|
|
|
|
Python-Version: 3.6
|
2015-08-22 22:41:30 -04:00
|
|
|
|
Post-History: 08-Aug-2015, 23-Aug-2015
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
PEP 498 proposes new syntactic support for string interpolation that is
|
|
|
|
|
transparent to the compiler, allow name references from the interpolation
|
|
|
|
|
operation full access to containing namespaces (as with any other expression),
|
|
|
|
|
rather than being limited to explicitly name references.
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
However, it only offers this capability for string formatting, making it likely
|
|
|
|
|
we will see code like the following::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
os.system(f"echo {user_message}")
|
|
|
|
|
|
|
|
|
|
This kind of code is superficially elegant, but poses a significant problem
|
|
|
|
|
if the interpolated value ``user_message`` is in fact provided by a user: it's
|
|
|
|
|
an opening for a form of code injection attack, where the supplied user data
|
|
|
|
|
has not been properly escaped before being passed to the ``os.system`` call.
|
|
|
|
|
|
|
|
|
|
To address that problem (and a number of other concerns), this PEP proposes an
|
2015-08-22 22:38:55 -04:00
|
|
|
|
alternative approach to compiler supported interpolation, using ``i`` (for
|
|
|
|
|
"interpolation") as the new string prefix and a substitution syntax
|
|
|
|
|
inspired by that used in ``string.Template`` and ES6 JavaScript, rather than
|
|
|
|
|
adding a 4th substitution variable syntax to Python.
|
|
|
|
|
|
|
|
|
|
Some possible examples of the proposed syntax::
|
|
|
|
|
|
|
|
|
|
msg = str(i'My age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.')
|
|
|
|
|
print(_(i"This is a $translated $message"))
|
|
|
|
|
translated = l20n(i"{{ $user }} is running {{ appname }}")
|
|
|
|
|
myquery = sql(i"SELECT $column FROM $table;")
|
|
|
|
|
mycommand = sh(i"cat $filename")
|
|
|
|
|
mypage = html(i"<html><body>${response.body}</body></html>")
|
|
|
|
|
callable = defer(i"$x + $y")
|
|
|
|
|
|
|
|
|
|
Summary of differences from PEP 498
|
|
|
|
|
===================================
|
|
|
|
|
|
|
|
|
|
The key differences of this proposal relative to PEP 498:
|
|
|
|
|
|
|
|
|
|
* "i" (interpolation template) prefix rather than "f" (formatted string)
|
|
|
|
|
* string.Template/JavaScript inspired substitution syntax, rather than str.format/C# inspired
|
|
|
|
|
* interpolation templates are created at runtime as a new kind of object
|
|
|
|
|
* the default rendering is invoked by calling ``str()`` on a template object
|
|
|
|
|
rather than automatically
|
2015-08-22 17:04:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Proposal
|
|
|
|
|
========
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
This PEP proposes the introduction of a new string prefix that declares the
|
|
|
|
|
string to be an interpolation template rather than an ordinary string::
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-22 23:44:47 -04:00
|
|
|
|
template = i"Substitute $names and ${expressions} at runtime"
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-22 17:04:33 -04:00
|
|
|
|
This would be effectively interpreted as::
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
_raw_template = "Substitute $names and ${expressions} at runtime"
|
|
|
|
|
_parsed_fields = (
|
|
|
|
|
("Substitute ", 0, "names", "", ""),
|
|
|
|
|
(" and ", 1, "expressions", "", ""),
|
|
|
|
|
(" at runtime", None, None, None, None),
|
|
|
|
|
)
|
|
|
|
|
_field_values = (names, expressions)
|
2015-08-22 22:38:55 -04:00
|
|
|
|
template = types.InterpolationTemplate(_raw_template,
|
|
|
|
|
_parsed_fields,
|
|
|
|
|
_field_values)
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
The ``__str__`` method on ``types.InterpolationTemplate`` would then implementat
|
|
|
|
|
the following ``str.format`` inspired semantics::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
>>> import datetime
|
|
|
|
|
>>> name = 'Jane'
|
|
|
|
|
>>> age = 50
|
|
|
|
|
>>> anniversary = datetime.date(1991, 10, 12)
|
2015-08-22 22:38:55 -04:00
|
|
|
|
>>> str(i'My name is $name, my age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.')
|
2015-08-08 05:20:33 -04:00
|
|
|
|
'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
|
2015-08-22 22:38:55 -04:00
|
|
|
|
>>> str(i'She said her name is ${name!r}.')
|
2015-08-08 05:20:33 -04:00
|
|
|
|
"She said her name is 'Jane'."
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
The interpolation template prefix can be combined with single-quoted,
|
|
|
|
|
double-quoted and triple quoted strings, including raw strings. It does not
|
|
|
|
|
support combination with bytes literals.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
This PEP does not propose to remove or deprecate any of the existing
|
|
|
|
|
string formatting mechanisms, as those will remain valuable when formatting
|
2015-08-08 05:28:56 -04:00
|
|
|
|
strings that are not present directly in the source code of the application.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
PEP 498 makes interpolating values into strings with full access to Python's
|
2015-08-22 07:17:31 -04:00
|
|
|
|
lexical namespace semantics simpler, but it does so at the cost of creating a
|
|
|
|
|
situation where interpolating values into sensitive targets like SQL queries,
|
|
|
|
|
shell commands and HTML templates will enjoy a much cleaner syntax when handled
|
|
|
|
|
without regard for code injection attacks than when they are handled correctly.
|
|
|
|
|
It also has the effect of introducing yet another syntax for substitution
|
|
|
|
|
expressions into Python, when we already have 3 (``str.format``,
|
|
|
|
|
``bytes.__mod__`` and ``string.Template``)
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
This PEP proposes to handle the former issue by deferring the actual rendering
|
|
|
|
|
of the interpolation template to its ``__str__`` method (allow the use of
|
|
|
|
|
other template renderers by passing the template around as an object), and the
|
|
|
|
|
latter by adopting the ``string.Template`` substitution syntax defined in PEP
|
|
|
|
|
292.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:34:57 -04:00
|
|
|
|
The substitution syntax devised for PEP 292 is deliberately simple so that the
|
2015-08-22 05:57:17 -04:00
|
|
|
|
template strings can be extracted into an i18n message catalog, and passed to
|
2015-08-08 05:20:33 -04:00
|
|
|
|
translators who may not themselves be developers. For these use cases, it is
|
|
|
|
|
important that the interpolation syntax be as simple as possible, as the
|
|
|
|
|
translators are responsible for preserving the substition markers, even as
|
|
|
|
|
they translate the surrounding text. The PEP 292 syntax is also a common mesage
|
|
|
|
|
catalog syntax already supporting by many commercial software translation
|
|
|
|
|
support tools.
|
|
|
|
|
|
|
|
|
|
PEP 498 correctly points out that the PEP 292 syntax isn't as flexible as that
|
|
|
|
|
introduced for general purpose string formatting in PEP 3101, so this PEP adds
|
|
|
|
|
that flexibility to the ``${ref}`` construct in PEP 292, and allows translation
|
|
|
|
|
tools the option of rejecting usage of that more advanced syntax at runtime,
|
|
|
|
|
rather than categorically rejecting it at compile time. The proposed permitted
|
2015-08-22 05:57:17 -04:00
|
|
|
|
expressions, conversion specifiers, and format specifiers inside ``${ref}`` are
|
2015-08-22 17:34:57 -04:00
|
|
|
|
exactly as defined for ``{ref}`` substituion in PEP 498.
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
The specific proposal in this PEP is also deliberately close in both syntax
|
|
|
|
|
and semantics to the general purpose interpolation syntax introduced to
|
2015-08-22 07:17:31 -04:00
|
|
|
|
JavaScript in ES6, as we can reasonably expect a great many Python developers
|
|
|
|
|
to be regularly switching back and forth between user interface code written in
|
2015-08-22 05:57:17 -04:00
|
|
|
|
JavaScript and core application code written in Python.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
=============
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
This PEP proposes the introduction of ``i`` as a new string prefix that
|
|
|
|
|
results in the creation of an instance of a new type,
|
|
|
|
|
``types.InterpolationTemplate``.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
Interpolation template literals are Unicode strings (bytes literals are not
|
|
|
|
|
permitted), and string literal concatenation operates as normal, with the
|
|
|
|
|
entire combined literal forming the interpolation template.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
The template string is parsed into literals and expressions. Expressions
|
2015-08-08 05:20:33 -04:00
|
|
|
|
appear as either identifiers prefixed with a single "$" character, or
|
|
|
|
|
surrounded be a leading '${' and a trailing '}. The parts of the format string
|
|
|
|
|
that are not expressions are separated out as string literals.
|
|
|
|
|
|
|
|
|
|
While parsing the string, any doubled ``$$`` is replaced with a single ``$``
|
|
|
|
|
and is considered part of the literal text, rather than as introducing an
|
|
|
|
|
expression.
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
These components are then organised into an instance of a new type with the
|
|
|
|
|
following semantics::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
class InterpolationTemplate:
|
|
|
|
|
__slots__ = ("raw_template", "parsed_fields", "field_values")
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
def __new__(cls, raw_template, parsed_fields, field_values):
|
|
|
|
|
self = super().__new__()
|
|
|
|
|
self.raw_template = raw_template
|
|
|
|
|
self.parsed_fields = parsed_fields
|
|
|
|
|
self.field_values = field_values
|
|
|
|
|
return self
|
|
|
|
|
|
|
|
|
|
def __iter__(self):
|
|
|
|
|
# Support iterable unpacking
|
|
|
|
|
yield self.raw_template
|
|
|
|
|
yield self.parsed_fields
|
|
|
|
|
yield self.field_values
|
|
|
|
|
|
|
|
|
|
def __repr__(self):
|
|
|
|
|
return str(i"<${type(self).__qualname__} ${self.raw_template!r} "
|
|
|
|
|
"at ${id(self):#x}>")
|
|
|
|
|
|
|
|
|
|
def __str__(self):
|
|
|
|
|
# See definition of the default template rendering below
|
|
|
|
|
|
|
|
|
|
The result of the interpolation template expression is an instance of this
|
|
|
|
|
type, rather than an already rendered string - default rendering only takes
|
|
|
|
|
place when the instance's ``__str__`` method is called.
|
|
|
|
|
|
|
|
|
|
The format of the parsed fields tuple is inspired by the interface of
|
2015-08-22 05:57:17 -04:00
|
|
|
|
``string.Formatter.parse``, and consists of a series of 5-tuples each
|
|
|
|
|
containing:
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
* a leading string literal (may be the empty string)
|
|
|
|
|
* the substitution field position (zero-based enumeration)
|
|
|
|
|
* the substitution expression text
|
|
|
|
|
* the substitution conversion specifier (as defined by str.format)
|
|
|
|
|
* the substitution format specifier (as defined by str.format)
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:34:57 -04:00
|
|
|
|
This field ordering is defined such that reading the parsed field tuples from
|
|
|
|
|
left to right will have all the subcomponents displayed in the same order as
|
|
|
|
|
they appear in the original template string.
|
|
|
|
|
|
|
|
|
|
For ease of access the sequence elements will be available as attributes in
|
|
|
|
|
addition to being available by position:
|
|
|
|
|
|
|
|
|
|
* ``leading_text``
|
|
|
|
|
* ``field_position``
|
|
|
|
|
* ``expression``
|
|
|
|
|
* ``conversion``
|
|
|
|
|
* ``format``
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:34:57 -04:00
|
|
|
|
The expression text is simply the text of the substitution expression, as it
|
2015-08-08 05:20:33 -04:00
|
|
|
|
appeared in the original string, but without the leading and/or surrounding
|
2015-08-22 17:34:57 -04:00
|
|
|
|
expression markers. The conversion specifier and format specifier are separated
|
|
|
|
|
from the substition expression by ``!`` and ``:`` as defined for ``str.format``.
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
If a given substition field has no leading literal section, conversion specifier
|
2015-08-22 17:34:57 -04:00
|
|
|
|
or format specifier, then the corresponding elements in the tuple are the
|
|
|
|
|
empty string. If the final part of the string has no trailing substitution
|
|
|
|
|
field, then the field position, field expression, conversion specifier and
|
|
|
|
|
format specifier will all be ``None``.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
The substitution field values tuple is created by evaluating the interpolated
|
|
|
|
|
expressions in the exact runtime context where the interpolation expression
|
|
|
|
|
appears in the source code.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 23:44:47 -04:00
|
|
|
|
For the following example interpolation template::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 23:44:47 -04:00
|
|
|
|
i'abc${expr1:spec1}${expr2!r:spec2}def${expr3:!s}ghi $ident $$jkl'
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
the parsed fields tuple would be::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
(
|
2015-08-22 05:57:17 -04:00
|
|
|
|
('abc', 0, 'expr1', '', 'spec1'),
|
|
|
|
|
('', 1, 'expr2', 'r', 'spec2'),
|
|
|
|
|
(def', 2, 'expr3', 's', ''),
|
|
|
|
|
('ghi', 3, 'ident', '', ''),
|
|
|
|
|
('$jkl', None, None, None, None)
|
2015-08-08 05:20:33 -04:00
|
|
|
|
)
|
|
|
|
|
|
2015-08-22 07:17:31 -04:00
|
|
|
|
While the field values tuple would be::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
(expr1, expr2, expr3, ident)
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
The parsed fields tuple can be constant folded at compile time, while the
|
|
|
|
|
expression values tuple will always need to be constructed at runtime.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
The ``InterpolationTemplate.__str__`` implementation would have the following
|
2015-08-08 05:20:33 -04:00
|
|
|
|
semantics, with field processing being defined in terms of the ``format``
|
|
|
|
|
builtin and ``str.format`` conversion specifiers::
|
|
|
|
|
|
|
|
|
|
_converter = string.Formatter().convert_field
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
def __str__(self):
|
|
|
|
|
raw_template, fields, values = self
|
2015-08-08 05:20:33 -04:00
|
|
|
|
template_parts = []
|
2015-08-22 05:57:17 -04:00
|
|
|
|
for leading_text, field_num, expr, conversion, format_spec in fields:
|
2015-08-08 05:20:33 -04:00
|
|
|
|
template_parts.append(leading_text)
|
|
|
|
|
if field_num is not None:
|
|
|
|
|
value = values[field_num]
|
|
|
|
|
if conversion:
|
|
|
|
|
value = _converter(value, conversion)
|
|
|
|
|
field_text = format(value, format_spec)
|
|
|
|
|
template_parts.append(field_str)
|
|
|
|
|
return "".join(template_parts)
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Writing custom interpolators
|
|
|
|
|
----------------------------
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
Writing a custom interpolator doesn't requiring any special syntax. Instead,
|
|
|
|
|
custom interpolators are ordinary callables that process an interpolation
|
|
|
|
|
template directly based on the ``raw_template``, ``parsed_fields`` and
|
|
|
|
|
``field_values`` attributes, rather than relying on the default rendered.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Expression evaluation
|
|
|
|
|
---------------------
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
The subexpressions that are extracted from the interpolation expression are
|
|
|
|
|
evaluated in the context where the interpolation expression appears. This means
|
|
|
|
|
the expression has full access to local, nonlocal and global variables. Any
|
|
|
|
|
valid Python expression can be used inside ``${}``, including function and
|
|
|
|
|
method calls. References without the surrounding braces are limited to looking
|
|
|
|
|
up single identifiers.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Because the substitution expressions are evaluated where the string appears in
|
|
|
|
|
the source code, there are no additional security concerns related to the
|
|
|
|
|
contents of the expression itself, as you could have also just written the
|
|
|
|
|
same expression and used runtime field parsing::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
>>> bar=10
|
|
|
|
|
>>> def foo(data):
|
|
|
|
|
... return data + 20
|
|
|
|
|
...
|
2015-08-22 23:44:47 -04:00
|
|
|
|
>>> str(i'input=$bar, output=${foo(bar)}')
|
2015-08-08 05:20:33 -04:00
|
|
|
|
'input=10, output=30'
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Is essentially equivalent to::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
>>> 'input={}, output={}'.format(bar, foo(bar))
|
|
|
|
|
'input=10, output=30'
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Handling code injection attacks
|
|
|
|
|
-------------------------------
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
The proposed interpolation syntax makes it potentially attractive to write
|
2015-08-22 05:57:17 -04:00
|
|
|
|
code like the following::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
myquery = str(i"SELECT $column FROM $table;")
|
|
|
|
|
mycommand = str(i"cat $filename")
|
|
|
|
|
mypage = str(i"<html><body>${response.body}</body></html>")
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
These all represent potential vectors for code injection attacks, if any of the
|
|
|
|
|
variables being interpolated happen to come from an untrusted source. The
|
|
|
|
|
specific proposal in this PEP is designed to make it straightforward to write
|
|
|
|
|
use case specific interpolators that take care of quoting interpolated values
|
|
|
|
|
appropriately for the relevant security context::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
myquery = sql(i"SELECT $column FROM $table;")
|
|
|
|
|
mycommand = sh(i"cat $filename")
|
|
|
|
|
mypage = html(i"<html><body>${response.body}</body></html>")
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
This PEP does not cover adding such interpolators to the standard library,
|
|
|
|
|
but instead ensures they can be readily provided by third party libraries.
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
(Although it's tempting to propose adding InterpolationTemplate support at
|
|
|
|
|
least to ``subprocess.call``, ``subprocess.check_call`` and
|
|
|
|
|
``subprocess.check_output``)
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
Format and conversion specifiers
|
|
|
|
|
--------------------------------
|
|
|
|
|
|
|
|
|
|
Aside from separating them out from the substitution expression, format and
|
|
|
|
|
conversion specifiers are otherwise treated as opaque strings by the
|
|
|
|
|
interpolation template parser - assigning semantics to those (or, alternatively,
|
|
|
|
|
prohibiting their use) is handled at runtime by the specified interpolator.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
Error handling
|
|
|
|
|
--------------
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Either compile time or run time errors can occur when processing interpolation
|
|
|
|
|
expressions. Compile time errors are limited to those errors that can be
|
|
|
|
|
detected when parsing a template string into its component tuples. These
|
|
|
|
|
errors all raise SyntaxError.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
Unmatched braces::
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
>>> i'x=${x'
|
2015-08-08 05:20:33 -04:00
|
|
|
|
File "<stdin>", line 1
|
|
|
|
|
SyntaxError: missing '}' in interpolation expression
|
|
|
|
|
|
|
|
|
|
Invalid expressions::
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
>>> i'x=${!x}'
|
2015-08-08 05:20:33 -04:00
|
|
|
|
File "<fstring>", line 1
|
|
|
|
|
!x
|
|
|
|
|
^
|
|
|
|
|
SyntaxError: invalid syntax
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
Run time errors occur when evaluating the expressions inside a
|
|
|
|
|
template string before creating the interpolation template object. See PEP 498
|
|
|
|
|
for some examples.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Different interpolators may also impose additional runtime
|
2015-08-08 05:20:33 -04:00
|
|
|
|
constraints on acceptable interpolated expressions and other formatting
|
|
|
|
|
details, which will be reported as runtime exceptions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Internationalising interpolated strings
|
|
|
|
|
=======================================
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Since this PEP derives its interpolation syntax from the internationalisation
|
|
|
|
|
focused PEP 292, it's worth considering the potential implications this PEP
|
|
|
|
|
may have for the internationalisation use case.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Internationalisation enters the picture by writing a custom interpolator that
|
|
|
|
|
performs internationalisation. For example, the following implementation
|
|
|
|
|
would delegate interpolation calls to ``string.Template``::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
def i18n(template):
|
|
|
|
|
# A real implementation would also handle normal strings
|
|
|
|
|
raw_template, fields, values = template
|
|
|
|
|
translated = gettext.gettext(raw_template)
|
2015-08-22 07:17:31 -04:00
|
|
|
|
value_map = _build_interpolation_map(fields, values)
|
|
|
|
|
return string.Template(translated).safe_substitute(value_map)
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
def _build_interpolation_map(fields, values):
|
|
|
|
|
field_values = {}
|
|
|
|
|
for literal_text, field_num, expr, conversion, format_spec in fields:
|
|
|
|
|
assert expr.isidentifier() and not conversion and not format_spec
|
2015-08-08 05:20:33 -04:00
|
|
|
|
if field_num is not None:
|
2015-08-22 05:57:17 -04:00
|
|
|
|
field_values[expr] = values[field_num]
|
|
|
|
|
return field_values
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:04:33 -04:00
|
|
|
|
And would could then be invoked as::
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:34:57 -04:00
|
|
|
|
# _ = i18n at top of module or injected into the builtins module
|
2015-08-22 22:38:55 -04:00
|
|
|
|
print(_(i"This is a $translated $message"))
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:34:57 -04:00
|
|
|
|
Any actual i18n implementation would need to address other issues (most notably
|
2015-08-22 05:57:17 -04:00
|
|
|
|
message catalog extraction), but this gives the general idea of what might be
|
|
|
|
|
possible.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
It's also worth noting that one of the benefits of the ``$`` based substitution
|
|
|
|
|
syntax in this PEP is its compatibility with Mozilla's
|
|
|
|
|
`l20n syntax <http://l20n.org/>`__, which uses ``{{ name }}`` for global
|
|
|
|
|
substitution, and ``{{ $user }}`` for local context substitution.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
With the syntax in this PEP, an l20n interpolator could be written as::
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
translated = l20n(i"{{ $user }} is running {{ appname }}")
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
With the syntax proposed in PEP 498 (and neglecting the difficulty of doing
|
|
|
|
|
catalog lookups using PEP 498's semantics), the necessary brace escaping would
|
2015-08-22 17:04:33 -04:00
|
|
|
|
make the string look like this in order to interpolate the user variable
|
2015-08-22 05:57:17 -04:00
|
|
|
|
while preserving all of the expected braces::
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
locally_interpolated = f"{{{{ ${user} }}}} is running {{{{ appname }}}}"
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 07:17:31 -04:00
|
|
|
|
|
|
|
|
|
Possible integration with the logging module
|
|
|
|
|
============================================
|
|
|
|
|
|
|
|
|
|
One of the challenges with the logging module has been that previously been
|
|
|
|
|
unable to devise a reasonable migration strategy away from the use of
|
|
|
|
|
printf-style formatting. The runtime parsing and interpolation overhead for
|
|
|
|
|
logging messages also poses a problem for extensive logging of runtime events
|
|
|
|
|
for monitoring purposes.
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
While beyond the scope of this initial PEP, interpolation template support
|
|
|
|
|
could potentially be added to the logging module's event reporting APIs,
|
|
|
|
|
permitting relevant details to be captured using forms like::
|
2015-08-22 07:17:31 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
logging.debug(i"Event: $event; Details: $data")
|
|
|
|
|
logging.critical(i"Error: $error; Details: $data")
|
2015-08-22 07:17:31 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
As the interpolation template is passed in as an ordinary argument, other
|
|
|
|
|
keyword arguments also remain available::
|
|
|
|
|
|
|
|
|
|
logging.critical(i"Error: $error; Details: $data", exc_info=True)
|
2015-08-22 07:17:31 -04:00
|
|
|
|
|
2015-08-08 05:20:33 -04:00
|
|
|
|
Discussion
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
Refer to PEP 498 for additional discussion, as several of the points there
|
|
|
|
|
also apply to this PEP.
|
|
|
|
|
|
2015-08-22 17:04:33 -04:00
|
|
|
|
Deferring support for binary interpolation
|
|
|
|
|
------------------------------------------
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-22 17:04:33 -04:00
|
|
|
|
Supporting binary interpolation with this syntax would be relatively
|
2015-08-22 22:38:55 -04:00
|
|
|
|
straightforward (the elements in the parsed fields tuple would just be
|
|
|
|
|
byte strings rather than text strings, and the default renderer would be
|
|
|
|
|
markedly less useful), but poses a signficant likelihood of producing
|
|
|
|
|
confusing type errors when a text interpolator was presented with
|
2015-08-22 17:04:33 -04:00
|
|
|
|
binary input.
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-22 17:04:33 -04:00
|
|
|
|
Since the proposed operator is useful without binary interpolation support, and
|
|
|
|
|
such support can be readily added later, further consideration of binary
|
|
|
|
|
interpolation is considered out of scope for the current PEP.
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
2015-08-23 00:17:45 -04:00
|
|
|
|
Interoperability with str-only interfaces
|
|
|
|
|
-----------------------------------------
|
|
|
|
|
|
|
|
|
|
For interoperability with interfaces that only accept strings, interpolation
|
|
|
|
|
templates can be prerendered with ``str``, rather than delegating the rendering
|
|
|
|
|
to the called function.
|
|
|
|
|
|
|
|
|
|
This reflects the key difference from PEP 498, which *always* eagerly applies]
|
|
|
|
|
the default rendering, without any convenient way to decide to do something
|
|
|
|
|
different.
|
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Preserving the raw template string
|
|
|
|
|
----------------------------------
|
|
|
|
|
|
|
|
|
|
Earlier versions of this PEP failed to make the raw template string available
|
|
|
|
|
to interpolators. This greatly complicated the i18n example, as it needed to
|
|
|
|
|
reconstruct the original template to pass to the message catalog lookup.
|
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
Creating a rich object rather than a global name lookup
|
|
|
|
|
-------------------------------------------------------
|
2015-08-22 05:57:17 -04:00
|
|
|
|
|
|
|
|
|
Earlier versions of this PEP used an ``__interpolate__`` builtin, rather than
|
2015-08-22 22:38:55 -04:00
|
|
|
|
a creating a new kind of object for later consumption by interpolation
|
|
|
|
|
functions. Creating a rich descriptive object with a useful default renderer
|
|
|
|
|
made it much easier to support customisation of the semantics of interpolation.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
Relative order of conversion and format specifier in parsed fields
|
|
|
|
|
------------------------------------------------------------------
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
The relative order of the conversion specifier and the format specifier in the
|
|
|
|
|
substitution field 5-tuple is defined to match the order they appear in the
|
|
|
|
|
format string, which is unfortunately the inverse of the way they appear in the
|
|
|
|
|
``string.Formatter.parse`` 4-tuple.
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 05:57:17 -04:00
|
|
|
|
I consider this a design defect in ``string.Formatter.parse``, so I think it's
|
|
|
|
|
worth fixing it in for the customer interpolator API, since the tuple already
|
|
|
|
|
has other differences (like including both the field position number *and* the
|
|
|
|
|
text of the expression).
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
2015-08-22 17:34:57 -04:00
|
|
|
|
This PEP also makes the parsed field attributes available by name, so it's
|
|
|
|
|
possible to write interpolators without caring about the precise field order
|
|
|
|
|
at all.
|
2015-08-22 07:17:31 -04:00
|
|
|
|
|
2015-08-22 22:38:55 -04:00
|
|
|
|
|
|
|
|
|
Acknowledgements
|
|
|
|
|
================
|
|
|
|
|
|
|
|
|
|
* Eric V. Smith for creating PEP 498 and demonstrating the feasibility of
|
|
|
|
|
arbitrary expression substitution in string interpolation
|
|
|
|
|
* Barry Warsaw for the string.Template syntax defined in PEP 292
|
|
|
|
|
* Armin Ronacher for pointing me towards Mozilla's l20n project
|
|
|
|
|
* Mike Miller for his survey of programming language interpolation syntaxes in
|
|
|
|
|
PEP (TBD)
|
|
|
|
|
|
2015-08-08 05:20:33 -04:00
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [#] %-formatting
|
|
|
|
|
(https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting)
|
|
|
|
|
|
|
|
|
|
.. [#] str.format
|
|
|
|
|
(https://docs.python.org/3/library/string.html#formatstrings)
|
|
|
|
|
|
|
|
|
|
.. [#] string.Template documentation
|
|
|
|
|
(https://docs.python.org/3/library/string.html#template-strings)
|
|
|
|
|
|
|
|
|
|
.. [#] PEP 215: String Interpolation
|
|
|
|
|
(https://www.python.org/dev/peps/pep-0215/)
|
|
|
|
|
|
|
|
|
|
.. [#] PEP 292: Simpler String Substitutions
|
2015-08-08 10:56:03 -04:00
|
|
|
|
(https://www.python.org/dev/peps/pep-0292/)
|
2015-08-08 05:20:33 -04:00
|
|
|
|
|
|
|
|
|
.. [#] PEP 3101: Advanced String Formatting
|
|
|
|
|
(https://www.python.org/dev/peps/pep-3101/)
|
|
|
|
|
|
|
|
|
|
.. [#] PEP 498: Literal string formatting
|
|
|
|
|
(https://www.python.org/dev/peps/pep-0498/)
|
|
|
|
|
|
|
|
|
|
.. [#] string.Formatter.parse
|
|
|
|
|
(https://docs.python.org/3/library/string.html#string.Formatter.parse)
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|