PEP 501: Build on 498 instead of competing

This commit is contained in:
Nick Coghlan 2015-08-30 13:44:31 +10:00
parent fbe3070944
commit 651a74028d
1 changed files with 286 additions and 279 deletions

View File

@ -6,9 +6,10 @@ Author: Nick Coghlan <ncoghlan@gmail.com>
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Requires: 498
Created: 08-Aug-2015 Created: 08-Aug-2015
Python-Version: 3.6 Python-Version: 3.6
Post-History: 08-Aug-2015, 23-Aug-2015 Post-History: 08-Aug-2015, 23-Aug-2015, 30-Aug-2015
Abstract Abstract
======== ========
@ -16,44 +17,53 @@ Abstract
PEP 498 proposes new syntactic support for string interpolation that is PEP 498 proposes new syntactic support for string interpolation that is
transparent to the compiler, allow name references from the interpolation transparent to the compiler, allow name references from the interpolation
operation full access to containing namespaces (as with any other expression), operation full access to containing namespaces (as with any other expression),
rather than being limited to explicitly name references. rather than being limited to explicit name references. These are referred
to in the PEP as "f-strings" (a mnemonic for "formatted strings").
However, it only offers this capability for string formatting, making it likely However, it only offers this capability for string formatting, making it likely
we will see code like the following:: we will see code like the following::
os.system(f"echo {user_message}") os.system(f"echo {message_from_user}")
This kind of code is superficially elegant, but poses a significant problem This kind of code is superficially elegant, but poses a significant problem
if the interpolated value ``user_message`` is in fact provided by a user: it's if the interpolated value ``message_from_user`` is in fact provided by an
an opening for a form of code injection attack, where the supplied user data untrusted user: it's an opening for a form of code injection attack, where
has not been properly escaped before being passed to the ``os.system`` call. the supplied user data has not been properly escaped before being passed to
the ``os.system`` call.
To address that problem (and a number of other concerns), this PEP proposes an To address that problem (and a number of other concerns), this PEP proposes
alternative approach to compiler supported interpolation, using ``i`` (for the complementary introduction of "i-strings" (a mnemonic for "interpolation
"interpolation") as the new string prefix and a substitution syntax template strings"), where ``f"Message with {data}"`` would produce the same
inspired by that used in ``string.Template`` and ES6 JavaScript, rather than result as ``format(i"Message with {data}")``.
adding a 4th substitution variable syntax to Python.
Some possible examples of the proposed syntax:: Some possible examples of the proposed syntax::
msg = str(i'My age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.') mycommand = sh(i"cat {filename}")
print(_(i"This is a $translated $message")) myquery = sql(i"SELECT {column} FROM {table};")
translated = l20n(i"{{ $user }} is running {{ appname }}") myresponse = html(i"<html><body>{response.body}</body></html>")
myquery = sql(i"SELECT $column FROM $table;") logging.debug(i"Message with {detailed} {debugging} {info}")
mycommand = sh(i"cat $filename")
mypage = html(i"<html><body>${response.body}</body></html>")
callable = defer(i"$x + $y")
Summary of differences from PEP 498 Summary of differences from PEP 498
=================================== ===================================
The key differences of this proposal relative to PEP 498: The key additions this proposal makes relative to PEP 498:
* "i" (interpolation template) prefix rather than "f" (formatted string) * the "i" (interpolation template) prefix indicates delayed rendering, but
* string.Template/JavaScript inspired substitution syntax, rather than str.format/C# inspired otherwise uses the same syntax and semantics as formatted strings
* interpolation templates are created at runtime as a new kind of object * interpolation templates are available at runtime as a new kind of object
* the default rendering is invoked by calling ``str()`` on a template object (``types.InterpolationTemplate``)
rather than automatically * the default rendering used by formatted strings is invoked on an
interpolation template object by calling ``format(template)`` rather than
implicitly
* while f-string ``f"Message {here}"`` would be *semantically* equivalent to
``format(i"Message {here}")``, it is expected that the explicit syntax would
avoid the runtime overhead of using the delayed rendering machinery
NOTE: This proposal spells out a draft API for ``types.InterpolationTemplate``.
The precise details of the structures and methods exposed by this type would
be informed by the reference implementation of PEP 498, so it makes sense to
gain experience with that as an internal API before locking down a public API
(if this extension proposal is accepted).
Proposal Proposal
======== ========
@ -61,38 +71,39 @@ Proposal
This PEP proposes the introduction of a new string prefix that declares the This PEP proposes the introduction of a new string prefix that declares the
string to be an interpolation template rather than an ordinary string:: string to be an interpolation template rather than an ordinary string::
template = i"Substitute $names and ${expressions} at runtime" template = i"Substitute {names} and {expressions()} at runtime"
This would be effectively interpreted as:: This would be effectively interpreted as::
_raw_template = "Substitute $names and ${expressions} at runtime" _raw_template = "Substitute {names} and {expressions()} at runtime"
_parsed_fields = ( _parsed_template = (
("Substitute ", 0, "names", "", ""), ("Substitute ", "names"),
(" and ", 1, "expressions", "", ""), (" and ", "expressions()"),
(" at runtime", None, None, None, None), (" at runtime", None),
) )
_field_values = (names, expressions) _field_values = (names, expressions())
_format_specifiers = (f"", f"")
template = types.InterpolationTemplate(_raw_template, template = types.InterpolationTemplate(_raw_template,
_parsed_fields, _parsed_template,
_field_values) _field_values,
_format_specifiers)
The ``__str__`` method on ``types.InterpolationTemplate`` would then implementat The ``__format__`` method on ``types.InterpolationTemplate`` would then
the following ``str.format`` inspired semantics:: implement the following ``str.format`` inspired semantics::
>>> import datetime >>> import datetime
>>> name = 'Jane' >>> name = 'Jane'
>>> age = 50 >>> age = 50
>>> anniversary = datetime.date(1991, 10, 12) >>> anniversary = datetime.date(1991, 10, 12)
>>> str(i'My name is $name, my age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.') >>> format(i'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.')
'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.' 'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
>>> str(i'She said her name is ${name!r}.') >>> format(i'She said her name is {repr(name)}.')
"She said her name is 'Jane'." "She said her name is 'Jane'."
The interpolation template prefix can be combined with single-quoted, As with formatted strings, the interpolation template prefix can be combined with single-quoted, double-quoted and triple quoted strings, including raw strings.
double-quoted and triple quoted strings, including raw strings. It does not It does not support combination with bytes literals.
support combination with bytes literals.
This PEP does not propose to remove or deprecate any of the existing Similarly, this PEP does not propose to remove or deprecate any of the existing
string formatting mechanisms, as those will remain valuable when formatting string formatting mechanisms, as those will remain valuable when formatting
strings that are not present directly in the source code of the application. strings that are not present directly in the source code of the application.
@ -105,38 +116,15 @@ lexical namespace semantics simpler, but it does so at the cost of creating a
situation where interpolating values into sensitive targets like SQL queries, situation where interpolating values into sensitive targets like SQL queries,
shell commands and HTML templates will enjoy a much cleaner syntax when handled shell commands and HTML templates will enjoy a much cleaner syntax when handled
without regard for code injection attacks than when they are handled correctly. without regard for code injection attacks than when they are handled correctly.
It also has the effect of introducing yet another syntax for substitution
expressions into Python, when we already have 3 (``str.format``,
``bytes.__mod__`` and ``string.Template``)
This PEP proposes to handle the former issue by deferring the actual rendering This PEP proposes to provide the option of delaying the actual rendering
of the interpolation template to its ``__str__`` method (allow the use of of an interpolation template to its ``__format__`` method, allowing the use of
other template renderers by passing the template around as an object), and the other template renderers by passing the template around as a first class object.
latter by adopting the ``string.Template`` substitution syntax defined in PEP
292.
The substitution syntax devised for PEP 292 is deliberately simple so that the While very different in the technical details, the
template strings can be extracted into an i18n message catalog, and passed to ``types.InterpolationTemplate`` interface proposed in this PEP is
translators who may not themselves be developers. For these use cases, it is conceptually quite similar to the ``FormattableString`` type underlying the
important that the interpolation syntax be as simple as possible, as the `native interpolation <https://msdn.microsoft.com/en-us/library/dn961160.aspx>`__ support introduced in C# 6.0.
translators are responsible for preserving the substition markers, even as
they translate the surrounding text. The PEP 292 syntax is also a common mesage
catalog syntax already supporting by many commercial software translation
support tools.
PEP 498 correctly points out that the PEP 292 syntax isn't as flexible as that
introduced for general purpose string formatting in PEP 3101, so this PEP adds
that flexibility to the ``${ref}`` construct in PEP 292, and allows translation
tools the option of rejecting usage of that more advanced syntax at runtime,
rather than categorically rejecting it at compile time. The proposed permitted
expressions, conversion specifiers, and format specifiers inside ``${ref}`` are
exactly as defined for ``{ref}`` substituion in PEP 498.
The specific proposal in this PEP is also deliberately close in both syntax
and semantics to the general purpose interpolation syntax introduced to
JavaScript in ES6, as we can reasonably expect a great many Python developers
to be regularly switching back and forth between user interface code written in
JavaScript and core application code written in Python.
Specification Specification
@ -150,141 +138,153 @@ Interpolation template literals are Unicode strings (bytes literals are not
permitted), and string literal concatenation operates as normal, with the permitted), and string literal concatenation operates as normal, with the
entire combined literal forming the interpolation template. entire combined literal forming the interpolation template.
The template string is parsed into literals and expressions. Expressions The template string is parsed into literals, expressions and format specifiers
appear as either identifiers prefixed with a single "$" character, or as described for f-strings in PEP 498. Conversion specifiers are handled
surrounded be a leading '${' and a trailing '}. The parts of the format string by the compiler, and appear as part of the field text in interpolation
that are not expressions are separated out as string literals. templates.
While parsing the string, any doubled ``$$`` is replaced with a single ``$`` However, rather than being rendered directly into a formatted strings, these
and is considered part of the literal text, rather than as introducing an components are instead organised into an instance of a new type with the
expression.
These components are then organised into an instance of a new type with the
following semantics:: following semantics::
class InterpolationTemplate: class InterpolationTemplate:
__slots__ = ("raw_template", "parsed_fields", "field_values") __slots__ = ("raw_template", "parsed_template",
"field_values", "format_specifiers")
def __new__(cls, raw_template, parsed_fields, field_values): def __new__(cls, raw_template, parsed_template,
field_values, format_specifiers):
self = super().__new__(cls) self = super().__new__(cls)
self.raw_template = raw_template self.raw_template = raw_template
self.parsed_fields = parsed_fields self.parsed_template = parsed_template
self.field_values = field_values self.field_values = field_values
self.format_specifiers = format_specifiers
return self return self
def __iter__(self):
# Support iterable unpacking
yield self.raw_template
yield self.parsed_fields
yield self.field_values
def __repr__(self): def __repr__(self):
return str(i"<${type(self).__qualname__} ${self.raw_template!r} " return (f"<{type(self).__qualname__} {repr(self._raw_template)} "
"at ${id(self):#x}>") f"at {id(self):#x}>")
def __str__(self): def __format__(self, format_specifier):
# See definition of the default template rendering below # When formatted, render to a string, and use string formatting
return format(self.render(), format_specifier)
The result of the interpolation template expression is an instance of this def render(self, *, render_template=''.join,
type, rather than an already rendered string - default rendering only takes render_field=format):
place when the instance's ``__str__`` method is called. # See definition of the template rendering semantics below
The format of the parsed fields tuple is inspired by the interface of The result of an interpolation template expression is an instance of this
``string.Formatter.parse``, and consists of a series of 5-tuples each type, rather than an already rendered string - rendering only takes
containing: place when the instance's ``render`` method is called (either directly, or
indirectly via ``__format__``).
* a leading string literal (may be the empty string) The compiler will pass the following details to the interpolation template for
* the substitution field position (zero-based enumeration) later use:
* the substitution expression text
* the substitution conversion specifier (as defined by str.format)
* the substitution format specifier (as defined by str.format)
This field ordering is defined such that reading the parsed field tuples from * a string containing the raw template as written in the source code
left to right will have all the subcomponents displayed in the same order as * a parsed template tuple that allows the renderer to render the
they appear in the original template string. template without needing to reparse the raw string template for substitution
fields
* a tuple containing the evaluated field values, in field substitution order
* a tuple containing the field format specifiers, in field substitution order
For ease of access the sequence elements will be available as attributes in This structure is designed to take full advantage of compile time constant
addition to being available by position: folding by ensuring the parsed template is always constant, even when the
field values and format specifiers include variable substitution expressions.
* ``leading_text`` The raw template is just the interpolation template as a string. By default,
* ``field_position`` it is used to provide an human readable representation for the interpolation
* ``expression`` template.
* ``conversion``
* ``format``
The expression text is simply the text of the substitution expression, as it The parsed template consists of a tuple of 2-tuples, with each 2-tuple
appeared in the original string, but without the leading and/or surrounding containing the following fields:
expression markers. The conversion specifier and format specifier are separated
from the substition expression by ``!`` and ``:`` as defined for ``str.format``.
If a given substition field has no leading literal section, conversion specifier * ``leading_text``: a leading string literal. This will be the empty string if
or format specifier, then the corresponding elements in the tuple are the the current field is at the start of the string, or immediately follows the
empty string. If the final part of the string has no trailing substitution preceding field.
field, then the field position, field expression, conversion specifier and * ``field_expr``: the text of the expression element in the substitution field.
format specifier will all be ``None``. This will be None for a final trailing text segment.
The substitution field values tuple is created by evaluating the interpolated The tuple of evaluated field values holds the *results* of evaluating the
expressions in the exact runtime context where the interpolation expression substitution expressions in the scope where the interpolation template appears.
appears in the source code.
For the following example interpolation template:: The tuple of field specifiers holds the *results* of evaluating the field
specifiers as f-strings in the scope where the interpolation template appears.
i'abc${expr1:spec1}${expr2!r:spec2}def${expr3:!s}ghi $ident $$jkl' The ``InterpolationTemplate.render`` implementation then defines the rendering
process in terms of the following renderers:
the parsed fields tuple would be:: * an overall ``render_template`` operation that defines how the sequence of
literal template sections and rendered fields are composed into a fully
rendered result. The default template renderer is string concatenation
using ``''.join``.
* a per field ``render_field`` operation that receives the field value and
format specifier for substitution fields within the template. The default
field renderer is the ``format`` builtin.
( Given an appropriate parsed template representation and internal methods of
('abc', 0, 'expr1', '', 'spec1'), iterating over it, the semantics of template rendering would then be equivalent
('', 1, 'expr2', 'r', 'spec2'), to the following::
(def', 2, 'expr3', 's', ''),
('ghi', 3, 'ident', '', ''),
('$jkl', None, None, None, None)
)
While the field values tuple would be:: def render(self, *, render_template=''.join,
render_field=format):
(expr1, expr2, expr3, ident) iter_fields = enumerate(self.parsed_template)
values = self.field_values
The parsed fields tuple can be constant folded at compile time, while the specifiers = self.format_specifiers
expression values tuple will always need to be constructed at runtime.
The ``InterpolationTemplate.__str__`` implementation would have the following
semantics, with field processing being defined in terms of the ``format``
builtin and ``str.format`` conversion specifiers::
_converter = string.Formatter().convert_field
def __str__(self):
raw_template, fields, values = self
template_parts = [] template_parts = []
for leading_text, field_num, expr, conversion, format_spec in fields: for field_pos, (leading_text, field_expr) in iter_fields:
template_parts.append(leading_text) template_parts.append(leading_text)
if field_num is not None: if field_expr is not None:
value = values[field_num] value = values[field_pos]
if conversion: specifier = specifiers[field_pos]
value = _converter(value, conversion) rendered_field = render_field(value, specifier)
field_text = format(value, format_spec) template_parts.append(rendered_field)
template_parts.append(field_str) return render_template(template_parts)
return "".join(template_parts)
Writing custom interpolators Conversion specifiers
---------------------------- ---------------------
Writing a custom interpolator doesn't requiring any special syntax. Instead, The ``!a``, ``!r`` and ``!s`` conversion specifiers supported by ``str.format``
custom interpolators are ordinary callables that process an interpolation and hence PEP 498 are handled in interpolation templates as follows:
template directly based on the ``raw_template``, ``parsed_fields`` and
``field_values`` attributes, rather than relying on the default rendered. * they're included unmodified in the raw template to ensure no information is
lost
* they're *replaced* in the parsed template with the corresponding builtin
calls, in order to ensure that ``field_expr`` always contains a valid
Python expression
* the corresponding field value placed in the field values tuple is
converted appropriately *before* being passed to the interpolation
template
This means that, for most purposes, the difference between the use of
conversion specifiers and calling the corresponding builtins in the
original interpolation template will be transparent to custom renderers. The
difference will only be apparent if reparsing the raw template, or attempting
to reconstruct the original template from the parsed template.
Writing custom renderers
------------------------
Writing a custom renderer doesn't requiring any special syntax. Instead,
custom renderers are ordinary callables that process an interpolation
template directly either by calling the ``render()`` method with alternate ``render_template`` or ``render_field`` implementations, or by accessing the
template's data attributes directly.
For example, the following function would render a template using objects'
``repr`` implementations rather than their native formatting support::
def reprformat(template):
def render_field(value, specifier):
return format(repr(value), specifier)
return template.render(render_field=render_field)
Expression evaluation Expression evaluation
--------------------- ---------------------
The subexpressions that are extracted from the interpolation expression are As with f-strings, the subexpressions that are extracted from the interpolation
evaluated in the context where the interpolation expression appears. This means template are evaluated in the context where the interpolation template
the expression has full access to local, nonlocal and global variables. Any appears. This means the expression has full access to local, nonlocal and global variables. Any valid Python expression can be used inside ``{}``, including
valid Python expression can be used inside ``${}``, including function and function and method calls.
method calls. References without the surrounding braces are limited to looking
up single identifiers.
Because the substitution expressions are evaluated where the string appears in Because the substitution expressions are evaluated where the string appears in
the source code, there are no additional security concerns related to the the source code, there are no additional security concerns related to the
@ -295,7 +295,7 @@ same expression and used runtime field parsing::
>>> def foo(data): >>> def foo(data):
... return data + 20 ... return data + 20
... ...
>>> str(i'input=$bar, output=${foo(bar)}') >>> str(i'input={bar}, output={foo(bar)}')
'input=10, output=30' 'input=10, output=30'
Is essentially equivalent to:: Is essentially equivalent to::
@ -306,37 +306,44 @@ Is essentially equivalent to::
Handling code injection attacks Handling code injection attacks
------------------------------- -------------------------------
The proposed interpolation syntax makes it potentially attractive to write The PEP 498 formatted string syntax makes it potentially attractive to write
code like the following:: code like the following::
myquery = str(i"SELECT $column FROM $table;") runquery(f"SELECT {column} FROM {table};")
mycommand = str(i"cat $filename") runcommand(f"cat {filename}")
mypage = str(i"<html><body>${response.body}</body></html>") return_response(f"<html><body>{response.body}</body></html>")
These all represent potential vectors for code injection attacks, if any of the These all represent potential vectors for code injection attacks, if any of the
variables being interpolated happen to come from an untrusted source. The variables being interpolated happen to come from an untrusted source. The
specific proposal in this PEP is designed to make it straightforward to write specific proposal in this PEP is designed to make it straightforward to write
use case specific interpolators that take care of quoting interpolated values use case specific renderers that take care of quoting interpolated values
appropriately for the relevant security context:: appropriately for the relevant security context::
myquery = sql(i"SELECT $column FROM $table;") runquery(sql(i"SELECT {column} FROM {table};"))
mycommand = sh(i"cat $filename") runcommand(sh(i"cat {filename}"))
mypage = html(i"<html><body>${response.body}</body></html>") return_response(html(i"<html><body>{response.body}</body></html>"))
This PEP does not cover adding such interpolators to the standard library, This PEP does not cover adding such renderers to the standard library
but instead ensures they can be readily provided by third party libraries. immediately, but rather proposes to ensure that they can be readily provided by
third party libraries, and potentially incorporated into the standard library
at a later date.
(Although it's tempting to propose adding InterpolationTemplate support at For example, a renderer that aimed to offer a POSIX shell style experience for
least to ``subprocess.call``, ``subprocess.check_call`` and accessing external programs, without the significant risks posed by running
``subprocess.check_output``) ``os.system`` or enabling the system shell when using the ``subprocess`` module
APIs, might provide an interface for running external programs similar to that
offered by the
`Julia programming language <http://julia.readthedocs.org/en/latest/manual/running-external-programs/>`__,
only with the backtick based ``\`cat $filename\``` syntax replaced by
``i"cat {filename}"`` style interpolation templates.
Format and conversion specifiers Format specifiers
-------------------------------- -----------------
Aside from separating them out from the substitution expression, format and Aside from separating them out from the substitution expression during parsing,
conversion specifiers are otherwise treated as opaque strings by the format specifiers are otherwise treated as opaque strings by the interpolation
interpolation template parser - assigning semantics to those (or, alternatively, template parser - assigning semantics to those (or, alternatively,
prohibiting their use) is handled at runtime by the specified interpolator. prohibiting their use) is handled at runtime by the field renderer.
Error handling Error handling
-------------- --------------
@ -348,13 +355,13 @@ errors all raise SyntaxError.
Unmatched braces:: Unmatched braces::
>>> i'x=${x' >>> i'x={x'
File "<stdin>", line 1 File "<stdin>", line 1
SyntaxError: missing '}' in interpolation expression SyntaxError: missing '}' in interpolation expression
Invalid expressions:: Invalid expressions::
>>> i'x=${!x}' >>> i'x={!x}'
File "<fstring>", line 1 File "<fstring>", line 1
!x !x
^ ^
@ -364,68 +371,16 @@ Run time errors occur when evaluating the expressions inside a
template string before creating the interpolation template object. See PEP 498 template string before creating the interpolation template object. See PEP 498
for some examples. for some examples.
Different interpolators may also impose additional runtime Different renderers may also impose additional runtime
constraints on acceptable interpolated expressions and other formatting constraints on acceptable interpolated expressions and other formatting
details, which will be reported as runtime exceptions. details, which will be reported as runtime exceptions.
Internationalising interpolated strings
=======================================
Since this PEP derives its interpolation syntax from the internationalisation
focused PEP 292, it's worth considering the potential implications this PEP
may have for the internationalisation use case.
Internationalisation enters the picture by writing a custom interpolator that
performs internationalisation. For example, the following implementation
would delegate interpolation calls to ``string.Template``::
def i18n(template):
# A real implementation would also handle normal strings
raw_template, fields, values = template
translated = gettext.gettext(raw_template)
value_map = _build_interpolation_map(fields, values)
return string.Template(translated).safe_substitute(value_map)
def _build_interpolation_map(fields, values):
field_values = {}
for literal_text, field_num, expr, conversion, format_spec in fields:
assert expr.isidentifier() and not conversion and not format_spec
if field_num is not None:
field_values[expr] = values[field_num]
return field_values
And could then be invoked as::
# _ = i18n at top of module or injected into the builtins module
print(_(i"This is a $translated $message"))
Any actual i18n implementation would need to address other issues (most notably
message catalog extraction), but this gives the general idea of what might be
possible.
It's also worth noting that one of the benefits of the ``$`` based substitution
syntax in this PEP is its compatibility with Mozilla's
`l20n syntax <http://l20n.org/>`__, which uses ``{{ name }}`` for global
substitution, and ``{{ $user }}`` for local context substitution.
With the syntax in this PEP, an l20n interpolator could be written as::
translated = l20n(i"{{ $user }} is running {{ appname }}")
With the syntax proposed in PEP 498 (and neglecting the difficulty of doing
catalog lookups using PEP 498's semantics), the necessary brace escaping would
make the string look like this in order to interpolate the user variable
while preserving all of the expected braces::
locally_interpolated = f"{{{{ ${user} }}}} is running {{{{ appname }}}}"
Possible integration with the logging module Possible integration with the logging module
============================================ ============================================
One of the challenges with the logging module has been that previously been One of the challenges with the logging module has been that we have previously
unable to devise a reasonable migration strategy away from the use of been unable to devise a reasonable migration strategy away from the use of
printf-style formatting. The runtime parsing and interpolation overhead for printf-style formatting. The runtime parsing and interpolation overhead for
logging messages also poses a problem for extensive logging of runtime events logging messages also poses a problem for extensive logging of runtime events
for monitoring purposes. for monitoring purposes.
@ -434,13 +389,41 @@ While beyond the scope of this initial PEP, interpolation template support
could potentially be added to the logging module's event reporting APIs, could potentially be added to the logging module's event reporting APIs,
permitting relevant details to be captured using forms like:: permitting relevant details to be captured using forms like::
logging.debug(i"Event: $event; Details: $data") logging.debug(i"Event: {event}; Details: {data}")
logging.critical(i"Error: $error; Details: $data") logging.critical(i"Error: {error}; Details: {data}")
Rather than the current mod-formatting style::
logging.debug("Event: %s; Details: %s", event, data)
logging.critical("Error: %s; Details: %s", event, data)
As the interpolation template is passed in as an ordinary argument, other As the interpolation template is passed in as an ordinary argument, other
keyword arguments also remain available:: keyword arguments would also remain available::
logging.critical(i"Error: {error}; Details: {data}", exc_info=True)
As part of any such integration, a recommended approach would need to be
defined for "lazy evaluation" of interpolated fields, as the ``logging``
module's existing delayed interpolation support provides access to
`various attributes <https://docs.python.org/3/library/logging.html#logrecord-attributes>`__ of the event ``LogRecord`` instance.
For example, since interpolation expressions are arbitrary Python expressions,
string literals could be used to indicate cases where evaluation itself is
being deferred, not just rendering::
logging.debug(i"Logger: {'record.name'}; Event: {event}; Details: {data}")
This could be further extended with idioms like using inline tuples to indicate
deferred function calls to be made only if the log message is actually
going to be rendered at current logging levels::
logging.debug(i"Event: {event}; Details: {expensive_call, raw_data}")
This kind of approach would be possible as having access to the actual *text*
of the field expression would allow the logging renderer to distinguish
between inline tuples that appear in the field expression itself, and tuples
that happen to be passed in as data values in a normal field.
logging.critical(i"Error: $error; Details: $data", exc_info=True)
Discussion Discussion
========== ==========
@ -455,10 +438,10 @@ Supporting binary interpolation with this syntax would be relatively
straightforward (the elements in the parsed fields tuple would just be straightforward (the elements in the parsed fields tuple would just be
byte strings rather than text strings, and the default renderer would be byte strings rather than text strings, and the default renderer would be
markedly less useful), but poses a signficant likelihood of producing markedly less useful), but poses a signficant likelihood of producing
confusing type errors when a text interpolator was presented with confusing type errors when a text renderer was presented with
binary input. binary input.
Since the proposed operator is useful without binary interpolation support, and Since the proposed syntax is useful without binary interpolation support, and
such support can be readily added later, further consideration of binary such support can be readily added later, further consideration of binary
interpolation is considered out of scope for the current PEP. interpolation is considered out of scope for the current PEP.
@ -466,19 +449,21 @@ Interoperability with str-only interfaces
----------------------------------------- -----------------------------------------
For interoperability with interfaces that only accept strings, interpolation For interoperability with interfaces that only accept strings, interpolation
templates can be prerendered with ``str``, rather than delegating the rendering templates can still be prerendered with ``format``, rather than delegating the
to the called function. rendering to the called function.
This reflects the key difference from PEP 498, which *always* eagerly applies This reflects the key difference from PEP 498, which *always* eagerly applies
the default rendering, without any convenient way to decide to do something the default rendering, without any convenient way to delegate that choice to
different. another section of the code.
Preserving the raw template string Preserving the raw template string
---------------------------------- ----------------------------------
Earlier versions of this PEP failed to make the raw template string available Earlier versions of this PEP failed to make the raw template string available
to interpolators. This greatly complicated the i18n example, as it needed to on the interpolation template. Retaining it makes it possible to provide a more
reconstruct the original template to pass to the message catalog lookup. attractive template representation, as well as providing the ability to
precisely reconstruct the original string, including both the expression text
and the details of any eagerly rendered substitution fields in format specifiers.
Creating a rich object rather than a global name lookup Creating a rich object rather than a global name lookup
------------------------------------------------------- -------------------------------------------------------
@ -488,33 +473,52 @@ a creating a new kind of object for later consumption by interpolation
functions. Creating a rich descriptive object with a useful default renderer functions. Creating a rich descriptive object with a useful default renderer
made it much easier to support customisation of the semantics of interpolation. made it much easier to support customisation of the semantics of interpolation.
Relative order of conversion and format specifier in parsed fields Building atop PEP 498, rather than competing with it
------------------------------------------------------------------ ----------------------------------------------------
The relative order of the conversion specifier and the format specifier in the Earlier versions of this PEP attempted to serve as a complete substitute for
substitution field 5-tuple is defined to match the order they appear in the PEP 498, rather than building a more flexible delayed rendering capability on
format string, which is unfortunately the inverse of the way they appear in the top of PEP 498's eager rendering.
``string.Formatter.parse`` 4-tuple.
I consider this a design defect in ``string.Formatter.parse``, so I think it's Assuming the presence of f-strings as a supporting capability simplified a
worth fixing it in for the customer interpolator API, since the tuple already number of aspects of the proposal in this PEP (such as how to handle substitution
has other differences (like including both the field position number *and* the fields in format specifiers)
text of the expression).
This PEP also makes the parsed field attributes available by name, so it's Deferring consideration of possible use in i18n use cases
possible to write interpolators without caring about the precise field order ---------------------------------------------------------
at all.
The initial motivating use case for this PEP was providing a cleaner syntax
for i18n translation, as that requires access to the original unmodified
template. As such, it focused on compatibility with the subsitution syntax used
in Python's ``string.Template`` formatting and Mozilla's l20n project.
However, subsequent discussion revealed there are significant additional
considerations to be taken into account in the i18n use case, which don't
impact the simpler cases of handling interpolation into security sensitive
contexts (like HTML, system shells, and database queries), or producing
application debugging messages in the preferred language of the development
team (rather than the native language of end users).
Due to the original design of the ``str.format`` substitution syntax in PEP
3101 being inspired by C#'s string formatting syntax, the specific field
substitution syntax used in PEP 498 is consistent not only with Python's own ``str.format`` syntax, but also with string formatting in C#, including the
native "$-string" interpolation syntax introduced in C# 6.0 (released in July
2015). This means that while this particular substitution syntax may not
currently be widely used for translation of *Python* applications (losing out
to traditional %-formatting and the designed-specifically-for-i18n
``string.Template`` formatting), it *is* a popular translation format in the
wider software development ecosystem (since it is already the preferred
format for translating C# applications).
Acknowledgements Acknowledgements
================ ================
* Eric V. Smith for creating PEP 498 and demonstrating the feasibility of * Eric V. Smith for creating PEP 498 and demonstrating the feasibility of
arbitrary expression substitution in string interpolation arbitrary expression substitution in string interpolation
* Barry Warsaw for the string.Template syntax defined in PEP 292 * Barry Warsaw, Armin Ronacher, and Mike Miller for their contributions to
* Armin Ronacher for pointing me towards Mozilla's l20n project exploring the feasibility of using this model of delayed rendering in i18n
* Mike Miller for his survey of programming language interpolation syntaxes in use cases (even though the ultimate conclusion was that it was a poor fit,
PEP (TBD) at least for current approaches to i18n in Python)
References References
========== ==========
@ -540,8 +544,11 @@ References
.. [#] PEP 498: Literal string formatting .. [#] PEP 498: Literal string formatting
(https://www.python.org/dev/peps/pep-0498/) (https://www.python.org/dev/peps/pep-0498/)
.. [#] string.Formatter.parse .. [#] FormattableString and C# native string interpolation
(https://docs.python.org/3/library/string.html#string.Formatter.parse) (https://msdn.microsoft.com/en-us/library/dn961160.aspx)
.. [#] Running external commands in Julia
(http://julia.readthedocs.org/en/latest/manual/running-external-programs/)
Copyright Copyright
========= =========