807 lines
34 KiB
ReStructuredText
807 lines
34 KiB
ReStructuredText
PEP: 501
|
|
Title: General purpose template literal strings
|
|
Author: Alyssa Coghlan <ncoghlan@gmail.com>, Nick Humrich <nick@humrich.us>
|
|
Discussions-To: https://discuss.python.org/t/pep-501-reopen-general-purpose-string-template-literals/24625
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Requires: 701
|
|
Created: 08-Aug-2015
|
|
Python-Version: 3.12
|
|
Post-History: `08-Aug-2015 <https://mail.python.org/archives/list/python-dev@python.org/thread/EAZ3P2M3CDDIQFR764NF6FXQHWXYMKJF/>`__,
|
|
`05-Sep-2015 <https://mail.python.org/archives/list/python-dev@python.org/thread/ILVRPS6DTFZ7IHL5HONDBB6INVXTFOZ2/>`__,
|
|
`09-Mar-2023 <https://discuss.python.org/t/pep-501-reopen-general-purpose-string-template-literals/24625>`__,
|
|
|
|
Abstract
|
|
========
|
|
|
|
Though easy and elegant to use, Python :term:`f-string`\s
|
|
can be vulnerable to injection attacks when used to construct
|
|
shell commands, SQL queries, HTML snippets and similar
|
|
(for example, ``os.system(f"echo {message_from_user}")``).
|
|
This PEP introduces template literal strings (or "t-strings"),
|
|
which have the same syntax and semantics but with rendering deferred
|
|
until :func:`format` or another builder function is called on them.
|
|
This will allow standard library calls, helper functions
|
|
and third party tools to safety and intelligently perform
|
|
appropriate escaping and other string processing on inputs
|
|
while retaining the usability and convenience of f-strings.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
:pep:`498` added new syntactic support for string interpolation that is
|
|
transparent to the compiler, allowing name references from the interpolation
|
|
operation full access to containing namespaces (as with any other expression),
|
|
rather than being limited to explicit name references. These are referred
|
|
to in the PEP as "f-strings" (a mnemonic for "formatted strings").
|
|
|
|
Since acceptance of :pep:`498`, f-strings have become well-established and very popular.
|
|
F-strings are becoming even more useful with the addition of :pep:`701`.
|
|
While f-strings are great, eager rendering has its limitations. For example, the eagerness of f-strings
|
|
has made code like the following likely::
|
|
|
|
os.system(f"echo {message_from_user}")
|
|
|
|
This kind of code is superficially elegant, but poses a significant problem
|
|
if the interpolated value ``message_from_user`` is in fact provided by an
|
|
untrusted user: it's an opening for a form of code injection attack, where
|
|
the supplied user data has not been properly escaped before being passed to
|
|
the ``os.system`` call.
|
|
|
|
To address that problem (and a number of other concerns), this PEP proposes
|
|
the complementary introduction of "t-strings" (a mnemonic for "template literal strings"),
|
|
where ``f"Message with {data}"`` would produce the same
|
|
result as ``format(t"Message with {data}")``.
|
|
|
|
|
|
While this PEP and :pep:`675` are similar in their goals, neither one competes with the other,
|
|
and can instead be used together.
|
|
|
|
This PEP was previously in deferred status, pending further experience with :pep:`498`'s
|
|
simpler approach of only supporting eager rendering without the additional
|
|
complexity of also supporting deferred rendering. Since then, f-strings have become very popular
|
|
and :pep:`701` has been introduced. This PEP has been updated to reflect current knowledge of f-strings,
|
|
and improvements from 701. It is designed to be built on top of the :pep:`701` implementation.
|
|
|
|
|
|
Proposal
|
|
========
|
|
|
|
This PEP proposes a new string prefix that declares the
|
|
string to be a template literal rather than an ordinary string::
|
|
|
|
template = t"Substitute {names:>10} and {expressions()} at runtime"
|
|
|
|
This would be effectively interpreted as::
|
|
|
|
_raw_template = "Substitute {names:>10} and {expressions()} at runtime"
|
|
_parsed_template = (
|
|
("Substitute ", "names"),
|
|
(" and ", "expressions()"),
|
|
(" at runtime", None),
|
|
)
|
|
_field_values = (names, expressions())
|
|
_format_specifiers = (f">10", f"")
|
|
template = types.TemplateLiteral(
|
|
_raw_template, _parsed_template, _field_values, _format_specifiers)
|
|
|
|
The ``__format__`` method on ``types.TemplateLiteral`` would then
|
|
implement the following :meth:`str.format` inspired semantics::
|
|
|
|
>>> import datetime
|
|
>>> name = 'Jane'
|
|
>>> age = 50
|
|
>>> anniversary = datetime.date(1991, 10, 12)
|
|
>>> format(t'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.')
|
|
'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
|
|
>>> format(t'She said her name is {name!r}.')
|
|
"She said her name is 'Jane'."
|
|
|
|
The implementation of template literals would be based on :pep:`701`, and use the same syntax.
|
|
|
|
This PEP does not propose to remove or deprecate any of the existing
|
|
string formatting mechanisms, as those will remain valuable when formatting
|
|
strings that are not present directly in the source code of the application.
|
|
|
|
Summary of differences from f-strings
|
|
-------------------------------------
|
|
|
|
The key differences between f-strings and t-strings are:
|
|
|
|
* the ``t`` (template literal) prefix indicates delayed rendering, but
|
|
otherwise uses the same syntax and semantics as formatted strings
|
|
* template literals are available at runtime as a new kind of object
|
|
(``types.TemplateLiteral``)
|
|
* the default rendering used by formatted strings is invoked on a
|
|
template literal object by calling ``format(template)`` rather than
|
|
implicitly
|
|
* while f-string ``f"Message {here}"`` would be *semantically* equivalent to
|
|
``format(t"Message {here}")``, f-strings will continue to avoid the runtime overhead of using
|
|
the delayed rendering machinery that is needed for t-strings
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
F-strings (:pep:`498`) made interpolating values into strings with full access to Python's
|
|
lexical namespace semantics simpler, but it does so at the cost of creating a
|
|
situation where interpolating values into sensitive targets like SQL queries,
|
|
shell commands and HTML templates will enjoy a much cleaner syntax when handled
|
|
without regard for code injection attacks than when they are handled correctly.
|
|
|
|
This PEP proposes to provide the option of delaying the actual rendering
|
|
of a template literal to its ``__format__`` method, allowing the use of
|
|
other template renderers by passing the template around as a first class object.
|
|
|
|
While very different in the technical details, the
|
|
``types.TemplateLiteral`` interface proposed in this PEP is
|
|
conceptually quite similar to the ``FormattableString`` type underlying the
|
|
`native interpolation <https://msdn.microsoft.com/en-us/library/dn961160.aspx>`__ support introduced in C# 6.0,
|
|
as well as `template literals in Javascript <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals>`__ introduced in ES6.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
This PEP proposes a new ``t`` string prefix that
|
|
results in the creation of an instance of a new type,
|
|
``types.TemplateLiteral``.
|
|
|
|
Template literals are Unicode strings (bytes literals are not
|
|
permitted), and string literal concatenation operates as normal, with the
|
|
entire combined literal forming the template literal.
|
|
|
|
The template string is parsed into literals, expressions and format specifiers
|
|
as described for f-strings in :pep:`498` and :pep:`701`. Conversion specifiers are handled
|
|
by the compiler, and appear as part of the field text in interpolation
|
|
templates.
|
|
|
|
However, rather than being rendered directly into a formatted string, these
|
|
components are instead organised into an instance of a new type with the
|
|
following semantics::
|
|
|
|
class TemplateLiteral:
|
|
__slots__ = ("raw_template", "parsed_template", "field_values", "format_specifiers")
|
|
|
|
def __new__(cls, raw_template, parsed_template, field_values, format_specifiers):
|
|
self = super().__new__(cls)
|
|
self.raw_template = raw_template
|
|
if len(parsed_template) == 0:
|
|
raise ValueError("'parsed_template' must contain at least one value")
|
|
self.parsed_template = parsed_template
|
|
self.field_values = field_values
|
|
self.format_specifiers = format_specifiers
|
|
return self
|
|
|
|
def __bool__(self):
|
|
return bool(self.raw_template)
|
|
|
|
def __add__(self, other):
|
|
if isinstance(other, TemplateLiteral):
|
|
if (
|
|
self.parsed_template
|
|
and self.parsed_template[-1][1] is None
|
|
and other.parsed_template
|
|
):
|
|
# merge the last string of self with the first string of other
|
|
content = self.parsed_template[-1][0]
|
|
new_parsed_template = (
|
|
self.parsed_template[:-1]
|
|
+ (
|
|
(
|
|
content + other.parsed_template[0][0],
|
|
other.parsed_template[0][1],
|
|
),
|
|
)
|
|
+ other.parsed_template[1:]
|
|
)
|
|
|
|
else:
|
|
new_parsed_template = self.parsed_template + other.parsed_template
|
|
|
|
return TemplateLiteral(
|
|
self.raw_template + other.raw_template,
|
|
new_parsed_template,
|
|
self.field_values + other.field_values,
|
|
self.format_specifiers + other.format_specifiers,
|
|
)
|
|
|
|
if isinstance(other, str):
|
|
if self.parsed_template and self.parsed_template[-1][1] is None:
|
|
# merge string with last value
|
|
new_parsed_template = self.parsed_template[:-1] + (
|
|
(self.parsed_template[-1][0] + other, None),
|
|
)
|
|
else:
|
|
new_parsed_template = self.parsed_template + ((other, None),)
|
|
|
|
return TemplateLiteral(
|
|
self.raw_template + other,
|
|
new_parsed_template,
|
|
self.field_values,
|
|
self.format_specifiers,
|
|
)
|
|
else:
|
|
raise TypeError(
|
|
f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
|
|
)
|
|
|
|
def __radd__(self, other):
|
|
if isinstance(other, str):
|
|
if self.parsed_template:
|
|
new_parsed_template = (
|
|
(other + self.parsed_template[0][0], self.parsed_template[0][1]),
|
|
) + self.parsed_template[1:]
|
|
else:
|
|
new_parsed_template = ((other, None),)
|
|
|
|
return TemplateLiteral(
|
|
other + self.raw_template,
|
|
new_parsed_template,
|
|
self.field_values,
|
|
self.format_specifiers,
|
|
)
|
|
else:
|
|
raise TypeError(
|
|
f"unsupported operand type(s) for +: '{type(other)}' and '{type(self)}'"
|
|
)
|
|
|
|
def __mul__(self, other):
|
|
if isinstance(other, int):
|
|
if not self.raw_template or other == 1:
|
|
return self
|
|
if other < 1:
|
|
return TemplateLiteral("", ("", None), (), ())
|
|
parsed_template = self.parsed_template
|
|
last_node = parsed_template[-1]
|
|
trailing_field = last_node[1]
|
|
if trailing_field is not None:
|
|
# With a trailing field, everything can just be repeated the requested number of times
|
|
new_parsed_template = parsed_template * other
|
|
else:
|
|
# Without a trailing field, need to amend the parsed template repetitions to merge
|
|
# the trailing text from each repetition with the leading text of the next
|
|
first_node = parsed_template[0]
|
|
merged_node = (last_node[0] + first_node[0], first_node[1])
|
|
repeating_pattern = parsed_template[1:-1] + merged_node
|
|
new_parsed_template = (
|
|
parsed_template[:-1]
|
|
+ (repeating_pattern * (other - 1))[:-1]
|
|
+ last_node
|
|
)
|
|
return TemplateLiteral(
|
|
self.raw_template * other,
|
|
new_parsed_template,
|
|
self.field_values * other,
|
|
self.format_specifiers * other,
|
|
)
|
|
else:
|
|
raise TypeError(
|
|
f"unsupported operand type(s) for *: '{type(self)}' and '{type(other)}'"
|
|
)
|
|
|
|
def __rmul__(self, other):
|
|
if isinstance(other, int):
|
|
return self * other
|
|
else:
|
|
raise TypeError(
|
|
f"unsupported operand type(s) for *: '{type(other)}' and '{type(self)}'"
|
|
)
|
|
|
|
def __eq__(self, other):
|
|
if not isinstance(other, TemplateLiteral):
|
|
return False
|
|
return (
|
|
self.raw_template == other.raw_template
|
|
and self.parsed_template == other.parsed_template
|
|
and self.field_values == other.field_values
|
|
and self.format_specifiers == other.format_specifiers
|
|
)
|
|
|
|
def __repr__(self):
|
|
return (
|
|
f"<{type(self).__qualname__} {repr(self.raw_template)} "
|
|
f"at {id(self):#x}>"
|
|
)
|
|
|
|
def __format__(self, format_specifier):
|
|
# When formatted, render to a string, and use string formatting
|
|
return format(self.render(), format_specifier)
|
|
|
|
def render(self, *, render_template="".join, render_field=format):
|
|
... # See definition of the template rendering semantics below
|
|
|
|
The result of a template literal expression is an instance of this
|
|
type, rather than an already rendered string — rendering only takes
|
|
place when the instance's ``render`` method is called (either directly, or
|
|
indirectly via ``__format__``).
|
|
|
|
The compiler will pass the following details to the template literal for
|
|
later use:
|
|
|
|
* a string containing the raw template as written in the source code
|
|
* a parsed template tuple that allows the renderer to render the
|
|
template without needing to reparse the raw string template for substitution
|
|
fields
|
|
* a tuple containing the evaluated field values, in field substitution order
|
|
* a tuple containing the field format specifiers, in field substitution order
|
|
|
|
This structure is designed to take full advantage of compile time constant
|
|
folding by ensuring the parsed template is always constant, even when the
|
|
field values and format specifiers include variable substitution expressions.
|
|
|
|
The raw template is just the template literal as a string. By default,
|
|
it is used to provide a human-readable representation for the
|
|
template literal.
|
|
|
|
The parsed template consists of a tuple of 2-tuples, with each 2-tuple
|
|
containing the following fields:
|
|
|
|
* ``leading_text``: a leading string literal. This will be the empty string if
|
|
the current field is at the start of the string, or immediately follows the
|
|
preceding field.
|
|
* ``field_expr``: the text of the expression element in the substitution field.
|
|
This will be None for a final trailing text segment.
|
|
|
|
The tuple of evaluated field values holds the *results* of evaluating the
|
|
substitution expressions in the scope where the template literal appears.
|
|
|
|
The tuple of field specifiers holds the *results* of evaluating the field
|
|
specifiers as f-strings in the scope where the template literal appears.
|
|
|
|
The ``TemplateLiteral.render`` implementation then defines the rendering
|
|
process in terms of the following renderers:
|
|
|
|
* an overall ``render_template`` operation that defines how the sequence of
|
|
literal template sections and rendered fields are composed into a fully
|
|
rendered result. The default template renderer is string concatenation
|
|
using ``''.join``.
|
|
* a per field ``render_field`` operation that receives the field value and
|
|
format specifier for substitution fields within the template. The default
|
|
field renderer is the ``format`` builtin.
|
|
|
|
Given an appropriate parsed template representation and internal methods of
|
|
iterating over it, the semantics of template rendering would then be equivalent
|
|
to the following::
|
|
|
|
def render(self, *, render_template=''.join,
|
|
render_field=format):
|
|
iter_fields = enumerate(self.parsed_template)
|
|
values = self.field_values
|
|
specifiers = self.format_specifiers
|
|
template_parts = []
|
|
for field_pos, (leading_text, field_expr) in iter_fields:
|
|
template_parts.append(leading_text)
|
|
if field_expr is not None:
|
|
value = values[field_pos]
|
|
specifier = specifiers[field_pos]
|
|
rendered_field = render_field(value, specifier)
|
|
template_parts.append(rendered_field)
|
|
return render_template(template_parts)
|
|
|
|
Conversion specifiers
|
|
---------------------
|
|
The ``!a``, ``!r`` and ``!s`` conversion specifiers supported by ``str.format``
|
|
and hence :pep:`498` are handled in template literals as follows:
|
|
|
|
* they're included unmodified in the raw template to ensure no information is
|
|
lost
|
|
* they're *replaced* in the parsed template with the corresponding builtin
|
|
calls, in order to ensure that ``field_expr`` always contains a valid
|
|
Python expression
|
|
* the corresponding field value placed in the field values tuple is
|
|
converted appropriately *before* being passed to the template literal
|
|
|
|
This means that, for most purposes, the difference between the use of
|
|
conversion specifiers and calling the corresponding builtins in the
|
|
original template literal will be transparent to custom renderers. The
|
|
difference will only be apparent if reparsing the raw template, or attempting
|
|
to reconstruct the original template from the parsed template.
|
|
|
|
Writing custom renderers
|
|
------------------------
|
|
|
|
Writing a custom renderer doesn't require any special syntax. Instead,
|
|
custom renderers are ordinary callables that process an interpolation
|
|
template directly either by calling the ``render()`` method with alternate ``render_template`` or ``render_field`` implementations, or by accessing the
|
|
template's data attributes directly.
|
|
|
|
For example, the following function would render a template using objects'
|
|
``repr`` implementations rather than their native formatting support::
|
|
|
|
def reprformat(template):
|
|
def render_field(value, specifier):
|
|
return format(repr(value), specifier)
|
|
return template.render(render_field=render_field)
|
|
|
|
When writing custom renderers, note that the return type of the overall
|
|
rendering operation is determined by the return type of the passed in ``render_template`` callable. While this is expected to be a string in most
|
|
cases, producing non-string objects *is* permitted. For example, a custom
|
|
template renderer could involve an ``sqlalchemy.sql.text`` call that produces
|
|
an `SQL Alchemy query object <http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#using-textual-sql>`__.
|
|
|
|
Non-strings may also be returned from ``render_field``, as long as it is paired
|
|
with a ``render_template`` implementation that expects that behaviour.
|
|
|
|
Expression evaluation
|
|
---------------------
|
|
|
|
As with f-strings, the subexpressions that are extracted from the interpolation
|
|
template are evaluated in the context where the template literal
|
|
appears. This means the expression has full access to local, nonlocal and global variables.
|
|
Any valid Python expression can be used inside ``{}``, including
|
|
function and method calls.
|
|
|
|
Because the substitution expressions are evaluated where the string appears in
|
|
the source code, there are no additional security concerns related to the
|
|
contents of the expression itself, as you could have also just written the
|
|
same expression and used runtime field parsing::
|
|
|
|
>>> bar=10
|
|
>>> def foo(data):
|
|
... return data + 20
|
|
...
|
|
>>> str(t'input={bar}, output={foo(bar)}')
|
|
'input=10, output=30'
|
|
|
|
Is essentially equivalent to::
|
|
|
|
>>> 'input={}, output={}'.format(bar, foo(bar))
|
|
'input=10, output=30'
|
|
|
|
Handling code injection attacks
|
|
-------------------------------
|
|
|
|
The :pep:`498` formatted string syntax makes it potentially attractive to write
|
|
code like the following::
|
|
|
|
runquery(f"SELECT {column} FROM {table};")
|
|
runcommand(f"cat {filename}")
|
|
return_response(f"<html><body>{response.body}</body></html>")
|
|
|
|
These all represent potential vectors for code injection attacks, if any of the
|
|
variables being interpolated happen to come from an untrusted source. The
|
|
specific proposal in this PEP is designed to make it straightforward to write
|
|
use case specific renderers that take care of quoting interpolated values
|
|
appropriately for the relevant security context::
|
|
|
|
runquery(sql(t"SELECT {column} FROM {table} WHERE column={value};"))
|
|
runcommand(sh(t"cat {filename}"))
|
|
return_response(html(t"<html><body>{response.body}</body></html>"))
|
|
|
|
This PEP does not cover adding all such renderers to the standard library
|
|
immediately (though one for shell escaping is proposed), but rather proposes to ensure that they can be readily provided by
|
|
third party libraries, and potentially incorporated into the standard library
|
|
at a later date.
|
|
|
|
It is proposed that a renderer is included in the :mod:`shlex` module, aimed to offer a POSIX shell style experience for
|
|
accessing external programs, without the significant risks posed by running
|
|
``os.system`` or enabling the system shell when using the ``subprocess`` module
|
|
APIs, which will provide an interface for running external programs inspired by that
|
|
offered by the
|
|
`Julia programming language <https://docs.julialang.org/en/v1/manual/running-external-programs/>`__,
|
|
only with the backtick based ``\`cat $filename\``` syntax replaced by
|
|
``t"cat {filename}"`` style template literals.
|
|
See more in the :ref:`501-shlex-module` section.
|
|
|
|
Format specifiers
|
|
-----------------
|
|
|
|
Aside from separating them out from the substitution expression during parsing,
|
|
format specifiers are otherwise treated as opaque strings by the interpolation
|
|
template parser - assigning semantics to those (or, alternatively,
|
|
prohibiting their use) is handled at runtime by the field renderer.
|
|
|
|
Error handling
|
|
--------------
|
|
|
|
Either compile time or run time errors can occur when processing interpolation
|
|
expressions. Compile time errors are limited to those errors that can be
|
|
detected when parsing a template string into its component tuples. These
|
|
errors all raise SyntaxError.
|
|
|
|
Unmatched braces::
|
|
|
|
>>> t'x={x'
|
|
File "<stdin>", line 1
|
|
t'x={x'
|
|
^
|
|
SyntaxError: missing '}' in template literal expression
|
|
|
|
Invalid expressions::
|
|
|
|
>>> t'x={!x}'
|
|
File "<fstring>", line 1
|
|
!x
|
|
^
|
|
SyntaxError: invalid syntax
|
|
|
|
Run time errors occur when evaluating the expressions inside a
|
|
template string before creating the template literal object. See :pep:`498`
|
|
for some examples.
|
|
|
|
Different renderers may also impose additional runtime
|
|
constraints on acceptable interpolated expressions and other formatting
|
|
details, which will be reported as runtime exceptions.
|
|
|
|
.. _501-shlex-module:
|
|
|
|
Renderer for shell escaping added to shlex
|
|
==========================================
|
|
|
|
As a reference implementation, a renderer for safe POSIX shell escaping can be added to the :mod:`shlex`
|
|
module. This renderer would be called ``sh`` and would be equivalent to calling ``shlex.quote`` on
|
|
each field value in the template literal.
|
|
|
|
Thus::
|
|
|
|
os.system(shlex.sh(t'cat {myfile}'))
|
|
|
|
would have the same behavior as::
|
|
|
|
os.system('cat ' + shlex.quote(myfile)))
|
|
|
|
The implementation would be::
|
|
|
|
def sh(template: TemplateLiteral):
|
|
return template.render(render_field=quote)
|
|
|
|
|
|
Changes to subprocess module
|
|
============================
|
|
|
|
With the additional renderer in the shlex module, and the addition of template literals,
|
|
the :mod:`subprocess` module can be changed to handle accepting template literals
|
|
as an additional input type to ``Popen``, as it already accepts a sequence, or a string,
|
|
with different behavior for each.
|
|
With the addition of template literals, :class:`subprocess.Popen` (and in return, all its higher level functions such as :func:`~subprocess.run`)
|
|
could accept strings in a safe way.
|
|
For example::
|
|
|
|
subprocess.run(t'cat {myfile}', shell=True)
|
|
|
|
would automatically use the ``shlex.sh`` renderer provided in this PEP. Therefore, using shlex
|
|
inside a ``subprocess.run`` call like so::
|
|
|
|
subprocess.run(shlex.sh(t'cat {myfile}'), shell=True)
|
|
|
|
would be redundant, as ``run`` would automatically render any template literals through ``shlex.sh``
|
|
|
|
|
|
Alternatively, when ``subprocess.Popen`` is run without ``shell=True``, it could still provide
|
|
subprocess with a more ergonomic syntax. For example::
|
|
|
|
subprocess.run(t'cat {myfile} --flag {value}')
|
|
|
|
would be equivalent to::
|
|
|
|
subprocess.run(['cat', myfile, '--flag', value])
|
|
|
|
or, more accurately::
|
|
|
|
subprocess.run(shlex.split(f'cat {shlex.quote(myfile)} --flag {shlex.quote(value)}'))
|
|
|
|
It would do this by first using the ``shlex.sh`` renderer, as above, then using ``shlex.split`` on the result.
|
|
|
|
The implementation inside ``subprocess.Popen._execute_child`` would look like::
|
|
|
|
if isinstance(args, TemplateLiteral):
|
|
import shlex
|
|
if shell:
|
|
args = [shlex.sh(args)]
|
|
else:
|
|
args = shlex.split(shlex.sh(args))
|
|
|
|
|
|
Possible integration with the logging module
|
|
============================================
|
|
|
|
One of the challenges with the logging module has been that we have previously
|
|
been unable to devise a reasonable migration strategy away from the use of
|
|
printf-style formatting. The runtime parsing and interpolation overhead for
|
|
logging messages also poses a problem for extensive logging of runtime events
|
|
for monitoring purposes.
|
|
|
|
While beyond the scope of this initial PEP, template literal support
|
|
could potentially be added to the logging module's event reporting APIs,
|
|
permitting relevant details to be captured using forms like::
|
|
|
|
logging.debug(t"Event: {event}; Details: {data}")
|
|
logging.critical(t"Error: {error}; Details: {data}")
|
|
|
|
Rather than the current mod-formatting style::
|
|
|
|
logging.debug("Event: %s; Details: %s", event, data)
|
|
logging.critical("Error: %s; Details: %s", event, data)
|
|
|
|
As the template literal is passed in as an ordinary argument, other
|
|
keyword arguments would also remain available::
|
|
|
|
logging.critical(t"Error: {error}; Details: {data}", exc_info=True)
|
|
|
|
As part of any such integration, a recommended approach would need to be
|
|
defined for "lazy evaluation" of interpolated fields, as the ``logging``
|
|
module's existing delayed interpolation support provides access to
|
|
:ref:`various attributes <logrecord-attributes>` of the event ``LogRecord`` instance.
|
|
|
|
For example, since template literal expressions are arbitrary Python expressions,
|
|
string literals could be used to indicate cases where evaluation itself is
|
|
being deferred, not just rendering::
|
|
|
|
logging.debug(t"Logger: {'record.name'}; Event: {event}; Details: {data}")
|
|
|
|
This could be further extended with idioms like using inline tuples to indicate
|
|
deferred function calls to be made only if the log message is actually
|
|
going to be rendered at current logging levels::
|
|
|
|
logging.debug(t"Event: {event}; Details: {expensive_call, raw_data}")
|
|
|
|
This kind of approach would be possible as having access to the actual *text*
|
|
of the field expression would allow the logging renderer to distinguish
|
|
between inline tuples that appear in the field expression itself, and tuples
|
|
that happen to be passed in as data values in a normal field.
|
|
|
|
|
|
Comparison to PEP 675
|
|
=====================
|
|
|
|
This PEP has similar goals to :pep:`675`.
|
|
While both are attempting to provide a way to have safer code, they are doing so in different ways.
|
|
:pep:`675` provides a way to find potential security issues via static analysis.
|
|
It does so by providing a way for the type checker to flag sections of code that are using
|
|
dynamic strings incorrectly. This requires a user to actually run a static analysis type checker such as mypy.
|
|
|
|
If :pep:`675` tells you that you are violating a type check, it is up to the programmer to know how to handle the dynamic-ness of the string.
|
|
This PEP provides a safer alternative to f-strings at runtime.
|
|
If a user receives a type-error, changing an existing f-string into a t-string could be an easy way to solve the problem.
|
|
|
|
t-strings enable safer code by correctly escaping the dynamic sections of strings, while maintaining the static portions.
|
|
|
|
This PEP also allows a way for a library/codebase to be safe, but it does so at runtime rather than
|
|
only during static analysis. For example, if a library wanted to ensure "only safe strings", it
|
|
could check that the type of object passed in at runtime is a template literal::
|
|
|
|
def my_safe_function(string_like_object):
|
|
if not isinstance(string_like_object, types.TemplateLiteral):
|
|
raise TypeError("Argument 'string_like_object' must be a t-string")
|
|
|
|
The two PEPs could also be used together by typing your function as accepting either a string literal or a template literal.
|
|
This way, your function can provide the same API for both static and dynamic strings::
|
|
|
|
def my_safe_function(string_like_object: LiteralString | TemplateLiteral):
|
|
...
|
|
|
|
|
|
Discussion
|
|
==========
|
|
|
|
Refer to :pep:`498` for previous discussion, as several of the points there
|
|
also apply to this PEP.
|
|
|
|
Support for binary interpolation
|
|
--------------------------------
|
|
|
|
As f-strings don't handle byte strings, neither will t-strings.
|
|
|
|
Interoperability with str-only interfaces
|
|
-----------------------------------------
|
|
|
|
For interoperability with interfaces that only accept strings, interpolation
|
|
templates can still be prerendered with ``format``, rather than delegating the
|
|
rendering to the called function.
|
|
|
|
This reflects the key difference from :pep:`498`, which *always* eagerly applies
|
|
the default rendering, without any way to delegate the choice of renderer to
|
|
another section of the code.
|
|
|
|
Preserving the raw template string
|
|
----------------------------------
|
|
|
|
Earlier versions of this PEP failed to make the raw template string available
|
|
on the template literal. Retaining it makes it possible to provide a more
|
|
attractive template representation, as well as providing the ability to
|
|
precisely reconstruct the original string, including both the expression text
|
|
and the details of any eagerly rendered substitution fields in format specifiers.
|
|
|
|
Creating a rich object rather than a global name lookup
|
|
-------------------------------------------------------
|
|
|
|
Earlier versions of this PEP used an ``__interpolate__`` builtin, rather than
|
|
a creating a new kind of object for later consumption by interpolation
|
|
functions. Creating a rich descriptive object with a useful default renderer
|
|
made it much easier to support customisation of the semantics of interpolation.
|
|
|
|
Building atop f-strings rather than replacing them
|
|
--------------------------------------------------
|
|
|
|
Earlier versions of this PEP attempted to serve as a complete substitute for
|
|
:pep:`498` (f-strings) . With the acceptance of that PEP and the more recent :pep:`701`,
|
|
this PEP can now build a more flexible delayed rendering capability
|
|
on top of the existing f-string eager rendering.
|
|
|
|
Assuming the presence of f-strings as a supporting capability simplified a
|
|
number of aspects of the proposal in this PEP (such as how to handle substitution
|
|
fields in format specifiers)
|
|
|
|
Deferring consideration of possible use in i18n use cases
|
|
---------------------------------------------------------
|
|
|
|
The initial motivating use case for this PEP was providing a cleaner syntax
|
|
for i18n translation, as that requires access to the original unmodified
|
|
template. As such, it focused on compatibility with the substitution syntax used
|
|
in Python's ``string.Template`` formatting and Mozilla's l20n project.
|
|
|
|
However, subsequent discussion revealed there are significant additional
|
|
considerations to be taken into account in the i18n use case, which don't
|
|
impact the simpler cases of handling interpolation into security sensitive
|
|
contexts (like HTML, system shells, and database queries), or producing
|
|
application debugging messages in the preferred language of the development
|
|
team (rather than the native language of end users).
|
|
|
|
Due to the original design of the ``str.format`` substitution syntax in :pep:`3101` being inspired by C#'s string formatting syntax, the specific field
|
|
substitution syntax used in :pep:`498` is consistent not only with Python's own ``str.format`` syntax, but also with string formatting in C#, including the
|
|
native "$-string" interpolation syntax introduced in C# 6.0 (released in July
|
|
2015). The related ``IFormattable`` interface in C# forms the basis of a
|
|
`number of elements <https://msdn.microsoft.com/en-us/library/system.iformattable.aspx>`__ of C#'s internationalization and localization
|
|
support.
|
|
|
|
This means that while this particular substitution syntax may not
|
|
currently be widely used for translation of *Python* applications (losing out
|
|
to traditional %-formatting and the designed-specifically-for-i18n
|
|
``string.Template`` formatting), it *is* a popular translation format in the
|
|
wider software development ecosystem (since it is already the preferred
|
|
format for translating C# applications).
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
* Eric V. Smith for creating :pep:`498` and demonstrating the feasibility of
|
|
arbitrary expression substitution in string interpolation
|
|
* Barry Warsaw, Armin Ronacher, and Mike Miller for their contributions to
|
|
exploring the feasibility of using this model of delayed rendering in i18n
|
|
use cases (even though the ultimate conclusion was that it was a poor fit,
|
|
at least for current approaches to i18n in Python)
|
|
|
|
References
|
|
==========
|
|
|
|
* `%-formatting
|
|
<https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting>`_
|
|
|
|
* `str.format
|
|
<https://docs.python.org/3/library/string.html#formatstrings>`_
|
|
|
|
* `string.Template documentation
|
|
<https://docs.python.org/3/library/string.html#template-strings>`_
|
|
|
|
* :pep:`215`: String Interpolation
|
|
|
|
* :pep:`292`: Simpler String Substitutions
|
|
|
|
* :pep:`3101`: Advanced String Formatting
|
|
|
|
* :pep:`498`: Literal string formatting
|
|
|
|
* :pep:`675`: Arbitrary Literal String Type
|
|
|
|
* :pep:`701`: Syntactic formalization of f-strings
|
|
|
|
* `FormattableString and C# native string interpolation
|
|
<https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated>`_
|
|
|
|
* `IFormattable interface in C# (see remarks for globalization notes)
|
|
<https://docs.microsoft.com/en-us/dotnet/api/system.iformattable>`_
|
|
|
|
* `TemplateLiterals in Javascript
|
|
<https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals>`_
|
|
|
|
* `Running external commands in Julia
|
|
<https://docs.julialang.org/en/v1/manual/running-external-programs/>`_
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|