PEP 501: update to template strings proposal (#3047)

* Re-open PEP 501 in consideration of PEP 701
* Switch naming from "interpolation template strings" to "template literal string"
* Add Nick Humrich as co-author

---------

Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Nick Coghlan <ncoghlan@gmail.com>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
This commit is contained in:
Nick Humrich 2024-08-12 10:11:40 -06:00 committed by GitHub
parent 49b5935190
commit 1696887355
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 376 additions and 156 deletions

View File

@ -1,27 +1,45 @@
PEP: 501 PEP: 501
Title: General purpose string interpolation Title: General purpose template literal strings
Version: $Revision$ Author: Alyssa Coghlan <ncoghlan@gmail.com>, Nick Humrich <nick@humrich.us>
Last-Modified: $Date$ Discussions-To: https://discuss.python.org/t/pep-501-reopen-general-purpose-string-template-literals/24625
Author: Alyssa Coghlan <ncoghlan@gmail.com> Status: Draft
Status: Deferred
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Requires: 498 Requires: 701
Created: 08-Aug-2015 Created: 08-Aug-2015
Python-Version: 3.6 Python-Version: 3.12
Post-History: 08-Aug-2015, 23-Aug-2015, 30-Aug-2015 Post-History: `08-Aug-2015 <https://mail.python.org/archives/list/python-dev@python.org/thread/EAZ3P2M3CDDIQFR764NF6FXQHWXYMKJF/>`__,
`05-Sep-2015 <https://mail.python.org/archives/list/python-dev@python.org/thread/ILVRPS6DTFZ7IHL5HONDBB6INVXTFOZ2/>`__,
`09-Mar-2023 <https://discuss.python.org/t/pep-501-reopen-general-purpose-string-template-literals/24625>`__,
Abstract Abstract
======== ========
:pep:`498` proposes new syntactic support for string interpolation that is Though easy and elegant to use, Python :term:`f-string`\s
transparent to the compiler, allow name references from the interpolation can be vulnerable to injection attacks when used to construct
shell commands, SQL queries, HTML snippets and similar
(for example, ``os.system(f"echo {message_from_user}")``).
This PEP introduces template literal strings (or "t-strings"),
which have the same syntax and semantics but with rendering deferred
until :func:`format` or another builder function is called on them.
This will allow standard library calls, helper functions
and third party tools to safety and intelligently perform
appropriate escaping and other string processing on inputs
while retaining the usability and convenience of f-strings.
Motivation
==========
:pep:`498` added new syntactic support for string interpolation that is
transparent to the compiler, allowing name references from the interpolation
operation full access to containing namespaces (as with any other expression), operation full access to containing namespaces (as with any other expression),
rather than being limited to explicit name references. These are referred rather than being limited to explicit name references. These are referred
to in the PEP as "f-strings" (a mnemonic for "formatted strings"). to in the PEP as "f-strings" (a mnemonic for "formatted strings").
However, it only offers this capability for string formatting, making it likely Since acceptance of :pep:`498`, f-strings have become well-established and very popular.
we will see code like the following:: F-strings are becoming even more useful with the addition of :pep:`701`.
While f-strings are great, eager rendering has its limitations. For example, the eagerness of f-strings
has made code like the following likely::
os.system(f"echo {message_from_user}") os.system(f"echo {message_from_user}")
@ -32,161 +50,274 @@ the supplied user data has not been properly escaped before being passed to
the ``os.system`` call. the ``os.system`` call.
To address that problem (and a number of other concerns), this PEP proposes To address that problem (and a number of other concerns), this PEP proposes
the complementary introduction of "i-strings" (a mnemonic for "interpolation the complementary introduction of "t-strings" (a mnemonic for "template literal strings"),
template strings"), where ``f"Message with {data}"`` would produce the same where ``f"Message with {data}"`` would produce the same
result as ``format(i"Message with {data}")``. result as ``format(t"Message with {data}")``.
Some possible examples of the proposed syntax::
mycommand = sh(i"cat {filename}")
myquery = sql(i"SELECT {column} FROM {table};")
myresponse = html(i"<html><body>{response.body}</body></html>")
logging.debug(i"Message with {detailed} {debugging} {info}")
PEP Deferral While this PEP and :pep:`675` are similar in their goals, neither one competes with the other,
============ and can instead be used together.
This PEP is currently deferred pending further experience with :pep:`498`'s This PEP was previously in deferred status, pending further experience with :pep:`498`'s
simpler approach of only supporting eager rendering without the additional simpler approach of only supporting eager rendering without the additional
complexity of also supporting deferred rendering. complexity of also supporting deferred rendering. Since then, f-strings have become very popular
and :pep:`701` has been introduced. This PEP has been updated to reflect current knowledge of f-strings,
and improvements from 701. It is designed to be built on top of the :pep:`701` implementation.
Summary of differences from PEP 498
===================================
The key additions this proposal makes relative to :pep:`498`:
* the "i" (interpolation template) prefix indicates delayed rendering, but
otherwise uses the same syntax and semantics as formatted strings
* interpolation templates are available at runtime as a new kind of object
(``types.InterpolationTemplate``)
* the default rendering used by formatted strings is invoked on an
interpolation template object by calling ``format(template)`` rather than
implicitly
* while f-string ``f"Message {here}"`` would be *semantically* equivalent to
``format(i"Message {here}")``, it is expected that the explicit syntax would
avoid the runtime overhead of using the delayed rendering machinery
NOTE: This proposal spells out a draft API for ``types.InterpolationTemplate``.
The precise details of the structures and methods exposed by this type would
be informed by the reference implementation of :pep:`498`, so it makes sense to
gain experience with that as an internal API before locking down a public API
(if this extension proposal is accepted).
Proposal Proposal
======== ========
This PEP proposes the introduction of a new string prefix that declares the This PEP proposes a new string prefix that declares the
string to be an interpolation template rather than an ordinary string:: string to be a template literal rather than an ordinary string::
template = i"Substitute {names} and {expressions()} at runtime" template = t"Substitute {names:>10} and {expressions()} at runtime"
This would be effectively interpreted as:: This would be effectively interpreted as::
_raw_template = "Substitute {names} and {expressions()} at runtime" _raw_template = "Substitute {names:>10} and {expressions()} at runtime"
_parsed_template = ( _parsed_template = (
("Substitute ", "names"), ("Substitute ", "names"),
(" and ", "expressions()"), (" and ", "expressions()"),
(" at runtime", None), (" at runtime", None),
) )
_field_values = (names, expressions()) _field_values = (names, expressions())
_format_specifiers = (f"", f"") _format_specifiers = (f">10", f"")
template = types.InterpolationTemplate(_raw_template, template = types.TemplateLiteral(
_parsed_template, _raw_template, _parsed_template, _field_values, _format_specifiers)
_field_values,
_format_specifiers)
The ``__format__`` method on ``types.InterpolationTemplate`` would then The ``__format__`` method on ``types.TemplateLiteral`` would then
implement the following ``str.format`` inspired semantics:: implement the following :meth:`str.format` inspired semantics::
>>> import datetime >>> import datetime
>>> name = 'Jane' >>> name = 'Jane'
>>> age = 50 >>> age = 50
>>> anniversary = datetime.date(1991, 10, 12) >>> anniversary = datetime.date(1991, 10, 12)
>>> format(i'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.') >>> format(t'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.')
'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.' 'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
>>> format(i'She said her name is {repr(name)}.') >>> format(t'She said her name is {name!r}.')
"She said her name is 'Jane'." "She said her name is 'Jane'."
As with formatted strings, the interpolation template prefix can be combined with single-quoted, double-quoted and triple quoted strings, including raw strings. The implementation of template literals would be based on :pep:`701`, and use the same syntax.
It does not support combination with bytes literals.
Similarly, this PEP does not propose to remove or deprecate any of the existing This PEP does not propose to remove or deprecate any of the existing
string formatting mechanisms, as those will remain valuable when formatting string formatting mechanisms, as those will remain valuable when formatting
strings that are not present directly in the source code of the application. strings that are not present directly in the source code of the application.
Summary of differences from f-strings
-------------------------------------
The key differences between f-strings and t-strings are:
* the ``t`` (template literal) prefix indicates delayed rendering, but
otherwise uses the same syntax and semantics as formatted strings
* template literals are available at runtime as a new kind of object
(``types.TemplateLiteral``)
* the default rendering used by formatted strings is invoked on a
template literal object by calling ``format(template)`` rather than
implicitly
* while f-string ``f"Message {here}"`` would be *semantically* equivalent to
``format(t"Message {here}")``, f-strings will continue to avoid the runtime overhead of using
the delayed rendering machinery that is needed for t-strings
Rationale Rationale
========= =========
:pep:`498` makes interpolating values into strings with full access to Python's F-strings (:pep:`498`) made interpolating values into strings with full access to Python's
lexical namespace semantics simpler, but it does so at the cost of creating a lexical namespace semantics simpler, but it does so at the cost of creating a
situation where interpolating values into sensitive targets like SQL queries, situation where interpolating values into sensitive targets like SQL queries,
shell commands and HTML templates will enjoy a much cleaner syntax when handled shell commands and HTML templates will enjoy a much cleaner syntax when handled
without regard for code injection attacks than when they are handled correctly. without regard for code injection attacks than when they are handled correctly.
This PEP proposes to provide the option of delaying the actual rendering This PEP proposes to provide the option of delaying the actual rendering
of an interpolation template to its ``__format__`` method, allowing the use of of a template literal to its ``__format__`` method, allowing the use of
other template renderers by passing the template around as a first class object. other template renderers by passing the template around as a first class object.
While very different in the technical details, the While very different in the technical details, the
``types.InterpolationTemplate`` interface proposed in this PEP is ``types.TemplateLiteral`` interface proposed in this PEP is
conceptually quite similar to the ``FormattableString`` type underlying the conceptually quite similar to the ``FormattableString`` type underlying the
`native interpolation <https://msdn.microsoft.com/en-us/library/dn961160.aspx>`__ support introduced in C# 6.0. `native interpolation <https://msdn.microsoft.com/en-us/library/dn961160.aspx>`__ support introduced in C# 6.0,
as well as `template literals in Javascript <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals>`__ introduced in ES6.
Specification Specification
============= =============
This PEP proposes the introduction of ``i`` as a new string prefix that This PEP proposes a new ``t`` string prefix that
results in the creation of an instance of a new type, results in the creation of an instance of a new type,
``types.InterpolationTemplate``. ``types.TemplateLiteral``.
Interpolation template literals are Unicode strings (bytes literals are not Template literals are Unicode strings (bytes literals are not
permitted), and string literal concatenation operates as normal, with the permitted), and string literal concatenation operates as normal, with the
entire combined literal forming the interpolation template. entire combined literal forming the template literal.
The template string is parsed into literals, expressions and format specifiers The template string is parsed into literals, expressions and format specifiers
as described for f-strings in :pep:`498`. Conversion specifiers are handled as described for f-strings in :pep:`498` and :pep:`701`. Conversion specifiers are handled
by the compiler, and appear as part of the field text in interpolation by the compiler, and appear as part of the field text in interpolation
templates. templates.
However, rather than being rendered directly into a formatted strings, these However, rather than being rendered directly into a formatted string, these
components are instead organised into an instance of a new type with the components are instead organised into an instance of a new type with the
following semantics:: following semantics::
class InterpolationTemplate: class TemplateLiteral:
__slots__ = ("raw_template", "parsed_template", __slots__ = ("raw_template", "parsed_template", "field_values", "format_specifiers")
"field_values", "format_specifiers")
def __new__(cls, raw_template, parsed_template, def __new__(cls, raw_template, parsed_template, field_values, format_specifiers):
field_values, format_specifiers):
self = super().__new__(cls) self = super().__new__(cls)
self.raw_template = raw_template self.raw_template = raw_template
if len(parsed_template) == 0:
raise ValueError("'parsed_template' must contain at least one value")
self.parsed_template = parsed_template self.parsed_template = parsed_template
self.field_values = field_values self.field_values = field_values
self.format_specifiers = format_specifiers self.format_specifiers = format_specifiers
return self return self
def __bool__(self):
return bool(self.raw_template)
def __add__(self, other):
if isinstance(other, TemplateLiteral):
if (
self.parsed_template
and self.parsed_template[-1][1] is None
and other.parsed_template
):
# merge the last string of self with the first string of other
content = self.parsed_template[-1][0]
new_parsed_template = (
self.parsed_template[:-1]
+ (
(
content + other.parsed_template[0][0],
other.parsed_template[0][1],
),
)
+ other.parsed_template[1:]
)
else:
new_parsed_template = self.parsed_template + other.parsed_template
return TemplateLiteral(
self.raw_template + other.raw_template,
new_parsed_template,
self.field_values + other.field_values,
self.format_specifiers + other.format_specifiers,
)
if isinstance(other, str):
if self.parsed_template and self.parsed_template[-1][1] is None:
# merge string with last value
new_parsed_template = self.parsed_template[:-1] + (
(self.parsed_template[-1][0] + other, None),
)
else:
new_parsed_template = self.parsed_template + ((other, None),)
return TemplateLiteral(
self.raw_template + other,
new_parsed_template,
self.field_values,
self.format_specifiers,
)
else:
raise TypeError(
f"unsupported operand type(s) for +: '{type(self)}' and '{type(other)}'"
)
def __radd__(self, other):
if isinstance(other, str):
if self.parsed_template:
new_parsed_template = (
(other + self.parsed_template[0][0], self.parsed_template[0][1]),
) + self.parsed_template[1:]
else:
new_parsed_template = ((other, None),)
return TemplateLiteral(
other + self.raw_template,
new_parsed_template,
self.field_values,
self.format_specifiers,
)
else:
raise TypeError(
f"unsupported operand type(s) for +: '{type(other)}' and '{type(self)}'"
)
def __mul__(self, other):
if isinstance(other, int):
if not self.raw_template or other == 1:
return self
if other < 1:
return TemplateLiteral("", ("", None), (), ())
parsed_template = self.parsed_template
last_node = parsed_template[-1]
trailing_field = last_node[1]
if trailing_field is not None:
# With a trailing field, everything can just be repeated the requested number of times
new_parsed_template = parsed_template * other
else:
# Without a trailing field, need to amend the parsed template repetitions to merge
# the trailing text from each repetition with the leading text of the next
first_node = parsed_template[0]
merged_node = (last_node[0] + first_node[0], first_node[1])
repeating_pattern = parsed_template[1:-1] + merged_node
new_parsed_template = (
parsed_template[:-1]
+ (repeating_pattern * (other - 1))[:-1]
+ last_node
)
return TemplateLiteral(
self.raw_template * other,
new_parsed_template,
self.field_values * other,
self.format_specifiers * other,
)
else:
raise TypeError(
f"unsupported operand type(s) for *: '{type(self)}' and '{type(other)}'"
)
def __rmul__(self, other):
if isinstance(other, int):
return self * other
else:
raise TypeError(
f"unsupported operand type(s) for *: '{type(other)}' and '{type(self)}'"
)
def __eq__(self, other):
if not isinstance(other, TemplateLiteral):
return False
return (
self.raw_template == other.raw_template
and self.parsed_template == other.parsed_template
and self.field_values == other.field_values
and self.format_specifiers == other.format_specifiers
)
def __repr__(self): def __repr__(self):
return (f"<{type(self).__qualname__} {repr(self._raw_template)} " return (
f"at {id(self):#x}>") f"<{type(self).__qualname__} {repr(self.raw_template)} "
f"at {id(self):#x}>"
)
def __format__(self, format_specifier): def __format__(self, format_specifier):
# When formatted, render to a string, and use string formatting # When formatted, render to a string, and use string formatting
return format(self.render(), format_specifier) return format(self.render(), format_specifier)
def render(self, *, render_template=''.join, def render(self, *, render_template="".join, render_field=format):
render_field=format): ... # See definition of the template rendering semantics below
# See definition of the template rendering semantics below
The result of an interpolation template expression is an instance of this The result of a template literal expression is an instance of this
type, rather than an already rendered string - rendering only takes type, rather than an already rendered string rendering only takes
place when the instance's ``render`` method is called (either directly, or place when the instance's ``render`` method is called (either directly, or
indirectly via ``__format__``). indirectly via ``__format__``).
The compiler will pass the following details to the interpolation template for The compiler will pass the following details to the template literal for
later use: later use:
* a string containing the raw template as written in the source code * a string containing the raw template as written in the source code
@ -200,9 +331,9 @@ This structure is designed to take full advantage of compile time constant
folding by ensuring the parsed template is always constant, even when the folding by ensuring the parsed template is always constant, even when the
field values and format specifiers include variable substitution expressions. field values and format specifiers include variable substitution expressions.
The raw template is just the interpolation template as a string. By default, The raw template is just the template literal as a string. By default,
it is used to provide a human readable representation for the interpolation it is used to provide a human-readable representation for the
template. template literal.
The parsed template consists of a tuple of 2-tuples, with each 2-tuple The parsed template consists of a tuple of 2-tuples, with each 2-tuple
containing the following fields: containing the following fields:
@ -214,12 +345,12 @@ containing the following fields:
This will be None for a final trailing text segment. This will be None for a final trailing text segment.
The tuple of evaluated field values holds the *results* of evaluating the The tuple of evaluated field values holds the *results* of evaluating the
substitution expressions in the scope where the interpolation template appears. substitution expressions in the scope where the template literal appears.
The tuple of field specifiers holds the *results* of evaluating the field The tuple of field specifiers holds the *results* of evaluating the field
specifiers as f-strings in the scope where the interpolation template appears. specifiers as f-strings in the scope where the template literal appears.
The ``InterpolationTemplate.render`` implementation then defines the rendering The ``TemplateLiteral.render`` implementation then defines the rendering
process in terms of the following renderers: process in terms of the following renderers:
* an overall ``render_template`` operation that defines how the sequence of * an overall ``render_template`` operation that defines how the sequence of
@ -251,20 +382,8 @@ to the following::
Conversion specifiers Conversion specifiers
--------------------- ---------------------
NOTE:
Appropriate handling of conversion specifiers is currently an open question.
Exposing them more directly to custom renderers would increase the
complexity of the ``InterpolationTemplate`` definition without providing an
increase in expressiveness (since they're redundant with calling the builtins
directly). At the same time, they *are* made available as arbitrary strings
when writing custom ``string.Formatter`` implementations, so it may be
desirable to offer similar levels of flexibility of interpretation in
interpolation templates.
The ``!a``, ``!r`` and ``!s`` conversion specifiers supported by ``str.format`` The ``!a``, ``!r`` and ``!s`` conversion specifiers supported by ``str.format``
and hence :pep:`498` are handled in interpolation templates as follows: and hence :pep:`498` are handled in template literals as follows:
* they're included unmodified in the raw template to ensure no information is * they're included unmodified in the raw template to ensure no information is
lost lost
@ -272,19 +391,18 @@ and hence :pep:`498` are handled in interpolation templates as follows:
calls, in order to ensure that ``field_expr`` always contains a valid calls, in order to ensure that ``field_expr`` always contains a valid
Python expression Python expression
* the corresponding field value placed in the field values tuple is * the corresponding field value placed in the field values tuple is
converted appropriately *before* being passed to the interpolation converted appropriately *before* being passed to the template literal
template
This means that, for most purposes, the difference between the use of This means that, for most purposes, the difference between the use of
conversion specifiers and calling the corresponding builtins in the conversion specifiers and calling the corresponding builtins in the
original interpolation template will be transparent to custom renderers. The original template literal will be transparent to custom renderers. The
difference will only be apparent if reparsing the raw template, or attempting difference will only be apparent if reparsing the raw template, or attempting
to reconstruct the original template from the parsed template. to reconstruct the original template from the parsed template.
Writing custom renderers Writing custom renderers
------------------------ ------------------------
Writing a custom renderer doesn't requiring any special syntax. Instead, Writing a custom renderer doesn't require any special syntax. Instead,
custom renderers are ordinary callables that process an interpolation custom renderers are ordinary callables that process an interpolation
template directly either by calling the ``render()`` method with alternate ``render_template`` or ``render_field`` implementations, or by accessing the template directly either by calling the ``render()`` method with alternate ``render_template`` or ``render_field`` implementations, or by accessing the
template's data attributes directly. template's data attributes directly.
@ -310,8 +428,9 @@ Expression evaluation
--------------------- ---------------------
As with f-strings, the subexpressions that are extracted from the interpolation As with f-strings, the subexpressions that are extracted from the interpolation
template are evaluated in the context where the interpolation template template are evaluated in the context where the template literal
appears. This means the expression has full access to local, nonlocal and global variables. Any valid Python expression can be used inside ``{}``, including appears. This means the expression has full access to local, nonlocal and global variables.
Any valid Python expression can be used inside ``{}``, including
function and method calls. function and method calls.
Because the substitution expressions are evaluated where the string appears in Because the substitution expressions are evaluated where the string appears in
@ -323,7 +442,7 @@ same expression and used runtime field parsing::
>>> def foo(data): >>> def foo(data):
... return data + 20 ... return data + 20
... ...
>>> str(i'input={bar}, output={foo(bar)}') >>> str(t'input={bar}, output={foo(bar)}')
'input=10, output=30' 'input=10, output=30'
Is essentially equivalent to:: Is essentially equivalent to::
@ -347,23 +466,24 @@ specific proposal in this PEP is designed to make it straightforward to write
use case specific renderers that take care of quoting interpolated values use case specific renderers that take care of quoting interpolated values
appropriately for the relevant security context:: appropriately for the relevant security context::
runquery(sql(i"SELECT {column} FROM {table};")) runquery(sql(t"SELECT {column} FROM {table} WHERE column={value};"))
runcommand(sh(i"cat {filename}")) runcommand(sh(t"cat {filename}"))
return_response(html(i"<html><body>{response.body}</body></html>")) return_response(html(t"<html><body>{response.body}</body></html>"))
This PEP does not cover adding such renderers to the standard library This PEP does not cover adding all such renderers to the standard library
immediately, but rather proposes to ensure that they can be readily provided by immediately (though one for shell escaping is proposed), but rather proposes to ensure that they can be readily provided by
third party libraries, and potentially incorporated into the standard library third party libraries, and potentially incorporated into the standard library
at a later date. at a later date.
For example, a renderer that aimed to offer a POSIX shell style experience for It is proposed that a renderer is included in the :mod:`shlex` module, aimed to offer a POSIX shell style experience for
accessing external programs, without the significant risks posed by running accessing external programs, without the significant risks posed by running
``os.system`` or enabling the system shell when using the ``subprocess`` module ``os.system`` or enabling the system shell when using the ``subprocess`` module
APIs, might provide an interface for running external programs similar to that APIs, which will provide an interface for running external programs inspired by that
offered by the offered by the
`Julia programming language <http://julia.readthedocs.org/en/latest/manual/running-external-programs/>`__, `Julia programming language <https://docs.julialang.org/en/v1/manual/running-external-programs/>`__,
only with the backtick based ``\`cat $filename\``` syntax replaced by only with the backtick based ``\`cat $filename\``` syntax replaced by
``i"cat {filename}"`` style interpolation templates. ``t"cat {filename}"`` style template literals.
See more in the :ref:`501-shlex-module` section.
Format specifiers Format specifiers
----------------- -----------------
@ -383,26 +503,96 @@ errors all raise SyntaxError.
Unmatched braces:: Unmatched braces::
>>> i'x={x' >>> t'x={x'
File "<stdin>", line 1 File "<stdin>", line 1
SyntaxError: missing '}' in interpolation expression t'x={x'
^
SyntaxError: missing '}' in template literal expression
Invalid expressions:: Invalid expressions::
>>> i'x={!x}' >>> t'x={!x}'
File "<fstring>", line 1 File "<fstring>", line 1
!x !x
^ ^
SyntaxError: invalid syntax SyntaxError: invalid syntax
Run time errors occur when evaluating the expressions inside a Run time errors occur when evaluating the expressions inside a
template string before creating the interpolation template object. See :pep:`498` template string before creating the template literal object. See :pep:`498`
for some examples. for some examples.
Different renderers may also impose additional runtime Different renderers may also impose additional runtime
constraints on acceptable interpolated expressions and other formatting constraints on acceptable interpolated expressions and other formatting
details, which will be reported as runtime exceptions. details, which will be reported as runtime exceptions.
.. _501-shlex-module:
Renderer for shell escaping added to shlex
==========================================
As a reference implementation, a renderer for safe POSIX shell escaping can be added to the :mod:`shlex`
module. This renderer would be called ``sh`` and would be equivalent to calling ``shlex.quote`` on
each field value in the template literal.
Thus::
os.system(shlex.sh(t'cat {myfile}'))
would have the same behavior as::
os.system('cat ' + shlex.quote(myfile)))
The implementation would be::
def sh(template: TemplateLiteral):
return template.render(render_field=quote)
Changes to subprocess module
============================
With the additional renderer in the shlex module, and the addition of template literals,
the :mod:`subprocess` module can be changed to handle accepting template literals
as an additional input type to ``Popen``, as it already accepts a sequence, or a string,
with different behavior for each.
With the addition of template literals, :class:`subprocess.Popen` (and in return, all its higher level functions such as :func:`~subprocess.run`)
could accept strings in a safe way.
For example::
subprocess.run(t'cat {myfile}', shell=True)
would automatically use the ``shlex.sh`` renderer provided in this PEP. Therefore, using shlex
inside a ``subprocess.run`` call like so::
subprocess.run(shlex.sh(t'cat {myfile}'), shell=True)
would be redundant, as ``run`` would automatically render any template literals through ``shlex.sh``
Alternatively, when ``subprocess.Popen`` is run without ``shell=True``, it could still provide
subprocess with a more ergonomic syntax. For example::
subprocess.run(t'cat {myfile} --flag {value}')
would be equivalent to::
subprocess.run(['cat', myfile, '--flag', value])
or, more accurately::
subprocess.run(shlex.split(f'cat {shlex.quote(myfile)} --flag {shlex.quote(value)}'))
It would do this by first using the ``shlex.sh`` renderer, as above, then using ``shlex.split`` on the result.
The implementation inside ``subprocess.Popen._execute_child`` would look like::
if isinstance(args, TemplateLiteral):
import shlex
if shell:
args = [shlex.sh(args)]
else:
args = shlex.split(shlex.sh(args))
Possible integration with the logging module Possible integration with the logging module
============================================ ============================================
@ -413,39 +603,39 @@ printf-style formatting. The runtime parsing and interpolation overhead for
logging messages also poses a problem for extensive logging of runtime events logging messages also poses a problem for extensive logging of runtime events
for monitoring purposes. for monitoring purposes.
While beyond the scope of this initial PEP, interpolation template support While beyond the scope of this initial PEP, template literal support
could potentially be added to the logging module's event reporting APIs, could potentially be added to the logging module's event reporting APIs,
permitting relevant details to be captured using forms like:: permitting relevant details to be captured using forms like::
logging.debug(i"Event: {event}; Details: {data}") logging.debug(t"Event: {event}; Details: {data}")
logging.critical(i"Error: {error}; Details: {data}") logging.critical(t"Error: {error}; Details: {data}")
Rather than the current mod-formatting style:: Rather than the current mod-formatting style::
logging.debug("Event: %s; Details: %s", event, data) logging.debug("Event: %s; Details: %s", event, data)
logging.critical("Error: %s; Details: %s", event, data) logging.critical("Error: %s; Details: %s", event, data)
As the interpolation template is passed in as an ordinary argument, other As the template literal is passed in as an ordinary argument, other
keyword arguments would also remain available:: keyword arguments would also remain available::
logging.critical(i"Error: {error}; Details: {data}", exc_info=True) logging.critical(t"Error: {error}; Details: {data}", exc_info=True)
As part of any such integration, a recommended approach would need to be As part of any such integration, a recommended approach would need to be
defined for "lazy evaluation" of interpolated fields, as the ``logging`` defined for "lazy evaluation" of interpolated fields, as the ``logging``
module's existing delayed interpolation support provides access to module's existing delayed interpolation support provides access to
`various attributes <https://docs.python.org/3/library/logging.html#logrecord-attributes>`__ of the event ``LogRecord`` instance. :ref:`various attributes <logrecord-attributes>` of the event ``LogRecord`` instance.
For example, since interpolation expressions are arbitrary Python expressions, For example, since template literal expressions are arbitrary Python expressions,
string literals could be used to indicate cases where evaluation itself is string literals could be used to indicate cases where evaluation itself is
being deferred, not just rendering:: being deferred, not just rendering::
logging.debug(i"Logger: {'record.name'}; Event: {event}; Details: {data}") logging.debug(t"Logger: {'record.name'}; Event: {event}; Details: {data}")
This could be further extended with idioms like using inline tuples to indicate This could be further extended with idioms like using inline tuples to indicate
deferred function calls to be made only if the log message is actually deferred function calls to be made only if the log message is actually
going to be rendered at current logging levels:: going to be rendered at current logging levels::
logging.debug(i"Event: {event}; Details: {expensive_call, raw_data}") logging.debug(t"Event: {event}; Details: {expensive_call, raw_data}")
This kind of approach would be possible as having access to the actual *text* This kind of approach would be possible as having access to the actual *text*
of the field expression would allow the logging renderer to distinguish of the field expression would allow the logging renderer to distinguish
@ -453,25 +643,46 @@ between inline tuples that appear in the field expression itself, and tuples
that happen to be passed in as data values in a normal field. that happen to be passed in as data values in a normal field.
Comparison to PEP 675
=====================
This PEP has similar goals to :pep:`675`.
While both are attempting to provide a way to have safer code, they are doing so in different ways.
:pep:`675` provides a way to find potential security issues via static analysis.
It does so by providing a way for the type checker to flag sections of code that are using
dynamic strings incorrectly. This requires a user to actually run a static analysis type checker such as mypy.
If :pep:`675` tells you that you are violating a type check, it is up to the programmer to know how to handle the dynamic-ness of the string.
This PEP provides a safer alternative to f-strings at runtime.
If a user recieves a type-error, changing an existing f-string into a t-string could be an easy way to solve the problem.
t-strings enable safer code by correctly escaping the dynamic sections of strings, while maintaining the static portions.
This PEP also allows a way for a library/codebase to be safe, but it does so at runtime rather than
only during static analysis. For example, if a library wanted to ensure "only safe strings", it
could check that the type of object passed in at runtime is a template literal::
def my_safe_function(string_like_object):
if not isinstance(string_like_object, types.TemplateLiteral):
raise TypeError("Argument 'string_like_object' must be a t-string")
The two PEPs could also be used together by typing your function as accepting either a string literal or a template literal.
This way, your function can provide the same API for both static and dynamic strings::
def my_safe_function(string_like_object: LiteralString | TemplateLiteral):
...
Discussion Discussion
========== ==========
Refer to :pep:`498` for additional discussion, as several of the points there Refer to :pep:`498` for previous discussion, as several of the points there
also apply to this PEP. also apply to this PEP.
Deferring support for binary interpolation Support for binary interpolation
------------------------------------------ --------------------------------
Supporting binary interpolation with this syntax would be relatively As f-strings don't handle byte strings, neither will t-strings.
straightforward (the elements in the parsed fields tuple would just be
byte strings rather than text strings, and the default renderer would be
markedly less useful), but poses a significant likelihood of producing
confusing type errors when a text renderer was presented with
binary input.
Since the proposed syntax is useful without binary interpolation support, and
such support can be readily added later, further consideration of binary
interpolation is considered out of scope for the current PEP.
Interoperability with str-only interfaces Interoperability with str-only interfaces
----------------------------------------- -----------------------------------------
@ -488,7 +699,7 @@ Preserving the raw template string
---------------------------------- ----------------------------------
Earlier versions of this PEP failed to make the raw template string available Earlier versions of this PEP failed to make the raw template string available
on the interpolation template. Retaining it makes it possible to provide a more on the template literal. Retaining it makes it possible to provide a more
attractive template representation, as well as providing the ability to attractive template representation, as well as providing the ability to
precisely reconstruct the original string, including both the expression text precisely reconstruct the original string, including both the expression text
and the details of any eagerly rendered substitution fields in format specifiers. and the details of any eagerly rendered substitution fields in format specifiers.
@ -501,11 +712,13 @@ a creating a new kind of object for later consumption by interpolation
functions. Creating a rich descriptive object with a useful default renderer functions. Creating a rich descriptive object with a useful default renderer
made it much easier to support customisation of the semantics of interpolation. made it much easier to support customisation of the semantics of interpolation.
Building atop PEP 498, rather than competing with it Building atop f-strings rather than replacing them
---------------------------------------------------- --------------------------------------------------
Earlier versions of this PEP attempted to serve as a complete substitute for Earlier versions of this PEP attempted to serve as a complete substitute for
:pep:`498`, rather than building a more flexible delayed rendering capability on :pep:`498` (f-strings) . With the acceptance of that PEP and the more recent :pep:`701`,
top of :pep:`498`'s eager rendering. this PEP can now build a more flexible delayed rendering capability
on top of the existing f-string eager rendering.
Assuming the presence of f-strings as a supporting capability simplified a Assuming the presence of f-strings as a supporting capability simplified a
number of aspects of the proposal in this PEP (such as how to handle substitution number of aspects of the proposal in this PEP (such as how to handle substitution
@ -526,8 +739,7 @@ contexts (like HTML, system shells, and database queries), or producing
application debugging messages in the preferred language of the development application debugging messages in the preferred language of the development
team (rather than the native language of end users). team (rather than the native language of end users).
Due to the original design of the ``str.format`` substitution syntax in :pep:`3101` Due to the original design of the ``str.format`` substitution syntax in :pep:`3101` being inspired by C#'s string formatting syntax, the specific field
being inspired by C#'s string formatting syntax, the specific field
substitution syntax used in :pep:`498` is consistent not only with Python's own ``str.format`` syntax, but also with string formatting in C#, including the substitution syntax used in :pep:`498` is consistent not only with Python's own ``str.format`` syntax, but also with string formatting in C#, including the
native "$-string" interpolation syntax introduced in C# 6.0 (released in July native "$-string" interpolation syntax introduced in C# 6.0 (released in July
2015). The related ``IFormattable`` interface in C# forms the basis of a 2015). The related ``IFormattable`` interface in C# forms the basis of a
@ -571,16 +783,24 @@ References
* :pep:`498`: Literal string formatting * :pep:`498`: Literal string formatting
* :pep:`675`: Arbitrary Literal String Type
* :pep:`701`: Syntactic formalization of f-strings
* `FormattableString and C# native string interpolation * `FormattableString and C# native string interpolation
<https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated>`_ <https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated>`_
* `IFormattable interface in C# (see remarks for globalization notes) * `IFormattable interface in C# (see remarks for globalization notes)
<https://docs.microsoft.com/en-us/dotnet/api/system.iformattable>`_ <https://docs.microsoft.com/en-us/dotnet/api/system.iformattable>`_
* `TemplateLiterals in Javascript
<https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals>`_
* `Running external commands in Julia * `Running external commands in Julia
<https://docs.julialang.org/en/v1/manual/running-external-programs/>`_ <https://docs.julialang.org/en/v1/manual/running-external-programs/>`_
Copyright Copyright
========= =========
This document has been placed in the public domain. This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.