python-peps/peps/pep-0501.rst

1485 lines
64 KiB
ReStructuredText
Raw Permalink Normal View History

PEP: 501
Title: General purpose template literal strings
Author: Alyssa Coghlan <ncoghlan@gmail.com>, Nick Humrich <nick@humrich.us>
Discussions-To: https://discuss.python.org/t/pep-501-reopen-general-purpose-string-template-literals/24625
Status: Withdrawn
2015-08-08 05:20:33 -04:00
Type: Standards Track
Requires: 701
2015-08-08 05:20:33 -04:00
Created: 08-Aug-2015
Python-Version: 3.12
Post-History: `08-Aug-2015 <https://mail.python.org/archives/list/python-dev@python.org/thread/EAZ3P2M3CDDIQFR764NF6FXQHWXYMKJF/>`__,
`05-Sep-2015 <https://mail.python.org/archives/list/python-dev@python.org/thread/ILVRPS6DTFZ7IHL5HONDBB6INVXTFOZ2/>`__,
`09-Mar-2023 <https://discuss.python.org/t/pep-501-reopen-general-purpose-string-template-literals/24625>`__,
Superseded-By: 750
2015-08-08 05:20:33 -04:00
.. superseded:: 750
2015-08-08 05:20:33 -04:00
Abstract
========
Though easy and elegant to use, Python :term:`f-string`\s
can be vulnerable to injection attacks when used to construct
shell commands, SQL queries, HTML snippets and similar
(for example, ``os.system(f"echo {message_from_user}")``).
This PEP introduces template literal strings (or "t-strings"),
which have syntax and semantics that are similar to f-strings,
but with rendering deferred until :func:`format` or another
template rendering function is called on them.
This will allow standard library calls, helper functions
and third party tools to safety and intelligently perform
appropriate escaping and other string processing on inputs
while retaining the usability and convenience of f-strings.
PEP Withdrawal
==============
When :pep:`750` was first published as a "tagged strings" proposal
(allowing for arbitrary string prefixes), this PEP was kept open to
continue championing the simpler "template literal" approach that
used a single dedicated string prefix to produce instances of a new
"interpolation template" type.
The `October 2024 updates <https://github.com/python/peps/pull/4062>`__
to :pep:`750` agreed that template strings were a better fit for Python
than the broader tagged strings concept.
All of the other concerns the authors of this PEP had with :pep:`750`
were also either addressed in those updates, or else left in a state
where they could reasonably be addressed in a future change proposal.
Due to the clear improvements in the updated :pep:`750` proposal,
this PEP has been withdrawn in favour of :pep:`750`.
.. important::
The remainder of this PEP still reflects the state of the tagged strings
proposal in August 2024. It has *not* been updated to reflect the
October 2024 changes to :pep:`750`, since the PEP withdrawal makes doing
so redundant.
Relationship with other PEPs
============================
This PEP is inpired by and builds on top of the f-string syntax first implemented
in :pep:`498` and formalised in :pep:`701`.
This PEP complements the literal string typing support added to Python's formal type
system in :pep:`675` by introducing a *safe* way to do dynamic interpolation of runtime
values into security sensitive strings.
This PEP competes with some aspects of the tagged string proposal in :pep:`750`
(most notably in whether template rendering is expressed as ``render(t"template literal")``
or as ``render"template literal"``), but also shares *many* common features (after
:pep:`750` was published, this PEP was updated with
`several new changes <https://github.com/python/peps/issues/3904>`__
inspired by the tagged strings proposal).
This PEP does NOT propose an alternative to :pep:`292` for user interface
internationalization use cases (but does note the potential for future syntactic
enhancements aimed at that use case that would benefit from the compiler-supported
value interpolation machinery that this PEP and :pep:`750` introduce).
Motivation
==========
:pep:`498` added new syntactic support for string interpolation that is
transparent to the compiler, allowing name references from the interpolation
2015-08-08 05:20:33 -04:00
operation full access to containing namespaces (as with any other expression),
rather than being limited to explicit name references. These are referred
to in the PEP (and elsewhere) as "f-strings" (a mnemonic for "formatted strings").
2015-08-08 05:20:33 -04:00
Since acceptance of :pep:`498`, f-strings have become well-established and very popular.
f-strings became even more useful and flexible with the formalised grammar in :pep:`701`.
While f-strings are great, eager rendering has its limitations. For example, the
eagerness of f-strings has made code like the following unfortunately plausible:
.. code-block:: python
2015-08-08 05:20:33 -04:00
os.system(f"echo {message_from_user}")
This kind of code is superficially elegant, but poses a significant problem
if the interpolated value ``message_from_user`` is in fact provided by an
untrusted user: it's an opening for a form of code injection attack, where
the supplied user data has not been properly escaped before being passed to
the ``os.system`` call.
While the ``LiteralString`` type annotation introduced in :pep:`675` means that typecheckers
are able to report a type error for this kind of unsafe function usage, those errors don't
help make it easier to write code that uses safer alternatives (such as
:func:`subprocess.run`).
To address that problem (and a number of other concerns), this PEP proposes
the complementary introduction of "t-strings" (a mnemonic for "template literal strings"),
where ``format(t"Message with {data}")`` would produce the same result as
``f"Message with {data}"``, but the template literal instance can instead be passed
to other template rendering functions which process the contents of the template
differently.
Proposal
========
Dedicated template literal syntax
---------------------------------
This PEP proposes a new string prefix that declares the
string to be a template literal rather than an ordinary string:
.. code-block:: python
template = t"Substitute {names:>{field_width}} and {expressions()!r} at runtime"
This would be effectively interpreted as:
.. code-block:: python
template = TemplateLiteral(
r"Substitute {names:>{field_width}} and {expressions()} at runtime",
TemplateLiteralText(r"Substitute "),
TemplateLiteralField("names", names, f">{field_width}", ""),
TemplateLiteralText(r" and "),
TemplateLiteralField("expressions()", expressions(), f"", "r"),
)
(Note: this is an illustrative example implementation. The exact compile time construction
syntax of ``types.TemplateLiteral`` is considered an implementation detail not specified by
the PEP. In particular, the compiler may bypass the default constructor's runtime logic that
detects consecutive text segments and merges them into a single text segment, as well as
checking the runtime types of all supplied arguments).
The ``__format__`` method on ``types.TemplateLiteral`` would then
implement the following :meth:`str.format` inspired semantics:
.. code-block:: python-console
2015-08-08 05:20:33 -04:00
>>> import datetime
>>> name = 'Jane'
>>> age = 50
>>> anniversary = datetime.date(1991, 10, 12)
>>> format(t'My name is {name}, my age next year is {age+1}, my anniversary is {anniversary:%A, %B %d, %Y}.')
2015-08-08 05:20:33 -04:00
'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
>>> format(t'She said her name is {name!r}.')
2015-08-08 05:20:33 -04:00
"She said her name is 'Jane'."
The syntax of template literals would be based on :pep:`701`, and largely use the same
syntax for the string portion of the template. Aside from using a different prefix, the one
other syntactic change is in the definition and handling of conversion specifiers, both to
allow ``!()`` as a standard conversion specifier to request evaluation of a field at
rendering time, and to allow custom renderers to also define custom conversion specifiers.
2015-08-08 05:20:33 -04:00
This PEP does not propose to remove or deprecate any of the existing
2015-08-08 05:20:33 -04:00
string formatting mechanisms, as those will remain valuable when formatting
2015-08-08 05:28:56 -04:00
strings that are not present directly in the source code of the application.
2015-08-08 05:20:33 -04:00
Lazy field evaluation conversion specifier
------------------------------------------
In addition to the existing support for the ``a``, ``r``, and ``s`` conversion specifiers,
:meth:`str.format`, :meth:`str.format_map`, and :class:`string.Formatter` will be updated
to accept ``()`` as a conversion specifier that means "call the interpolated value".
To support application of the standard conversion specifiers in custom template rendering
functions, a new :func:`!operator.convert_field` function will be added.
The signature and behaviour of the :func:`format` builtin will also be updated to accept a
conversion specifier as a third optional parameter. If a non-empty conversion specifier
is given, the value will be converted with :func:`!operator.convert_field` before looking up
the ``__format__`` method.
Custom conversion specifiers
----------------------------
To allow additional field-specific directives to be passed to custom rendering functions in
a way that still allows formatting of the template with the default renderer, the conversion
specifier field will be allowed to contain a second ``!`` character.
:func:`!operator.convert_field` and :func:`format` (and hence the default
``TemplateLiteral.render`` template rendering method), will ignore that character and any
subsequent text in the conversion specifier field.
:meth:`str.format`, :meth:`str.format_map`, and :class:`string.Formatter` will also be
updated to accept (and ignore) custom conversion specifiers.
Template renderer for POSIX shell commands
------------------------------------------
As both a practical demonstration of the benefits of delayed rendering support, and as
a valuable feature in its own right, a new ``sh`` template renderer will be added to
the :mod:`shlex` module. This renderer will produce strings where all interpolated fields
are escaped with :func:`shlex.quote`.
The :class:`subprocess.Popen` API (and higher level APIs that depend on it, such as
:func:`subprocess.run`) will be updated to accept interpolation templates and handle
them in accordance with the new ``shlex.sh`` renderer.
Background
==========
This PEP was initially proposed as a competitor to :pep:`498`. After it became clear that
the eager rendering proposal had sustantially more immediate support, it then spent several
years in a deferred state, pending further experience with :pep:`498`'s simpler approach of
only supporting eager rendering without the additional complexity of also supporting deferred
rendering.
Since then, f-strings have become very popular and :pep:`701` was introduced to tidy up some
rough edges and limitations in their syntax and semantics. The template literal proposal
was updated in 2023 to reflect current knowledge of f-strings, and improvements from
:pep:`701`.
In 2024, :pep:`750` was published, proposing a general purpose mechanism for custom tagged
string prefixes, rather than the narrower template literal proposal in this PEP. This PEP
was again updated, both to incorporate new ideas inspired by the tagged strings proposal,
and to describe the perceived benefits of the narrower template literal syntax proposal
in this PEP over the more general tagged string proposal.
Summary of differences from f-strings
-------------------------------------
The key differences between f-strings and t-strings are:
* the ``t`` (template literal) prefix indicates delayed rendering, but
otherwise largely uses the same syntax and semantics as formatted strings
* template literals are available at runtime as a new kind of object
(``types.TemplateLiteral``)
* the default rendering used by formatted strings is invoked on a
template literal object by calling ``format(template)`` rather than
being done implicitly in the compiled code
* unlike f-strings (where conversion specifiers are handled directly in the compiler),
t-string conversion specifiers are handled at rendering time by the rendering function
* the new ``!()`` conversion specifier indicates that the field expression is a callable
that should be called when using the default :func:`format` rendering function. This
specifier is specifically *not* being added to f-strings (since it is pointless there).
* a second ``!`` is allowed in t-string conversion specifiers (with any subsequent text
being ignored) as a way to allow custom template rendering functions to accept custom
conversion specifiers without breaking the default :func:`!TemplateLiteral.render`
rendering method. This feature is specifically *not* being added to f-strings (since
it is pointless there).
* while f-string ``f"Message {here}"`` would be *semantically* equivalent to
``format(t"Message {here}")``, f-strings will continue to be supported directly in the
compiler and hence avoid the runtime overhead of actually using the delayed rendering
machinery that is needed for t-strings
Summary of differences from tagged strings
------------------------------------------
When tagged strings were
`first proposed <https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408>`__,
there were several notable differences from the proposal in PEP 501 beyond the surface
syntax difference between whether rendering function invocations are written as
``render(t"template literal")`` or as ``render"template literal"``.
Over the course of the initial PEP 750 discussion, many of those differences were eliminated,
either by PEP 501 adopting that aspect of PEP 750's proposal (such as lazily applying
conversion specifiers), or by PEP 750 changing to retain some aspect of PEP 501's proposal
(such as defining a dedicated type to hold template segments rather than representing them
as simple sequences).
The main remaining significant difference is that this PEP argues that adding *only* the
t-string prefix is a sufficient enhancement to give all the desired benefits described in
PEP 750. The expansion to a generalised "tagged string" syntax isn't necessary, and causes
additional problems that can be avoided.
The two PEPs also differ in their proposed approaches to handling lazy evaluation of template
fields.
While there *are* other differences between the two proposals, those differences are more
cosmetic than substantive. In particular:
* this PEP proposes different names for the structural typing protocols
* this PEP proposes specific names for the concrete implementation types
* this PEP proposes exact details for the proposed APIs of the concrete implementation types
(including concatenation and repetition support, which are not part of the structural
typing protocols)
* this PEP proposes changes to the existing :func:`format` builtin to make it usable
directly as a template field renderer
The two PEPs also differ in *how* they make their case for delayed rendering support. This
PEP focuses more on the concrete implementation concept of using template literals to allow
the "interpolation" and "rendering" steps in f-string processing to be separated in time,
and then taking advantage of that to reduce the potential code injection risks associated
with misuse of f-strings. PEP 750 focuses more on the way that native templating support
allows behaviours that are difficult or impossible to achieve via existing string based
templating methods. As with the cosmetic differences noted above, this is more a difference
in style than a difference in substance.
2015-08-08 05:20:33 -04:00
Rationale
=========
f-strings (:pep:`498`) made interpolating values into strings with full access to Python's
lexical namespace semantics simpler, but it does so at the cost of creating a
situation where interpolating values into sensitive targets like SQL queries,
shell commands and HTML templates will enjoy a much cleaner syntax when handled
without regard for code injection attacks than when they are handled correctly.
This PEP proposes to provide the option of delaying the actual rendering
of a template literal to a formatted string to its ``__format__`` method, allowing the use
of other template renderers by passing the template around as a first class object.
While very different in the technical details, the
``types.TemplateLiteral`` interface proposed in this PEP is
conceptually quite similar to the ``FormattableString`` type underlying the
`native interpolation <https://msdn.microsoft.com/en-us/library/dn961160.aspx>`__
support introduced in C# 6.0, as well as the
`JavaScript template literals <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals>`__
introduced in ES6.
While not the original motivation for developing the proposal, many of the benefits for
defining domain specific languages described in :pep:`750` also apply to this PEP
(including the potential for per-DSL semantic highlighting in code editors based on the
type specifications of declared template variables and rendering function parameters).
2015-08-08 05:20:33 -04:00
Specification
=============
This PEP proposes a new ``t`` string prefix that
results in the creation of an instance of a new type,
``types.TemplateLiteral``.
2015-08-08 05:20:33 -04:00
Template literals are Unicode strings (bytes literals are not
permitted), and string literal concatenation operates as normal, with the
entire combined literal forming the template literal.
2015-08-08 05:20:33 -04:00
The template string is parsed into literals, expressions, format specifiers, and conversion
specifiers as described for f-strings in :pep:`498` and :pep:`701`. The syntax for conversion
specifiers is relaxed such that arbitrary strings are accepted (excluding those containing
``{``, ``}`` or ``:``) rather than being restricted to valid Python identifiers.
2015-08-08 05:20:33 -04:00
However, rather than being rendered directly into a formatted string, these
components are instead organised into instances of new types with the
following behaviour:
.. code-block:: python
class TemplateLiteralText(str):
# This is a renamed and extended version of the DecodedConcrete type in PEP 750
# Real type would be implemented in C, this is an API compatible Python equivalent
_raw: str
def __new__(cls, raw: str):
decoded = raw.encode("utf-8").decode("unicode-escape")
if decoded == raw:
decoded = raw
text = super().__new__(cls, decoded)
text._raw = raw
return text
@staticmethod
def merge(text_segments:Sequence[TemplateLiteralText]) -> TemplateLiteralText:
if len(text_segments) == 1:
return text_segments[0]
return TemplateLiteralText("".join(t._raw for t in text_segments))
@property
def raw(self) -> str:
return self._raw
def __repr__(self) -> str:
return f"{type(self).__name__}(r{self._raw!r})"
def __add__(self, other:Any) -> TemplateLiteralText|NotImplemented:
if isinstance(other, TemplateLiteralText):
return TemplateLiteralText(self._raw + other._raw)
return NotImplemented
def __mul__(self, other:Any) -> TemplateLiteralText|NotImplemented:
try:
factor = operator.index(other)
except TypeError:
return NotImplemented
return TemplateLiteralText(self._raw * factor)
__rmul__ = __mul__
class TemplateLiteralField(NamedTuple):
# This is mostly a renamed version of the InterpolationConcrete type in PEP 750
# However:
# - value is eagerly evaluated (values were all originally lazy in PEP 750)
# - conversion specifiers are allowed to be arbitrary strings
# - order of fields is adjusted so the text form is the first field and the
# remaining parameters match the updated signature of the `*format` builtin
# Real type would be implemented in C, this is an API compatible Python equivalent
expr: str
value: Any
format_spec: str | None = None
conversion_spec: str | None = None
def __repr__(self) -> str:
return (f"{type(self).__name__}({self.expr}, {self.value!r}, "
f"{self.format_spec!r}, {self.conversion_spec!r})")
def __str__(self) -> str:
return format(self.value, self.format_spec, self.conversion_spec)
def __format__(self, format_override) -> str:
if format_override:
format_spec = format_override
else:
format_spec = self.format_spec
return format(self.value, format_spec, self.conversion_spec)
2015-08-08 05:20:33 -04:00
class TemplateLiteral:
# This type corresponds to the TemplateConcrete type in PEP 750
# Real type would be implemented in C, this is an API compatible Python equivalent
_raw_template: str
_segments = tuple[TemplateLiteralText|TemplateLiteralField]
2015-08-08 05:20:33 -04:00
def __new__(cls, raw_template:str, *segments:TemplateLiteralText|TemplateLiteralField):
2015-08-27 10:48:24 -04:00
self = super().__new__(cls)
self._raw_template = raw_template
# Check if there are any adjacent text segments that need merging
# or any empty text segments that need discarding
type_err = "Template literal segments must be template literal text or field instances"
text_expected = True
needs_merge = False
for segment in segments:
match segment:
case TemplateLiteralText():
if not text_expected or not segment:
needs_merge = True
break
text_expected = False
case TemplateLiteralField():
text_expected = True
case _:
raise TypeError(type_err)
if not needs_merge:
# Match loop above will have checked all segments
self._segments = segments
return self
# Merge consecutive runs of text fields and drop any empty text fields
merged_segments:list[TemplateLiteralText|TemplateLiteralField] = []
pending_merge:list[TemplateLiteralText] = []
for segment in segments:
match segment:
case TemplateLiteralText() as text_segment:
if text_segment:
pending_merge.append(text_segment)
case TemplateLiteralField():
if pending_merge:
merged_segments.append(TemplateLiteralText.merge(pending_merge))
pending_merge.clear()
merged_segments.append(segment)
case _:
# First loop above may not check all segments when a merge is needed
raise TypeError(type_err)
if pending_merge:
merged_segments.append(TemplateLiteralText.merge(pending_merge))
pending_merge.clear()
self._segments = tuple(merged_segments)
return self
@property
def raw_template(self) -> str:
return self._raw_template
@property
def segments(self) -> tuple[TemplateLiteralText|TemplateLiteralField]:
return self._segments
def __len__(self) -> int:
return len(self._segments)
def __iter__(self) -> Iterable[TemplateLiteralText|TemplateLiteralField]:
return iter(self._segments)
# Note: template literals do NOT define any relative ordering
def __eq__(self, other):
if not isinstance(other, TemplateLiteral):
return NotImplemented
return (
self._raw_template == other._raw_template
and self._segments == other._segments
and self.field_values == other.field_values
and self.format_specifiers == other.format_specifiers
)
def __repr__(self) -> str:
return (f"{type(self).__name__}(r{self._raw!r}, "
f"{', '.join(map(repr, self._segments))})")
def __format__(self, format_specifier) -> str:
# When formatted, render to a string, and then use string formatting
return format(self.render(), format_specifier)
def render(self, *, render_template=''.join, render_text=str, render_field=format):
... # See definition of the template rendering semantics below
def __add__(self, other) -> TemplateLiteral|NotImplemented:
if isinstance(other, TemplateLiteral):
combined_raw_text = self._raw + other._raw
combined_segments = self._segments + other._segments
return TemplateLiteral(combined_raw_text, *combined_segments)
if isinstance(other, str):
# Treat the given string as a new raw text segment
combined_raw_text = self._raw + other
combined_segments = self._segments + (TemplateLiteralText(other),)
return TemplateLiteral(combined_raw_text, *combined_segments)
return NotImplemented
def __radd__(self, other) -> TemplateLiteral|NotImplemented:
if isinstance(other, str):
# Treat the given string as a new raw text segment. This effectively
# has precedence over string concatenation in CPython due to
# https://github.com/python/cpython/issues/55686
combined_raw_text = other + self._raw
combined_segments = (TemplateLiteralText(other),) + self._segments
return TemplateLiteral(combined_raw_text, *combined_segments)
return NotImplemented
def __mul__(self, other) -> TemplateLiteral|NotImplemented:
try:
factor = operator.index(other)
except TypeError:
return NotImplemented
if not self or factor == 1:
return self
if factor < 1:
return TemplateLiteral("")
repeated_text = self._raw_template * factor
repeated_segments = self._segments * factor
return TemplateLiteral(repeated_text, *repeated_segments)
__rmul__ = __mul__
(Note: this is an illustrative example implementation, the exact compile time construction
method and internal data management details of ``types.TemplateLiteral`` are considered an
implementation detail not specified by the PEP. However, the expected post-construction
behaviour of the public APIs on ``types.TemplateLiteral`` instances is specified by the
above code, as is the constructor signature for building template instances at runtime)
The result of a template literal expression is an instance of this
type, rather than an already rendered string. Rendering only takes
place when the instance's ``render`` method is called (either directly, or
indirectly via ``__format__``).
The compiler will pass the following details to the template literal for
later use:
* a string containing the raw template as written in the source code
* a sequence of template segments, with each segment being either:
* a literal text segment (a regular Python string that also provides access
to its raw form)
* a parsed template interpolation field, specifying the text of the interpolated
expression (as a regular string), its evaluated result, the format specifier text
(with any substitution fields eagerly evaluated as an f-string), and the conversion
specifier text (as a regular string)
The raw template is just the template literal as a string. By default,
it is used to provide a human-readable representation for the
template literal, but template renderers may also use it for other purposes (e.g. as a
cache lookup key).
The parsed template structure is taken from :pep:`750` and consists of a sequence of
template segments corresponding to the text segments and interpolation fields in the
template string.
This approach is designed to allow compilers to fully process each segment of the template
in order, before finally emitting code to pass all of the template segments to the template
literal constructor.
For example, assuming the following runtime values:
.. code-block:: python
names = ["Alice", "Bob", "Carol", "Eve"]
field_width = 10
def expressions():
return 42
The template from the proposal section would be represented at runtime as:
.. code-block:: python
TemplateLiteral(
r"Substitute {names:>{field_width}} and {expressions()!r} at runtime",
TemplateLiteralText(r"Substitute "),
TemplateLiteralField("names", ["Alice", "Bob", "Carol", "Eve"], ">10", ""),
TemplateLiteralText(r" and "),
TemplateLiteralField("expressions()", 42, "", "r"),
)
Rendering templates
-------------------
The ``TemplateLiteral.render`` implementation defines the rendering
process in terms of the following renderers:
* an overall ``render_template`` operation that defines how the sequence of
rendered text and field segments are composed into a fully rendered result.
The default template renderer is string concatenation using ``''.join``.
* a per text segment ``render_text`` operation that receives the individual literal
text segments within the template. The default text renderer is the builtin ``str``
constructor.
* a per field segment ``render_field`` operation that receives the field value, format
specifier, and conversion specifier for substitution fields within the template. The
default field renderer is the :func:`format` builtin.
Given the parsed template representation above, the semantics of template rendering would
then be equivalent to the following:
.. code-block:: python
def render(self, *, render_template=''.join, render_text=str, render_field=format):
rendered_segments = []
for segment in self._segments:
match segment:
case TemplateLiteralText() as text_segment:
rendered_segments.append(render_text(text_segment))
case TemplateLiteralField() as field_segment:
rendered_segments.append(render_field(*field_segment[1:]))
return render_template(rendered_segments)
Format specifiers
-----------------
The syntax and processing of field specifiers in t-strings is defined to be the same as it
is for f-strings.
This includes allowing field specifiers to themselves contain f-string substitution fields.
The raw text of the field specifiers (without processing any substitution fields) is
retained as part of the full raw template string.
The parsed field specifiers receive the field specifier string with those substitutions
already resolved. The ``:`` prefix is also omitted.
Aside from separating them out from the substitution expression during parsing,
format specifiers are otherwise treated as opaque strings by the interpolation
template parser - assigning semantics to those (or, alternatively,
prohibiting their use) is handled at rendering time by the field renderer.
Conversion specifiers
---------------------
In addition to the existing support for ``a``, ``r``, and ``s`` conversion specifiers,
:meth:`str.format` and :meth:`str.format_map` will be updated to accept ``()`` as a
conversion specifier that means "call the interpolated value".
Where :pep:`701` restricts conversion specifiers to ``NAME`` tokens, this PEP will instead
allow ``FSTRING_MIDDLE`` tokens (such that only ``{``, ``}`` and ``:`` are disallowed). This
change is made primarily to support lazy field rendering with the ``!()`` conversion
specifier, but also allows custom rendering functions more flexibility when defining their
own conversion specifiers in preference to those defined for the default :func:`format` field
renderer.
Conversion specifiers are still handled as plain strings, and do NOT support the use
of substitution fields.
The parsed conversion specifiers receive the conversion specifier string with the
``!`` prefix omitted.
To allow custom template renderers to define their own custom conversion specifiers without
causing the default renderer to fail, conversion specifiers will be permitted to contain a
custom suffix prefixed with a second ``!`` character. That is, ``!!<custom>``,
``!a!<custom>``, ``!r!<custom>``, ``!s!<custom>``, and ``!()!<custom>`` would all be
valid conversion specifiers in a template literal.
As described above, the default rendering supports the original ``!a``, ``!r`` and ``!s``
conversion specifiers defined in :pep:`3101`, together with the new ``!()`` lazy field
evaluation conversion specifier defined in this PEP. The default rendering ignores any
custom conversion specifier suffixes.
The full mapping between the standard conversion specifiers and the special methods called
on the interpolated value when the field is rendered:
* No conversion (empty string): ``__format__`` (with format specifier as parameter)
* ``a``: ``__repr__`` (as per the :func:`ascii` builtin)
* ``r``: ``__repr__`` (as per the :func:`repr` builtin)
* ``s``: ``__str__`` (as per the ``str`` builtin)
* ``()``: ``__call__`` (with no parameters)
When a conversion occurs, ``__format__`` (with the format specifier) is called on the result
of the conversion rather than being called on the original object.
The changes to :func:`format` and the addition of :func:`!operator.convert_field` make it
straightforward for custom renderers to also support the standard conversion specifiers.
f-strings themselves will NOT support the new ``!()`` conversion specifier (as it is
redundant when value interpolation and value rendering always occur at the same time). They
also will NOT support the use of custom conversion specifiers (since the rendering function
is known at compile time and doesn't make use of the custom specifiers).
New field conversion API in the :mod:`operator` module
------------------------------------------------------
To support application of the standard conversion specifiers in custom template rendering
functions, a new :func:`!operator.convert_field` function will be added:
.. code-block:: python
def convert_field(value, conversion_spec=''):
"""Apply the given string formatting conversion specifier to the given value"""
std_spec, sep, custom_spec = conversion_spec.partition("!")
match std_spec:
case '':
return value
case 'a':
return ascii(value)
case 'r':
return repr(value)
case 's':
return str(value)
case '()':
return value()
if not sep:
err = f"Invalid conversion specifier {std_spec!r}"
else:
err = f"Invalid conversion specifier {std_spec!r} in {conversion_spec!r}"
raise ValueError(f"{err}: expected '', 'a', 'r', 's' or '()')
Conversion specifier parameter added to :func:`format`
------------------------------------------------------
The signature and behaviour of the :func:`format` builtin will be updated:
.. code-block:: python
def format(value, format_spec='', conversion_spec=''):
if conversion_spec:
value_to_format = operator.convert_field(value)
else:
value_to_format = value
return type(value_to_format).__format__(value, format_spec)
If a non-empty conversion specifier is given, the value will be converted with
:func:`!operator.convert_field` before looking up the ``__format__`` method.
The signature of the ``__format__`` special method does NOT change (only format specifiers
are handled by the object being formatted).
Structural typing and duck typing
---------------------------------
To allow custom renderers to accept alternative interpolation template implementations
(rather than being tightly coupled to the native template literal types), the
following structural protocols will be added to the ``typing`` module:
.. code-block:: python
@runtime_checkable
class TemplateText(Protocol):
# Renamed version of PEP 750's Decoded protocol
def __str__(self) -> str:
...
raw: str
@runtime_checkable
class TemplateField(Protocol):
# Renamed and modified version of PEP 750's Interpolation protocol
def __len__(self):
...
def __getitem__(self, index: int):
...
def __str__(self) -> str:
...
expr: str
value: Any
format_spec: str | None = None
conversion_spec: str | None = None
@runtime_checkable
class InterpolationTemplate(Protocol):
# Corresponds to PEP 750's Template protocol
def __iter__(self) -> Iterable[TemplateText|TemplateField]:
...
raw_template: str
Note that the structural protocol APIs are substantially narrower than the full
implementation APIs defined for ``TemplateLiteralText``, ``TemplateLiteralField``,
and ``TemplateLiteral``.
Code that wants to accept interpolation templates and define specific handling for them
without introducing a dependency on the ``typing`` module, or restricting the code to
handling the concrete template literal types, should instead perform an attribute
existence check on ``raw_template``.
2015-08-08 05:20:33 -04:00
Writing custom renderers
------------------------
2015-08-08 05:20:33 -04:00
Writing a custom renderer doesn't require any special syntax. Instead,
custom renderers are ordinary callables that process an interpolation
template directly either by calling the ``render()`` method with alternate
``render_template``, ``render_text``, and/or ``render_field`` implementations, or by
accessing the template's data attributes directly.
2015-08-08 05:20:33 -04:00
For example, the following function would render a template using objects'
``repr`` implementations rather than their native formatting support:
2015-08-08 05:20:33 -04:00
.. code-block:: python
def repr_format(template):
def render_field(value, format_spec, conversion_spec):
converted_value = operator.convert_field(value, conversion_spec)
return format(repr(converted_value), format_spec)
return template.render(render_field=render_field)
The customer renderer shown respects the conversion specifiers in the original template, but
it is also possible to ignore them and render the interpolated values directly:
.. code-block:: python
def input_repr_format(template):
def render_field(value, format_spec, __):
return format(repr(value), format_spec)
return template.render(render_field=render_field)
2015-08-08 05:20:33 -04:00
When writing custom renderers, note that the return type of the overall
rendering operation is determined by the return type of the passed in ``render_template``
callable. While this will still be a string for formatting related use cases, producing
non-string objects *is* permitted. For example, a custom SQL
template renderer could involve an ``sqlalchemy.sql.text`` call that produces an
`SQL Alchemy query object <http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html#using-textual-sql>`__.
A subprocess invocation related template renderer could produce a string sequence suitable
for passing to ``subprocess.run``, or it could even call ``subprocess.run`` directly, and
return the result.
Non-strings may also be returned from ``render_text`` and ``render_field``, as long as
they are paired with a ``render_template`` implementation that expects that behaviour.
Custom renderers using the pattern matching style described in :pep:`750` are also supported:
.. code-block:: python
# Use the structural typing protocols rather than the concrete implementation types
from typing import InterpolationTemplate, TemplateText, TemplateField
def greet(template: InterpolationTemplate) -> str:
"""Render an interpolation template using structural pattern matching."""
result = []
for segment in template:
match segment:
match segment:
case TemplateText() as text_segment:
result.append(text_segment)
case TemplateField() as field_segment:
result.append(str(field_segment).upper())
return f"{''.join(result)}!"
2015-08-08 05:20:33 -04:00
Expression evaluation
---------------------
As with f-strings, the subexpressions that are extracted from the interpolation
template are evaluated in the context where the template literal
appears. This means the expression has full access to local, nonlocal and global variables.
Any valid Python expression can be used inside ``{}``, including
function and method calls.
2015-08-08 05:20:33 -04:00
Because the substitution expressions are evaluated where the string appears in
the source code, there are no additional security concerns related to the
contents of the expression itself, as you could have also just written the
same expression and used runtime field parsing:
.. code-block:: python-console
2015-08-08 05:20:33 -04:00
>>> bar=10
>>> def foo(data):
... return data + 20
...
>>> str(t'input={bar}, output={foo(bar)}')
2015-08-08 05:20:33 -04:00
'input=10, output=30'
Is essentially equivalent to:
.. code-block:: python-console
2015-08-08 05:20:33 -04:00
>>> 'input={}, output={}'.format(bar, foo(bar))
'input=10, output=30'
Handling code injection attacks
-------------------------------
2015-08-08 05:20:33 -04:00
The :pep:`498` formatted string syntax makes it potentially attractive to write
code like the following:
.. code-block:: python
2015-08-08 05:20:33 -04:00
runquery(f"SELECT {column} FROM {table};")
runcommand(f"cat {filename}")
return_response(f"<html><body>{response.body}</body></html>")
These all represent potential vectors for code injection attacks, if any of the
variables being interpolated happen to come from an untrusted source. The
specific proposal in this PEP is designed to make it straightforward to write
use case specific renderers that take care of quoting interpolated values
appropriately for the relevant security context:
.. code-block:: python
2015-08-08 05:20:33 -04:00
runquery(sql(t"SELECT {column} FROM {table} WHERE column={value};"))
runcommand(sh(t"cat {filename}"))
return_response(html(t"<html><body>{response.body}</body></html>"))
2015-08-08 05:20:33 -04:00
This PEP does not cover adding all such renderers to the standard library
immediately (though one for shell escaping is proposed), but rather proposes to ensure
that they can be readily provided by third party libraries, and potentially incorporated
into the standard library at a later date.
Over time, it is expected that APIs processing potentially dangerous string inputs may be
updated to accept interpolation templates natively, allowing problematic code examples to
be fixed simply by replacing the ``f`` string prefix with a ``t``:
.. code-block:: python
runquery(t"SELECT {column} FROM {table};")
runcommand(t"cat {filename}")
return_response(t"<html><body>{response.body}</body></html>")
It is proposed that a renderer is included in the :mod:`shlex` module, aiming to offer a
more POSIX shell style experience for accessing external programs, without the significant
risks posed by running ``os.system`` or enabling the system shell when using the
``subprocess`` module APIs. This renderer will provide an interface for running external
programs inspired by that offered by the
`Julia programming language <https://docs.julialang.org/en/v1/manual/running-external-programs/>`__,
only with the backtick based ``\`cat $filename\``` syntax replaced by ``t"cat {filename}"``
style template literals. See more in the :ref:`pep-501-shlex-module` section.
2015-08-08 05:20:33 -04:00
Error handling
--------------
Either compile time or run time errors can occur when processing interpolation
expressions. Compile time errors are limited to those errors that can be
detected when parsing a template string into its component tuples. These
errors all raise SyntaxError.
2015-08-08 05:20:33 -04:00
Unmatched braces::
>>> t'x={x'
2015-08-08 05:20:33 -04:00
File "<stdin>", line 1
t'x={x'
^
SyntaxError: missing '}' in template literal expression
2015-08-08 05:20:33 -04:00
Invalid expressions::
>>> t'x={!x}'
2015-08-08 05:20:33 -04:00
File "<fstring>", line 1
!x
^
SyntaxError: invalid syntax
Run time errors occur when evaluating the expressions inside a
template string before creating the template literal object. See :pep:`498`
for some examples.
2015-08-08 05:20:33 -04:00
Different renderers may also impose additional runtime
2015-08-08 05:20:33 -04:00
constraints on acceptable interpolated expressions and other formatting
details, which will be reported as runtime exceptions.
.. _pep-501-shlex-module:
Renderer for shell escaping added to :mod:`shlex`
-------------------------------------------------
As a reference implementation, a renderer for safe POSIX shell escaping can be added to
the :mod:`shlex` module. This renderer would be called ``sh`` and would be equivalent to
calling ``shlex.quote`` on each field value in the template literal.
Thus:
.. code-block:: python
os.system(shlex.sh(t'cat {myfile}'))
would have the same behavior as:
.. code-block:: python
os.system('cat ' + shlex.quote(myfile)))
The implementation would be:
.. code-block:: python
def sh(template: TemplateLiteral):
def render_field(value, format_spec, conversion_spec)
field_text = format(value, format_spec, conversion_spec)
return quote(field_text)
return template.render(render_field=render_field)
The addition of ``shlex.sh`` will NOT change the existing admonishments in the
:mod:`subprocess` documentation that passing ``shell=True`` is best avoided, nor the
reference from the :func:`os.system` documentation the higher level ``subprocess`` APIs.
Changes to subprocess module
----------------------------
With the additional renderer in the shlex module, and the addition of template literals,
the :mod:`subprocess` module can be changed to handle accepting template literals
as an additional input type to ``Popen``, as it already accepts a sequence, or a string,
with different behavior for each.
With the addition of template literals, :class:`subprocess.Popen` (and in return, all its
higher level functions such as :func:`subprocess.run`) could accept strings in a safe way
(at least on :ref:`POSIX systems <pep-501-defer-non-posix-shells>`).
For example:
.. code-block:: python
subprocess.run(t'cat {myfile}', shell=True)
would automatically use the ``shlex.sh`` renderer provided in this PEP. Therefore, using
``shlex`` inside a ``subprocess.run`` call like so:
.. code-block:: python
subprocess.run(shlex.sh(t'cat {myfile}'), shell=True)
would be redundant, as ``run`` would automatically render any template literals
through ``shlex.sh``
Alternatively, when ``subprocess.Popen`` is run without ``shell=True``, it could still
provide subprocess with a more ergonomic syntax. For example:
2015-08-08 05:20:33 -04:00
.. code-block:: python
subprocess.run(t'cat {myfile} --flag {value}')
would be equivalent to:
.. code-block:: python
subprocess.run(['cat', myfile, '--flag', value])
or, more accurately:
.. code-block:: python
subprocess.run(shlex.split(f'cat {shlex.quote(myfile)} --flag {shlex.quote(value)}'))
It would do this by first using the ``shlex.sh`` renderer, as above, then using
``shlex.split`` on the result.
The implementation inside ``subprocess.Popen._execute_child`` would look like:
.. code-block:: python
if hasattr(args, "raw_template"):
import shlex
if shell:
args = [shlex.sh(args)]
else:
args = shlex.split(shlex.sh(args))
How to Teach This
=================
This PEP intentionally includes two standard renderers that will always be available in
teaching environments: the :func:`format` builtin and the new ``shlex.sh`` POSIX shell
renderer.
Together, these two renderers can be used to build an initial understanding of delayed
rendering on top of a student's initial introduction to string formatting with f-strings.
This initial understanding would have the goal of allowing students to *use* template
literals effectively, in combination with pre-existing template rendering functions.
For example, ``f"{'some text'}"``, ``f"{value}"``, ``f"{value!r}"``, , ``f"{callable()}"``
could all be introduced.
Those same operations could then be rewritten as ``format(t"{'some text'}")``,
``format(t"{value}")``, ``format(t"{value!r}")``, , ``format(t"{callable()}")`` to
illustrate the relationship between the eager rendering form and the delayed rendering
form.
The difference between "template definition time" (or "interpolation time" ) and
"template rendering time" can then be investigated further by storing the template literals
as local variables and looking at their representations separately from the results of the
``format`` calls. At this point, the ``t"{callable!()}"`` syntax can be introduced to
distinguish between field expressions that are called at template definition time and those
that are called at template rendering time.
Finally, the differences between the results of ``f"{'some text'}"``,
``format(t"{'some text'}")``, and ``shlex.sh(t"{'some text'}")`` could be explored to
illustrate the potential for differences between the default rendering function and custom
rendering functions.
Actually defining your own custom template rendering functions would then be a separate more
advanced topic (similar to the way students are routinely taught to use decorators and
context managers well before they learn how to write their own custom ones).
:pep:`750` includes further ideas for teaching aspects of the delayed rendering topic.
2015-08-08 05:20:33 -04:00
Discussion
==========
Refer to :pep:`498` for previous discussion, as several of the points there
also apply to this PEP. :pep:`750`'s design discussions are also highly relevant,
as that PEP inspired several aspects of the current design.
2015-08-08 05:20:33 -04:00
Support for binary interpolation
--------------------------------
As f-strings don't handle byte strings, neither will t-strings.
Interoperability with str-only interfaces
-----------------------------------------
For interoperability with interfaces that only accept strings, interpolation
templates can still be prerendered with :func:`format`, rather than delegating the
rendering to the called function.
This reflects the key difference from :pep:`498`, which *always* eagerly applies
the default rendering, without any way to delegate the choice of renderer to
another section of the code.
Preserving the raw template string
----------------------------------
Earlier versions of this PEP failed to make the raw template string available
on the template literal. Retaining it makes it possible to provide a more
attractive template representation, as well as providing the ability to
precisely reconstruct the original string, including both the expression text
and the details of any eagerly rendered substitution fields in format specifiers.
Creating a rich object rather than a global name lookup
-------------------------------------------------------
Earlier versions of this PEP used an ``__interpolate__`` builtin, rather than
creating a new kind of object for later consumption by interpolation
functions. Creating a rich descriptive object with a useful default renderer
made it much easier to support customisation of the semantics of interpolation.
2015-08-08 05:20:33 -04:00
Building atop f-strings rather than replacing them
--------------------------------------------------
Earlier versions of this PEP attempted to serve as a complete substitute for
:pep:`498` (f-strings) . With the acceptance of that PEP and the more recent :pep:`701`,
this PEP can instead build a more flexible delayed rendering capability
on top of the existing f-string eager rendering.
Assuming the presence of f-strings as a supporting capability simplified a
number of aspects of the proposal in this PEP (such as how to handle substitution
fields in format specifiers).
Defining repetition and concatenation semantics
-----------------------------------------------
This PEP explicitly defines repetition and concatenation semantics for ``TemplateLiteral``
and ``TemplateLiteralText``. While not strictly necessary, defining these is expected
to make the types easier to work with in code that historically only supported regular
strings.
New conversion specifier for lazy field evaluation
--------------------------------------------------
The initially published version of :pep:`750` defaulted to lazy evaluation for all
interpolation fields. While it was subsequently updated to default to eager evaluation
(as happens for f-strings and this PEP), the discussions around the topic prompted the idea
of providing a way to indicate to rendering functions that the interpolated field value
should be called at rendering time rather than being used without modification.
Since PEP 750 also deferred the processing of conversion specifiers until evaluation time,
the suggestion was put forward that invoking ``__call__`` without arguments could be seen
as similar to the existing conversion specifiers that invoke ``__repr__`` (``!a``, ``!r``)
or ``__str__`` (``!s``).
Accordingly, this PEP was updated to also make conversion specifier processing the
responsibility of rendering functions, and to introduce ``!()`` as a new conversion
specifier for lazy evaluation.
Adding :func:`!operator.convert_field` and updating the :func:`format` builtin was than
a matter of providing appropriate support to rendering function implementations that
wanted to accept the default conversion specifiers.
Allowing arbitrary conversion specifiers in custom renderers
------------------------------------------------------------
Accepting ``!()`` as a new conversion specifier necessarily requires updating the syntax
that the parser accepts for conversion specifiers (they are currently restricted to
identifiers). This then raised the question of whether t-string compilation should enforce
the additional restriction that f-string compilation imposes: that the conversion specifier
be exactly one of ``!a``, ``!r``, or ``!s``.
With t-strings already being updated to allow ``!()`` when compiled, it made sense to treat
conversion specifiers as relating to rendering function similar to the way that format
specifiers related to the formatting of individual objects: aside from some characters that
are excluded for parsing reasons, they are otherwise free text fields with the meaning
decided by the consuming function or object. This reduces the temptation to introduce
renderer specific metaformatting into the template's format specifiers (since any
renderer specific information can be placed in the conversion specifier instead).
Only reserving a single new string prefix
-----------------------------------------
The primary difference between this PEP and :pep:`750` is that the latter aims to enable
the use of arbitrary string prefixes, rather than requiring the creation of template
literal instances that are then passed to other APIs. For example, PEP 750 would allow
the ``sh`` render described in this PEP to be used as ``sh"cat {somefile}"`` rather than
requiring the template literal to be created explicitly and then passed to a regular
function call (as in ``sh(t"cat {somefile}")``).
The main reason the PEP authors prefer the second spelling is because it makes it clearer
to a reader what is going on: a template literal instance is being created, and then
passed to a callable that knows how to do something useful with interpolation template
instances.
A `draft proposal <https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408/176>`__
from one of the :pep:`750` authors also suggests that static typecheckers will be able
to infer the use of particular domain specific languages just as readily from the form
that uses an explicit function call as they would be able to infer it from a directly
tagged string.
With the tagged string syntax at least arguably reducing clarity for human readers without
increasing the overall expressiveness of the construct, it seems reasonable to start with
the smallest viable proposal (a single new string prefix), and then revisit the potential
value of generalising to arbitrary prefixes in the future.
As a lesser, but still genuine, consideration, only using a single new string prefix for
this use case leaves open the possibility of defining alternate prefixes in the future that
still produce ``TemplateLiteral`` objects, but use a different syntax within the string to
define the interpolation fields (see the :ref:`i18n discussion <pep-501-defer-i18n>` below).
Deferring consideration of more concise delayed evaluation syntax
-----------------------------------------------------------------
During the discussions of delayed evaluation, ``{-> expr}`` was
`suggested <https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408/112>`__
as potential syntactic sugar for the already supported ``lambda`` based syntax:
``{(lambda: expr)}`` (the parentheses are required in the existing syntax to avoid
misinterpretation of the ``:`` character as indicating the start of the format specifier).
While adding such a spelling would complement the rendering time function call syntax
proposed in this PEP (that is, writing ``{-> expr!()}`` to evaluate arbitrary expressions
at rendering time), it is a topic that the PEP authors consider to be better left to a
future PEP if this PEP or :pep:`750` is accepted.
Deferring consideration of possible logging integration
-------------------------------------------------------
One of the challenges with the logging module has been that we have previously
been unable to devise a reasonable migration strategy away from the use of
printf-style formatting. While the logging module does allow formatters to specify the
use of :meth:`str.format` or :class:`string.Template` style substitution, it can be awkward
to ensure that messages written that way are only ever processed by log record formatters
that are expecting that syntax.
The runtime parsing and interpolation overhead for logging messages also poses a problem
for extensive logging of runtime events for monitoring purposes.
While beyond the scope of this initial PEP, template literal support
could potentially be added to the logging module's event reporting APIs,
permitting relevant details to be captured using forms like:
.. code-block:: python
logging.debug(t"Event: {event}; Details: {data}")
logging.critical(t"Error: {error}; Details: {data}")
Rather than the historical mod-formatting style:
.. code-block:: python
logging.debug("Event: %s; Details: %s", event, data)
logging.critical("Error: %s; Details: %s", event, data)
As the template literal is passed in as an ordinary argument, other
keyword arguments would also remain available:
.. code-block:: python
logging.critical(t"Error: {error}; Details: {data}", exc_info=True)
The approach to standardising lazy field evaluation described in this PEP is
primarily based on the anticipated needs of this hypothetical integration into
the logging module:
.. code-block:: python
logging.debug(t"Eager evaluation of {expensive_call()}")
logging.debug(t"Lazy evaluation of {expensive_call!()}")
logging.debug(t"Eager evaluation of {expensive_call_with_args(x, y, z)}")
logging.debug(t"Lazy evaluation of {(lambda: expensive_call_with_args(x, y, z))!()}")
It's an open question whether the definition of logging formatters would be updated to
support template strings, but if they were, the most likely way of defining fields which
should be :ref:`looked up on the log record <logrecord-attributes>` instead of being
interpreted eagerly is simply to escape them so they're available as part of the literal
text:
.. code-block:: python
proc_id = get_process_id()
formatter = logging.Formatter(t"{{asctime}}:{proc_id}:{{name}}:{{levelname}}{{message}}")
.. _pep-501-defer-i18n:
Deferring consideration of possible use in i18n use cases
---------------------------------------------------------
The initial motivating use case for this PEP was providing a cleaner syntax
for i18n (internationalization) translation, as that requires access to the original
unmodified template. As such, it focused on compatibility with the substitution syntax
used in Python's :class:`string.Template` formatting and Mozilla's l20n project.
However, subsequent discussion revealed there are significant additional
considerations to be taken into account in the i18n use case, which don't
impact the simpler cases of handling interpolation into security sensitive
contexts (like HTML, system shells, and database queries), or producing
application debugging messages in the preferred language of the development
team (rather than the native language of end users).
Due to that realisation, the PEP was switched to use the :meth:`str.format` substitution
syntax originally defined in :pep:`3101` and subsequently used as the basis for :pep:`498`.
While it would theoretically be possible to update :class:`string.Template` to support
the creation of instances from native template literals, and to implement the structural
``typing.Template`` protocol, the PEP authors have not identified any practical benefit
in doing so.
However, one significant benefit of the "only one string prefix" approach used in this PEP
is that while it generalises the existing f-string interpolation syntax to support delayed
rendering through t-strings, it doesn't imply that that should be the *only* compiler
supported interpolation syntax that Python should ever offer.
Most notably, it leaves the door open to an alternate "t$-string" syntax that would allow
``TemplateLiteral`` instances to be created using a :pep:`292` based interpolation syntax
rather than a :pep:`3101` based syntax:
template = t$"Substitute $words and ${other_values} at runtime"
The only runtime distinction between templates created that way and templates created from
regular t-strings would be in the contents of their ``raw_template`` attributes.
.. _pep-501-defer-non-posix-shells:
Deferring escaped rendering support for non-POSIX shells
--------------------------------------------------------
:func:`shlex.quote` works by classifying the regex character set ``[\w@%+=:,./-]`` to be
safe, deeming all other characters to be unsafe, and hence requiring quoting of the string
containing them. The quoting mechanism used is then specific to the way that string quoting
works in POSIX shells, so it cannot be trusted when running a shell that doesn't follow
POSIX shell string quoting rules.
For example, running ``subprocess.run(f'echo {shlex.quote(sys.argv[1])}', shell=True)`` is
safe when using a shell that follows POSIX quoting rules::
$ cat > run_quoted.py
import sys, shlex, subprocess
subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True)
$ python3 run_quoted.py pwd
pwd
$ python3 run_quoted.py '; pwd'
; pwd
$ python3 run_quoted.py "'pwd'"
'pwd'
but remains unsafe when running a shell from Python invokes ``cmd.exe`` (or Powershell)::
S:\> echo import sys, shlex, subprocess > run_quoted.py
S:\> echo subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True) >> run_quoted.py
S:\> type run_quoted.py
import sys, shlex, subprocess
subprocess.run(f"echo {shlex.quote(sys.argv[1])}", shell=True)
S:\> python3 run_quoted.py "echo OK"
'echo OK'
S:\> python3 run_quoted.py "'& echo Oh no!"
''"'"'
Oh no!'
Resolving this standard library limitation is beyond the scope of this PEP.
Acknowledgements
================
* Eric V. Smith for creating :pep:`498` and demonstrating the feasibility of
arbitrary expression substitution in string interpolation
* The authors of :pep:`750` for the substantial design improvements that tagged strings
inspired for this PEP, their general advocacy for the value of language level delayed
template rendering support, and their efforts to ensure that any native interpolation
template support lays a strong foundation for future efforts in providing robust syntax
highlighting and static type checking support for domain specific languages
* Barry Warsaw, Armin Ronacher, and Mike Miller for their contributions to
exploring the feasibility of using this model of delayed rendering in i18n
use cases (even though the ultimate conclusion was that it was a poor fit,
at least for current approaches to i18n in Python)
2015-08-08 05:20:33 -04:00
References
==========
2022-08-04 04:41:52 -04:00
* `%-formatting
<https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting>`_
2015-08-08 05:20:33 -04:00
2022-08-04 04:41:52 -04:00
* `str.format
<https://docs.python.org/3/library/string.html#formatstrings>`_
2015-08-08 05:20:33 -04:00
2022-08-04 04:41:52 -04:00
* `string.Template documentation
<https://docs.python.org/3/library/string.html#template-strings>`_
2015-08-08 05:20:33 -04:00
2022-08-04 04:41:52 -04:00
* :pep:`215`: String Interpolation
2015-08-08 05:20:33 -04:00
2022-08-04 04:41:52 -04:00
* :pep:`292`: Simpler String Substitutions
2015-08-08 05:20:33 -04:00
2022-08-04 04:41:52 -04:00
* :pep:`3101`: Advanced String Formatting
2015-08-08 05:20:33 -04:00
2022-08-04 04:41:52 -04:00
* :pep:`498`: Literal string formatting
2015-08-08 05:20:33 -04:00
* :pep:`675`: Arbitrary Literal String Type
* :pep:`701`: Syntactic formalization of f-strings
2022-08-04 04:41:52 -04:00
* `FormattableString and C# native string interpolation
<https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated>`_
2022-08-04 04:41:52 -04:00
* `IFormattable interface in C# (see remarks for globalization notes)
<https://docs.microsoft.com/en-us/dotnet/api/system.iformattable>`_
* `TemplateLiterals in Javascript
<https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals>`_
2022-08-04 04:41:52 -04:00
* `Running external commands in Julia
<https://docs.julialang.org/en/v1/manual/running-external-programs/>`_
2015-08-08 05:20:33 -04:00
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.