python-peps/peps/pep-0750.rst

1322 lines
48 KiB
ReStructuredText

PEP: 750
Title: Template Strings
Author: Jim Baker <jim.baker@python.org>,
Guido van Rossum <guido@python.org>,
Paul Everitt <pauleveritt@me.com>,
Koudai Aono <koxudaxi@gmail.com>,
Lysandros Nikolaou <lisandrosnik@gmail.com>,
Dave Peck <davepeck@davepeck.org>
Discussions-To: https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408
Status: Draft
Type: Standards Track
Created: 08-Jul-2024
Python-Version: 3.14
Post-History: `09-Aug-2024 <https://discuss.python.org/t/60408>`__,
`17-Oct-2024 <https://discuss.python.org/t/60408/201>`__,
`21-Oct-2024 <https://discuss.python.org/t/60408/226>`__
Abstract
========
This PEP introduces template strings for custom string processing.
Template strings are a generalization of f-strings, using a ``t`` in place of
the ``f`` prefix. Instead of evaluating to ``str``, t-strings evaluate to a new
type, ``Template``:
.. code-block:: python
template: Template = t"Hello {name}"
Templates provide developers with access to the string and its interpolated
values *before* they are combined. This brings native flexible string
processing to the Python language and enables safety checks, web templating,
domain-specific languages, and more.
Relationship With Other PEPs
============================
Python introduced f-strings in Python 3.6 with :pep:`498`. The grammar was
then formalized in :pep:`701` which also lifted some restrictions. This PEP
is based on PEP 701.
At nearly the same time PEP 498 arrived, :pep:`501` was written to provide
"i-strings" -- that is, "interpolation template strings". The PEP was
deferred pending further experience with f-strings. Work on this PEP was
resumed by a different author in March 2023, introducing "t-strings" as template
literal strings, and built atop PEP 701.
The authors of this PEP consider it to be a generalization and simplification
of the updated work in PEP 501. (That PEP has also recently been updated to
reflect the new ideas in this PEP.)
Motivation
==========
Python f-strings are easy to use and very popular. Over time, however, developers
have encountered limitations that make them
`unsuitable for certain use cases <https://docs.djangoproject.com/en/5.1/ref/utils/#django.utils.html.format_html>`__.
In particular, f-strings provide no way to intercept and transform interpolated
values before they are combined into a final string.
As a result, incautious use of f-strings can lead to security vulnerabilities.
For example, a user executing a SQL query with :mod:`python:sqlite3`
may be tempted to use an f-string to embed values into their SQL expression,
which could lead to a `SQL injection attack <https://en.wikipedia.org/wiki/SQL_injection>`__.
Or, a developer building HTML may include unescaped user input in the string,
leading to a `cross-site scripting (XSS) <https://en.wikipedia.org/wiki/Cross-site_scripting>`__
vulnerability.
More broadly, the inability to transform interpolated values before they are
combined into a final string limits the utility of f-strings in more complex
string processing tasks.
Template strings address these problems by providing
developers with access to the string and its interpolated values.
For example, imagine we want to generate some HTML. Using template strings,
we can define an ``html()`` function that allows us to automatically sanitize
content:
.. code-block:: python
evil = "<script>alert('evil')</script>"
template = t"<p>{evil}</p>"
assert html(template) == "<p>&lt;script&gt;alert('evil')&lt;/script&gt;</p>"
Likewise, our hypothetical ``html()`` function can make it easy for developers
to add attributes to HTML elements using a dictionary:
.. code-block:: python
attributes = {"src": "shrubbery.jpg", "alt": "looks nice"}
template = t"<img {attributes} />"
assert html(template) == '<img src="shrubbery.jpg" alt="looks nice" />'
Neither of these examples is possible with f-strings. By providing a
mechanism to intercept and transform interpolated values, template strings
enable a wide range of string processing use cases.
Specification
=============
Template String Literals
------------------------
This PEP introduces a new string prefix, ``t``, to define template string literals.
These literals resolve to a new type, ``Template``, found in a new top-level
standard library module, ``templatelib``.
The following code creates a ``Template`` instance:
.. code-block:: python
from templatelib import Template
template = t"This is a template string."
assert isinstance(template, Template)
Template string literals support the full syntax of :pep:`701`. This includes
the ability to nest template strings within interpolations, as well as the ability
to use all valid quote marks (``'``, ``"``, ``'''``, and ``"""``). Like other string
prefixes, the ``t`` prefix must immediately precede the quote. Like f-strings,
both lowercase ``t`` and uppercase ``T`` prefixes are supported. Like
f-strings, t-strings may not be combined with the ``b`` or ``u`` prefixes.
Additionally, f-strings and t-strings cannot be combined, so the ``ft``
prefix is invalid as well. t-strings *may* be combined with the ``r`` prefix;
see the `Raw Template Strings`_ section below for more information.
The ``Template`` Type
---------------------
Template strings evaluate to an instance of a new type, ``templatelib.Template``:
.. code-block:: python
class Template:
args: Sequence[str | Interpolation]
def __init__(self, *args: str | Interpolation):
...
The ``args`` attribute provides access to the string parts and
any interpolations in the literal:
.. code-block:: python
name = "World"
template = t"Hello {name}"
assert isinstance(template.args[0], str)
assert isinstance(template.args[1], Interpolation)
assert template.args[0] == "Hello "
assert template.args[1].value == "World"
See `Interleaving of Template.args`_ below for more information on how the
``args`` attribute is structured.
The ``Template`` type is immutable. ``Template.args`` cannot be reassigned
or mutated.
The ``Interpolation`` Type
--------------------------
The ``Interpolation`` type represents an expression inside a template string.
Like ``Template``, it is a new concrete type found in the ``templatelib`` module:
.. code-block:: python
class Interpolation:
value: object
expr: str
conv: Literal["a", "r", "s"] | None
format_spec: str
__match_args__ = ("value", "expr", "conv", "format_spec")
def __init__(
self,
value: object,
expr: str,
conv: Literal["a", "r", "s"] | None = None,
format_spec: str = "",
):
...
Like ``Template``, ``Interpolation`` is shallow immutable. Its attributes
cannot be reassigned.
The ``value`` attribute is the evaluated result of the interpolation:
.. code-block:: python
name = "World"
template = t"Hello {name}"
assert template.args[1].value == "World"
The ``expr`` attribute is the *original text* of the interpolation:
.. code-block:: python
name = "World"
template = t"Hello {name}"
assert template.args[1].expr == "name"
We expect that the ``expr`` attribute will not be used in most template processing
code. It is provided for completeness and for use in debugging and introspection.
See both the `Common Patterns Seen in Processing Templates`_ section and the
`Examples`_ section for more information on how to process template strings.
The ``conv`` attribute is the :ref:`optional conversion <python:formatstrings>`
to be used, one of ``r``, ``s``, and ``a``, corresponding to ``repr()``,
``str()``, and ``ascii()`` conversions. As with f-strings, no other conversions
are supported:
.. code-block:: python
name = "World"
template = t"Hello {name!r}"
assert template.args[1].conv == "r"
If no conversion is provided, ``conv`` is ``None``.
The ``format_spec`` attribute is the :ref:`format specification <python:formatspec>`.
As with f-strings, this is an arbitrary string that defines how to present the value:
.. code-block:: python
value = 42
template = t"Value: {value:.2f}"
assert template.args[1].format_spec == ".2f"
Format specifications in f-strings can themselves contain interpolations. This
is permitted in template strings as well; ``format_spec`` is set to the eagerly
evaluated result:
.. code-block:: python
value = 42
precision = 2
template = t"Value: {value:.{precision}f}"
assert template.args[1].format_spec == ".2f"
If no format specification is provided, ``format_spec`` defaults to an empty
string (``""``). This matches the ``format_spec`` parameter of Python's
:func:`python:format` built-in.
Unlike f-strings, it is up to code that processes the template to determine how to
interpret the ``conv`` and ``format_spec`` attributes.
Such code is not required to use these attributes, but when present they should
be respected, and to the extent possible match the behavior of f-strings.
It would be surprising if, for example, a template string that uses ``{value:.2f}``
did not round the value to two decimal places when processed.
Processing Template Strings
---------------------------
Developers can write arbitrary code to process template strings. For example,
the following function renders static parts of the template in lowercase and
interpolations in uppercase:
.. code-block:: python
from templatelib import Template, Interpolation
def lower_upper(template: Template) -> str:
"""Render static parts lowercased and interpolations uppercased."""
parts: list[str] = []
for arg in template.args:
if isinstance(arg, Interpolation):
parts.append(str(arg.value).upper())
else:
parts.append(arg.lower())
return "".join(parts)
name = "world"
assert lower_upper(t"HELLO {name}") == "hello WORLD"
There is no requirement that template strings are processed in any particular
way. Code that processes templates has no obligation to return a string.
Template strings are a flexible, general-purpose feature.
See the `Common Patterns Seen in Processing Templates`_ section for more
information on how to process template strings. See the `Examples`_ section
for detailed working examples.
Template String Concatenation
-----------------------------
Template strings support explicit concatenation using ``+``. Concatenation is
supported for two ``Template`` instances as well as for a ``Template`` instance
and a ``str``:
.. code-block:: python
name = "World"
template1 = t"Hello "
template2 = t"{name}"
assert template1 + template2 == t"Hello {name}"
assert template1 + "!" == t"Hello !"
assert "Hello " + template2 == t"Hello {name}"
Concatenation of templates is "viral": the concatenation of a ``Template`` and
a ``str`` always results in a ``Template`` instance.
Python's implicit concatenation syntax is also supported. The following code
will work as expected:
.. code-block:: python
name = "World"
template = t"Hello " "World"
assert template == t"Hello World"
template2 = t"Hello " t"World"
assert template2 == t"Hello World"
The ``Template`` type implements the ``__add__()`` and ``__radd__()`` methods
roughly as follows:
.. code-block:: python
class Template:
def __add__(self, other: object) -> Template:
if isinstance(other, str):
return Template(*self.args[:-1], self.args[-1] + other)
if not isinstance(other, Template):
return NotImplemented
return Template(*self.args[:-1], self.args[-1] + other.args[0], *other.args[1:])
def __radd__(self, other: object) -> Template:
if not isinstance(other, str):
return NotImplemented
return Template(other + self.args[0], *self.args[1:])
Special care is taken to ensure that the interleaving of ``str`` and ``Interpolation``
instances is maintained when concatenating. (See the
`Interleaving of Template.args`_ section for more information.)
Template and Interpolation Equality
-----------------------------------
Two instances of ``Template`` are defined to be equal if their ``args`` attributes
contain the same strings and interpolations in the same order:
.. code-block:: python
assert t"I love {stilton}" == t"I love {stilton}"
assert t"I love {stilton}" != t"I love {roquefort}"
assert t"I " + t"love {stilton}" == t"I love {stilton}"
The implementation of ``Template.__eq__()`` is roughly as follows:
.. code-block:: python
class Template:
def __eq__(self, other: object) -> bool:
if not isinstance(other, Template):
return NotImplemented
return self.args == other.args
Two instances of ``Interpolation`` are defined to be equal if their ``value``,
``expr``, ``conv``, and ``format_spec`` attributes are equal:
.. code-block:: python
class Interpolation:
def __eq__(self, other: object) -> bool:
if not isinstance(other, Interpolation):
return NotImplemented
return (
self.value == other.value
and self.expr == other.expr
and self.conv == other.conv
and self.format_spec == other.format_spec
)
No Support for Ordering
-----------------------
The ``Template`` and ``Interpolation`` types do not support ordering. This is
unlike all other string literal types in Python, which support lexicographic
ordering. Because interpolations can contain arbitrary values, there is no
natural ordering for them. As a result, neither the ``Template`` nor the
``Interpolation`` type implements the standard comparison methods.
Support for the debug specifier (``=``)
---------------------------------------
The debug specifier, ``=``, is supported in template strings and behaves similarly
to how it behaves in f-strings, though due to limitations of the implementation
there is a slight difference.
In particular, ``t'{expr=}'`` is treated as ``t'expr={expr}'``:
.. code-block:: python
name = "World"
template = t"Hello {name=}"
assert template.args[0] == "Hello name="
assert template.args[1].value == "World"
Raw Template Strings
--------------------
Raw template strings are supported using the ``rt`` (or ``tr``) prefix:
.. code-block:: python
trade = 'shrubberies'
t = rt'Did you say "{trade}"?\n'
assert t.args[0] == r'Did you say "'
assert t.args[2] == r'"?\n'
In this example, the ``\n`` is treated as two separate characters
(a backslash followed by 'n') rather than a newline character. This is
consistent with Python's raw string behavior.
As with regular template strings, interpolations in raw template strings are
processed normally, allowing for the combination of raw string behavior and
dynamic content.
Interpolation Expression Evaluation
-----------------------------------
Expression evaluation for interpolations is the same as in :pep:`498#expression-evaluation`:
The expressions that are extracted from the string are evaluated in the context
where the template string appeared. This means the expression has full access to its
lexical scope, including local and global variables. Any valid Python expression
can be used, including function and method calls.
Template strings are evaluated eagerly from left to right, just like f-strings. This means that
interpolations are evaluated immediately when the template string is processed, not deferred
or wrapped in lambdas.
Exceptions
----------
Exceptions raised in t-string literals are the same as those raised in f-string
literals.
Interleaving of ``Template.args``
---------------------------------
In the ``Template`` type, the ``args`` attribute is a sequence that will always
alternate between string literals and ``Interpolation`` instances. Specifically:
- Even-indexed elements (0, 2, 4, ...) are always of type ``str``, representing
the literal parts of the template.
- Odd-indexed elements (1, 3, 5, ...) are always ``Interpolation`` instances,
representing the interpolated expressions.
For example, the following assertions hold:
.. code-block:: python
name = "World"
template = t"Hello {name}"
assert len(template.args) == 3
assert template.args[0] == "Hello "
assert template.args[1].value == "World"
assert template.args[2] == ""
These rules imply that the ``args`` attribute will always have an odd length.
As a consequence, empty strings are added to the sequence when the template
begins or ends with an interpolation, or when two interpolations are adjacent:
.. code-block:: python
a, b = "a", "b"
template = t"{a}{b}"
assert len(template.args) == 5
assert template.args[0] == ""
assert template.args[1].value == "a"
assert template.args[2] == ""
assert template.args[3].value == "b"
assert template.args[4] == ""
Most template processing code will not care about this detail and will use
either structural pattern matching or ``isinstance()`` checks to distinguish
between the two types of elements in the sequence.
The detail exists because it allows for performance optimizations in template
processing code. For example, a template processor could cache the static parts
of the template and only reprocess the dynamic parts when the template is
evaluated with different values. Access to the static parts can be done with
``template.args[::2]``.
Interleaving is an invariant maintained by the ``Template`` class. Developers can
take advantage of it but they are not required to themselves maintain it.
Specifically, ``Template.__init__()`` can be called with ``str`` and
``Interpolation`` instances in *any* order; the constructor will "interleave" them
as necessary before assigning them to ``args``.
Examples
========
All examples in this section of the PEP have fully tested reference implementations
available in the public `pep750-examples <https://github.com/davepeck/pep750-examples>`_
git repository.
Example: Implementing f-strings with t-strings
----------------------------------------------
It is easy to "implement" f-strings using t-strings. That is, we can
write a function ``f(template: Template) -> str`` that processes a ``Template``
in much the same way as an f-string literal, returning the same result:
.. code-block:: python
name = "World"
value = 42
templated = t"Hello {name!r}, value: {value:.2f}"
formatted = f"Hello {name!r}, value: {value:.2f}"
assert f(templated) == formatted
The ``f()`` function supports both conversion specifiers like ``!r`` and format
specifiers like ``:.2f``. The full code is fairly simple:
.. code-block:: python
from templatelib import Template, Interpolation
def convert(value: object, conv: Literal["a", "r", "s"] | None) -> object:
if conv == "a":
return ascii(value)
elif conv == "r":
return repr(value)
elif conv == "s":
return str(value)
return value
def f(template: Template) -> str:
parts = []
for arg in template.args:
match arg:
case str() as s:
parts.append(s)
case Interpolation(value, _, conv, format_spec):
value = convert(value, conv)
value = format(value, format_spec)
parts.append(value)
return "".join(parts)
.. note:: Example code
See `fstring.py`__ and `test_fstring.py`__.
__ https://github.com/davepeck/pep750-examples/blob/main/pep/fstring.py
__ https://github.com/davepeck/pep750-examples/blob/main/pep/test_fstring.py
Example: Structured Logging
---------------------------
Structured logging allows developers to log data in both a human-readable format
*and* a structured format (like JSON) using only a single logging call. This is
useful for log aggregation systems that process the structured format while
still allowing developers to easily read their logs.
We present two different approaches to implementing structured logging with
template strings.
Approach 1: Custom Log Messages
'''''''''''''''''''''''''''''''
The :ref:`Python Logging Cookbook <python:logging-cookbook>`
has a short section on `how to implement structured logging <https://docs.python.org/3/howto/logging-cookbook.html#implementing-structured-logging>`_.
The logging cookbook suggests creating a new "message" class, ``StructuredMessage``,
that is constructed with a simple text message and a separate dictionary of values:
.. code-block:: python
message = StructuredMessage("user action", {
"action": "traded",
"amount": 42,
"item": "shrubs"
})
logging.info(message)
# Outputs:
# user action >>> {"action": "traded", "amount": 42, "item": "shrubs"}
The ``StructuredMessage.__str__()`` method formats both the human-readable
message *and* the values, combining them into a final string. (See the
`logging cookbook <https://docs.python.org/3/howto/logging-cookbook.html#implementing-structured-logging>`_
for its full example.)
We can implement an improved version of ``StructuredMessage`` using template strings:
.. code-block:: python
import json
from templatelib import Interpolation, Template
from typing import Mapping
class TemplateMessage:
def __init__(self, template: Template) -> None:
self.template = template
@property
def message(self) -> str:
# Use the f() function from the previous example
return f(self.template)
@property
def values(self) -> Mapping[str, object]:
return {
arg.expr: arg.value
for arg in self.template.args
if isinstance(arg, Interpolation)
}
def __str__(self) -> str:
return f"{self.message} >>> {json.dumps(self.values)}"
_ = TemplateMessage # optional, to improve readability
action, amount, item = "traded", 42, "shrubs"
logging.info(_(t"User {action}: {amount:.2f} {item}"))
# Outputs:
# User traded: 42.00 shrubs >>> {"action": "traded", "amount": 42, "item": "shrubs"}
Template strings give us a more elegant way to define the custom message
class. With template strings it is no longer necessary for developers to make
sure that their format string and values dictionary are kept in sync; a single
template string literal is all that is needed. The ``TemplateMessage``
implementation can automatically extract structured keys and values from
the ``Interpolation.expr`` and ``Interpolation.value`` attributes, respectively.
Approach 2: Custom Formatters
'''''''''''''''''''''''''''''
Custom messages are a reasonable approach to structured logging but can be a
little awkward. To use them, developers must wrap every log message they write
in a custom class. This can be easy to forget.
An alternative approach is to define custom ``logging.Formatter`` classes. This
approach is more flexible and allows for more control over the final output. In
particular, it's possible to take a single template string and output it in
multiple formats (human-readable and JSON) to separate log streams.
We define two simple formatters, a ``MessageFormatter`` for human-readable output
and a ``ValuesFormatter`` for JSON output:
.. code-block:: python
import json
from logging import Formatter, LogRecord
from templatelib import Interpolation, Template
from typing import Any, Mapping
class MessageFormatter(Formatter):
def message(self, template: Template) -> str:
# Use the f() function from the previous example
return f(template)
def format(self, record: LogRecord) -> str:
msg = record.msg
if not isinstance(msg, Template):
return super().format(record)
return self.message(msg)
class ValuesFormatter(Formatter):
def values(self, template: Template) -> Mapping[str, Any]:
return {
arg.expr: arg.value
for arg in template.args
if isinstance(arg, Interpolation)
}
def format(self, record: LogRecord) -> str:
msg = record.msg
if not isinstance(msg, Template):
return super().format(record)
return json.dumps(self.values(msg))
We can then use these formatters when configuring our logger:
.. code-block:: python
import logging
import sys
logger = logging.getLogger(__name__)
message_handler = logging.StreamHandler(sys.stdout)
message_handler.setFormatter(MessageFormatter())
logger.addHandler(message_handler)
values_handler = logging.StreamHandler(sys.stderr)
values_handler.setFormatter(ValuesFormatter())
logger.addHandler(values_handler)
action, amount, item = "traded", 42, "shrubs"
logger.info(t"User {action}: {amount:.2f} {item}")
# Outputs to sys.stdout:
# User traded: 42.00 shrubs
# At the same time, outputs to sys.stderr:
# {"action": "traded", "amount": 42, "item": "shrubs"}
This approach has a couple advantages over the custom message approach to structured
logging:
- Developers can log a t-string directly without wrapping it in a custom class.
- Human-readable and structured output can be sent to separate log streams. This
is useful for log aggregation systems that process structured data independently
from human-readable data.
.. note:: Example code
See `logging.py`__ and `test_logging.py`__.
__ https://github.com/davepeck/pep750-examples/blob/main/pep/logging.py
__ https://github.com/davepeck/pep750-examples/blob/main/pep/test_logging.py
Example: HTML Templating
-------------------------
This PEP contains several short HTML templating examples. It turns out that the
"hypothetical" ``html()`` function mentioned in the `Motivation`_ section
(and a few other places in this PEP) exists and is available in the
`pep750-examples repository <https://github.com/davepeck/pep750-examples/>`_.
If you're thinking about parsing a complex grammar with template strings, we
hope you'll find it useful.
Backwards Compatibility
=======================
Like f-strings, use of template strings will be a syntactic backwards incompatibility
with previous versions.
Security Implications
=====================
The security implications of working with template strings, with respect to
interpolations, are as follows:
1. Scope lookup is the same as f-strings (lexical scope). This model has been
shown to work well in practice.
2. Code that processes ``Template`` instances can ensure that any interpolations
are processed in a safe fashion, including respecting the context in which
they appear.
How To Teach This
=================
Template strings have several audiences:
- Developers using template strings and processing functions
- Authors of template processing code
- Framework authors who build interesting machinery with template strings
We hope that teaching developers will be straightforward. At a glance,
template strings look just like f-strings. Their syntax is familiar and the
scoping rules remain the same.
The first thing developers must learn is that template string literals don't
evaluate to strings; instead, they evaluate to a new type, ``Template``. This
is a simple type intended to be used by template processing code. It's not until
developers call a processing function that they get the result they want:
typically, a string, although processing code can of course return any arbitrary
type.
Because developers will learn that t-strings are nearly always used in tandem
with processing functions, they don't necessarily need to understand the details
of the ``Template`` type. As with descriptors and decorators, we expect many more
developers will use t-strings than write t-string processing functions.
Over time, a small number of more advanced developers *will* wish to author their
own template processing code. Writing processing code often requires thinking
in terms of formal grammars. Developers will need to learn how to parse the
``args`` attribute of a ``Template`` instance and how to process interpolations
in a context-sensitive fashion. More sophisticated grammars will likely require
parsing to intermediate representations like an AST. Great template processing
code will handle format specifiers and conversions when appropriate. Writing
production-grade template processing code -- for instance, to support HTML
templates -- can be a large undertaking.
We expect that template strings will provide framework authors with a powerful
new tool in their toolbox. While the functionality of template strings overlaps
with existing tools like template engines, t-strings move that logic into
the language itself. Bringing the full power and generality of Python to bear on
string processing tasks opens new possibilities for framework authors.
Common Patterns Seen in Processing Templates
============================================
Structural Pattern Matching
---------------------------
Iterating over the ``Template.args`` with structural pattern matching is the expected
best practice for many template function implementations:
.. code-block:: python
from templatelib import Template, Interpolation
def process(template: Template) -> Any:
for arg in template.args:
match arg:
case str() as s:
... # handle each string part
case Interpolation() as interpolation:
... # handle each interpolation
Processing code may also commonly sub-match on attributes of the ``Interpolation`` type:
.. code-block:: python
match arg:
case Interpolation(int()):
... # handle interpolations with integer values
case Interpolation(value=str() as s):
... # handle interpolations with string values
# etc.
Memoizing
---------
Template functions can efficiently process both static and dynamic parts of templates.
The structure of ``Template`` objects allows for effective memoization:
.. code-block:: python
source = template.args[::2] # Static string parts
values = [i.value for i in template.args[1::2]] # Dynamic interpolated values
This separation enables caching of processed static parts, while dynamic parts can be
inserted as needed. Authors of template processing code can use the static
``source`` as cache keys, leading to significant performance improvements when
similar templates are used repeatedly.
Parsing to Intermediate Representations
---------------------------------------
Code that processes templates can parse the template string into intermediate
representations, like an AST. We expect that many template processing libraries
will use this approach.
For instance, rather than returning a ``str``, our theoretical ``html()`` function
(see the `Motivation`_ section) could return an HTML ``Element`` defined in the
same package:
.. code-block:: python
@dataclass(frozen=True)
class Element:
tag: str
attributes: Mapping[str, str | bool]
children: Sequence[str | Element]
def __str__(self) -> str:
...
def html(template: Template) -> Element:
...
Calling ``str(element)`` would then render the HTML but, in the meantime, the
``Element`` could be manipulated in a variety of ways.
Context-sensitive Processing of Interpolations
----------------------------------------------
Continuing with our hypothetical ``html()`` function, it could be made
context-sensitive. Interpolations could be processed differently depending
on where they appear in the template.
For example, our ``html()`` function could support multiple kinds of
interpolations:
.. code-block:: python
attributes = {"id": "main"}
attribute_value = "shrubbery"
content = "hello"
template = t"<div {attributes} data-value={attribute_value}>{content}</div>"
element = html(template)
assert str(element) == '<div id="main" data-value="shrubbery">hello</div>'
Because the ``{attributes}`` interpolation occurs in the context of an HTML tag,
and because there is no corresponding attribute name, it is treated as a dictionary
of attributes. The ``{attribute_value}`` interpolation is treated as a simple
string value and is quoted before inclusion in the final string. The
``{content}`` interpolation is treated as potentially unsafe content and is
escaped before inclusion in the final string.
Nested Template Strings
-----------------------
Going a step further with our ``html()`` function, we could support nested
template strings. This would allow for more complex HTML structures to be
built up from simpler templates:
.. code-block:: python
name = "World"
content = html(t"<p>Hello {name}</p>")
template = t"<div>{content}</div>"
element = html(template)
assert str(element) == '<div><p>Hello World</p></div>'
Because the ``{content}`` interpolation is an ``Element`` instance, it does
not need to be escaped before inclusion in the final string.
One could imagine a nice simplification: if the ``html()`` function is passed
a ``Template`` instance, it could automatically convert it to an ``Element``
by recursively calling itself on the nested template.
We expect that nesting and composition of templates will be a common pattern
in template processing code and, where appropriate, used in preference to
simple string concatenation.
Approaches to Lazy Evaluation
-----------------------------
Like f-strings, interpolations in t-string literals are eagerly evaluated. However,
there are cases where lazy evaluation may be desirable.
If a single interpolation is expensive to evaluate, it can be explicitly wrapped
in a ``lambda`` in the template string literal:
.. code-block:: python
name = "World"
template = t"Hello {(lambda: name)}"
assert callable(template.args[1].value)
assert template.args[1].value() == "World"
This assumes, of course, that template processing code anticipates and handles
callable interpolation values. (One could imagine also supporting iterators,
awaitables, etc.) This is not a requirement of the PEP, but it is a common
pattern in template processing code.
In general, we hope that the community will develop best practices for lazy
evaluation of interpolations in template strings and that, when it makes sense,
common libraries will provide support for callable or awaitable values in
their template processing code.
Approaches to Asynchronous Evaluation
-------------------------------------
Closely related to lazy evaluation is asynchronous evaluation.
As with f-strings, the ``await`` keyword is allowed in interpolations:
.. code-block:: python
async def example():
async def get_name() -> str:
await asyncio.sleep(1)
return "Sleepy"
template = t"Hello {await get_name()}"
# Use the f() function from the f-string example, above
assert f(template) == "Hello Sleepy"
More sophisticated template processing code can take advantage of this to
perform asynchronous operations in interpolations. For example, a "smart"
processing function could anticipate that an interpolation is an awaitable
and await it before processing the template string:
.. code-block:: python
async def example():
async def get_name() -> str:
await asyncio.sleep(1)
return "Sleepy"
template = t"Hello {get_name}"
assert await aformat(template) == "Hello Sleepy"
This assumes that the template processing code in ``aformat()`` is asynchronous
and is able to ``await`` an interpolation's value.
.. note:: Example code
See `aformat.py`__ and `test_aformat.py`__.
__ https://github.com/davepeck/pep750-examples/blob/main/pep/aformat.py
__ https://github.com/davepeck/pep750-examples/blob/main/pep/test_aformat.py
Approaches to Template Reuse
----------------------------
If developers wish to reuse template strings multiple times with different
values, they can write a function to return a ``Template`` instance:
.. code-block:: python
def reusable(name: str, question: str) -> Template:
return t"Hello {name}, {question}?"
template = reusable("friend", "how are you")
template = reusable("King Arthur", "what is your quest")
This is, of course, no different from how f-strings can be reused.
Reference Implementation
========================
At the time of this PEP's announcement, a fully-working implementation is
`available <https://github.com/lysnikolaou/cpython/tree/tag-strings-rebased>`_.
There is also a public repository of `examples and tests <https://github.com/davepeck/pep750-examples>`_
built around the reference implementation. If you're interested in playing with
template strings, this repository is a great place to start.
Rejected Ideas
==============
This PEP has been through several significant revisions. In addition, quite a few interesting
ideas were considered both in revisions of :pep:`501` and in the `Discourse discussion <https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408/196>`_.
We attempt to document the most significant ideas that were considered and rejected.
Arbitrary String Literal Prefixes
---------------------------------
Inspired by `JavaScript tagged template literals <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates>`_,
an earlier version of this PEP allowed for arbitrary "tag" prefixes in front
of literal strings:
.. code-block:: python
my_tag'Hello {name}'
The prefix was a special callable called a "tag function". Tag functions
received the parts of the template string in an argument list. They could then
process the string and return an arbitrary value:
.. code-block:: python
def my_tag(*args: str | Interpolation) -> Any:
...
This approach was rejected for several reasons:
- It was deemed too complex to build in full generality. JavaScript allows for
arbitrary expressions to precede a template string, which is a significant
challenge to implement in Python.
- It precluded future introduction of new string prefixes.
- It seemed to needlessly pollute the namespace.
Use of a single ``t`` prefix was chosen as a simpler, more Pythonic approach and
more in keeping with template strings' role as a generalization of f-strings.
Delayed Evaluation of Interpolations
------------------------------------
An early version of this PEP proposed that interpolations should be lazily
evaluated. All interpolations were "wrapped" in implicit lambdas. Instead of
having an eagerly evaluated ``value`` attribute, interpolations had a
``getvalue()`` method that would resolve the value of the interpolation:
.. code-block:: python
class Interpolation:
...
_value: Callable[[], object]
def getvalue(self) -> object:
return self._value()
This was rejected for several reasons:
- The overwhelming majority of use cases for template strings naturally call
for immediate evaluation.
- Delayed evaluation would be a significant departure from the behavior of
f-strings.
- Implicit lambda wrapping leads to difficulties with type hints and
static analysis.
Most importantly, there are viable (if imperfect) alternatives to implicit
lambda wrapping when lazy evaluation is desired. See the section on
`Approaches to Lazy Evaluation`_, above, for more information.
Making ``Template`` and ``Interpolation`` Into Protocols
--------------------------------------------------------
An early version of this PEP proposed that the ``Template`` and ``Interpolation``
types be runtime checkable protocols rather than concrete types.
In the end, we felt that using concrete types was more straightforward.
An Additional ``Decoded`` Type
------------------------------
An early version of this PEP proposed an additional type, ``Decoded``, to represent
the "static string" parts of a template string. This type derived from ``str`` and
had a single extra ``raw`` attribute that provided the original text of the string.
We rejected this in favor of the simpler approach of using plain ``str`` and
allowing combination of ``r`` and ``t`` prefixes.
Other Homes for ``Template`` and ``Interpolation``
--------------------------------------------------
Previous versions of this PEP proposed that the ``Template`` and ``Interpolation``
types be placed in the ``types`` module. This was rejected in favor of creating
a new top-level standard library module, ``templatelib``. This was done to avoid
polluting the ``types`` module with seemingly unrelated types.
Enable Full Reconstruction of Original Template Literal
-------------------------------------------------------
Earlier versions of this PEP attempted to make it possible to fully reconstruct
the text of the original template string from a ``Template`` instance. This was
rejected as being overly complex.
There are several limitations with respect to round-tripping to the original
source text:
- ``Interpolation.format_spec`` defaults to ``""`` if not provided. It is therefore
impossible to distinguish ``t"{expr}"`` from ``t"{expr:}"``.
- The debug specifier, ``=``, is treated as a special case. It is therefore not
possible to distinguish ``t"{expr=}"`` from ``t"expr={expr}"``.
- Finally, format specifiers in f-strings allow arbitrary nesting. In this PEP
and in the reference implementation, the specifier is eagerly evaluated
to set the ``format_spec`` in the ``Interpolation``, thereby losing
the original expressions. For example:
.. code-block:: python
value = 42
precision = 2
template = t"Value: {value:.{precision}f}"
assert template.args[1].format_spec == ".2f"
We do not anticipate that these limitations will be a significant issue in practice.
Developers who need to obtain the original template string literal can always
use ``inspect.getsource()`` or similar tools.
Disallowing String Concatenation
--------------------------------
Earlier versions of this PEP proposed that template strings should not support
concatenation. This was rejected in favor of allowing concatenation.
There are reasonable arguments in favor of rejecting one or all forms of
concatenation: namely, that it cuts off a class of potential bugs, particularly
when one takes the view that template strings will often contain complex grammars
for which concatenation doesn't always have the same meaning (or any meaning).
Moreover, the earliest versions of this PEP proposed a syntax closer to
JavaScript's tagged template literals, where an arbitrary callable could be used
as a prefix to a string literal. There was no guarantee that the callable would
return a type that supported concatenation.
In the end, we decided that the surprise to developers of a new string type
*not* supporting concatenation was likely to be greater than the theoretical
harm caused by supporting it. (Developers concatenate f-strings all the time,
after all, and while we are sure there are cases where this introduces bugs,
it's not clear that those bugs outweigh the benefits of supporting concatenation.)
While concatenation is supported, we expect that code that uses template strings
will more commonly build up larger templates through nesting and composition
rather than concatenation.
Arbitrary Conversion Values
---------------------------
Python allows only ``r``, ``s``, or ``a`` as possible conversion type values.
Trying to assign a different value results in ``SyntaxError``.
In theory, template functions could choose to handle other conversion types. But this
PEP adheres closely to :pep:`701`. Any changes to allowed values should be in a
separate PEP.
Removing ``conv`` From ``Interpolation``
----------------------------------------
During the authoring of this PEP, we considered removing the ``conv`` attribute
from ``Interpolation`` and specifying that the conversion should be performed
eagerly, before ``Interpolation.value`` is set.
This was done to simplify the work of writing template processing code. The
``conv`` attribute is of limited extensibility (it is typed as
``Literal["r", "s", "a"] | None``). It is not clear that it adds significant
value or flexibility to template strings that couldn't better be achieved with
custom format specifiers. Unlike with format specifiers, there is no
equivalent to Python's :func:`python:format` built-in. (Instead, we include an
sample implementation of ``convert()`` in the `Examples`_ section.)
Ultimately we decided to keep the ``conv`` attribute in the ``Interpolation`` type
to maintain compatibility with f-strings and to allow for future extensibility.
Alternate Interpolation Symbols
-------------------------------
In the early stages of this PEP, we considered allowing alternate symbols for
interpolations in template strings. For example, we considered allowing
``${name}`` as an alternative to ``{name}`` with the idea that it might be useful
for i18n or other purposes. See the
`Discourse thread <https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408/122>`_
for more information.
This was rejected in favor of keeping t-string syntax as close to f-string syntax
as possible.
A Lazy Conversion Specifier
---------------------------
We considered adding a new conversion specifier, ``!()``, that would explicitly
wrap the interpolation expression in a lambda.
This was rejected in favor of the simpler approach of using explicit lambdas
when lazy evaluation is desired.
Alternate Layouts for ``Template.args``
---------------------------------------
During the development of this PEP, we considered several alternate layouts for
the ``args`` attribute of the ``Template`` type. This included:
- Instead of ``args``, ``Template`` contains a ``strings`` attribute of type
``Sequence[str]`` and an ``interpolations`` attribute of type
``Sequence[Interpolation]``. There are zero or more interpolations and
there is always one more string than there are interpolations. Utility code
could build an interleaved sequence of strings and interpolations from these
separate attributes. This was rejected as being overly complex.
- ``args`` is typed as a ``Sequence[tuple[str, Interpolation | None]]``. Each
static string is paired with is neighboring interpolation. The final
string part has no corresponding interpolation. This was rejected as being
overly complex.
- ``args`` remains a ``Sequence[str | Interpolation]`` but does not support
interleaving. As a result, empty strings are not added to the sequence. It is
no longer possible to obtain static strings with ``args[::2]``; instead,
instance checks or structural pattern matching must be used to distinguish
between strings and interpolations. We believe this approach is easier to
explain and, at first glance, more intuitive. However, it was rejected as
offering less future opportunty for performance optimization. We also believe
that ``args[::2]`` may prove to be a useful shortcut in template processing
code.
Mechanism to Describe the "Kind" of Template
--------------------------------------------
If t-strings prove popular, it may be useful to have a way to describe the
"kind" of content found in a template string: "sql", "html", "css", etc.
This could enable powerful new features in tools such as linters, formatters,
type checkers, and IDEs. (Imagine, for example, ``black`` formatting HTML in
t-strings, or ``mypy`` checking whether a given attribute is valid for an HTML
tag.) While exciting, this PEP does not propose any specific mechanism. It is
our hope that, over time, the community will develop conventions for this purpose.
Acknowledgements
================
Thanks to Ryan Morshead for contributions during development of the ideas leading
to template strings. Special mention also to Dropbox's
`pyxl <https://github.com/dropbox/pyxl>`_ for tackling similar ideas years ago.
Finally, thanks to Joachim Viide for his pioneering work on the `tagged library
<https://github.com/jviide/tagged>`_. Tagged was not just the precursor to
template strings, but the place where the whole effort started via a GitHub issue
comment!
Copyright
=========
This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.