diff --git a/peps/pep-0750.rst b/peps/pep-0750.rst index fc7d84675..3291da787 100644 --- a/peps/pep-0750.rst +++ b/peps/pep-0750.rst @@ -1,51 +1,46 @@ PEP: 750 -Title: Tag Strings For Writing Domain-Specific Languages -Author: Jim Baker , Guido van Rossum , Paul Everitt -Sponsor: Lysandros Nikolaou +Title: Template Strings +Author: Jim Baker , + Guido van Rossum , + Paul Everitt , + Koudai Aono , + Lysandros Nikolaou , + Dave Peck Discussions-To: https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408 Status: Draft Type: Standards Track Created: 08-Jul-2024 Python-Version: 3.14 +Post-History: `09-Aug-2024 `__, + `17-Oct-2024 `__, + `21-Oct-2024 `__ + Abstract ======== -This PEP introduces tag strings for custom, repeatable string processing. Tag strings -are an extension to f-strings, with a custom function -- the "tag" -- in place of the -``f`` prefix. This function can then provide rich features such as safety checks, lazy -evaluation, domain-specific languages (DSLs) for web templating, and more. +This PEP introduces template strings for custom string processing. -Tag strings are similar to `JavaScript tagged template literals `_ -and related ideas in other languages. The following tag string usage shows how similar it is to an ``f`` string, albeit -with the ability to process the literal string and embedded values: +Template strings are a generalization of f-strings, using a ``t`` in place of +the ``f`` prefix. Instead of evaluating to ``str``, t-strings evaluate to a new +type, ``Template``: .. code-block:: python - name = "World" - greeting = greet"hello {name}" - assert greeting == "Hello WORLD!" + template: Template = t"Hello {name}" +Templates provide developers with access to the string and its interpolated +values *before* they are combined. This brings native flexible string +processing to the Python language and enables safety checks, web templating, +domain-specific languages, and more. -Tag functions accept prepared arguments and return a string: - -.. code-block:: python - - def greet(*args): - """Tag function to return a greeting with an upper-case recipient.""" - salutation, recipient, *_ = args - getvalue, *_ = recipient - return f"{salutation.title().strip()} {getvalue().upper()}!" - -Below you can find richer examples. As a note, an implementation based on CPython 3.14 -exists, as discussed in this document. Relationship With Other PEPs ============================ Python introduced f-strings in Python 3.6 with :pep:`498`. The grammar was then formalized in :pep:`701` which also lifted some restrictions. This PEP -is based off of PEP 701. +is based on PEP 701. At nearly the same time PEP 498 arrived, :pep:`501` was written to provide "i-strings" -- that is, "interpolation template strings". The PEP was @@ -53,753 +48,995 @@ deferred pending further experience with f-strings. Work on this PEP was resumed by a different author in March 2023, introducing "t-strings" as template literal strings, and built atop PEP 701. -The authors of this PEP consider tag strings as a generalization of the -updated work in PEP 501. +The authors of this PEP consider it to be a generalization and simplification +of the updated work in PEP 501. (That PEP has also recently been updated to +reflect the new ideas in this PEP.) + Motivation ========== -Python f-strings became very popular, very fast. The syntax was simple, convenient, and -interpolated expressions had access to regular scoping rules. However, f-strings have -two main limitations - expressions are eagerly evaluated, and interpolated values -cannot be intercepted. The former means that f-strings cannot be re-used like templates, -and the latter means that how values are interpolated cannot be customized. +Python f-strings are easy to use and very popular. Over time, however, developers +have encountered limitations that make them +`unsuitable for certain use cases `__. +In particular, f-strings provide no way to intercept and transform interpolated +values before they are combined into a final string. -Templating in Python is currently achieved using packages like Jinja2 which bring their -own templating languages for generating dynamic content. In addition to being one more -thing to learn, these languages are not nearly as expressive as Python itself. This -means that business logic, which cannot be expressed in the templating language, must be -written in Python instead, spreading the logic across different languages and files. +As a result, incautious use of f-strings can lead to security vulnerabilities. +For example, a user executing a SQL query with :mod:`python:sqlite3` +may be tempted to use an f-string to embed values into their SQL expression, +which could lead to a `SQL injection attack `__. +Or, a developer building HTML may include unescaped user input in the string, +leading to a `cross-site scripting (XSS) `__ +vulnerability. -Likewise, the inability to intercept interpolated values means that they cannot be -sanitized or otherwise transformed before being integrated into the final string. Here, -the convenience of f-strings could be considered a liability. For example, a user -executing a query with `sqlite3 `__ -may be tempted to use an f-string to embed values into their SQL expression instead of -using the ``?`` placeholder and passing the values as a tuple to avoid an -`SQL injection attack `__. +More broadly, the inability to transform interpolated values before they are +combined into a final string limits the utility of f-strings in more complex +string processing tasks. -Tag strings address both these problems by extending the f-string syntax to provide -developers access to the string and its interpolated values before they are combined. In -doing so, tag strings may be interpreted in many different ways, opening up the -possibility for DSLs and other custom string processing. +Template strings address these problems by providing +developers with access to the string and its interpolated values. -Proposal -======== - -This PEP proposes customizable prefixes for f-strings. These f-strings then -become a "tag string": an f-string with a "tag function." The tag function is -a callable which is given a sequence of arguments for the parsed tokens in -the string. - -Here's a very simple example. Imagine we want a certain kind of string with -some custom business policies: uppercase the value and add an exclamation point. - -Let's start with a tag string which simply returns a static greeting: +For example, imagine we want to generate some HTML. Using template strings, +we can define an ``html()`` function that allows us to automatically sanitize +content: .. code-block:: python - def greet(*args): - """Give a static greeting.""" - return "Hello!" + evil = "" + template = t"

{evil}

" + assert html(template) == "

<script>alert('evil')</script>

" - assert greet"Hi" == "Hello!" # Use the custom "tag" on the string - -As you can see, ``greet`` is just a callable, in the place that the ``f`` -prefix would go. Let's look at the args: +Likewise, our hypothetical ``html()`` function can make it easy for developers +to add attributes to HTML elements using a dictionary: .. code-block:: python - def greet(*args): - """Uppercase and add exclamation.""" - salutation = args[0].upper() - return f"{salutation}!" + attributes = {"src": "shrubbery.jpg", "alt": "looks nice"} + template = t"" + assert html(template) == 'looks nice' - greeting = greet"Hello" # Use the custom "tag" on the string - assert greeting == "HELLO!" +Neither of these examples is possible with f-strings. By providing a +mechanism to intercept and transform interpolated values, template strings +enable a wide range of string processing use cases. -The tag function is passed a sequence of arguments. Since our tag string is simply -``"Hello"``, the ``args`` sequence only contains a string-like value of ``'Hello'``. - -With this in place, let's introduce an *interpolation*. That is, a place where -a value should be inserted: - -.. code-block:: python - - def greet(*args): - """Handle an interpolation.""" - # The first arg is the string-like value "Hello " with a space - salutation = args[0].strip() - # The second arg is an "interpolation" - interpolation = args[1] - # Interpolations are tuples, the first item is a lambda - getvalue = interpolation[0] - # It gets called in the scope where it was defined, so - # the interpolation returns "World" - result = getvalue() - recipient = result.upper() - return f"{salutation} {recipient}!" - - name = "World" - greeting = greet"Hello {name}" - assert greeting == "Hello WORLD!" - -The f-string interpolation of ``{name}`` leads to the new machinery in tag -strings: - -- ``args[0]`` is still the string-like ``'Hello '``, this time with a trailing space -- ``args[1]`` is an expression -- the ``{name}`` part -- Tag strings represent this part as an *interpolation* object as discussed below - -The ``*args`` list is a sequence of ``Decoded`` and ``Interpolation`` values. A "decoded" object -is a string-like object with extra powers, as described below. An "interpolation" object is a -tuple-like value representing how Python processed the interpolation into a form useful for your -tag function. Both are fully described below in `Specification`_. - -Here is a more generalized version using structural pattern matching and type hints: - -.. code-block:: python - - from typing import Decoded, Interpolation # Get the new protocols - - def greet(*args: Decoded | Interpolation) -> str: - """Handle arbitrary args using structural pattern matching.""" - result = [] - for arg in args: - match arg: - case Decoded() as decoded: - result.append(decoded) - case Interpolation() as interpolation: - value = interpolation.getvalue() - result.append(value.upper()) - - return f"{''.join(result)}!" - - name = "World" - greeting = greet"Hello {name} nice to meet you" - assert greeting == "Hello WORLD nice to meet you!" - -Tag strings extract more than just a callable from the ``Interpolation``. They also -provide Python string formatting info, as well as the original text: - -.. code-block:: python - - def greet(*args: Decoded | Interpolation) -> str: - """Interpolations can have string formatting specs and conversions.""" - result = [] - for arg in args: - match arg: - case Decoded() as decoded: - result.append(decoded) - case getvalue, raw, conversion, format_spec: # Unpack - gv = f"gv: {getvalue()}" - r = f"r: {raw}" - c = f"c: {conversion}" - f = f"f: {format_spec}" - result.append(", ".join([gv, r, c, f])) - - return f"{''.join(result)}!" - - name = "World" - assert greet"Hello {name!r:s}" == "Hello gv: World, r: name, c: r, f: s!" - -You can see each of the ``Interpolation`` parts getting extracted: - -- The lambda expression to call and get the value in the scope it was defined -- The raw string of the interpolation (``name``) -- The Python "conversion" field (``r``) -- Any `format specification `_ - (``s``) Specification ============= -In the rest of this specification, ``my_tag`` will be used for an arbitrary tag. -For example: +Template String Literals +------------------------ + +This PEP introduces a new string prefix, ``t``, to define template string literals. +These literals resolve to a new type, ``Template``, found in a new top-level +standard library module, ``templatelib``. + +The following code creates a ``Template`` instance: .. code-block:: python - def mytag(*args): - return args + from templatelib import Template + template = t"This is a template string." + assert isinstance(template, Template) - trade = 'shrubberies' - mytag'Did you say "{trade}"?' +Template string literals support the full syntax of :pep:`701`. This includes +the ability to nest template strings within interpolations, as well as the ability +to use all valid quote marks (``'``, ``"``, ``'''``, and ``"""``). Like other string +prefixes, the ``t`` prefix must immediately precede the quote. Like f-strings, +both lowercase ``t`` and uppercase ``T`` prefixes are supported. Like +f-strings, t-strings may not be combined with the ``b`` or ``u`` prefixes. +Additionally, f-strings and t-strings cannot be combined, so the ``ft`` +prefix is invalid as well. t-strings *may* be combined with the ``r`` prefix; +see the `Raw Template Strings`_ section below for more information. -Valid Tag Names ---------------- -The tag name can be any undotted name that isn't already an existing valid string or -bytes prefix, as seen in the `lexical analysis specification -`_. -Therefore these prefixes can't be used as a tag: +The ``Template`` Type +--------------------- -.. code-block:: text - - stringprefix: "r" | "u" | "R" | "U" | "f" | "F" - : | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF" - - bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" - -Python `restricts certain keywords `_ from being -used as identifiers. This restriction also applies to tag names. Usage of keywords should -trigger a helpful error, as done in recent CPython releases. - -Tags Must Immediately Precede the Quote Mark --------------------------------------------- - -As with other string literal prefixes, no whitespace can be between the tag and the -quote mark. - -PEP 701 -------- - -Tag strings support the full syntax of :pep:`701` in that any string literal, -with any quote mark, can be nested in the interpolation. This nesting includes -of course tag strings. - -Evaluating Tag Strings ----------------------- - -When the tag string is evaluated, the tag must have a binding, or a ``NameError`` -is raised; and it must be a callable, or a ``TypeError`` is raised. The callable -must accept a sequence of positional arguments. This behavior follows from the -de-sugaring of: +Template strings evaluate to an instance of a new type, ``templatelib.Template``: .. code-block:: python - trade = 'shrubberies' - mytag'Did you say "{trade}"?' - -to: - -.. code-block:: python - - mytag(DecodedConcrete(r'Did you say "'), InterpolationConcrete(lambda: trade, 'trade', None, None), DecodedConcrete(r'"?')) - -.. note:: - - `DecodedConcrete` and `InterpolationConcrete` are just example implementations. If approved, - tag strings will have concrete types in `builtins`. - -Decoded Strings ---------------- - -In the ``mytag'Did you say "{trade}"?'`` example, there are two strings: ``r'Did you say "'`` -and ``r'"?'``. - -Strings are internally stored as objects with a ``Decoded`` structure, meaning: conforming to -a protocol ``Decoded``: - -.. code-block:: python - - @runtime_checkable - class Decoded(Protocol): - def __str__(self) -> str: + class Template: + args: Sequence[str | Interpolation] + + def __init__(self, *args: str | Interpolation): ... - raw: str - - -These ``Decoded`` objects have access to raw strings. Raw strings are used because tag strings -are meant to target a variety of DSLs, such as the shell and regexes. Such DSLs have their -own specific treatment of metacharacters, namely the backslash. - -However, often the "cooked" string is what is needed, by decoding the string as -if it were a standard Python string. In the proposed implementation, the decoded object's -``__new__`` will *store* the raw string and *store and return* the "cooked" string. - -The protocol is marked as ``@runtime_checkable`` to allow structural pattern matching to -test against the protocol instead of a type. This can incur a small performance penalty. -Since the ``case`` tests are in user-code tag functions, authors can choose to optimize by -testing for the implementation type discussed next. - -The ``Decoded`` protocol will be available from ``typing``. In CPython, ``Decoded`` -will be implemented in C, but for discussion of this PEP, the following is a compatible -implementation: +The ``args`` attribute provides access to the string parts and +any interpolations in the literal: .. code-block:: python - class DecodedConcrete(str): - _raw: str + name = "World" + template = t"Hello {name}" + assert isinstance(template.args[0], str) + assert isinstance(template.args[1], Interpolation) + assert template.args[0] == "Hello " + assert template.args[1].value == "World" - def __new__(cls, raw: str): - decoded = raw.encode("utf-8").decode("unicode-escape") - if decoded == raw: - decoded = raw - chunk = super().__new__(cls, decoded) - chunk._raw = raw - return chunk +See `Interleaving of Template.args`_ below for more information on how the +``args`` attribute is structured. - @property - def raw(self): - return self._raw +The ``Template`` type is immutable. ``Template.args`` cannot be reassigned +or mutated. -Interpolation -------------- -An ``Interpolation`` is the data structure representing an expression inside the tag -string. Interpolations enable a delayed evaluation model, where the interpolation -expression is computed, transformed, memoized, or processed in any way. +The ``Interpolation`` Type +-------------------------- -In addition, the original text of the interpolation expression is made available to the -tag function. This can be useful for debugging or metaprogramming. - -``Interpolation`` is a ``Protocol`` which will be made available from ``typing``. It -has the following definition: +The ``Interpolation`` type represents an expression inside a template string. +Like ``Template``, it is a new concrete type found in the ``templatelib`` module: .. code-block:: python - @runtime_checkable - class Interpolation(Protocol): - def __len__(self): - ... - - def __getitem__(self, index: int): - ... - - def getvalue(self) -> Callable[[], Any]: - ... - + class Interpolation: + value: object expr: str conv: Literal["a", "r", "s"] | None - format_spec: str | None + format_spec: str -Given this example interpolation: + __match_args__ = ("value", "expr", "conv", "format_spec") + + def __init__( + self, + value: object, + expr: str, + conv: Literal["a", "r", "s"] | None = None, + format_spec: str = "", + ): + ... + +Like ``Template``, ``Interpolation`` is shallow immutable. Its attributes +cannot be reassigned. + +The ``value`` attribute is the evaluated result of the interpolation: .. code-block:: python - mytag'{trade!r:some-formatspec}' + name = "World" + template = t"Hello {name}" + assert template.args[1].value == "World" -these attributes are as follows: - -* ``getvalue`` is a zero argument closure for the interpolation. In this case, ``lambda: trade``. - -* ``expr`` is the *expression text* of the interpolation. Example: ``'trade'``. - -* ``conv`` is the - `optional conversion `_ - to be used by the tag function, one of ``r``, ``s``, and ``a``, corresponding to repr, str, - and ascii conversions. Note that as with f-strings, no other conversions are supported. - Example: ``'r'``. - -* ``format_spec`` is the optional `format_spec string `_. - A ``format_spec`` is eagerly evaluated if it contains any expressions before being passed to the tag - function. Example: ``'some-formatspec'``. - -In all cases, the tag function determines what to do with valid ``Interpolation`` -attributes. - -In the CPython reference implementation, implementing ``Interpolation`` in C would -use the equivalent `Struct Sequence Objects -`_ (see -such code as `os.stat_result -`_). For purposes of this -PEP, here is an example of a pure Python implementation: +The ``expr`` attribute is the *original text* of the interpolation: .. code-block:: python - class InterpolationConcrete(NamedTuple): - getvalue: Callable[[], Any] - expr: str - conv: Literal['a', 'r', 's'] | None = None - format_spec: str | None = None + name = "World" + template = t"Hello {name}" + assert template.args[1].expr == "name" + +We expect that the ``expr`` attribute will not be used in most template processing +code. It is provided for completeness and for use in debugging and introspection. +See both the `Common Patterns Seen in Processing Templates`_ section and the +`Examples`_ section for more information on how to process template strings. + +The ``conv`` attribute is the :ref:`optional conversion ` +to be used, one of ``r``, ``s``, and ``a``, corresponding to ``repr()``, +``str()``, and ``ascii()`` conversions. As with f-strings, no other conversions +are supported: + +.. code-block:: python + + name = "World" + template = t"Hello {name!r}" + assert template.args[1].conv == "r" + +If no conversion is provided, ``conv`` is ``None``. + +The ``format_spec`` attribute is the :ref:`format specification `. +As with f-strings, this is an arbitrary string that defines how to present the value: + +.. code-block:: python + + value = 42 + template = t"Value: {value:.2f}" + assert template.args[1].format_spec == ".2f" + +Format specifications in f-strings can themselves contain interpolations. This +is permitted in template strings as well; ``format_spec`` is set to the eagerly +evaluated result: + +.. code-block:: python + + value = 42 + precision = 2 + template = t"Value: {value:.{precision}f}" + assert template.args[1].format_spec == ".2f" + +If no format specification is provided, ``format_spec`` defaults to an empty +string (``""``). This matches the ``format_spec`` parameter of Python's +:func:`python:format` built-in. + +Unlike f-strings, it is up to code that processes the template to determine how to +interpret the ``conv`` and ``format_spec`` attributes. +Such code is not required to use these attributes, but when present they should +be respected, and to the extent possible match the behavior of f-strings. +It would be surprising if, for example, a template string that uses ``{value:.2f}`` +did not round the value to two decimal places when processed. + + +Processing Template Strings +--------------------------- + +Developers can write arbitrary code to process template strings. For example, +the following function renders static parts of the template in lowercase and +interpolations in uppercase: + +.. code-block:: python + + from templatelib import Template, Interpolation + + def lower_upper(template: Template) -> str: + """Render static parts lowercased and interpolations uppercased.""" + parts: list[str] = [] + for arg in template.args: + if isinstance(arg, Interpolation): + parts.append(str(arg.value).upper()) + else: + parts.append(arg.lower()) + return "".join(parts) + + name = "world" + assert lower_upper(t"HELLO {name}") == "hello WORLD" + +There is no requirement that template strings are processed in any particular +way. Code that processes templates has no obligation to return a string. +Template strings are a flexible, general-purpose feature. + +See the `Common Patterns Seen in Processing Templates`_ section for more +information on how to process template strings. See the `Examples`_ section +for detailed working examples. + + +Template String Concatenation +----------------------------- + +Template strings support explicit concatenation using ``+``. Concatenation is +supported for two ``Template`` instances as well as for a ``Template`` instance +and a ``str``: + +.. code-block:: python + + name = "World" + template1 = t"Hello " + template2 = t"{name}" + assert template1 + template2 == t"Hello {name}" + assert template1 + "!" == t"Hello !" + assert "Hello " + template2 == t"Hello {name}" + +Concatenation of templates is "viral": the concatenation of a ``Template`` and +a ``str`` always results in a ``Template`` instance. + +Python's implicit concatenation syntax is also supported. The following code +will work as expected: + +.. code-block:: python + + name = "World" + template = t"Hello " "World" + assert template == t"Hello World" + template2 = t"Hello " t"World" + assert template2 == t"Hello World" + + +The ``Template`` type implements the ``__add__()`` and ``__radd__()`` methods +roughly as follows: + +.. code-block:: python + + class Template: + def __add__(self, other: object) -> Template: + if isinstance(other, str): + return Template(*self.args[:-1], self.args[-1] + other) + if not isinstance(other, Template): + return NotImplemented + return Template(*self.args[:-1], self.args[-1] + other.args[0], *other.args[1:]) + + def __radd__(self, other: object) -> Template: + if not isinstance(other, str): + return NotImplemented + return Template(other + self.args[0], *self.args[1:]) + +Special care is taken to ensure that the interleaving of ``str`` and ``Interpolation`` +instances is maintained when concatenating. (See the +`Interleaving of Template.args`_ section for more information.) + + +Template and Interpolation Equality +----------------------------------- + +Two instances of ``Template`` are defined to be equal if their ``args`` attributes +contain the same strings and interpolations in the same order: + +.. code-block:: python + + assert t"I love {stilton}" == t"I love {stilton}" + assert t"I love {stilton}" != t"I love {roquefort}" + assert t"I " + t"love {stilton}" == t"I love {stilton}" + +The implementation of ``Template.__eq__()`` is roughly as follows: + +.. code-block:: python + + class Template: + def __eq__(self, other: object) -> bool: + if not isinstance(other, Template): + return NotImplemented + return self.args == other.args + +Two instances of ``Interpolation`` are defined to be equal if their ``value``, +``expr``, ``conv``, and ``format_spec`` attributes are equal: + +.. code-block:: python + + class Interpolation: + def __eq__(self, other: object) -> bool: + if not isinstance(other, Interpolation): + return NotImplemented + return ( + self.value == other.value + and self.expr == other.expr + and self.conv == other.conv + and self.format_spec == other.format_spec + ) + + +No Support for Ordering +----------------------- + +The ``Template`` and ``Interpolation`` types do not support ordering. This is +unlike all other string literal types in Python, which support lexicographic +ordering. Because interpolations can contain arbitrary values, there is no +natural ordering for them. As a result, neither the ``Template`` nor the +``Interpolation`` type implements the standard comparison methods. + + +Support for the debug specifier (``=``) +--------------------------------------- + +The debug specifier, ``=``, is supported in template strings and behaves similarly +to how it behaves in f-strings, though due to limitations of the implementation +there is a slight difference. + +In particular, ``t'{expr=}'`` is treated as ``t'expr={expr}'``: + +.. code-block:: python + + name = "World" + template = t"Hello {name=}" + assert template.args[0] == "Hello name=" + assert template.args[1].value == "World" + + +Raw Template Strings +-------------------- + +Raw template strings are supported using the ``rt`` (or ``tr``) prefix: + +.. code-block:: python + + trade = 'shrubberies' + t = rt'Did you say "{trade}"?\n' + assert t.args[0] == r'Did you say "' + assert t.args[2] == r'"?\n' + +In this example, the ``\n`` is treated as two separate characters +(a backslash followed by 'n') rather than a newline character. This is +consistent with Python's raw string behavior. + +As with regular template strings, interpolations in raw template strings are +processed normally, allowing for the combination of raw string behavior and +dynamic content. + Interpolation Expression Evaluation ----------------------------------- -Expression evaluation for interpolations is the same as in :pep:`498#expression-evaluation`, -except that all expressions are always implicitly wrapped with a ``lambda``: +Expression evaluation for interpolations is the same as in :pep:`498#expression-evaluation`: The expressions that are extracted from the string are evaluated in the context - where the tag string appeared. This means the expression has full access to its + where the template string appeared. This means the expression has full access to its lexical scope, including local and global variables. Any valid Python expression can be used, including function and method calls. -However, there's one additional nuance to consider, `function scope -`_ -versus `annotation scope -`_. -Consider this somewhat contrived example to configure captions: +Template strings are evaluated eagerly from left to right, just like f-strings. This means that +interpolations are evaluated immediately when the template string is processed, not deferred +or wrapped in lambdas. + + +Exceptions +---------- + +Exceptions raised in t-string literals are the same as those raised in f-string +literals. + + +Interleaving of ``Template.args`` +--------------------------------- + +In the ``Template`` type, the ``args`` attribute is a sequence that will always +alternate between string literals and ``Interpolation`` instances. Specifically: + +- Even-indexed elements (0, 2, 4, ...) are always of type ``str``, representing + the literal parts of the template. +- Odd-indexed elements (1, 3, 5, ...) are always ``Interpolation`` instances, + representing the interpolated expressions. + +For example, the following assertions hold: .. code-block:: python - class CaptionConfig: - tag = 'b' - figure = f'<{tag}>Figure' + name = "World" + template = t"Hello {name}" + assert len(template.args) == 3 + assert template.args[0] == "Hello " + assert template.args[1].value == "World" + assert template.args[2] == "" -Let's now attempt to rewrite the above example to use tag strings: +These rules imply that the ``args`` attribute will always have an odd length. +As a consequence, empty strings are added to the sequence when the template +begins or ends with an interpolation, or when two interpolations are adjacent: .. code-block:: python - class CaptionConfig: - tag = 'b' - figure = html'<{tag}>Figure' + a, b = "a", "b" + template = t"{a}{b}" + assert len(template.args) == 5 + assert template.args[0] == "" + assert template.args[1].value == "a" + assert template.args[2] == "" + assert template.args[3].value == "b" + assert template.args[4] == "" -Unfortunately, this rewrite doesn't work if using the usual lambda wrapping to -implement interpolations, namely ``lambda: tag``. When the interpolations are -evaluated by the tag function, it will result in ``NameError: name 'tag' is not -defined``. The root cause of this name error is that ``lambda: tag`` uses function scope, -and it's therefore not able to use the class definition where ``tag`` is -defined. +Most template processing code will not care about this detail and will use +either structural pattern matching or ``isinstance()`` checks to distinguish +between the two types of elements in the sequence. + +The detail exists because it allows for performance optimizations in template +processing code. For example, a template processor could cache the static parts +of the template and only reprocess the dynamic parts when the template is +evaluated with different values. Access to the static parts can be done with +``template.args[::2]``. + +Interleaving is an invariant maintained by the ``Template`` class. Developers can +take advantage of it but they are not required to themselves maintain it. +Specifically, ``Template.__init__()`` can be called with ``str`` and +``Interpolation`` instances in *any* order; the constructor will "interleave" them +as necessary before assigning them to ``args``. + + +Examples +======== + +All examples in this section of the PEP have fully tested reference implementations +available in the public `pep750-examples `_ +git repository. + + +Example: Implementing f-strings with t-strings +---------------------------------------------- + +It is easy to "implement" f-strings using t-strings. That is, we can +write a function ``f(template: Template) -> str`` that processes a ``Template`` +in much the same way as an f-string literal, returning the same result: -Desugaring how the tag string could be evaluated will result in the same -``NameError`` even using f-strings; the lambda wrapping here also uses function -scoping: .. code-block:: python - class CaptionConfig: - tag = 'b' - figure = f'<{(lambda: tag)()}>Figure' + name = "World" + value = 42 + templated = t"Hello {name!r}, value: {value:.2f}" + formatted = f"Hello {name!r}, value: {value:.2f}" + assert f(templated) == formatted -For tag strings, getting such a ``NameError`` would be surprising. It would also -be a rough edge in using tag strings in this specific case of working with class -variables. After all, tag strings are supposed to support a superset of the -capabilities of f-strings. - -The solution is to use annotation scope for tag string interpolations. While the -name "annotation scope" suggests it's only about annotations, it solves this -problem by lexically resolving names in the class definition, such as ``tag``, -unlike function scope. - -.. note:: - - The use of annotation scope means it's not possible to fully desugar - interpolations into Python code. Instead it's as if one is writing - ``interpolation_lambda: tag``, not ``lambda: tag``, where a hypothetical - ``interpolation_lambda`` keyword variant uses annotation scope instead of - the standard function scope. - - This is more or less how the reference implementation implements this - concept (but without creating a new keyword of course). - -This PEP and its reference implementation therefore use the support for -annotation scope. Note that this usage is a separable part from the -implementation of :pep:`649` and :pep:`695` which provides a somewhat similar -deferred execution model for annotations. Instead it's up to the tag function to -evaluate any interpolations. - -With annotation scope in place, lambda-wrapped expressions in interpolations -then provide the usual lexical scoping seen with f-strings. So there's no need -to use ``locals()``, ``globals()``, or frame introspection with -``sys._getframe`` to evaluate the interpolation. In addition, the code of each -expression is available and does not have to be looked up with -``inspect.getsource`` or some other means. - -Format Specification --------------------- - -The ``format_spec`` is by default ``None`` if it is not specified in the tag string's -corresponding interpolation. - -Because the tag function is completely responsible for processing ``Decoded`` -and ``Interpolation`` values, there is no required interpretation for the format -spec and conversion in an interpolation. For example, this is a valid usage: +The ``f()`` function supports both conversion specifiers like ``!r`` and format +specifiers like ``:.2f``. The full code is fairly simple: .. code-block:: python - html'
{content:HTML|str}
' + from templatelib import Template, Interpolation -In this case the ``format_spec`` for the second interpolation is the string -``'HTML|str'``; it is up to the ``html`` tag to do something with the -"format spec" here, if anything. + def convert(value: object, conv: Literal["a", "r", "s"] | None) -> object: + if conv == "a": + return ascii(value) + elif conv == "r": + return repr(value) + elif conv == "s": + return str(value) + return value -f-string-style ``=`` Evaluation -------------------------------- -``mytag'{expr=}'`` is parsed to being the same as ``mytag'expr={expr}``', as -implemented in the issue `Add = to f-strings for -easier debugging `_. + def f(template: Template) -> str: + parts = [] + for arg in template.args: + match arg: + case str() as s: + parts.append(s) + case Interpolation(value, _, conv, format_spec): + value = convert(value, conv) + value = format(value, format_spec) + parts.append(value) + return "".join(parts) -Tag Function Arguments ----------------------- -The tag function has the following signature: +.. note:: Example code + + See `fstring.py`__ and `test_fstring.py`__. + + __ https://github.com/davepeck/pep750-examples/blob/main/pep/fstring.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_fstring.py + + +Example: Structured Logging +--------------------------- + +Structured logging allows developers to log data in both a human-readable format +*and* a structured format (like JSON) using only a single logging call. This is +useful for log aggregation systems that process the structured format while +still allowing developers to easily read their logs. + +We present two different approaches to implementing structured logging with +template strings. + +Approach 1: Custom Log Messages +''''''''''''''''''''''''''''''' + +The :ref:`Python Logging Cookbook ` +has a short section on `how to implement structured logging `_. + +The logging cookbook suggests creating a new "message" class, ``StructuredMessage``, +that is constructed with a simple text message and a separate dictionary of values: .. code-block:: python - def mytag(*args: Decoded | Interpolation) -> Any: - ... + message = StructuredMessage("user action", { + "action": "traded", + "amount": 42, + "item": "shrubs" + }) + logging.info(message) -This corresponds to the following protocol: + # Outputs: + # user action >>> {"action": "traded", "amount": 42, "item": "shrubs"} + +The ``StructuredMessage.__str__()`` method formats both the human-readable +message *and* the values, combining them into a final string. (See the +`logging cookbook `_ +for its full example.) + +We can implement an improved version of ``StructuredMessage`` using template strings: .. code-block:: python - class TagFunction(Protocol): - def __call__(self, *args: Decoded | Interpolation) -> Any: - ... + import json + from templatelib import Interpolation, Template + from typing import Mapping -Because of subclassing, the signature for ``mytag`` can of course be widened to -the following, at the cost of losing some type specificity: + class TemplateMessage: + def __init__(self, template: Template) -> None: + self.template = template + + @property + def message(self) -> str: + # Use the f() function from the previous example + return f(self.template) + + @property + def values(self) -> Mapping[str, object]: + return { + arg.expr: arg.value + for arg in self.template.args + if isinstance(arg, Interpolation) + } + + def __str__(self) -> str: + return f"{self.message} >>> {json.dumps(self.values)}" + + _ = TemplateMessage # optional, to improve readability + action, amount, item = "traded", 42, "shrubs" + logging.info(_(t"User {action}: {amount:.2f} {item}")) + + # Outputs: + # User traded: 42.00 shrubs >>> {"action": "traded", "amount": 42, "item": "shrubs"} + +Template strings give us a more elegant way to define the custom message +class. With template strings it is no longer necessary for developers to make +sure that their format string and values dictionary are kept in sync; a single +template string literal is all that is needed. The ``TemplateMessage`` +implementation can automatically extract structured keys and values from +the ``Interpolation.expr`` and ``Interpolation.value`` attributes, respectively. + + +Approach 2: Custom Formatters +''''''''''''''''''''''''''''' + +Custom messages are a reasonable approach to structured logging but can be a +little awkward. To use them, developers must wrap every log message they write +in a custom class. This can be easy to forget. + +An alternative approach is to define custom ``logging.Formatter`` classes. This +approach is more flexible and allows for more control over the final output. In +particular, it's possible to take a single template string and output it in +multiple formats (human-readable and JSON) to separate log streams. + +We define two simple formatters, a ``MessageFormatter`` for human-readable output +and a ``ValuesFormatter`` for JSON output: .. code-block:: python - def mytag(*args: str | tuple) -> Any: - ... + import json + from logging import Formatter, LogRecord + from templatelib import Interpolation, Template + from typing import Any, Mapping -A user might write a tag string as follows: + + class MessageFormatter(Formatter): + def message(self, template: Template) -> str: + # Use the f() function from the previous example + return f(template) + + def format(self, record: LogRecord) -> str: + msg = record.msg + if not isinstance(msg, Template): + return super().format(record) + return self.message(msg) + + + class ValuesFormatter(Formatter): + def values(self, template: Template) -> Mapping[str, Any]: + return { + arg.expr: arg.value + for arg in template.args + if isinstance(arg, Interpolation) + } + + def format(self, record: LogRecord) -> str: + msg = record.msg + if not isinstance(msg, Template): + return super().format(record) + return json.dumps(self.values(msg)) + + +We can then use these formatters when configuring our logger: .. code-block:: python - def tag(*args): - return args + import logging + import sys - tag"\N{{GRINNING FACE}}" + logger = logging.getLogger(__name__) + message_handler = logging.StreamHandler(sys.stdout) + message_handler.setFormatter(MessageFormatter()) + logger.addHandler(message_handler) -Tag strings will represent this as exactly one ``Decoded`` argument. In this case, ``Decoded.raw`` would be -``'\\N{GRINNING FACE}'``. The "cooked" representation via encode and decode would be: + values_handler = logging.StreamHandler(sys.stderr) + values_handler.setFormatter(ValuesFormatter()) + logger.addHandler(values_handler) -.. code-block:: python + action, amount, item = "traded", 42, "shrubs" + logger.info(t"User {action}: {amount:.2f} {item}") - '\\N{GRINNING FACE}'.encode('utf-8').decode('unicode-escape') - '😀' + # Outputs to sys.stdout: + # User traded: 42.00 shrubs -Named unicode characters immediately followed by more text will still produce -just one ``Decoded`` argument: - -.. code-block:: python - - def tag(*args): - return args - - assert tag"\N{{GRINNING FACE}}sometext" == (DecodedConcrete("😀sometext"),) + # At the same time, outputs to sys.stderr: + # {"action": "traded", "amount": 42, "item": "shrubs"} -Return Value ------------- +This approach has a couple advantages over the custom message approach to structured +logging: -Tag functions can return any type. Often they will return a string, but -richer systems can be built by returning richer objects. See below for -a motivating example. +- Developers can log a t-string directly without wrapping it in a custom class. +- Human-readable and structured output can be sent to separate log streams. This + is useful for log aggregation systems that process structured data independently + from human-readable data. -Function Application --------------------- -Tag strings desugar as follows: +.. note:: Example code -.. code-block:: python + See `logging.py`__ and `test_logging.py`__. + + __ https://github.com/davepeck/pep750-examples/blob/main/pep/logging.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_logging.py - mytag'Hi, {name!s:format_spec}!' -This is equivalent to: +Example: HTML Templating +------------------------- -.. code-block:: python +This PEP contains several short HTML templating examples. It turns out that the +"hypothetical" ``html()`` function mentioned in the `Motivation`_ section +(and a few other places in this PEP) exists and is available in the +`pep750-examples repository `_. +If you're thinking about parsing a complex grammar with template strings, we +hope you'll find it useful. - mytag(DecodedConcrete(r'Hi, '), InterpolationConcrete(lambda: name, 'name', - 's', 'format_spec'), DecodedConcrete(r'!')) - -.. note:: - - To keep it simple, this and subsequent desugaring omits an important scoping - aspect in how names in interpolation expressions are resolved, specifically - when defining classes. See `Interpolation Expression Evaluation`_. - -No Empty Decoded String ------------------------ - -Alternation between decodeds and interpolations is commonly seen, but it depends -on the tag string. Decoded strings will never have a value that is the empty string: - -.. code-block:: python - - mytag'{a}{b}{c}' - -...which results in this desugaring: - -.. code-block:: python - - mytag(InterpolationConcrete(lambda: a, 'a', None, None), InterpolationConcrete(lambda: b, 'b', None, None), InterpolationConcrete(lambda: c, 'c', None, None)) - -Likewise: - -.. code-block:: python - - mytag'' - -...results in this desugaring: - -.. code-block:: python - - mytag() - -HTML Example of Rich Return Types -================================= - -Tag functions can be a powerful part of larger processing chains by returning richer objects. -JavaScript tagged template literals, for example, are not constrained by a requirement to -return a string. As an example, let's look at an HTML generation system, with a usage and -"subcomponent": - -.. code-block:: - - def Menu(*, logo: str, class_: str) -> HTML: - return html'Site Logo' - - icon = 'acme.png' - result = html'
<{Menu} logo={icon} class="my-menu"/>
' - img = result.children[0] - assert img.tag == "img" - assert img.attrs == {"src": "acme.png", "class": "my-menu", "alt": "Site Logo"} - # We can also treat the return type as a string of specially-serialized HTML - assert str(result) = '
' # etc. - -This ``html`` tag function might have the following signature: - -.. code-block:: python - - def html(*args: Decoded | Interpolation) -> HTML: - ... - -The ``HTML`` return class might have the following shape as a ``Protocol``: - -.. code-block:: python - - @runtime_checkable - class HTML(Protocol): - tag: str - attrs: dict[str, Any] - children: Sequence[str | HTML] - -In summary, the returned instance can be used as: - -- A string, for serializing to the final output -- An iterable, for working with WSGI/ASGI for output streamed and evaluated - interpolations *in the order* they are written out -- A DOM (data) structure of nested Python data - -In each case, the result can be lazily and recursively composed in a safe fashion, because -the return value isn't required to be a string. Recommended practice is that -return values are "passive" objects. - -What benefits might come from returning rich objects instead of strings? A DSL for -a domain such as HTML templating can provide a toolchain of post-processing, as -`Babel `_ does for JavaScript -`with AST-based transformation plugins `_. -Similarly, systems that provide middleware processing can operate on richer, -standard objects with more capabilities. Tag string results can be tested as -nested Python objects, rather than string manipulation. Finally, the intermediate -results can be cached/persisted in useful ways. - -Tool Support -============ - -Python Semantics in Tag Strings -------------------------------- - -Python template languages and other DSLs have semantics quite apart from Python. -Different scope rules, different calling semantics e.g. for macros, their own -grammar for loops, and the like. - -This means all tools need to write special support for each language. Even then, -it is usually difficult to find all the possible scopes, for example to autocomplete -values. - -However, f-strings do not have this issue. An f-string is considered part of Python. -Expressions in curly braces behave as expected and values should resolve based on -regular scoping rules. Tools such as mypy can see inside f-string expressions, -but will likely never look inside a Jinja2 template. - -DSLs written with tag strings will inherit much of this value. While we can't expect -standard tooling to understand the "domain" in the DSL, they can still inspect -anything expressible in an f-string. Backwards Compatibility ======================= -Like f-strings, use of tag strings will be a syntactic backwards incompatibility +Like f-strings, use of template strings will be a syntactic backwards incompatibility with previous versions. + Security Implications ===================== -The security implications of working with interpolations, with respect to +The security implications of working with template strings, with respect to interpolations, are as follows: 1. Scope lookup is the same as f-strings (lexical scope). This model has been shown to work well in practice. -2. Tag functions can ensure that any interpolations are done in a safe fashion, - including respecting the context in the target DSL. +2. Code that processes ``Template`` instances can ensure that any interpolations + are processed in a safe fashion, including respecting the context in which + they appear. + How To Teach This ================= -Tag strings have several audiences: consumers of tag functions, authors of tag -functions, and framework authors who provide interesting machinery for tag -functions. +Template strings have several audiences: -All three groups can start from an important framing: +- Developers using template strings and processing functions +- Authors of template processing code +- Framework authors who build interesting machinery with template strings -- Existing solutions (such as template engines) can do parts of tag strings -- But tag strings move logic closer to "normal Python" +We hope that teaching developers will be straightforward. At a glance, +template strings look just like f-strings. Their syntax is familiar and the +scoping rules remain the same. -Consumers can look at tag strings as starting from f-strings: +The first thing developers must learn is that template string literals don't +evaluate to strings; instead, they evaluate to a new type, ``Template``. This +is a simple type intended to be used by template processing code. It's not until +developers call a processing function that they get the result they want: +typically, a string, although processing code can of course return any arbitrary +type. -- They look familiar -- Scoping and syntax rules are the same +Because developers will learn that t-strings are nearly always used in tandem +with processing functions, they don't necessarily need to understand the details +of the ``Template`` type. As with descriptors and decorators, we expect many more +developers will use t-strings than write t-string processing functions. -They first thing they need to absorb: unlike f-strings, the string isn't -immediately evaluated "in-place". Something else (the tag function) happens. -That's the second thing to teach: the tag functions do something particular. -Thus the concept of "domain specific languages" (DSLs). What's extra to -teach: you need to import the tag function before tagging a string. +Over time, a small number of more advanced developers *will* wish to author their +own template processing code. Writing processing code often requires thinking +in terms of formal grammars. Developers will need to learn how to parse the +``args`` attribute of a ``Template`` instance and how to process interpolations +in a context-sensitive fashion. More sophisticated grammars will likely require +parsing to intermediate representations like an AST. Great template processing +code will handle format specifiers and conversions when appropriate. Writing +production-grade template processing code -- for instance, to support HTML +templates -- can be a large undertaking. -Tag function authors think in terms of making a DSL. They have -business policies they want to provide in a Python-familiar way. With tag -functions, Python is going to do much of the pre-processing. This lowers -the bar for making a DSL. +We expect that template strings will provide framework authors with a powerful +new tool in their toolbox. While the functionality of template strings overlaps +with existing tools like template engines, t-strings move that logic into +the language itself. Bringing the full power and generality of Python to bear on +string processing tasks opens new possibilities for framework authors. -Tag authors can begin with simple use cases. After authors gain experience, tag strings can be used to add larger -patterns: lazy evaluation, intermediate representations, registries, and more. -Each of these points also match the teaching of decorators. In that case, -a learner consumes something which applies to the code just after it. They -don't need to know too much about decorator theory to take advantage of the -utility. - -Common Patterns Seen In Writing Tag Functions -============================================= +Common Patterns Seen in Processing Templates +============================================ Structural Pattern Matching --------------------------- -Iterating over the arguments with structural pattern matching is the expected -best practice for many tag function implementations: +Iterating over the ``Template.args`` with structural pattern matching is the expected +best practice for many template function implementations: .. code-block:: python - def tag(*args: Decoded | Interpolation) -> Any: - for arg in args: + from templatelib import Template, Interpolation + + def process(template: Template) -> Any: + for arg in template.args: match arg: - case Decoded() as decoded: - ... # handle each decoded string + case str() as s: + ... # handle each string part case Interpolation() as interpolation: ... # handle each interpolation -Lazy Evaluation ---------------- -The example tag functions above each call the interpolation's ``getvalue`` lambda -immediately. Python developers have frequently wished that f-strings could be -deferred, or lazily evaluated. It would be straightforward to write a wrapper that, -for example, defers calling the lambda until an ``__str__`` was invoked. +Processing code may also commonly sub-match on attributes of the ``Interpolation`` type: + +.. code-block:: python + + match arg: + case Interpolation(int()): + ... # handle interpolations with integer values + case Interpolation(value=str() as s): + ... # handle interpolations with string values + # etc. + Memoizing --------- -Tag function authors have control of processing the static string parts and -the dynamic interpolation parts. For higher performance, they can deploy approaches -for memoizing processing, for example by generating keys. +Template functions can efficiently process both static and dynamic parts of templates. +The structure of ``Template`` objects allows for effective memoization: -Order of Evaluation -------------------- +.. code-block:: python -Imagine a tag that generates a number of sections in HTML. The tag needs inputs for each -section. But what if the last input argument takes a while? You can't return the HTML for -the first section until all the arguments are available. + source = template.args[::2] # Static string parts + values = [i.value for i in template.args[1::2]] # Dynamic interpolated values + +This separation enables caching of processed static parts, while dynamic parts can be +inserted as needed. Authors of template processing code can use the static +``source`` as cache keys, leading to significant performance improvements when +similar templates are used repeatedly. + + +Parsing to Intermediate Representations +--------------------------------------- + +Code that processes templates can parse the template string into intermediate +representations, like an AST. We expect that many template processing libraries +will use this approach. + +For instance, rather than returning a ``str``, our theoretical ``html()`` function +(see the `Motivation`_ section) could return an HTML ``Element`` defined in the +same package: + +.. code-block:: python + + @dataclass(frozen=True) + class Element: + tag: str + attributes: Mapping[str, str | bool] + children: Sequence[str | Element] + + def __str__(self) -> str: + ... + + + def html(template: Template) -> Element: + ... + +Calling ``str(element)`` would then render the HTML but, in the meantime, the +``Element`` could be manipulated in a variety of ways. + + +Context-sensitive Processing of Interpolations +---------------------------------------------- + +Continuing with our hypothetical ``html()`` function, it could be made +context-sensitive. Interpolations could be processed differently depending +on where they appear in the template. + +For example, our ``html()`` function could support multiple kinds of +interpolations: + +.. code-block:: python + + attributes = {"id": "main"} + attribute_value = "shrubbery" + content = "hello" + template = t"
{content}
" + element = html(template) + assert str(element) == '
hello
' + +Because the ``{attributes}`` interpolation occurs in the context of an HTML tag, +and because there is no corresponding attribute name, it is treated as a dictionary +of attributes. The ``{attribute_value}`` interpolation is treated as a simple +string value and is quoted before inclusion in the final string. The +``{content}`` interpolation is treated as potentially unsafe content and is +escaped before inclusion in the final string. + + +Nested Template Strings +----------------------- + +Going a step further with our ``html()`` function, we could support nested +template strings. This would allow for more complex HTML structures to be +built up from simpler templates: + +.. code-block:: python + + name = "World" + content = html(t"

Hello {name}

") + template = t"
{content}
" + element = html(template) + assert str(element) == '

Hello World

' + +Because the ``{content}`` interpolation is an ``Element`` instance, it does +not need to be escaped before inclusion in the final string. + +One could imagine a nice simplification: if the ``html()`` function is passed +a ``Template`` instance, it could automatically convert it to an ``Element`` +by recursively calling itself on the nested template. + +We expect that nesting and composition of templates will be a common pattern +in template processing code and, where appropriate, used in preference to +simple string concatenation. + + +Approaches to Lazy Evaluation +----------------------------- + +Like f-strings, interpolations in t-string literals are eagerly evaluated. However, +there are cases where lazy evaluation may be desirable. + +If a single interpolation is expensive to evaluate, it can be explicitly wrapped +in a ``lambda`` in the template string literal: + +.. code-block:: python + + name = "World" + template = t"Hello {(lambda: name)}" + assert callable(template.args[1].value) + assert template.args[1].value() == "World" + +This assumes, of course, that template processing code anticipates and handles +callable interpolation values. (One could imagine also supporting iterators, +awaitables, etc.) This is not a requirement of the PEP, but it is a common +pattern in template processing code. + +In general, we hope that the community will develop best practices for lazy +evaluation of interpolations in template strings and that, when it makes sense, +common libraries will provide support for callable or awaitable values in +their template processing code. + + +Approaches to Asynchronous Evaluation +------------------------------------- + +Closely related to lazy evaluation is asynchronous evaluation. + +As with f-strings, the ``await`` keyword is allowed in interpolations: + +.. code-block:: python + + async def example(): + async def get_name() -> str: + await asyncio.sleep(1) + return "Sleepy" + + template = t"Hello {await get_name()}" + # Use the f() function from the f-string example, above + assert f(template) == "Hello Sleepy" + +More sophisticated template processing code can take advantage of this to +perform asynchronous operations in interpolations. For example, a "smart" +processing function could anticipate that an interpolation is an awaitable +and await it before processing the template string: + +.. code-block:: python + + async def example(): + async def get_name() -> str: + await asyncio.sleep(1) + return "Sleepy" + + template = t"Hello {get_name}" + assert await aformat(template) == "Hello Sleepy" + +This assumes that the template processing code in ``aformat()`` is asynchronous +and is able to ``await`` an interpolation's value. + +.. note:: Example code + + See `aformat.py`__ and `test_aformat.py`__. + + __ https://github.com/davepeck/pep750-examples/blob/main/pep/aformat.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_aformat.py + + +Approaches to Template Reuse +---------------------------- + +If developers wish to reuse template strings multiple times with different +values, they can write a function to return a ``Template`` instance: + +.. code-block:: python + + def reusable(name: str, question: str) -> Template: + return t"Hello {name}, {question}?" + + template = reusable("friend", "how are you") + template = reusable("King Arthur", "what is your quest") + +This is, of course, no different from how f-strings can be reused. -You'd prefer to emit markup as the inputs are available. Some templating tools support -this approach, as does tag strings. Reference Implementation ======================== @@ -807,65 +1044,168 @@ Reference Implementation At the time of this PEP's announcement, a fully-working implementation is `available `_. -This implementation is not final, as the PEP discussion will likely provide changes. +There is also a public repository of `examples and tests `_ +built around the reference implementation. If you're interested in playing with +template strings, this repository is a great place to start. + Rejected Ideas ============== +This PEP has been through several significant revisions. In addition, quite a few interesting +ideas were considered both in revisions of :pep:`501` and in the `Discourse discussion `_. -Enable Exact Round-Tripping of ``conv`` and ``format_spec`` ------------------------------------------------------------ +We attempt to document the most significant ideas that were considered and rejected. -There are two limitations with respect to exactly round-tripping to the original -source text. -First, the ``format_spec`` can be arbitrarily nested: +Arbitrary String Literal Prefixes +--------------------------------- + +Inspired by `JavaScript tagged template literals `_, +an earlier version of this PEP allowed for arbitrary "tag" prefixes in front +of literal strings: .. code-block:: python - mytag'{x:{a{b{c}}}}' + my_tag'Hello {name}' -In this PEP and corresponding reference implementation, the format_spec -is eagerly evaluated to set the ``format_spec`` in the interpolation, thereby losing the -original expressions. - -While it would be feasible to preserve round-tripping in every usage, this would -require an extra flag ``equals`` to support, for example, ``{x=}``, and a -recursive ``Interpolation`` definition for ``format_spec``. The following is roughly the -pure Python equivalent of this type, including preserving the sequence -unpacking (as used in case statements): +The prefix was a special callable called a "tag function". Tag functions +received the parts of the template string in an argument list. They could then +process the string and return an arbitrary value: .. code-block:: python - class InterpolationConcrete(NamedTuple): - getvalue: Callable[[], Any] - raw: str - conv: str | None = None - format_spec: str | None | tuple[Decoded | Interpolation, ...] = None - equals: bool = False + def my_tag(*args: str | Interpolation) -> Any: + ... - def __len__(self): - return 4 +This approach was rejected for several reasons: - def __iter__(self): - return iter((self.getvalue, self.raw, self.conv, self.format_spec)) +- It was deemed too complex to build in full generality. JavaScript allows for + arbitrary expressions to precede a template string, which is a significant + challenge to implement in Python. +- It precluded future introduction of new string prefixes. +- It seemed to needlessly pollute the namespace. -However, the additional complexity to support exact round-tripping seems -unnecessary and is thus rejected. +Use of a single ``t`` prefix was chosen as a simpler, more Pythonic approach and +more in keeping with template strings' role as a generalization of f-strings. -No Implicit String Concatenation + +Delayed Evaluation of Interpolations +------------------------------------ + +An early version of this PEP proposed that interpolations should be lazily +evaluated. All interpolations were "wrapped" in implicit lambdas. Instead of +having an eagerly evaluated ``value`` attribute, interpolations had a +``getvalue()`` method that would resolve the value of the interpolation: + +.. code-block:: python + + class Interpolation: + ... + _value: Callable[[], object] + + def getvalue(self) -> object: + return self._value() + +This was rejected for several reasons: + +- The overwhelming majority of use cases for template strings naturally call + for immediate evaluation. +- Delayed evaluation would be a significant departure from the behavior of + f-strings. +- Implicit lambda wrapping leads to difficulties with type hints and + static analysis. + +Most importantly, there are viable (if imperfect) alternatives to implicit +lambda wrapping when lazy evaluation is desired. See the section on +`Approaches to Lazy Evaluation`_, above, for more information. + + +Making ``Template`` and ``Interpolation`` Into Protocols +-------------------------------------------------------- + +An early version of this PEP proposed that the ``Template`` and ``Interpolation`` +types be runtime checkable protocols rather than concrete types. + +In the end, we felt that using concrete types was more straightforward. + + +An Additional ``Decoded`` Type +------------------------------ + +An early version of this PEP proposed an additional type, ``Decoded``, to represent +the "static string" parts of a template string. This type derived from ``str`` and +had a single extra ``raw`` attribute that provided the original text of the string. +We rejected this in favor of the simpler approach of using plain ``str`` and +allowing combination of ``r`` and ``t`` prefixes. + + +Other Homes for ``Template`` and ``Interpolation`` +-------------------------------------------------- + +Previous versions of this PEP proposed that the ``Template`` and ``Interpolation`` +types be placed in the ``types`` module. This was rejected in favor of creating +a new top-level standard library module, ``templatelib``. This was done to avoid +polluting the ``types`` module with seemingly unrelated types. + + +Enable Full Reconstruction of Original Template Literal +------------------------------------------------------- + +Earlier versions of this PEP attempted to make it possible to fully reconstruct +the text of the original template string from a ``Template`` instance. This was +rejected as being overly complex. + +There are several limitations with respect to round-tripping to the original +source text: + +- ``Interpolation.format_spec`` defaults to ``""`` if not provided. It is therefore + impossible to distinguish ``t"{expr}"`` from ``t"{expr:}"``. +- The debug specifier, ``=``, is treated as a special case. It is therefore not + possible to distinguish ``t"{expr=}"`` from ``t"expr={expr}"``. +- Finally, format specifiers in f-strings allow arbitrary nesting. In this PEP + and in the reference implementation, the specifier is eagerly evaluated + to set the ``format_spec`` in the ``Interpolation``, thereby losing + the original expressions. For example: + +.. code-block:: python + + value = 42 + precision = 2 + template = t"Value: {value:.{precision}f}" + assert template.args[1].format_spec == ".2f" + +We do not anticipate that these limitations will be a significant issue in practice. +Developers who need to obtain the original template string literal can always +use ``inspect.getsource()`` or similar tools. + + +Disallowing String Concatenation -------------------------------- -Implicit tag string concatenation isn't supported, which is `unlike other string literals -`_. +Earlier versions of this PEP proposed that template strings should not support +concatenation. This was rejected in favor of allowing concatenation. -The expectation is that triple quoting is sufficient. If implicit string -concatenation is supported, results from tag evaluations would need to -support the ``+`` operator with ``__add__`` and ``__radd__``. +There are reasonable arguments in favor of rejecting one or all forms of +concatenation: namely, that it cuts off a class of potential bugs, particularly +when one takes the view that template strings will often contain complex grammars +for which concatenation doesn't always have the same meaning (or any meaning). + +Moreover, the earliest versions of this PEP proposed a syntax closer to +JavaScript's tagged template literals, where an arbitrary callable could be used +as a prefix to a string literal. There was no guarantee that the callable would +return a type that supported concatenation. + +In the end, we decided that the surprise to developers of a new string type +*not* supporting concatenation was likely to be greater than the theoretical +harm caused by supporting it. (Developers concatenate f-strings all the time, +after all, and while we are sure there are cases where this introduces bugs, +it's not clear that those bugs outweigh the benefits of supporting concatenation.) + +While concatenation is supported, we expect that code that uses template strings +will more commonly build up larger templates through nesting and composition +rather than concatenation. -Because tag strings target embedded DSLs, this complexity introduces other -issues, such as determining appropriate separators. This seems unnecessarily -complicated and is thus rejected. Arbitrary Conversion Values --------------------------- @@ -873,17 +1213,106 @@ Arbitrary Conversion Values Python allows only ``r``, ``s``, or ``a`` as possible conversion type values. Trying to assign a different value results in ``SyntaxError``. -In theory, tag functions could choose to handle other conversion types. But this +In theory, template functions could choose to handle other conversion types. But this PEP adheres closely to :pep:`701`. Any changes to allowed values should be in a separate PEP. + +Removing ``conv`` From ``Interpolation`` +---------------------------------------- + +During the authoring of this PEP, we considered removing the ``conv`` attribute +from ``Interpolation`` and specifying that the conversion should be performed +eagerly, before ``Interpolation.value`` is set. + +This was done to simplify the work of writing template processing code. The +``conv`` attribute is of limited extensibility (it is typed as +``Literal["r", "s", "a"] | None``). It is not clear that it adds significant +value or flexibility to template strings that couldn't better be achieved with +custom format specifiers. Unlike with format specifiers, there is no +equivalent to Python's :func:`python:format` built-in. (Instead, we include an +sample implementation of ``convert()`` in the `Examples`_ section.) + +Ultimately we decided to keep the ``conv`` attribute in the ``Interpolation`` type +to maintain compatibility with f-strings and to allow for future extensibility. + + +Alternate Interpolation Symbols +------------------------------- + +In the early stages of this PEP, we considered allowing alternate symbols for +interpolations in template strings. For example, we considered allowing +``${name}`` as an alternative to ``{name}`` with the idea that it might be useful +for i18n or other purposes. See the +`Discourse thread `_ +for more information. + +This was rejected in favor of keeping t-string syntax as close to f-string syntax +as possible. + + +A Lazy Conversion Specifier +--------------------------- + +We considered adding a new conversion specifier, ``!()``, that would explicitly +wrap the interpolation expression in a lambda. + +This was rejected in favor of the simpler approach of using explicit lambdas +when lazy evaluation is desired. + + +Alternate Layouts for ``Template.args`` +--------------------------------------- + +During the development of this PEP, we considered several alternate layouts for +the ``args`` attribute of the ``Template`` type. This included: + +- Instead of ``args``, ``Template`` contains a ``strings`` attribute of type + ``Sequence[str]`` and an ``interpolations`` attribute of type + ``Sequence[Interpolation]``. There are zero or more interpolations and + there is always one more string than there are interpolations. Utility code + could build an interleaved sequence of strings and interpolations from these + separate attributes. This was rejected as being overly complex. + +- ``args`` is typed as a ``Sequence[tuple[str, Interpolation | None]]``. Each + static string is paired with is neighboring interpolation. The final + string part has no corresponding interpolation. This was rejected as being + overly complex. + +- ``args`` remains a ``Sequence[str | Interpolation]`` but does not support + interleaving. As a result, empty strings are not added to the sequence. It is + no longer possible to obtain static strings with ``args[::2]``; instead, + instance checks or structural pattern matching must be used to distinguish + between strings and interpolations. We believe this approach is easier to + explain and, at first glance, more intuitive. However, it was rejected as + offering less future opportunty for performance optimization. We also believe + that ``args[::2]`` may prove to be a useful shortcut in template processing + code. + + +Mechanism to Describe the "Kind" of Template +-------------------------------------------- + +If t-strings prove popular, it may be useful to have a way to describe the +"kind" of content found in a template string: "sql", "html", "css", etc. +This could enable powerful new features in tools such as linters, formatters, +type checkers, and IDEs. (Imagine, for example, ``black`` formatting HTML in +t-strings, or ``mypy`` checking whether a given attribute is valid for an HTML +tag.) While exciting, this PEP does not propose any specific mechanism. It is +our hope that, over time, the community will develop conventions for this purpose. + + Acknowledgements ================ Thanks to Ryan Morshead for contributions during development of the ideas leading -to tag strings. Thanks also to Koudai Aono for infrastructure work on contributing -materials. Special mention also to Dropbox's `pyxl `_ -as tackling similar ideas years ago. +to template strings. Special mention also to Dropbox's +`pyxl `_ for tackling similar ideas years ago. +Finally, thanks to Joachim Viide for his pioneering work on the `tagged library +`_. Tagged was not just the precursor to +template strings, but the place where the whole effort started via a GitHub issue +comment! + Copyright =========