Major redesign of PEP 501 interpolation
This commit is contained in:
parent
673f1ce88d
commit
1bd21bb257
429
pep-0501.txt
429
pep-0501.txt
|
@ -1,5 +1,5 @@
|
|||
PEP: 501
|
||||
Title: Translation ready string interpolation
|
||||
Title: General purpose string interpolation
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Nick Coghlan <ncoghlan@gmail.com>
|
||||
|
@ -18,53 +18,83 @@ transparent to the compiler, allow name references from the interpolation
|
|||
operation full access to containing namespaces (as with any other expression),
|
||||
rather than being limited to explicitly name references.
|
||||
|
||||
This PEP agrees with the basic motivation of PEP 498, but proposes to focus
|
||||
both the syntax and the implementation on the il8n use case, drawing on the
|
||||
previous proposals in PEP 292 (which added string.Template) and its predecessor
|
||||
PEP 215 (which proposed syntactic support, rather than a runtime string
|
||||
manipulation based approach). The text of this PEP currently assumes that the
|
||||
reader is familiar with these three previous related proposals.
|
||||
However, it only offers this capability for string formatting, making it likely
|
||||
we will see code like the following::
|
||||
|
||||
The interpolation syntax proposed for this PEP is that of PEP 292, but expanded
|
||||
to allow arbitrary expressions and format specifiers when using the ``${ref}``
|
||||
interpolation syntax. The suggested new string prefix is "i" rather than "f",
|
||||
with the intended mnemonics being either "interpolated string" or
|
||||
"il8n string"::
|
||||
os.system(f"echo {user_message}")
|
||||
|
||||
This kind of code is superficially elegant, but poses a significant problem
|
||||
if the interpolated value ``user_message`` is in fact provided by a user: it's
|
||||
an opening for a form of code injection attack, where the supplied user data
|
||||
has not been properly escaped before being passed to the ``os.system`` call.
|
||||
|
||||
To address that problem (and a number of other concerns), this PEP proposes an
|
||||
alternative approach to compiler supported interpolation, based on a new
|
||||
``__interpolate__`` magic method, and using a substitution syntax inspired by
|
||||
that used in ``string.Template`` and ES6 JavaScript, rather than adding a 4th
|
||||
substitution variable syntax to Python.
|
||||
|
||||
Proposal
|
||||
========
|
||||
|
||||
This PEP proposes that the new syntax::
|
||||
|
||||
value = !interpolator "Substitute $names and ${expressions} at runtime"
|
||||
|
||||
be interpreted as::
|
||||
|
||||
_raw_template = "Substitute $names and ${expressions} at runtime"
|
||||
_parsed_fields = (
|
||||
("Substitute ", 0, "names", "", ""),
|
||||
(" and ", 1, "expressions", "", ""),
|
||||
(" at runtime", None, None, None, None),
|
||||
)
|
||||
_field_values = (names, expressions)
|
||||
value = interpolator.__interpolate__(_raw_template,
|
||||
_parsed_fields,
|
||||
_field_values)
|
||||
|
||||
Whitespace would be permitted between the interpolator name and the opening
|
||||
quote, but not required in most cases.
|
||||
|
||||
The ``str`` builtin type would gain an ``__interpolate__`` implementation that
|
||||
supported the following ``str.format`` based semantics::
|
||||
|
||||
>>> import datetime
|
||||
>>> name = 'Jane'
|
||||
>>> age = 50
|
||||
>>> anniversary = datetime.date(1991, 10, 12)
|
||||
>>> i'My name is $name, my age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.'
|
||||
>>> !str'My name is $name, my age next year is ${age+1}, my anniversary is ${anniversary:%A, %B %d, %Y}.'
|
||||
'My name is Jane, my age next year is 51, my anniversary is Saturday, October 12, 1991.'
|
||||
>>> i'She said her name is ${name!r}.'
|
||||
>>> !str'She said her name is ${name!r}.'
|
||||
"She said her name is 'Jane'."
|
||||
|
||||
This PEP also proposes the introduction of three new builtin functions,
|
||||
``__interpolate__``, ``__interpolateb__`` and ``__interpolateu__``, which
|
||||
implement key aspects of the interpolation process, and may be overridden in
|
||||
accordance with the usual mechanisms for shadowing builtin functions.
|
||||
The interpolation prefix could be used with single-quoted, double-quoted and
|
||||
triple quoted strings. It may also be used with raw strings, but in that case
|
||||
whitespace would be required between the interpolator name and the trailing
|
||||
string.
|
||||
|
||||
This PEP does not propose to remove or deprecate any of the existing
|
||||
string formatting mechanisms, as those will remain valuable when formatting
|
||||
strings that are not present directly in the source code of the application.
|
||||
|
||||
The key aim of this PEP that isn't inherited from PEP 498 is to help ensure
|
||||
that future Python applications are written in a "translation ready" way, where
|
||||
many interface strings that may need to be translated to allow an application
|
||||
to be used in multiple languages are flagged as a natural consequence of the
|
||||
development process, even though they won't be translated by default.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
PEP 498 makes interpolating values into strings with full access to Python's
|
||||
lexical namespace semantics simpler, but it does so at the cost of introducing
|
||||
yet another string interpolation syntax.
|
||||
yet another string interpolation syntax, and also creates a situation where
|
||||
interpolating values into sensitive targets like SQL queries, shell commands
|
||||
and HTML templates will enjoy a much cleaner syntax when handled without
|
||||
regard for code injection attacks than when they are handled correctly.
|
||||
|
||||
This PEP proposes to handle the latter issue by always specifying an explicit
|
||||
interpolator for interpolation operations, and the former by adopting the
|
||||
``string.Template`` substitution syntax defined in PEP 292.
|
||||
|
||||
The interpolation syntax devised for PEP 292 is deliberately simple so that the
|
||||
template strings can be extracted into an il8n message catalog, and passed to
|
||||
template strings can be extracted into an i18n message catalog, and passed to
|
||||
translators who may not themselves be developers. For these use cases, it is
|
||||
important that the interpolation syntax be as simple as possible, as the
|
||||
translators are responsible for preserving the substition markers, even as
|
||||
|
@ -77,31 +107,35 @@ introduced for general purpose string formatting in PEP 3101, so this PEP adds
|
|||
that flexibility to the ``${ref}`` construct in PEP 292, and allows translation
|
||||
tools the option of rejecting usage of that more advanced syntax at runtime,
|
||||
rather than categorically rejecting it at compile time. The proposed permitted
|
||||
expressions inside ``${ref}`` are exactly as defined in PEP 498.
|
||||
expressions, conversion specifiers, and format specifiers inside ``${ref}`` are
|
||||
exactly as defined in PEP 498.
|
||||
|
||||
The specific proposal in this PEP is also deliberately close in both syntax
|
||||
and semantics to the general purpose interpolation syntax introduced to
|
||||
JavaScript in ES6, as we can reasonably expect a great many Python to be
|
||||
regularly switching back and forth between user interface code written in
|
||||
JavaScript and core application code written in Python.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
In source code, i-strings are string literals that are prefixed by the
|
||||
letter 'i'. The string will be parsed into its components at compile time,
|
||||
which will then be passed to the new ``__interpolate__`` builtin at runtime.
|
||||
In source code, interpolation expressions are introduced by the new character
|
||||
``!``. This is a new kind of expression, consisting of::
|
||||
|
||||
The 'i' prefix may be combined with 'b', where the 'i' must appear first, in
|
||||
which case ``__interpolateb__`` will be called rather than ``__interpolate__``.
|
||||
Similarly, 'i' may also be combined with 'u' to call ``__interpolateu__``
|
||||
rather than ``__interpolate__``.
|
||||
!DOTTED_NAME TEMPLATE_STRING
|
||||
|
||||
The 'i' prefix may also be combined with 'r', with or without 'b' or 'u', to
|
||||
produce raw i-strings. This disables backslash escape sequences in the string
|
||||
literal as usual, but has no effect on the runtime interpolation behaviour.
|
||||
Similar to ``yield`` expressions, this construct can be used without
|
||||
parentheses as a standalone expression statement, as the sole expression on the
|
||||
right hand side of an assignment or return statement, and as the sole argument
|
||||
to a function. In other situations, it requires containing parentheses to avoid
|
||||
ambiguity.
|
||||
|
||||
In all cases, the only permitted location for the 'i' prefix is before all other
|
||||
prefix characters - it indicates a runtime operation, which is largely
|
||||
independent of the compile time prefixes (aside from calling different
|
||||
interpolation functions when combined with 'b' or 'u').
|
||||
The template string must be a Unicode string (byte strings are not permitted),
|
||||
and string literal concatenation operates as normal within the template string
|
||||
component of the expression.
|
||||
|
||||
i-strings are parsed into literals and expressions. Expressions
|
||||
The template string is parsed into literals and expressions. Expressions
|
||||
appear as either identifiers prefixed with a single "$" character, or
|
||||
surrounded be a leading '${' and a trailing '}. The parts of the format string
|
||||
that are not expressions are separated out as string literals.
|
||||
|
@ -110,63 +144,68 @@ While parsing the string, any doubled ``$$`` is replaced with a single ``$``
|
|||
and is considered part of the literal text, rather than as introducing an
|
||||
expression.
|
||||
|
||||
These components are then organised into 3 parallel tuples:
|
||||
These components are then organised into a tuple of tuples, and passed to the
|
||||
``__interpolate__`` method of the interpolator identified by the given
|
||||
name::
|
||||
|
||||
* parsed format string fields
|
||||
* expression text
|
||||
* expression values
|
||||
DOTTED_NAME.__interpolate__(TEMPLATE_STRING,
|
||||
<parsed_fields>,
|
||||
<field_values>)
|
||||
|
||||
And then passed to the ``__interpolate__`` builtin at runtime::
|
||||
The template string field tuple is inspired by the interface of
|
||||
``string.Formatter.parse``, and consists of a series of 5-tuples each
|
||||
containing:
|
||||
|
||||
__interpolate__(fields, expressions, values)
|
||||
* a leading string literal (may be the empty string)
|
||||
* the substitution field position (zero-based enumeration)
|
||||
* the substitution expression text
|
||||
* the substitution conversion specifier (as defined by str.format)
|
||||
* the substitution format specifier (as defined by str.format)
|
||||
|
||||
The format string field tuple is inspired by the interface of
|
||||
``string.Formatter.parse``, and consists of a series of 4-tuples each containing
|
||||
a leading literal, together with a trailing field number, format specifier,
|
||||
and conversion specifier. If a given substition field has no leading literal
|
||||
section, format specifier or conversion specifier, then the corresponding
|
||||
elements in the tuple are the empty string. If the final part of the string
|
||||
has no trailing substitution field, then the field number, format specifier
|
||||
If a given substition field has no leading literal section, format specifier
|
||||
or conversion specifier, then the corresponding elements in the tuple are the
|
||||
empty string. If the final part of the string has no trailing substitution
|
||||
field, then the field number, format specifier
|
||||
and conversion specifier will all be ``None``.
|
||||
|
||||
The expression text is simply the text of each interpolated expression, as it
|
||||
appeared in the original string, but without the leading and/or surrounding
|
||||
expression markers.
|
||||
|
||||
The expression values are the result of evaluating the interpolated expressions
|
||||
in the exact runtime context where the i-string appears in the source code.
|
||||
The substitution field values tuple is created by evaluating the interpolated
|
||||
expressions in the exact runtime context where the interpolation expression
|
||||
appears in the source code.
|
||||
|
||||
For the following example i-string::
|
||||
For the following example interpolation expression::
|
||||
|
||||
i'abc${expr1:spec1}${expr2!r:spec2}def${expr3:!s}ghi $ident $$jkl'``,
|
||||
!str 'abc${expr1:spec1}${expr2!r:spec2}def${expr3:!s}ghi $ident $$jkl'
|
||||
|
||||
the fields tuple would be::
|
||||
the parsed fields tuple would be::
|
||||
|
||||
(
|
||||
('abc', 0, 'spec1', ''),
|
||||
('', 1, 'spec2' 'r'),
|
||||
(def', 2, '', 's'),
|
||||
('ghi', 3, '', ''),
|
||||
('$jkl', None, None, None)
|
||||
('abc', 0, 'expr1', '', 'spec1'),
|
||||
('', 1, 'expr2', 'r', 'spec2'),
|
||||
(def', 2, 'expr3', 's', ''),
|
||||
('ghi', 3, 'ident', '', ''),
|
||||
('$jkl', None, None, None, None)
|
||||
)
|
||||
|
||||
For the same example, the expression text and value tuples would be::
|
||||
While the field values tupe would be::
|
||||
|
||||
('expr1', 'expr2', 'expr3', 'ident') # Expression text
|
||||
(expr1, expr2, expr2, ident) # Expression values
|
||||
(expr1, expr2, expr3, ident)
|
||||
|
||||
The fields and expression text tuples can be constant folded at compile time,
|
||||
while the expression values tuple will always need to be constructed at runtime.
|
||||
The parsed fields tuple can be constant folded at compile time, while the
|
||||
expression values tuple will always need to be constructed at runtime.
|
||||
|
||||
The default ``__interpolate__`` implementation would have the following
|
||||
The ``str.__interpolate__`` implementation would have the following
|
||||
semantics, with field processing being defined in terms of the ``format``
|
||||
builtin and ``str.format`` conversion specifiers::
|
||||
|
||||
_converter = string.Formatter().convert_field
|
||||
|
||||
def __interpolate__(fields, expressions, values):
|
||||
def __interpolate__(raw_template, fields, values):
|
||||
template_parts = []
|
||||
for leading_text, field_num, format_spec, conversion in fields:
|
||||
for leading_text, field_num, expr, conversion, format_spec in fields:
|
||||
template_parts.append(leading_text)
|
||||
if field_num is not None:
|
||||
value = values[field_num]
|
||||
|
@ -176,167 +215,162 @@ builtin and ``str.format`` conversion specifiers::
|
|||
template_parts.append(field_str)
|
||||
return "".join(template_parts)
|
||||
|
||||
The default ``__interpolateu__`` implementation would be the
|
||||
``__interpolate__`` builtin.
|
||||
Writing custom interpolators
|
||||
----------------------------
|
||||
|
||||
The default ``__interpolateb__`` implementation would be defined in terms of
|
||||
the binary mod-formatting reintroduced in PEP 461::
|
||||
To simplify the process of writing custom interpolators, it is proposed to add
|
||||
a new builtin decorator, ``interpolator``, which would be defined as::
|
||||
|
||||
def __interpolateb__(fields, expressions, values):
|
||||
template_parts = []
|
||||
for leading_data, field_num, format_spec, conversion in fields:
|
||||
template_parts.append(leading_data)
|
||||
if field_num is not None:
|
||||
if conversion:
|
||||
raise ValueError("Conversion specifiers not supported "
|
||||
"in default binary interpolation")
|
||||
value = values[field_num]
|
||||
field_data = ("%" + format_spec) % (value,)
|
||||
template_parts.append(field_data)
|
||||
return b"".join(template_parts)
|
||||
def interpolator(f):
|
||||
f.__interpolate__ = f.__call__
|
||||
return f
|
||||
|
||||
This definition permits examples like the following::
|
||||
This allows new interpolators to be written as::
|
||||
|
||||
>>> data = 10
|
||||
>>> ib'$data'
|
||||
b'10'
|
||||
>>> b'${data:%4x}'
|
||||
b' a'
|
||||
>>> b'${data:#4x}'
|
||||
b' 0xa'
|
||||
>>> b'${data:04X}'
|
||||
b'000A'
|
||||
@interpolator
|
||||
def my_custom_interpolator(raw_template, parsed_fields, field_values):
|
||||
...
|
||||
|
||||
|
||||
Expression evaluation
|
||||
---------------------
|
||||
|
||||
The expressions that are extracted from the string are evaluated in
|
||||
the context where the i-string appeared. This means the expression has
|
||||
full access to local, nonlocal and global variables. Any valid Python
|
||||
expression can be used inside ``${}``, including function and method calls.
|
||||
References without the surrounding braces are limited to looking up single
|
||||
identifiers.
|
||||
The subexpressions that are extracted from the interpolation expression are
|
||||
evaluated in the context where the interpolation expression appears. This means
|
||||
the expression has full access to local, nonlocal and global variables. Any
|
||||
valid Python expression can be used inside ``${}``, including function and
|
||||
method calls. References without the surrounding braces are limited to looking
|
||||
up single identifiers.
|
||||
|
||||
Because the i-strings are evaluated where the string appears in the
|
||||
source code, there is no additional expressiveness available with
|
||||
i-strings. There are also no additional security concerns: you could
|
||||
have also just written the same expression, not inside of an
|
||||
i-string::
|
||||
Because the substitution expressions are evaluated where the string appears in
|
||||
the source code, there are no additional security concerns related to the
|
||||
contents of the expression itself, as you could have also just written the
|
||||
same expression and used runtime field parsing::
|
||||
|
||||
>>> bar=10
|
||||
>>> def foo(data):
|
||||
... return data + 20
|
||||
...
|
||||
>>> i'input=$bar, output=${foo(bar)}'
|
||||
>>> !str 'input=$bar, output=${foo(bar)}'
|
||||
'input=10, output=30'
|
||||
|
||||
Is equivalent to::
|
||||
Is essentially equivalent to::
|
||||
|
||||
>>> 'input={}, output={}'.format(bar, foo(bar))
|
||||
'input=10, output=30'
|
||||
|
||||
Format specifiers
|
||||
-----------------
|
||||
Handling code injection attacks
|
||||
-------------------------------
|
||||
|
||||
Format specifiers are not interpreted by the i-string parser - that is
|
||||
handling at runtime by the called interpolation function.
|
||||
The proposed interpolation expressions make it potentially attractive to write
|
||||
code like the following::
|
||||
|
||||
Concatenating strings
|
||||
---------------------
|
||||
myquery = !str "SELECT $column FROM $table;"
|
||||
mycommand = !str "cat $filename"
|
||||
mypage = !str "<html><body>$content</body></html>"
|
||||
|
||||
As i-strings are shorthand for a runtime builtin function call, implicit
|
||||
concatenation is a syntax error (similar to attempting implicit concatenation
|
||||
between bytes and str literals)::
|
||||
These all represent potential vectors for code injection attacks, if any of the
|
||||
variables being interpolated happen to come from an untrusted source. The
|
||||
specific proposal in this PEP is designed to make it straightforward to write
|
||||
use case specific interpolators that take care of quoting interpolated values
|
||||
appropriately for the relevant security context::
|
||||
|
||||
>>> i"interpolated" "not interpolated"
|
||||
File "<stdin>", line 1
|
||||
SyntaxError: cannot mix interpolation call with plain literal
|
||||
myquery = !sql "SELECT $column FROM $table;"
|
||||
mycommand = !sh "cat $filename"
|
||||
mypage = !html "<html><body>$content</body></html>"
|
||||
|
||||
This PEP does not cover adding such interpolators to the standard library,
|
||||
but instead ensures they can be readily provided by third party libraries.
|
||||
|
||||
(Although it's tempting to propose adding __interpolate__ implementations to
|
||||
``subprocess.call``, ``subprocess.check_call`` and ``subprocess.check_output``)
|
||||
|
||||
Format and conversion specifiers
|
||||
--------------------------------
|
||||
|
||||
Aside from separating them out from the substitution expression, format and
|
||||
conversion specifiers are otherwise treated as opaque strings by the
|
||||
interpolation template parser - assigning semantics to those (or, alternatively,
|
||||
prohibiting their use) is handled at runtime by the specified interpolator.
|
||||
|
||||
Error handling
|
||||
--------------
|
||||
|
||||
Either compile time or run time errors can occur when processing
|
||||
i-strings. Compile time errors are limited to those errors that can be
|
||||
detected when parsing an i-string into its component tuples. These errors all
|
||||
raise SyntaxError.
|
||||
Either compile time or run time errors can occur when processing interpolation
|
||||
expressions. Compile time errors are limited to those errors that can be
|
||||
detected when parsing a template string into its component tuples. These
|
||||
errors all raise SyntaxError.
|
||||
|
||||
Unmatched braces::
|
||||
|
||||
>>> i'x=${x'
|
||||
>>> !str 'x=${x'
|
||||
File "<stdin>", line 1
|
||||
SyntaxError: missing '}' in interpolation expression
|
||||
|
||||
Invalid expressions::
|
||||
|
||||
>>> i'x=${!x}'
|
||||
>>> !str 'x=${!x}'
|
||||
File "<fstring>", line 1
|
||||
!x
|
||||
^
|
||||
SyntaxError: invalid syntax
|
||||
|
||||
Run time errors occur when evaluating the expressions inside an
|
||||
i-string. See PEP 498 for some examples.
|
||||
template string. See PEP 498 for some examples.
|
||||
|
||||
Different interpolation functions may also impose additional runtime
|
||||
Different interpolators may also impose additional runtime
|
||||
constraints on acceptable interpolated expressions and other formatting
|
||||
details, which will be reported as runtime exceptions.
|
||||
|
||||
Leading whitespace in expressions is not skipped
|
||||
------------------------------------------------
|
||||
|
||||
Unlike PEP 498, leading whitespace in expressions doesn't need to be skipped -
|
||||
'$' is not a legal character in Python's syntax, so it can't appear inside
|
||||
a ``${}`` field except as part of another string, whether interpolated or not.
|
||||
|
||||
|
||||
Internationalising interpolated strings
|
||||
=======================================
|
||||
|
||||
So far, this PEP has said nothing practical about internationalisation - only
|
||||
formatting text using either str.format or bytes.__mod__ semantics depending
|
||||
on whether or not a str or bytes object is being interpolated.
|
||||
Since this PEP derives its interpolation syntax from the internationalisation
|
||||
focused PEP 292, it's worth considering the potential implications this PEP
|
||||
may have for the internationalisation use case.
|
||||
|
||||
Internationalisation enters the picture by overriding the ``__interpolate__``
|
||||
builtin on a module-by-module basis. For example, the following implementation
|
||||
would delegate interpolation calls to string.Template::
|
||||
Internationalisation enters the picture by writing a custom interpolator that
|
||||
performs internationalisation. For example, the following implementation
|
||||
would delegate interpolation calls to ``string.Template``::
|
||||
|
||||
def _interpolation_fields_to_template(fields, expressions):
|
||||
if not all(expr.isidentifier() for expr in expressions):
|
||||
raise ValueError("Only variable substitions permitted for il8n")
|
||||
template_parts = []
|
||||
for literal_text, field_num, format_spec, conversion in fields:
|
||||
if format_spec:
|
||||
raise ValueError("Format specifiers not permitted for il8n")
|
||||
if conversion:
|
||||
raise ValueError("Conversion specifiers not permitted for il8n")
|
||||
template_parts.append(literal_text)
|
||||
if field_num is not None:
|
||||
template_parts.append("${" + expressions[field_num] + "}")
|
||||
return "".join(template_parts)
|
||||
|
||||
def __interpolate__(fields, expressions, values):
|
||||
catalog_str = _interpolation_fields_to_template(fields, expressions)
|
||||
translated = _(catalog_str)
|
||||
values = {k:v for k, v in zip(expressions, values)}
|
||||
@interpolator
|
||||
def i18n(template, fields, values):
|
||||
translated = gettext.gettext(template)
|
||||
values = _build_interpolation_map(fields, values)
|
||||
return string.Template(translated).safe_substitute(values)
|
||||
|
||||
If a module were to import that definition of __interpolate__ into the
|
||||
module namespace, then:
|
||||
def _build_interpolation_map(fields, values):
|
||||
field_values = {}
|
||||
for literal_text, field_num, expr, conversion, format_spec in fields:
|
||||
assert expr.isidentifier() and not conversion and not format_spec
|
||||
if field_num is not None:
|
||||
field_values[expr] = values[field_num]
|
||||
return field_values
|
||||
|
||||
* Any i"translated & interpolated" strings would be translated
|
||||
* Any iu"untranslated & interpolated" strings would not be translated
|
||||
* Any ib"untranslated & interpolated" strings would not be translated
|
||||
* Any other string and bytes literals would not be translated unless explicitly
|
||||
passed to the relevant translation machinery at runtime
|
||||
And would then be invoked as::
|
||||
|
||||
This shifts the behaviour from the status quo, where translation support needs
|
||||
to be added explicitly to each string requiring translation to one where
|
||||
opting *in* to translation is done on a module by module basis, and
|
||||
individual interpolated strings can then be opted *out* of translation by
|
||||
adding the "u" prefix to the string literal in order to call
|
||||
``__interpolateu__`` instead of ``__interpolate__``.
|
||||
print(!i18n "This is a $translated $message")
|
||||
|
||||
Any actual implementation would need to address other issues (most notably
|
||||
message catalog extraction), but this gives the general idea of what might be
|
||||
possible.
|
||||
|
||||
It's also worth noting that one of the benefits of the ``$`` based substitution
|
||||
syntax in this PEP is its compatibility with Mozilla's
|
||||
`l20n syntax <http://l20n.org/>`__, which uses ``{{ name }}`` for global
|
||||
substitution, and ``{{ $user }}`` for local context substitution.
|
||||
|
||||
With the syntax in this PEP, an l20n interpolator could be written as::
|
||||
|
||||
translated = !l20n "{{ $user }} is running {{ appname }}"
|
||||
|
||||
With the syntax proposed in PEP 498 (and neglecting the difficulty of doing
|
||||
catalog lookups using PEP 498's semantics), the necessary brace escaping would
|
||||
make the string look like this in order to interpolating the user variable
|
||||
while preserving all of the expected braces::
|
||||
|
||||
interpolated = "{{{{ ${user} }}}} is running {{{{ appname }}}}"
|
||||
|
||||
Discussion
|
||||
==========
|
||||
|
@ -344,19 +378,42 @@ Discussion
|
|||
Refer to PEP 498 for additional discussion, as several of the points there
|
||||
also apply to this PEP.
|
||||
|
||||
Preserving the unmodified format string
|
||||
---------------------------------------
|
||||
Compatibility with IPython magic strings
|
||||
----------------------------------------
|
||||
|
||||
A lot of the complexity in the il8n example is actually in recreating the
|
||||
original format string from its component parts. It may make sense to preserve
|
||||
and pass that entire string to the interpolation function, in addition to
|
||||
the broken down field definitions.
|
||||
IPython uses "!" to introduce custom interactive constructs. These are only
|
||||
used at statement level, and could continue to be special cased in the
|
||||
IPython runtime.
|
||||
|
||||
This approach would also allow translators to more consistently benefit from
|
||||
the simplicity of the PEP 292 approach to string formatting (in the example
|
||||
above, surrounding braces are added to the catalog strings even for cases that
|
||||
don't need them)
|
||||
This existing usage *did* help inspire the syntax proposed in this PEP.
|
||||
|
||||
Preserving the raw template string
|
||||
----------------------------------
|
||||
|
||||
Earlier versions of this PEP failed to make the raw template string available
|
||||
to interpolators. This greatly complicated the i18n example, as it needed to
|
||||
reconstruct the original template to pass to the message catalog lookup.
|
||||
|
||||
Using a magic method rather than a global name lookup
|
||||
-----------------------------------------------------
|
||||
|
||||
Earlier versions of this PEP used an ``__interpolate__`` builtin, rather than
|
||||
a magic method on an explicitly named interpolator. Naming the interpolator
|
||||
eliminated a lot of the complexity otherwise associated with shadowing the
|
||||
builtin function in order to modify the semantics of interpolation.
|
||||
|
||||
Relative order of conversion and format specifier in parsed fields
|
||||
------------------------------------------------------------------
|
||||
|
||||
The relative order of the conversion specifier and the format specifier in the
|
||||
substitution field 5-tuple is defined to match the order they appear in the
|
||||
format string, which is unfortunately the inverse of the way they appear in the
|
||||
``string.Formatter.parse`` 4-tuple.
|
||||
|
||||
I consider this a design defect in ``string.Formatter.parse``, so I think it's
|
||||
worth fixing it in for the customer interpolator API, since the tuple already
|
||||
has other differences (like including both the field position number *and* the
|
||||
text of the expression).
|
||||
|
||||
References
|
||||
==========
|
||||
|
|
Loading…
Reference in New Issue