Added inline literal markup.

This commit is contained in:
Eric V. Smith 2015-08-29 16:20:18 -04:00
parent 9143b2852d
commit 8546af530e
1 changed files with 135 additions and 127 deletions

View File

@ -14,10 +14,10 @@ Abstract
========
Python supports multiple ways to format text strings. These include
%-formatting [#]_, str.format [#]_, and string.Template [#]_. Each of
these methods have their advantages, but in addition have
disadvantages that make them cumbersome to use in practice. This PEP
proposed to add a new string formatting mechanism: Literal String
%-formatting [#]_, ``str.format()`` [#]_, and ``string.Template``
[#]_. Each of these methods have their advantages, but in addition
have disadvantages that make them cumbersome to use in practice. This
PEP proposed to add a new string formatting mechanism: Literal String
Formatting. In this PEP, such strings will be refered to as
"f-strings", taken from the leading character used to denote such
strings.
@ -43,8 +43,8 @@ their values. Some examples are::
A similar feature was proposed in PEP 215 [#]_. PEP 215 proposed to
support a subset of Python expressions, and did not support the
type-specific string formatting (the __format__ method) which was
introduced with PEP 3101 [#]_.
type-specific string formatting (the ``__format__()`` method) which
was introduced with PEP 3101 [#]_.
Rationale
=========
@ -75,16 +75,16 @@ To be defensive, the following code should be used::
>>> 'error: %s' % (msg,)
"error: ('disk failure', 32)"
str.format() was added to address some of these problems with
``str.format()`` was added to address some of these problems with
%-formatting. In particular, it uses normal function call syntax (and
therefor supports multiple parameters) and it is extensible through
the __format__() method on the object being converted to a string. See
PEP-3101 for a detailed rationale. This PEP reuses much of the
str.format() syntax and machinery, in order to provide continuity with
an existing Python string formatting mechanism.
the ``__format__()`` method on the object being converted to a
string. See PEP-3101 for a detailed rationale. This PEP reuses much of
the ``str.format()`` syntax and machinery, in order to provide
continuity with an existing Python string formatting mechanism.
However, str.format() is not without its issues. Chief among them is
its verbosity. For example, the text 'value' is repeated here::
However, ``str.format()`` is not without its issues. Chief among them
is its verbosity. For example, the text ``value`` is repeated here::
>>> value = 4 * 20
>>> 'The value is {value}.'.format(value=value)
@ -105,20 +105,21 @@ With an f-string, this becomes::
f-strings provide a concise, readable way to include the value of
Python expressions inside strings.
In this sense, string.Template and %-formatting have similar
shortcomings to str.format(), but also support fewer formatting
options. In particular, they do not support the __format__ protocol,
so that there is no way to control how a specific object is converted
to a string, nor can it be extended to additional types that want to
control how they are converted to strings (such as Decimal and
datetime). This example is not possible with string.Template::
In this sense, ``string.Template`` and %-formatting have similar
shortcomings to ``str.format()``, but also support fewer formatting
options. In particular, they do not support the ``__format__``
protocol, so that there is no way to control how a specific object is
converted to a string, nor can it be extended to additional types that
want to control how they are converted to strings (such as ``Decimal``
and ``datetime``). This example is not possible with
``string.Template``::
>>> value = 1234
>>> f'input={value:#0.6x}'
'input=0x04d2'
And neither %-formatting nor string.Template can control formatting
such as::
And neither %-formatting nor ``string.Template`` can control
formatting such as::
>>> date = datetime.date(1991, 10, 12)
>>> f'{date} was on a {date:%A}'
@ -183,38 +184,38 @@ corresponding single brace. Doubled opening braces do not signify the
start of an expression.
Following the expression, an optional type conversion may be
specified. The allowed conversions are '!s', '!r', or '!a'. These are
treated the same as in str.format: '!s' calls str() on the expression,
'!r' calls repr() on the expression, and '!a' calls ascii() on the
expression. These conversions are applied before the call to
__format__. The only reason to use '!s' is if you want to specify a
format specifier that applies to str, not to the type of the
expression.
specified. The allowed conversions are ``'!s'``, ``'!r'``, or
``'!a'``. These are treated the same as in ``str.format()``: ``'!s'``
calls ``str()`` on the expression, ``'!r'`` calls ``repr()`` on the
expression, and ``'!a'`` calls ``ascii()`` on the expression. These
conversions are applied before the call to ``__format__``. The only
reason to use ``'!s'`` is if you want to specify a format specifier
that applies to ``str``, not to the type of the expression.
Similar to str.format, optional format specifiers maybe be included
inside the f-string, separated from the expression (or the type
conversion, if specified) by a colon. If a format specifier is not
provied, an empty string is used.
Similar to ``str.format()``, optional format specifiers maybe be
included inside the f-string, separated from the expression (or the
type conversion, if specified) by a colon. If a format specifier is
not provied, an empty string is used.
So, an f-string looks like::
f ' <text> { <expression> <optional !s, !r, or !a> <optional : format specifier> } text ... '
The resulting expression's __format__ method is called with the format
specifier. The resulting value is used when building the value of the
f-string.
The resulting expression's ``__format__`` method is called with the
format specifier. The resulting value is used when building the value
of the f-string.
Expressions cannot contain ':' or '!' outside of strings or parens,
brackets, or braces. The exception is that the '!=' operator is
special cased.
Expressions cannot contain ``':'`` or ``'!'`` outside of strings or
parens, brackets, or braces. The exception is that the ``'!='``
operator is allowed as a special case.
Escape sequences
----------------
Scanning an f-string for expressions happens after escape sequences
are decoded. Because hex(ord('{')) == 0x7b, the f-string
f'\\u007b4*10}' is decoded to f'{4*10}', which evaluates as the integer
40::
are decoded. Because ``hex(ord('{')) == 0x7b``, the f-string
``f'\\u007b4*10}'`` is decoded to ``f'{4*10}'``, which evaluates as
the integer 40::
>>> f'\u007b4*10}'
'40'
@ -232,8 +233,8 @@ Code equivalence
The exact code used to implement f-strings is not specified. However,
it is guaranteed that any embedded value that is converted to a string
will use that value's __format__ method. This is the same mechanism
that str.format() uses to convert values to strings.
will use that value's ``__format__`` method. This is the same
mechanism that ``str.format()`` uses to convert values to strings.
For example, this code::
@ -269,9 +270,9 @@ Is equivalent to::
'result=20'
After stripping leading and trailing whitespace (see below), the
expression is parsed with the equivalent of ast.parse(expression,
'<fstring>', 'eval') [#]_. Note that this restricts the expression: it
cannot contain any newlines, for example::
expression is parsed with the equivalent of ``ast.parse(expression,
'<fstring>', 'eval')`` [#]_. Note that this restricts the expression:
it cannot contain any newlines, for example::
>>> x = 0
>>> f'''{x
@ -282,7 +283,7 @@ cannot contain any newlines, for example::
SyntaxError: invalid syntax
But note that this works, since the newline is removed from the
string, and the spaces in front of the '1' are allowed in an
string, and the spaces in front of the ``'1'`` are allowed in an
expression::
>>> f'{x+\
@ -302,9 +303,9 @@ code such as::
'result: 12.35'
Once expressions in a format specifier are evaluated (if necessary),
format specifiers are not interpreted by the f-string evaluator. Just as
in str.format(), they are merely passed in to the __format__() method
of the object being formatted.
format specifiers are not interpreted by the f-string evaluator. Just
as in ``str.format()``, they are merely passed in to the
``__format__()`` method of the object being formatted.
Concatenating strings
---------------------
@ -332,7 +333,7 @@ Error handling
Either compile time or run time errors can occur when processing
f-strings. Compile time errors are limited to those errors that can be
detected when scanning an f-string. These errors all raise
SyntaxError.
``SyntaxError``.
Unmatched braces::
@ -416,45 +417,48 @@ How to denote f-strings
Because the compiler must be involved in evaluating the expressions
contained in the interpolated strings, there must be some way to
denote to the compiler which strings should be evaluated. This PEP
chose a leading 'f' character preceeding the string literal. This is
similar to how 'b' and 'r' prefixes change the meaning of the string
itself, at compile time. Other prefixes were suggested, such as 'i'. No
option seemed better than the other, so 'f' was chosen.
chose a leading ``'f'`` character preceeding the string literal. This
is similar to how ``'b'`` and ``'r'`` prefixes change the meaning of
the string itself, at compile time. Other prefixes were suggested,
such as ``'i'``. No option seemed better than the other, so ``'f'``
was chosen.
Another option was to support special functions, known to the
compiler, such as Format(). This seems like too much magic for Python:
not only is there a chance for collision with existing identifiers,
the PEP author feels that it's better to signify the magic with a
string prefix character.
compiler, such as ``Format()``. This seems like too much magic for
Python: not only is there a chance for collision with existing
identifiers, the PEP author feels that it's better to signify the
magic with a string prefix character.
How to specify the location of expressions in f-strings
*******************************************************
This PEP supports the same syntax as str.format() for distinguishing
replacement text inside strings: expressions are contained inside
braces. There were other options suggested, such as string.Template's
$identifier or ${expression}.
This PEP supports the same syntax as ``str.format()`` for
distinguishing replacement text inside strings: expressions are
contained inside braces. There were other options suggested, such as
``string.Template``'s ``$identifier`` or ``${expression}``.
While $identifier is no doubt more familiar to shell scripters and
users of some other languages, in Python str.format() is heavily
While ``$identifier`` is no doubt more familiar to shell scripters and
users of some other languages, in Python ``str.format()`` is heavily
used. A quick search of Python's standard library shows only a handful
of uses of string.Template, but hundreds of uses of str.format().
of uses of ``string.Template``, but hundreds of uses of
``str.format()``.
Another proposed alternative was to have the substituted text between
\\{ and } or between \\{ and \\}. While this syntax would probably be
desirable if all string literals were to support interpolation, this
PEP only supports strings that are already marked with the leading
'f'. As such, the PEP is using unadorned braces to denoted substituted
text, in order to leverage end user familiarity with str.format().
``\\{`` and ``}`` or between ``\\{`` and ``\\}``. While this syntax
would probably be desirable if all string literals were to support
interpolation, this PEP only supports strings that are already marked
with the leading ``'f'``. As such, the PEP is using unadorned braces
to denoted substituted text, in order to leverage end user familiarity
with ``str.format()``.
Supporting full Python expressions
**********************************
Many people on the python-ideas discussion wanted support for either
only single identifiers, or a limited subset of Python expressions
(such as the subset supported by str.format()). This PEP supports full
Python expressions inside the braces. Without full expressions, some
desirable usage would be cumbersome. For example::
(such as the subset supported by ``str.format()``). This PEP supports
full Python expressions inside the braces. Without full expressions,
some desirable usage would be cumbersome. For example::
>>> f'Column={col_idx+1}'
>>> f'number of items: {len(items)}'
@ -484,19 +488,20 @@ Differences between f-string and str.format expressions
-------------------------------------------------------
There is one small difference between the limited expressions allowed
in str.format() and the full expressions allowed inside f-strings. The
difference is in how index lookups are performed. In str.format(),
index values that do not look like numbers are converted to strings::
in ``str.format()`` and the full expressions allowed inside
f-strings. The difference is in how index lookups are performed. In
``str.format()``, index values that do not look like numbers are
converted to strings::
>>> d = {'a': 10, 'b': 20}
>>> 'a={d[a]}'.format(d=d)
'a=10'
Notice that the index value is converted to the string "a" when it is
looked up in the dict.
Notice that the index value is converted to the string ``'a'`` when it
is looked up in the dict.
However, in f-strings, you would need to use a literal for the value
of 'a'::
of ``'a'``::
>>> f'a={d["a"]}'
'a=10'
@ -511,7 +516,7 @@ use variables as index values::
See [#]_ for a further discussion. It was this observation that led to
full Python expressions being supported in f-strings.
Furthermore, the limited expressions that str.format() understands
Furthermore, the limited expressions that ``str.format()`` understands
need not be valid Python expressions. For example::
>>> '{i[";]}'.format(i={'";':4})
@ -525,7 +530,7 @@ Triple-quoted f-strings
Triple quoted f-strings are allowed. These strings are parsed just as
normal triple-quoted strings are. After parsing, the normal f-string
logic is applied, and __format__() on each value is called.
logic is applied, and ``__format__()`` on each value is called.
Raw f-strings
-------------
@ -542,29 +547,30 @@ In addition, raw f-strings may be combined with triple-quoted strings.
No binary f-strings
-------------------
For the same reason that we don't support bytes.format(), you may not
combine 'f' with 'b' string literals. The primary problem is that an
object's __format__() method may return Unicode data that is not
compatible with a bytes string.
For the same reason that we don't support ``bytes.format()``, you may
not combine ``'f'`` with ``'b'`` string literals. The primary problem
is that an object's ``__format__()`` method may return Unicode data that
is not compatible with a bytes string.
Binary f-strings would first require a solution for
bytes.format(). This idea has been proposed in the past, most recently
in PEP 461 [#]_. The discussions of such a feature usually suggest either
``bytes.format()``. This idea has been proposed in the past, most
recently in PEP 461 [#]_. The discussions of such a feature usually
suggest either
- adding a method such as __bformat__() so an object can control how
it is converted to bytes, or
- adding a method such as ``__bformat__()`` so an object can control
how it is converted to bytes, or
- having bytes.format() not be as general purpose or extensible as
str.format().
- having ``bytes.format()`` not be as general purpose or extensible
as ``str.format()``.
Both of these remain as options in the future, if such functionality
is desired.
!s, !r, and !a are redundant
----------------------------
``!s``, ``!r``, and ``!a`` are redundant
----------------------------------------
The !s, !r, and !a are not strictly required. Because arbitrary
expressions are allowed inside the f-strings, this code::
The ``!s``, ``!r``, and ``!a`` are not strictly required. Because
arbitrary expressions are allowed inside the f-strings, this code::
>>> a = 'some string'
>>> f'{a!r}'
@ -575,23 +581,23 @@ Is identical to::
>>> f'{repr(a)}'
"'some string'"
Similarly, !s can be replaced by calls to str() and !a by calls to
ascii().
Similarly, ``!s`` can be replaced by calls to ``str()`` and ``!a`` by
calls to ``ascii()``.
However, !s, !r, and !a are supported by this PEP in order to minimize
the differences with str.format(). !s, !r, and !a are required in
str.format() because it does not allow the execution of arbitrary
expressions.
However, ``!s``, ``!r``, and ``!a`` are supported by this PEP in order
to minimize the differences with ``str.format()``. ``!s``, ``!r``, and
``!a`` are required in ``str.format()`` because it does not allow the
execution of arbitrary expressions.
Lambdas inside expressions
--------------------------
Because lambdas use the ':' character, they cannot appear outside of
parenthesis in an expression. The colon is interpreted as the start of
the format specifier, which means the start of the lambda expression
is seen and is syntactically invalid. As there's no practical use for
a plain lambda in an f-string expression, this is not seen as much of
a limitation.
Because lambdas use the ``':'`` character, they cannot appear outside
of parenthesis in an expression. The colon is interpreted as the start
of the format specifier, which means the start of the lambda
expression is seen and is syntactically invalid. As there's no
practical use for a plain lambda in an f-string expression, this is
not seen as much of a limitation.
If you feel you must use lambdas, they may be used inside of parens::
@ -602,27 +608,27 @@ Examples from Python's source code
==================================
Here are some examples from Python source code that currently use
str.format(), and how they would look with f-strings. This PEP does
not recommend wholesale converting to f-strings, these are just
examples of real-world usages of str.format() and how they'd look if
written from scratch using f-strings.
``str.format()``, and how they would look with f-strings. This PEP
does not recommend wholesale converting to f-strings, these are just
examples of real-world usages of ``str.format()`` and how they'd look
if written from scratch using f-strings.
Lib/asyncio/locks.py::
``Lib/asyncio/locks.py``::
extra = '{},waiters:{}'.format(extra, len(self._waiters))
extra = f'{extra},waiters:{len(self._waiters)}'
Lib/configparser.py::
``Lib/configparser.py``::
message.append(" [line {0:2d}]".format(lineno))
message.append(f" [line {lineno:2d}]")
Tools/clinic/clinic.py::
``Tools/clinic/clinic.py``::
methoddef_name = "{}_METHODDEF".format(c_basename.upper())
methoddef_name = f"{c_basename.upper()}_METHODDEF"
python-config.py::
``python-config.py``::
print("Usage: {0} [{1}]".format(sys.argv[0], '|'.join('--'+opt for opt in valid_opts)), file=sys.stderr)
print(f"Usage: {sys.argv[0]} [{'|'.join('--'+opt for opt in valid_opts)}]", file=sys.stderr)
@ -645,11 +651,13 @@ The same expression used multiple times
Every expression in braces in an f-string is evaluated exactly once
for each time it appears in the f-string. However, it's undefined
which result will show up in the resulting string value. For purposes
of this section, two expressions are the same if they have the exact
same literal text defining them. For example, '{i}' and '{i}' are the
same expression, but '{i}' and '{ i}' are not, due to the extra space
in the second expression.
which result will show up in the resulting string value. This only
matters for expressions with side effects.
For purposes of this section, two expressions are the same if they
have the exact same literal text defining them. For example, ``'{i}'``
and ``'{i}'`` are the same expression, but ``'{i}'`` and ``'{i }'``
are not, due to the extra space in the second expression.
For example, given::
@ -661,8 +669,8 @@ For example, given::
>>> f'{fn(lst)} {fn(lst)}'
'1 2'
The resulting f-string might have the value '1 2', '2 2', '1 1', or
even '2 1'.
The resulting f-string might have the value ``'1 2'``, ``'2 2'``,
``'1 1'``, or even ``'2 1'``.
However::
@ -670,9 +678,9 @@ However::
>>> f'{fn(lst)} { fn(lst)}'
'1 2'
This f-string will always have the value '1 2'. This is due to the two
expressions not being the same: the space in the second example makes
the two expressions distinct.
This f-string will always have the value ``'1 2'``. This is due to the
two expressions not being the same: the space in the second example
makes the two expressions distinct.
This restriction is in place in order to allow for a possible future
extension allowing translated strings, wherein the expression