PEP 701: Correct handling of format specifiers and nested expressions (#3151)
This commit is contained in:
parent
69b9d2b3d3
commit
295769391a
43
pep-0701.rst
43
pep-0701.rst
|
@ -228,13 +228,28 @@ for details on the syntax):
|
|||
The new tokens (``FSTRING_START``, ``FSTRING_MIDDLE``, ``FSTRING_END``) are defined
|
||||
:ref:`later in this document <701-new-tokens>`.
|
||||
|
||||
This PEP leaves up to the implementation the level of f-string nesting allowed but
|
||||
**specifies a lower bound of 5 levels of nesting**. This is to ensure that users can
|
||||
have a reasonable expectation of being able to nest f-strings with "reasonable" depth.
|
||||
This PEP leaves up to the implementation the level of f-string nesting allowed
|
||||
(f-strings withing the expression parts of other f-strings) but **specifies a
|
||||
lower bound of 5 levels of nesting**. This is to ensure that users can have a
|
||||
reasonable expectation of being able to nest f-strings with "reasonable" depth.
|
||||
This PEP implies that limiting nesting is **not part of the language
|
||||
specification** but also the language specification **doesn't mandate arbitrary
|
||||
nesting**.
|
||||
|
||||
Similarly, this PEP leaves up to the implementation the level of expression nesting
|
||||
in format specifiers but **specifies a lower bound of 2 levels of nesting**. This means
|
||||
that the following should always be valid:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
f"{'':*^{1:{1}}}"
|
||||
|
||||
but the following can be valid or not depending on the implementation:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
f"{'':*^{1:{1:{1}}}}"
|
||||
|
||||
The new grammar will preserve the Abstract Syntax Tree (AST) of the current
|
||||
implementation. This means that no semantic changes will be introduced by this
|
||||
PEP on existing code that uses f-strings.
|
||||
|
@ -362,8 +377,11 @@ tokens:
|
|||
2. Keep consuming tokens until a one of the following is encountered:
|
||||
|
||||
* A closing quote equal to the opening quote.
|
||||
* An opening brace (``{``) or a closing brace (``}``) that is not immediately
|
||||
followed by another opening/closing brace.
|
||||
* If in "format specifier mode" (see step 3), an opening brace (``{``) or a
|
||||
closing brace (``}``).
|
||||
* If not in "format specifier mode" (see step 3), an opening brace (``{``) or
|
||||
a closing brace (``}``) that is not immediately followed by another opening/closing
|
||||
brace.
|
||||
|
||||
In all cases, if the character buffer is not empty, emit a ``FSTRING_MIDDLE``
|
||||
token with the contents captured so far but transform any double
|
||||
|
@ -375,16 +393,15 @@ tokens:
|
|||
is encountered, go to step 3.
|
||||
* If a closing bracket (not immediately followed by another closing bracket)
|
||||
is encountered, emit a token for the closing bracket and go to step 2.
|
||||
|
||||
3. Push a new tokenizer mode to the tokenizer mode stack for "Regular Python
|
||||
tokenization within f-string" and proceed to tokenize with it. This mode
|
||||
tokenizes as the "Regular Python tokenization" until a ``!``, ``:``, ``=``
|
||||
character is encountered or if a ``}`` character is encountered with the same
|
||||
level of nesting as the opening bracket token that was pushed when we enter the
|
||||
f-string part. Using this mode, emit tokens until one of the stop points are
|
||||
reached. When this happens, emit the corresponding token for the stopping
|
||||
character encountered and, pop the current tokenizer mode from the tokenizer mode
|
||||
stack and go to step 2.
|
||||
tokenizes as the "Regular Python tokenization" until a ``:`` or a ``}``
|
||||
character is encountered with the same level of nesting as the opening
|
||||
bracket token that was pushed when we enter the f-string part. Using this mode,
|
||||
emit tokens until one of the stop points are reached. When this happens, emit
|
||||
the corresponding token for the stopping character encountered and, pop the
|
||||
current tokenizer mode from the tokenizer mode stack and go to step 2. If the
|
||||
stopping point is a ``:`` character, enter step 2 in "format specifier" mode.
|
||||
4. Emit a ``FSTRING_END`` token with the contents captured and pop the current
|
||||
tokenizer mode (corresponding to "F-string tokenization") and go back to
|
||||
"Regular Python mode".
|
||||
|
|
Loading…
Reference in New Issue