PEP 701: Add some clarifications to f-string debug expressions and tokens (#2929)
This commit is contained in:
parent
b4d7ea782d
commit
9de4efd734
47
pep-0701.rst
47
pep-0701.rst
|
@ -213,7 +213,47 @@ This PEP leaves up to the implementation the level of f-string nesting allowed.
|
||||||
This means that limiting nesting is **not part of the language specification**
|
This means that limiting nesting is **not part of the language specification**
|
||||||
but also the language specification **doesn't mandate arbitrary nesting**.
|
but also the language specification **doesn't mandate arbitrary nesting**.
|
||||||
|
|
||||||
Three new tokens are introduced:
|
Handling of f-string debug expressions
|
||||||
|
--------------------------------------
|
||||||
|
|
||||||
|
Since Python 3.8, f-strings can be used to debug expressions by using the
|
||||||
|
``=`` operator. For example::
|
||||||
|
|
||||||
|
>>> a = 1
|
||||||
|
>>> f"{1+1=}"
|
||||||
|
'1+1=2'
|
||||||
|
|
||||||
|
This semantics were not introduced formally in a PEP and they were implemented
|
||||||
|
in the current string parser as a special case in `bpo-36817
|
||||||
|
<https://bugs.python.org/issue?@action=redirect&bpo=36817>`_ and documented in
|
||||||
|
`the f-string lexical analysis section
|
||||||
|
<https://docs.python.org/3/reference/lexical_analysis.html#f-strings>`_.
|
||||||
|
|
||||||
|
This feature is not affected by the changes proposed in this PEP but is
|
||||||
|
important to specify that the formal handling of this feature requires the lexer
|
||||||
|
to be able to "untokenize" the expression part of the f-string. This is not a
|
||||||
|
problem for the current string parser as it can operate directly on the string
|
||||||
|
token contents. However, incorporating this feature into a given parser
|
||||||
|
implementation requires the lexer to keep track of the raw string contents of
|
||||||
|
the expression part of the f-string and make them available to the parser when
|
||||||
|
the parse tree is constructed for f-string nodes. A pure "untokenization" is not
|
||||||
|
enough because as specified currently, f-string debugging preserve whitespace,
|
||||||
|
including spaces after the ``{`` and the ``=`` characters. This means that the
|
||||||
|
raw string contents of the expression part of the f-string must be kept intact
|
||||||
|
and not just the associated tokens.
|
||||||
|
|
||||||
|
How parser/lexer implementations deal with this problem is of course up to the
|
||||||
|
implementation.
|
||||||
|
|
||||||
|
New tokens
|
||||||
|
----------
|
||||||
|
|
||||||
|
Three new tokens are introduced: ``FSTRING_START``, ``FSTRING_MIDDLE`` and
|
||||||
|
``FSTRING_END``. This PEP does not mandate the precise definitions of these tokens
|
||||||
|
as different lexers may have different implementations that may be more efficient
|
||||||
|
than the ones proposed here given the context of the particular implementation. However,
|
||||||
|
the following definitions are provided as a reference so that the reader can have a
|
||||||
|
better understanding of the proposed grammar changes and how the tokens are used:
|
||||||
|
|
||||||
* ``FSTRING_START``: This token includes f-string character (``f``/``F``) and the open quote(s).
|
* ``FSTRING_START``: This token includes f-string character (``f``/``F``) and the open quote(s).
|
||||||
* ``FSTRING_MIDDLE``: This token includes the text between the opening quote
|
* ``FSTRING_MIDDLE``: This token includes the text between the opening quote
|
||||||
|
@ -254,6 +294,9 @@ while ``f"""some words"""`` will be tokenized simply as::
|
||||||
FSTRING_START - 'f"""'
|
FSTRING_START - 'f"""'
|
||||||
FSTRING_END - 'some words'
|
FSTRING_END - 'some words'
|
||||||
|
|
||||||
|
Consequences of the new grammar
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
All restrictions mentioned in the PEP are lifted from f-literals, as explained below:
|
All restrictions mentioned in the PEP are lifted from f-literals, as explained below:
|
||||||
|
|
||||||
* Expression portions may now contain strings delimited with the same kind of
|
* Expression portions may now contain strings delimited with the same kind of
|
||||||
|
@ -291,7 +334,7 @@ limited to be different from the quotes of the enclosing string, because this is
|
||||||
now allowed: as an arbitrary Python string can contain any possible choice of
|
now allowed: as an arbitrary Python string can contain any possible choice of
|
||||||
quotes, so can any f-string expression. Additionally there is no need to clarify
|
quotes, so can any f-string expression. Additionally there is no need to clarify
|
||||||
that certain things are not allowed in the expression part because of
|
that certain things are not allowed in the expression part because of
|
||||||
implementation restructions such as comments, new line characters or
|
implementation restrictions such as comments, new line characters or
|
||||||
backslashes.
|
backslashes.
|
||||||
|
|
||||||
The only "surprising" difference is that as f-strings allow specifying a
|
The only "surprising" difference is that as f-strings allow specifying a
|
||||||
|
|
Loading…
Reference in New Issue