PEP 701: Incorporate feedback from the discussion thread (#2939)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com> Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
This commit is contained in:
parent
574e82c2f4
commit
d547ef7ef4
120
pep-0701.rst
120
pep-0701.rst
|
@ -143,9 +143,8 @@ f-string literals (as well as the Python language in general).
|
|||
|
||||
>>> f"{f"{f"infinite"}"}" + " " + f"{f"nesting!!!"}"
|
||||
|
||||
This choice not only allows for a more consistent and predictable behavior of what can be
|
||||
placed in f-strings but provides an intuitive way to manipulate string literals in a
|
||||
more flexible way without to having to fight the limitations of the implementation.
|
||||
This "feature" is not universally agreed to be desirable, and some users find this unreadable.
|
||||
For a discussion on the different views on this, see the :ref:`701-considerations-of-quote-reuse` section.
|
||||
|
||||
#. Another issue that has felt unintuitive to most is the lack of support for backslashes
|
||||
within the expression component of an f-string. One example that keeps coming up is including
|
||||
|
@ -223,10 +222,17 @@ for details on the syntax):
|
|||
| FSTRING_MIDDLE
|
||||
| fstring_replacement_field
|
||||
|
||||
The new tokens (``FSTRING_START``, ``FSTRING_MIDDLE``, ``FSTRING_END``) are defined
|
||||
:ref:`later in this document <701-new-tokens>`.
|
||||
|
||||
This PEP leaves up to the implementation the level of f-string nesting allowed.
|
||||
This means that limiting nesting is **not part of the language specification**
|
||||
but also the language specification **doesn't mandate arbitrary nesting**.
|
||||
|
||||
The new grammar will preserve the Abstract Syntax Tree (AST) of the current
|
||||
implementation. This means that no semantic changes will be introduced by this
|
||||
PEP on existing code that uses f-strings.
|
||||
|
||||
Handling of f-string debug expressions
|
||||
--------------------------------------
|
||||
|
||||
|
@ -259,6 +265,8 @@ and not just the associated tokens.
|
|||
How parser/lexer implementations deal with this problem is of course up to the
|
||||
implementation.
|
||||
|
||||
.. _701-new-tokens:
|
||||
|
||||
New tokens
|
||||
----------
|
||||
|
||||
|
@ -277,10 +285,10 @@ better understanding of the proposed grammar changes and how the tokens are used
|
|||
|
||||
These tokens are always string parts and they are semantically equivalent to the
|
||||
``STRING`` token with the restrictions specified. These tokens must be produced by the lexer
|
||||
when lexing f-strings. This means that **the tokenizer cannot produce a single token for f-strings anymore**. How
|
||||
the lexer emits this token is **not specified** as this will heavily depend on every
|
||||
implementation (even the Python version of the lexer in the standard library is
|
||||
implemented differently to the one used by the PEG parser).
|
||||
when lexing f-strings. This means that **the tokenizer cannot produce a single token for f-strings anymore**.
|
||||
How the lexer emits this token is **not specified** as this will heavily depend on every
|
||||
implementation (even the Python version of the lexer in the standard library is implemented
|
||||
differently to the one used by the PEG parser).
|
||||
|
||||
As an example::
|
||||
|
||||
|
@ -308,6 +316,20 @@ while ``f"""some words"""`` will be tokenized simply as::
|
|||
FSTRING_START - 'f"""'
|
||||
FSTRING_END - 'some words'
|
||||
|
||||
One way existing lexers can be adapted to emit these tokens is to incorporate a stack of "lexer modes"
|
||||
or to use a stack of different lexers. This is because the lexer needs to switch from "regular Python
|
||||
lexing" to "f-string lexing" when it encounters an f-string start token and as f-strings can be nested,
|
||||
the context needs to be preserved until the f-string closes. Also, the "lexer mode" inside an f-string
|
||||
expression part needs to behave as a "super-set" of the regular Python lexer (as it needs to be able to
|
||||
switch back to f-string lexing when it encounters the ``}`` terminator for the expression part as well
|
||||
as handling f-string formatting and debug expressions). Of course, as mentioned before, is not possible to
|
||||
provide a precise specification of how this should be done as it will depend on the specific implementation
|
||||
and nature of the lexer to be changed.
|
||||
|
||||
The specifics of how (or if) the ``tokenize`` module will emit these tokens (or others) and what
|
||||
is included in the emitted tokens are left out of this document and must be decided later in a regular
|
||||
CPython issue.
|
||||
|
||||
Consequences of the new grammar
|
||||
-------------------------------
|
||||
|
||||
|
@ -320,7 +342,76 @@ All restrictions mentioned in the PEP are lifted from f-string literals, as expl
|
|||
expanded when the innermost string is evaluated.
|
||||
* Comments, using the ``#`` character, are possible only in multi-line f-string literals,
|
||||
since comments are terminated by the end of the line (which makes closing a
|
||||
single-line f-string literal impossible)
|
||||
single-line f-string literal impossible). Comments in multi-line f-string literals require
|
||||
the closing ``{`` of the expression part to be present in a different line as the one the
|
||||
comment is in.
|
||||
|
||||
.. _701-considerations-of-quote-reuse:
|
||||
|
||||
Considerations regarding quote reuse
|
||||
------------------------------------
|
||||
|
||||
One of the consequences of the grammar proposed here is that, as mentioned above,
|
||||
f-string expressions can now contain strings delimited with the same kind of quote
|
||||
that is used to delimit the external f-string literal. For example:
|
||||
|
||||
>>> f" something { my_dict["key"] } something else "
|
||||
|
||||
In the `discussion thread for this PEP <https://discuss.python.org/t/pep-701-syntactic-formalization-of-f-strings/22046>`_,
|
||||
several concerns have been raised regarding this aspect and we want to collect them here,
|
||||
as these should be taken into consideration when accepting or rejecting this PEP.
|
||||
|
||||
Some of these objections include:
|
||||
|
||||
* Many people find quote reuse withing the same string confusing and hard to read. This is because
|
||||
allowing quote reuse will violate a current property of Python as it stands today: the fact that
|
||||
strings are fully delimited by two consecutive pairs of the same kind of quote, which by itself is a very simple rule.
|
||||
One of the reasons quote reuse may be harder for humans to parse, leading to less readable
|
||||
code, is that the quote character is the same for both start and
|
||||
end (as opposed to other delimiters).
|
||||
|
||||
* Some users have raised concerns that quote reuse may break some lexer and syntax highlighting tools that rely
|
||||
on simple mechanisms to detect strings and f-strings, such as regular expressions or simple delimiter
|
||||
matching tools. Introducing quote reuse in f-strings will either make it trickier to keep these tools
|
||||
working or will break the tools altogether (as, for instance, regular expressions cannot parse arbitrary nested
|
||||
structures with delimiters). The IDLE editor, included in the standard library, is an example of a
|
||||
tool which may need some work to correctly apply syntax highlighting to f-strings.
|
||||
|
||||
Here are some of the arguments in favour:
|
||||
|
||||
* Many languages that allow similar syntactic constructs (normally called "string interpolation") allow quote
|
||||
reuse and arbitrary nesting. These languages include JavaScript, Ruby, C#, Bash, Swift and many others.
|
||||
The fact that many languages allow quote reuse can be a compelling argument in favour of allowing it in Python. This
|
||||
is because it will make the language more familiar to users coming from other languages.
|
||||
|
||||
* As many other popular languages allow quote reuse in string interpolation constructs, this means that editors
|
||||
that support syntax highlighting for these languages will already have the necessary tools to support syntax
|
||||
highlighting for f-strings with quote reuse in Python. This means that although the files that handle syntax
|
||||
highlighting for Python will need to be updated to support this new feature, is not expected to be impossible
|
||||
or very hard to do.
|
||||
|
||||
* One advantage of allowing quote reuse is that it composes cleanly with other syntax. Sometimes this is referred to
|
||||
as "referential transparency". An example of this is that if we have ``f(x+1)``, assuming ``a`` is a brand new variable, it
|
||||
should behave the same as ``a = x+1; f(a)``. And vice versa. So if we have::
|
||||
|
||||
def py2c(source):
|
||||
prefix = source.removesuffix(".py")
|
||||
return f"{prefix}.c"
|
||||
|
||||
It should be expected that if we replace the variable ``prefix`` with its definition, the answer should be the same::
|
||||
|
||||
def py2c(source):
|
||||
return f"{source.removesuffix(".py")}.c"
|
||||
|
||||
* Limiting quote reuse will considerably increase the complexity of the implementation of the proposed changes. This is because
|
||||
it will force the parser to have the context that is parsing an expression part of an f-string with a given quote in order
|
||||
to know if it needs to reject an expression that reuses the quote. Carrying this context around is not trivial in parsers that
|
||||
can backtrack arbitrarily (such as the PEG parser). The issue becomes even more complex if we consider that f-strings can be
|
||||
arbitrarily nested and therefore several quote types may need to be rejected.
|
||||
|
||||
To gather feedback from the community,
|
||||
`a poll <https://discuss.python.org/t/pep-701-syntactic-formalization-of-f-strings/22046/24>`__
|
||||
has been initiated to get a sense of how the community feels about this aspect of the PEP.
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
@ -370,8 +461,18 @@ A reference implementation can be found in the implementation_ fork.
|
|||
Rejected Ideas
|
||||
==============
|
||||
|
||||
#. Although we think the readability arguments that have been raised against
|
||||
allowing quote reuse in f-string expressions are valid and very important,
|
||||
we have decided to propose not rejecting quote reuse in f-strings at the parser
|
||||
level. The reason is that one of the cornerstones of this PEP is to reduce the
|
||||
complexity and maintenance of parsing f-strings in CPython and this will not
|
||||
only work against that goal, but it may even make the implementation even more
|
||||
complex than the current one. We believe that forbidding quote reuse should be
|
||||
done in linters and code style tools and not in the parser, the same way other
|
||||
confusing or hard-to-read constructs in the language are handled today.
|
||||
|
||||
#. We have decided not to lift the restriction that some expression portions
|
||||
need to wrap ``':'`` and ``'!'`` in braces at the top level, e.g.::
|
||||
need to wrap ``':'`` and ``'!'`` in parentheses at the top level, e.g.::
|
||||
|
||||
>>> f'Useless use of lambdas: { lambda x: x*2 }'
|
||||
SyntaxError: unexpected EOF while parsing
|
||||
|
@ -390,7 +491,6 @@ Rejected Ideas
|
|||
be parenthesized if needed::
|
||||
|
||||
>>> f'Useless use of lambdas: { (lambda x: x*2) }'
|
||||
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
|
Loading…
Reference in New Issue