diff --git a/pep-0536.txt b/pep-0536.txt new file mode 100644 index 000000000..5a954b7db --- /dev/null +++ b/pep-0536.txt @@ -0,0 +1,183 @@ +PEP: 536 +Title: Final Grammar for Literal String Interpolation +Version: $Revision$ +Last-Modified: $Date$ +Author: Philipp Angerer +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 11-Dec-2016 +Python-Version: 3.7 +Post-History: 12-Dec-2016 + +Abstract +======== + +PEP 498 introduced Literal String Interpolation (or “f-strings”). +The expression portions of those literals however are subject to +certain restrictions. This PEP proposes a formal grammar lifting +those restrictions, promoting “f-strings” to “f expressions” or f-literals. + +This PEP expands upon the f-strings introduced by PEP 498, +so this text requires familiarity with PEP 498. + +Terminology +=========== + +This text will refer to the existing grammar as “f-strings”, +and the proposed one as “f-literals”. + +Furthermore, it will refer to the ``{}``-delimited expressions in +f-literals/f-strings as “expression portions” and the static string content +around them as “string portions”. + +Motivation +========== + +The current implementation of f-strings in CPython relies on the existing +string parsing machinery and a post processing of its tokens. This results in +several restrictions to the possible expressions usable within f-strings: + +#. It is impossible to use the quote character delimiting the f-string + within the expression portion:: + + >>> f'Magic wand: { bag['wand'] }' + ^ + SyntaxError: invalid syntax + +#. A previously considered way around it would lead to escape sequences + in executed code and is prohibited in f-strings:: + + >>> f'Magic wand { bag[\'wand\'] } string' + SyntaxError: f-string expression portion cannot include a backslash + +#. Comments are forbidden even in multi-line f-strings:: + + >>> f'''A complex trick: { + ... bag['bag'] # recursive bags! + ... }''' + SyntaxError: f-string expression part cannot include '#' + +#. Expression portions need to wrap ``':'`` and ``'!'`` in braces:: + + >>> f'Useless use of lambdas: { lambda x: x*2 }' + SyntaxError: unexpected EOF while parsing + +These limitations serve no purpose from a language user perspective and +can be lifted by giving f-literals a regular grammar without exceptions +and implementing it using dedicated parse code. + +Rationale +========= + +.. https://mail.python.org/pipermail/python-ideas/2016-August/041727.html + +The restrictions mentioned in Motivation_ are non-obvious and counter-intuitive +unless the user is familiar with the f-literals’ implementation details. + +As mentioned, a previous version of PEP 498 allowed escape sequences +anywhere in f-strings, including as ways to encode the braces delimiting +the expression portions and in their code. They would be expanded before +the code is parsed, which would have had several important ramifications: + +#. It would not be clear to human readers which portions are Expressions +and which are strings. Great material for an “obfuscated/underhanded +Python challenge” +#. Syntax highlighters are good in parsing nested grammar, but not +in recognizing escape sequences. ECMAScript 2016 (JavaScript) allows +escape sequences in its identifiers [1]_ and the author knows of no +syntax highlighter able to correctly highlight code making use of this. + +As a consequence, the expression portions would be harder to recognize +with and without the aid of syntax highlighting. With the new grammar, +it is easy to extend syntax highlighters to correctly parse +and display f-literals: + +.. raw:: html + +
f'Magic wand: {bag['wand']:^10}'
+ +.. This is the output of kate-syntax-highlighter when given that code + (with some quotes stripped) + +Highlighting expression portions with possible escape sequences would +mean to create a modified copy of all rules of the complete expression +grammar, accounting for the possibility of escape sequences in key words, +delimiters, and all other language syntax. One such duplication would +yield one level of escaping depth and have to be repeated for a deeper +escaping in a recursive f-literal. This is the case since no highlighting +engine known to the author supports expanding escape sequences before +applying rules to a certain context. Nesting contexts however is a +standard feature of all highlighting engines. + +Familiarity also plays a role: Arbitrary nesting of expressions +without expansion of escape sequences is available in every single +other language employing a string interpolation method that uses +expressions instead of just variable names. [2]_ + +Specification +============= + +PEP 498 specified f-strings as the following, but places restrictions on it:: + + f ' { } ... ' + +All restrictions mentioned in the PEP are lifted from f-literals, +as explained below: + +#. Expression portions may now contain strings delimited with the same + kind of quote that is used to delimit the f-literal. +#. Backslashes may now appear within expressions just like anywhere else + in Python code. In case of strings nested within f-literals, + escape sequences are expanded when the innermost string is evaluated. +#. Comments, using the ``'#'`` character, are possible only in multi-line + f-literals, since comments are terminated by the end of the line + (which makes closing a single-line f-literal impossible). +#. Expression portions may contain ``':'`` or ``'!'`` wherever + syntactically valid. The first ``':'`` or ``'!'`` that is not part + of an expression has to be followed a valid coercion or format specifier. + +A remaining restriction not explicitly mentioned by PEP 498 is line breaks +in expression portions. Since strings delimited by single ``'`` or ``"`` +characters are expected to be single line, line breaks remain illegal +in expression portions of single line strings. + +.. note:: Is lifting of the restrictions sufficient, + or should we specify a more complete grammar? + +Backwards Compatibility +======================= + +f-literals are fully backwards compatible to f-strings, +and expands the syntax considered legal. + +Reference Implementation +======================== + +TBD + +References +========== + +.. [1] ECMAScript ``IdentifierName`` specification + ( http://ecma-international.org/ecma-262/6.0/#sec-names-and-keywords ) + + Yes, ``const cthulhu = { H̹̙̦̮͉̩̗̗ͧ̇̏̊̾Eͨ͆͒̆ͮ̃͏̷̮̣̫̤̣Cͯ̂͐͏̨̛͔̦̟͈̻O̜͎͍͙͚̬̝̣̽ͮ͐͗̀ͤ̍̀͢M̴̡̲̭͍͇̼̟̯̦̉̒͠Ḛ̛̙̞̪̗ͥͤͩ̾͑̔͐ͅṮ̴̷̷̗̼͍̿̿̓̽͐H̙̙̔̄͜\u0042: 42 }`` is valid ECMAScript 2016 + +.. [2] Wikipedia article on string interpolation + ( https://en.wikipedia.org/wiki/String_interpolation ) + +Copyright +========= + +This document has been placed in the public domain. + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: