From 1bd2f20a00d96538e90e566c41b1c20f2f6cd89f Mon Sep 17 00:00:00 2001 From: Chris Angelico Date: Tue, 15 Sep 2015 11:37:22 +1000 Subject: [PATCH] Apply PEP 502 changes from Mike Miller --- pep-0502.txt | 449 ++++++++++++++++++--------------------------------- 1 file changed, 161 insertions(+), 288 deletions(-) diff --git a/pep-0502.txt b/pep-0502.txt index a51b7eba6..dbb1db34c 100644 --- a/pep-0502.txt +++ b/pep-0502.txt @@ -1,44 +1,46 @@ PEP: 502 -Title: String Interpolation Redux +Title: String Interpolation - Extended Discussion Version: $Revision$ Last-Modified: $Date$ Author: Mike G. Miller Status: Draft -Type: Standards Track +Type: Informational Content-Type: text/x-rst Created: 10-Aug-2015 Python-Version: 3.6 -Note: Open issues below are stated with a question mark (?), -and are therefore searchable. - Abstract ======== -This proposal describes a new string interpolation feature for Python, -called an *expression-string*, -that is both concise and powerful, -improves readability in most cases, -yet does not conflict with existing code. +PEP 498: *Literal String Interpolation*, which proposed "formatted strings" was +accepted September 9th, 2015. +Additional background and rationale given during its design phase is detailed +below. + +To recap that PEP, +a string prefix was introduced that marks the string as a template to be +rendered. +These formatted strings may contain one or more expressions +built on `the existing syntax`_ of ``str.format()``. +The formatted string expands at compile-time into a conventional string format +operation, +with the given expressions from its text extracted and passed instead as +positional arguments. -To achieve this end, -a new string prefix is introduced, -which expands at compile-time into an equivalent expression-string object, -with requested variables from its context passed as keyword arguments. At runtime, -the new object uses these passed values to render a string to given -specifications, building on `the existing syntax`_ of ``str.format()``:: +the resulting expressions are evaluated to render a string to given +specifications:: >>> location = 'World' - >>> e'Hello, {location} !' # new prefix: e'' - 'Hello, World !' # interpolated result + >>> f'Hello, {location} !' # new prefix: f'' + 'Hello, World !' # interpolated result + +Format-strings may be thought of as merely syntactic sugar to simplify traditional +calls to ``str.format()``. .. _the existing syntax: https://docs.python.org/3/library/string.html#format-string-syntax -This PEP does not recommend to remove or deprecate any of the existing string -formatting mechanisms. - Motivation ========== @@ -50,12 +52,16 @@ In comparison to other dynamic scripting languages with similar use cases, the amount of code necessary to build similar strings is substantially higher, while at times offering lower readability due to verbosity, dense syntax, -or identifier duplication. [1]_ +or identifier duplication. + +These difficulties are described at moderate length in the original +`post to python-ideas`_ +that started the snowball (that became PEP 498) rolling. [1]_ Furthermore, replacement of the print statement with the more consistent print function of Python 3 (PEP 3105) has added one additional minor burden, an additional set of parentheses to type and read. -Combined with the verbosity of current formatting solutions, +Combined with the verbosity of current string formatting solutions, this puts an otherwise simple language at an unfortunate disadvantage to its peers:: @@ -66,7 +72,7 @@ peers:: # Python 3, str.format with named parameters print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(**locals())) - # Python 3, variation B, worst case + # Python 3, worst case print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(user=user, id=id, hostname= @@ -74,7 +80,7 @@ peers:: In Python, the formatting and printing of a string with multiple variables in a single line of code of standard width is noticeably harder and more verbose, -indentation often exacerbating the issue. +with indentation exacerbating the issue. For use cases such as smaller projects, systems programming, shell script replacements, and even one-liners, @@ -82,36 +88,17 @@ where message formatting complexity has yet to be encapsulated, this verbosity has likely lead a significant number of developers and administrators to choose other languages over the years. +.. _post to python-ideas: https://mail.python.org/pipermail/python-ideas/2015-July/034659.html + Rationale ========= -Naming ------- - -The term expression-string was chosen because other applicable terms, -such as format-string and template are already well used in the Python standard -library. - -The string prefix itself, ``e''`` was chosen to demonstrate that the -specification enables expressions, -is not limited to ``str.format()`` syntax, -and also does not lend itself to `the shorthand term`_ "f-string". -It is also slightly easier to type than other choices such as ``_''`` and -``i''``, -while perhaps `less odd-looking`_ to C-developers. -``printf('')`` vs. ``print(f'')``. - -.. _the shorthand term: reference_needed -.. _less odd-looking: https://mail.python.org/pipermail/python-dev/2015-August/141147.html - - - Goals ------------- -The design goals of expression-strings are as follows: +The design goals of format strings are as follows: #. Eliminate need to pass variables manually. #. Eliminate repetition of identifiers and redundant parentheses. @@ -133,40 +120,44 @@ Python specified both single (``'``) and double (``"``) ASCII quote characters to enclose strings. It is not reasonable to choose one of them now to enable interpolation, while leaving the other for uninterpolated strings. -"Backtick" characters (`````) are also `constrained by history`_ as a shortcut -for ``repr()``. +Other characters, +such as the "Backtick" (or grave accent `````) are also +`constrained by history`_ +as a shortcut for ``repr()``. This leaves a few remaining options for the design of such a feature: * An operator, as in printf-style string formatting via ``%``. * A class, such as ``string.Template()``. -* A function, such as ``str.format()``. -* New syntax +* A method or function, such as ``str.format()``. +* New syntax, or * A new string prefix marker, such as the well-known ``r''`` or ``u''``. -The first three options above currently work well. +The first three options above are mature. Each has specific use cases and drawbacks, yet also suffer from the verbosity and visual noise mentioned previously. -All are discussed in the next section. +All options are discussed in the next sections. .. _constrained by history: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html + Background ------------- -This proposal builds on several existing techniques and proposals and what +Formatted strings build on several existing techniques and proposals and what we've collectively learned from them. +In keeping with the design goals of readability and error-prevention, +the following examples therefore use named, +not positional arguments. -The following examples focus on the design goals of readability and -error-prevention using named parameters. Let's assume we have the following dictionary, and would like to print out its items as an informative string for end users:: >>> params = {'user': 'nobody', 'id': 9, 'hostname': 'darkstar'} -Printf-style formatting -''''''''''''''''''''''' +Printf-style formatting, via operator +''''''''''''''''''''''''''''''''''''' This `venerable technique`_ continues to have its uses, such as with byte-based protocols, @@ -178,7 +169,7 @@ and familiarity to many programmers:: In this form, considering the prerequisite dictionary creation, the technique is verbose, a tad noisy, -and relatively readable. +yet relatively readable. Additional issues are that an operator can only take one argument besides the original string, meaning multiple parameters must be passed in a tuple or dictionary. @@ -190,8 +181,8 @@ or forget the trailing type, e.g. (``s`` or ``d``). .. _venerable technique: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting -string.Template -''''''''''''''' +string.Template Class +''''''''''''''''''''' The ``string.Template`` `class from`_ PEP 292 (Simpler String Substitutions) @@ -202,7 +193,7 @@ that finds its main use cases in shell and internationalization tools:: Template('Hello, user: $user, id: ${id}, on host: $hostname').substitute(params) -Also verbose, however the string itself is readable. +While also verbose, the string itself is readable. Though functionality is limited, it meets its requirements well. It isn't powerful enough for many cases, @@ -232,8 +223,8 @@ and likely contributed to the PEP's lack of acceptance. It was superseded by the following proposal. -str.format() -'''''''''''' +str.format() Method +''''''''''''''''''' The ``str.format()`` `syntax of`_ PEP 3101 is the most recent and modern of the existing options. @@ -253,36 +244,32 @@ string literals:: host=hostname) 'Hello, user: nobody, id: 9, on host: darkstar' +The verbosity of the method-based approach is illustrated here. + .. _syntax of: https://docs.python.org/3/library/string.html#format-string-syntax PEP 498 -- Literal String Formatting '''''''''''''''''''''''''''''''''''' -PEP 498 discusses and delves partially into implementation details of -expression-strings, -which it calls f-strings, -the idea and syntax -(with exception of the prefix letter) -of which is identical to that discussed here. -The resulting compile-time transformation however -returns a string joined from parts at runtime, -rather than an object. - -It also, somewhat controversially to those first exposed to it, -introduces the idea that these strings shall be augmented with support for -arbitrary expressions, -which is discussed further in the following sections. +PEP 498 defines and discusses format strings, +as also described in the `Abstract`_ above. +It also, somewhat controversially to those first exposed, +introduces the idea that format-strings shall be augmented with support for +arbitrary expressions. +This is discussed further in the +Restricting Syntax section under +`Rejected Ideas`_. PEP 501 -- Translation ready string interpolation ''''''''''''''''''''''''''''''''''''''''''''''''' The complimentary PEP 501 brings internationalization into the discussion as a -first-class concern, with its proposal of i-strings, +first-class concern, with its proposal of the i-prefix, ``string.Template`` syntax integration compatible with ES6 (Javascript), deferred rendering, -and a similar object return value. +and an object return value. Implementations in Other Languages @@ -374,7 +361,8 @@ ES6 (Javascript) Designers of `Template strings`_ faced the same issue as Python where single and double quotes were taken. Unlike Python however, "backticks" were not. -They were chosen as part of the ECMAScript 2015 (ES6) standard:: +Despite `their issues`_, +they were chosen as part of the ECMAScript 2015 (ES6) standard:: console.log(`Fifteen is ${a + b} and\nnot ${2 * a + b}.`); @@ -391,8 +379,10 @@ as the tag:: * User implemented prefixes supported. * Arbitrary expressions are supported. +.. _their issues: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html .. _Template strings: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings + C#, Version 6 ''''''''''''' @@ -428,13 +418,14 @@ Arbitrary `interpolation under Swift`_ is available on all strings:: Additional examples ''''''''''''''''''' -A number of additional examples may be `found at Wikipedia`_. +A number of additional examples of string interpolation may be +`found at Wikipedia`_. + +Now that background and history have been covered, +let's continue on for a solution. .. _found at Wikipedia: https://en.wikipedia.org/wiki/String_interpolation#Examples -Now that background and imlementation history have been covered, -let's continue on for a solution. - New Syntax ---------- @@ -442,178 +433,47 @@ New Syntax This should be an option of last resort, as every new syntax feature has a cost in terms of real-estate in a brain it inhabits. -There is one alternative left on our list of possibilities, +There is however one alternative left on our list of possibilities, which follows. New String Prefix ----------------- -Given the history of string formatting in Python, -backwards-compatibility, +Given the history of string formatting in Python and backwards-compatibility, implementations in other languages, -and the avoidance of new syntax unless necessary, +avoidance of new syntax unless necessary, an acceptable design is reached through elimination rather than unique insight. -Therefore, we choose to explicitly mark interpolated string literals with a -string prefix. +Therefore, marking interpolated string literals with a string prefix is chosen. -We also choose an expression syntax that reuses and builds on the strongest of +We also choose an expression syntax that reuses and builds on the strongest of the existing choices, -``str.format()`` to avoid further duplication. - - -Specification -============= - -String literals with the prefix of ``e`` shall be converted at compile-time to -the construction of an ``estr`` (perhaps ``types.ExpressionString``?) object. -Strings and values are parsed from the literal and passed as tuples to the -constructor:: +``str.format()`` to avoid further duplication of functionality:: >>> location = 'World' - >>> e'Hello, {location} !' + >>> f'Hello, {location} !' # new prefix: f'' + 'Hello, World !' # interpolated result - # becomes - # estr('Hello, {location} !', # template - ('Hello, ', ' !'), # string fragments - ('location',), # expressions - ('World',), # values - ) - -The object interpolates its result immediately at run-time:: - - 'Hello, World !' +PEP 498 -- Literal String Formatting, delves into the mechanics and +implementation of this design. -ExpressionString Objects ------------------------- - -The ExpressionString object supports both immediate and deferred rendering of -its given template and parameters. -It does this by immediately rendering its inputs to its internal string and -``.rendered`` string member (still necessary?), -useful in the majority of use cases. -To allow for deferred rendering and caller-specified escaping, -all inputs are saved for later inspection, -with convenience methods available. - -Notes: - -* Inputs are saved to the object as ``.template`` and ``.context`` members - for later use. -* No explicit ``str(estr)`` call is necessary to render the result, - though doing so might be desired to free resources if significant. -* Additional or deferred rendering is available through the ``.render()`` - method, which allows template and context to be overriden for flexibility. -* Manual escaping of potentially dangerous input is available through the - ``.escape(escape_function)`` method, - the rules of which may therefore be specified by the caller. - The given function should both accept and return a single modified string. - -* A sample Python implementation can `found at Bitbucket`_: - -.. _found at Bitbucket: https://bitbucket.org/mixmastamyk/docs/src/default/pep/estring_demo.py - - -Inherits From ``str`` Type -''''''''''''''''''''''''''' - -Inheriting from the ``str`` class is one of the techniques available to improve -compatibility with code expecting a string object, -as it will pass an ``isinstance(obj, str)`` test. -ExpressionString implements this and also renders its result into the "raw" -string of its string superclass, -providing compatibility with a majority of code. - - -Interpolation Syntax --------------------- - -The strongest of the existing string formatting syntaxes is chosen, -``str.format()`` as a base to build on. [10]_ [11]_ - -.. - -* Additionally, single arbitrary expressions shall also be supported inside - braces as an extension:: - - >>> e'My age is {age + 1} years.' - - See below for section on safety. - -* Triple quoted strings with multiple lines shall be supported:: - - >>> e'''Hello, - {location} !''' - 'Hello,\n World !' - -* Adjacent implicit concatenation shall be supported; - interpolation does not `not bleed into`_ other strings:: - - >>> 'Hello {1, 2, 3} ' e'{location} !' - 'Hello {1, 2, 3} World !' - -* Additional implementation details, - for example expression and error-handling, - are specified in the compatible PEP 498. - -.. _not bleed into: https://mail.python.org/pipermail/python-ideas/2015-July/034763.html - - -Composition with Other Prefixes -------------------------------- - -* Expression-strings apply to unicode objects only, - therefore ``u''`` is never needed. - Should it be prevented? - -* Bytes objects are not included here and do not compose with e'' as they - do not support ``__format__()``. - -* Complimentary to raw strings, - backslash codes shall not be converted in the expression-string, - when combined with ``r''`` as ``re''``. - - -Examples --------- - -A more complicated example follows:: - - n = 5; # t0, t1 = … TODO - a = e"Sliced {n} onions in {t1-t0:.3f} seconds." - # returns the equvalent of - estr("Sliced {n} onions in {t1-t0:.3f} seconds", # template - ('Sliced ', ' onions in ', ' seconds'), # strings - ('n', 't1-t0:.3f'), # expressions - (5, 0.555555) # values - ) - -With expressions only:: - - b = e"Three random numbers: {rand()}, {rand()}, {rand()}." - # returns the equvalent of - estr("Three random numbers: {rand():f}, {rand():f}, {rand():}.", # template - ('Three random numbers: ', ', ', ', ', '.'), # strings - ('rand():f', 'rand():f', 'rand():f'), # expressions - (rand(), rand(), rand()) # values - ) +Additional Topics +================= Safety ----------- In this section we will describe the safety situation and precautions taken -in support of expression-strings. +in support of format-strings. -#. Only string literals shall be considered here, +#. Only string literals have been considered for format-strings, not variables to be taken as input or passed around, making external attacks difficult to accomplish. - * ``str.format()`` `already handles`_ this use-case. - * Direct instantiation of the ExpressionString object with non-literal input - shall not be allowed. (Practicality?) + ``str.format()`` and alternatives `already handle`_ this use-case. #. Neither ``locals()`` nor ``globals()`` are necessary nor used during the transformation, @@ -622,37 +482,72 @@ in support of expression-strings. #. To eliminate complexity as well as ``RuntimeError`` (s) due to recursion depth, recursive interpolation is not supported. -#. Restricted characters or expression classes?, such as ``=`` for assignment. - However, mistakes or malicious code could be missed inside string literals. Though that can be said of code in general, that these expressions are inside strings means they are a bit more likely to be obscured. -.. _already handles: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html +.. _already handle: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html -Mitigation via tools +Mitigation via Tools '''''''''''''''''''' The idea is that tools or linters such as pyflakes, pylint, or Pycharm, -could check inside strings for constructs that exceed project policy. -As this is a common task with languages these days, -tools won't have to implement this feature solely for Python, +may check inside strings with expressions and mark them up appropriately. +As this is a common task with programming languages today, +multi-language tools won't have to implement this feature solely for Python, significantly shortening time to implementation. -Additionally the Python interpreter could check(?) and warn with appropriate -command-line parameters passed. +Farther in the future, +strings might also be checked for constructs that exceed the safety policy of +a project. + + +Style Guide/Precautions +----------------------- + +As arbitrary expressions may accomplish anything a Python expression is +able to, +it is highly recommended to avoid constructs inside format-strings that could +cause side effects. + +Further guidelines may be written once usage patterns and true problems are +known. + + +Reference Implementation(s) +--------------------------- + +The `say module on PyPI`_ implements string interpolation as described here +with the small burden of a callable interface:: + + > pip install say + + from say import say + nums = list(range(4)) + say("Nums has {len(nums)} items: {nums}") + +A Python implementation of Ruby interpolation `is also available`_. +It uses the codecs module to do its work:: + + > pip install interpy + + # coding: interpy + location = 'World' + print("Hello #{location}.") + +.. _say module on PyPI: https://pypi.python.org/pypi/say/ +.. _is also available: https://github.com/syrusakbary/interpy Backwards Compatibility ----------------------- -By using existing syntax and avoiding use of current or historical features, -expression-strings (and any associated sub-features), -were designed so as to not interfere with existing code and is not expected -to cause any issues. +By using existing syntax and avoiding current or historical features, +format strings were designed so as to not interfere with existing code and are +not expected to cause any issues. Postponed Ideas @@ -666,20 +561,12 @@ Though it was highly desired to integrate internationalization support, the finer details diverge at almost every point, making a common solution unlikely: [15]_ -* Use-cases -* Compile and run-time tasks -* Interpolation Syntax +* Use-cases differ +* Compile vs. run-time tasks +* Interpolation syntax needs * Intended audience * Security policy -Rather than try to fit a "square peg in a round hole," -this PEP attempts to allow internationalization to be supported in the future -by not preventing it. -In this proposal, -expression-string inputs are saved for inspection and re-rendering at a later -time, -allowing for their use by an external library of any sort. - Rejected Ideas -------------- @@ -687,18 +574,25 @@ Rejected Ideas Restricting Syntax to ``str.format()`` Only ''''''''''''''''''''''''''''''''''''''''''' -This was deemed not enough of a solution to the problem. -It can be seen in the `Implementations in Other Languages`_ section that the -developer community at large tends to agree. +The common `arguments against`_ support of arbitrary expresssions were: -The common `arguments against`_ arbitrary expresssions were: - -#. YAGNI, "You ain't gonna need it." -#. The change is not congruent with historical Python conservatism. +#. `YAGNI`_, "You aren't gonna need it." +#. The feature is not congruent with historical Python conservatism. #. Postpone - can implement in a future version if need is demonstrated. +.. _YAGNI: https://en.wikipedia.org/wiki/You_aren't_gonna_need_it .. _arguments against: https://mail.python.org/pipermail/python-ideas/2015-August/034913.html +Support of only ``str.format()`` syntax however, +was deemed not enough of a solution to the problem. +Often a simple length or increment of an object, for example, +is desired before printing. + +It can be seen in the `Implementations in Other Languages`_ section that the +developer community at large tends to agree. +String interpolation with arbitrary expresssions is becoming an industry +standard in modern languages due to its utility. + Additional/Custom String-Prefixes ''''''''''''''''''''''''''''''''' @@ -720,7 +614,7 @@ this was thought to create too much uncertainty of when and where string expressions could be used safely or not. The concept was also difficult to describe to others. [12]_ -Always consider expression-string variables to be unescaped, +Always consider format string variables to be unescaped, unless the developer has explicitly escaped them. @@ -735,33 +629,13 @@ and looking too much like bash/perl, which could encourage bad habits. [13]_ -Reference Implementation(s) -=========================== - -An expression-string implementation is currently attached to PEP 498, -under the ``f''`` prefix, -and may be available in nightly builds. - -A Python implementation of Ruby interpolation `is also available`_, -which is similar to this proposal. -It uses the codecs module to do its work:: - - > pip install interpy - - # coding: interpy - location = 'World' - print("Hello #{location}.") - -.. _is also available: https://github.com/syrusakbary/interpy - - Acknowledgements ================ -* Eric V. Smith for providing invaluable implementation work and design - opinions, helping to focus this PEP. -* Others on the python-ideas mailing list for rejecting the craziest of ideas, - also helping to achieve focus. +* Eric V. Smith for the authoring and implementation of PEP 498. +* Everyone on the python-ideas mailing list for rejecting the various crazy + ideas that came up, + helping to keep the final design in focus. References @@ -771,7 +645,6 @@ References (https://mail.python.org/pipermail/python-ideas/2015-July/034659.html) - .. [2] Briefer String Format (https://mail.python.org/pipermail/python-ideas/2015-July/034669.html)