python-peps/pep-0502.txt

728 lines
23 KiB
Plaintext
Raw Normal View History

PEP: 502
2015-09-14 21:37:22 -04:00
Title: String Interpolation - Extended Discussion
Version: $Revision$
Last-Modified: $Date$
Author: Mike G. Miller
2019-03-27 18:24:28 -04:00
Status: Rejected
2015-09-14 21:37:22 -04:00
Type: Informational
Content-Type: text/x-rst
Created: 10-Aug-2015
Python-Version: 3.6
Abstract
========
:pep:`498`: *Literal String Interpolation*, which proposed "formatted strings" was
2015-09-14 21:37:22 -04:00
accepted September 9th, 2015.
Additional background and rationale given during its design phase is detailed
below.
To recap that PEP,
a string prefix was introduced that marks the string as a template to be
rendered.
These formatted strings may contain one or more expressions
built on `the existing syntax`_ of ``str.format()``.
The formatted string expands at compile-time into a conventional string format
operation,
with the given expressions from its text extracted and passed instead as
positional arguments.
At runtime,
2015-09-14 21:37:22 -04:00
the resulting expressions are evaluated to render a string to given
specifications::
>>> location = 'World'
2015-09-14 21:37:22 -04:00
>>> f'Hello, {location} !' # new prefix: f''
'Hello, World !' # interpolated result
2015-09-14 21:37:22 -04:00
Format-strings may be thought of as merely syntactic sugar to simplify traditional
calls to ``str.format()``.
2015-09-14 21:37:22 -04:00
.. _the existing syntax: https://docs.python.org/3/library/string.html#format-string-syntax
2019-03-27 18:24:28 -04:00
PEP Status
==========
This PEP was rejected based on its using an opinion-based tone rather than a factual one.
This PEP was also deemed not critical as :pep:`498` was already written and should be the place
2019-03-27 18:24:28 -04:00
to house design decision details.
Motivation
==========
Though string formatting and manipulation features are plentiful in Python,
one area where it falls short
is the lack of a convenient string interpolation syntax.
In comparison to other dynamic scripting languages
with similar use cases,
the amount of code necessary to build similar strings is substantially higher,
while at times offering lower readability due to verbosity, dense syntax,
2015-09-14 21:37:22 -04:00
or identifier duplication.
These difficulties are described at moderate length in the original
`post to python-ideas`_
that started the snowball (that became :pep:`498`) rolling. [1]_
Furthermore, replacement of the print statement with the more consistent print
function of Python 3 (:pep:`3105`) has added one additional minor burden,
an additional set of parentheses to type and read.
2015-09-14 21:37:22 -04:00
Combined with the verbosity of current string formatting solutions,
this puts an otherwise simple language at an unfortunate disadvantage to its
peers::
echo "Hello, user: $user, id: $id, on host: $hostname" # bash
say "Hello, user: $user, id: $id, on host: $hostname"; # perl
puts "Hello, user: #{user}, id: #{id}, on host: #{hostname}\n" # ruby
# 80 ch -->|
# Python 3, str.format with named parameters
print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(**locals()))
2015-09-14 21:37:22 -04:00
# Python 3, worst case
print('Hello, user: {user}, id: {id}, on host: {hostname}'.format(user=user,
id=id,
hostname=
hostname))
In Python, the formatting and printing of a string with multiple variables in a
single line of code of standard width is noticeably harder and more verbose,
2015-09-14 21:37:22 -04:00
with indentation exacerbating the issue.
For use cases such as smaller projects, systems programming,
shell script replacements, and even one-liners,
where message formatting complexity has yet to be encapsulated,
this verbosity has likely lead a significant number of developers and
administrators to choose other languages over the years.
2015-09-14 21:37:22 -04:00
.. _post to python-ideas: https://mail.python.org/pipermail/python-ideas/2015-July/034659.html
Rationale
=========
Goals
-------------
2015-09-14 21:37:22 -04:00
The design goals of format strings are as follows:
#. Eliminate need to pass variables manually.
#. Eliminate repetition of identifiers and redundant parentheses.
#. Reduce awkward syntax, punctuation characters, and visual noise.
#. Improve readability and eliminate mismatch errors,
by preferring named parameters to positional arguments.
#. Avoid need for ``locals()`` and ``globals()`` usage,
instead parsing the given string for named parameters,
then passing them automatically. [2]_ [3]_
Limitations
-------------
In contrast to other languages that take design cues from Unix and its
shells,
and in common with Javascript,
Python specified both single (``'``) and double (``"``) ASCII quote
characters to enclose strings.
It is not reasonable to choose one of them now to enable interpolation,
while leaving the other for uninterpolated strings.
2015-09-14 21:37:22 -04:00
Other characters,
such as the "Backtick" (or grave accent `````) are also
`constrained by history`_
as a shortcut for ``repr()``.
This leaves a few remaining options for the design of such a feature:
* An operator, as in printf-style string formatting via ``%``.
* A class, such as ``string.Template()``.
2015-09-14 21:37:22 -04:00
* A method or function, such as ``str.format()``.
* New syntax, or
* A new string prefix marker, such as the well-known ``r''`` or ``u''``.
2015-09-14 21:37:22 -04:00
The first three options above are mature.
Each has specific use cases and drawbacks,
yet also suffer from the verbosity and visual noise mentioned previously.
2015-09-14 21:37:22 -04:00
All options are discussed in the next sections.
.. _constrained by history: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html
2015-09-14 21:37:22 -04:00
Background
-------------
2015-09-14 21:37:22 -04:00
Formatted strings build on several existing techniques and proposals and what
we've collectively learned from them.
2015-09-14 21:37:22 -04:00
In keeping with the design goals of readability and error-prevention,
the following examples therefore use named,
not positional arguments.
Let's assume we have the following dictionary,
and would like to print out its items as an informative string for end users::
>>> params = {'user': 'nobody', 'id': 9, 'hostname': 'darkstar'}
2015-09-14 21:37:22 -04:00
Printf-style formatting, via operator
'''''''''''''''''''''''''''''''''''''
This `venerable technique`_ continues to have its uses,
such as with byte-based protocols,
simplicity in simple cases,
and familiarity to many programmers::
>>> 'Hello, user: %(user)s, id: %(id)s, on host: %(hostname)s' % params
'Hello, user: nobody, id: 9, on host: darkstar'
In this form, considering the prerequisite dictionary creation,
the technique is verbose, a tad noisy,
2015-09-14 21:37:22 -04:00
yet relatively readable.
Additional issues are that an operator can only take one argument besides the
original string,
meaning multiple parameters must be passed in a tuple or dictionary.
Also, it is relatively easy to make an error in the number of arguments passed,
the expected type,
have a missing key,
or forget the trailing type, e.g. (``s`` or ``d``).
.. _venerable technique: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
2015-09-14 21:37:22 -04:00
string.Template Class
'''''''''''''''''''''
The ``string.Template`` `class from`_ :pep:`292`
(Simpler String Substitutions)
is a purposely simplified design,
using familiar shell interpolation syntax,
with `safe-substitution feature`_,
that finds its main use cases in shell and internationalization tools::
Template('Hello, user: $user, id: ${id}, on host: $hostname').substitute(params)
2015-09-14 21:37:22 -04:00
While also verbose, the string itself is readable.
Though functionality is limited,
it meets its requirements well.
It isn't powerful enough for many cases,
and that helps keep inexperienced users out of trouble,
as well as avoiding issues with moderately-trusted input (i18n) from
third-parties.
It unfortunately takes enough code to discourage its use for ad-hoc string
interpolation,
unless encapsulated in a `convenience library`_ such as ``flufl.i18n``.
.. _class from: https://docs.python.org/3/library/string.html#template-strings
.. _safe-substitution feature: https://docs.python.org/3/library/string.html#string.Template.safe_substitute
.. _convenience library: http://pythonhosted.org/flufl.i18n/
PEP 215 - String Interpolation
''''''''''''''''''''''''''''''
:pep:`215` was a former proposal of which this one shares a lot in common.
Apparently, the world was not ready for it at the time,
but considering recent support in a number of other languages,
its day may have come.
The large number of dollar sign (``$``) characters it included may have
led it to resemble Python's arch-nemesis Perl,
and likely contributed to the PEP's lack of acceptance.
It was superseded by the following proposal.
2015-09-14 21:37:22 -04:00
str.format() Method
'''''''''''''''''''
The ``str.format()`` `syntax of`_ :pep:`3101` is the most recent and modern of the
existing options.
It is also more powerful and usually easier to read than the others.
It avoids many of the drawbacks and limits of the previous techniques.
However, due to its necessary function call and parameter passing,
it runs from verbose to very verbose in various situations with
string literals::
>>> 'Hello, user: {user}, id: {id}, on host: {hostname}'.format(**params)
'Hello, user: nobody, id: 9, on host: darkstar'
# when using keyword args, var name shortening sometimes needed to fit :/
>>> 'Hello, user: {user}, id: {id}, on host: {host}'.format(user=user,
id=id,
host=hostname)
'Hello, user: nobody, id: 9, on host: darkstar'
2015-09-14 21:37:22 -04:00
The verbosity of the method-based approach is illustrated here.
.. _syntax of: https://docs.python.org/3/library/string.html#format-string-syntax
PEP 498 -- Literal String Formatting
''''''''''''''''''''''''''''''''''''
:pep:`498` defines and discusses format strings,
2015-09-14 21:37:22 -04:00
as also described in the `Abstract`_ above.
2015-09-14 21:37:22 -04:00
It also, somewhat controversially to those first exposed,
introduces the idea that format-strings shall be augmented with support for
arbitrary expressions.
This is discussed further in the
Restricting Syntax section under
`Rejected Ideas`_.
PEP 501 -- Translation ready string interpolation
'''''''''''''''''''''''''''''''''''''''''''''''''
The complimentary :pep:`501` brings internationalization into the discussion as a
2015-09-14 21:37:22 -04:00
first-class concern, with its proposal of the i-prefix,
``string.Template`` syntax integration compatible with ES6 (Javascript),
deferred rendering,
2015-09-14 21:37:22 -04:00
and an object return value.
Implementations in Other Languages
----------------------------------
String interpolation is now well supported by various programming languages
used in multiple industries,
and is converging into a standard of sorts.
It is centered around ``str.format()`` style syntax in minor variations,
with the addition of arbitrary expressions to expand utility.
In the `Motivation`_ section it was shown how convenient interpolation syntax
existed in Bash, Perl, and Ruby.
Let's take a look at their expression support.
Bash
''''
Bash supports a number of arbitrary, even recursive constructs inside strings::
> echo "user: $USER, id: $((id + 6)) on host: $(echo is $(hostname))"
user: nobody, id: 15 on host: is darkstar
* Explicit interpolation within double quotes.
* Direct environment variable access supported.
* Arbitrary expressions are supported. [4]_
* External process execution and output capture supported. [5]_
* Recursive expressions are supported.
Perl
''''
Perl also has arbitrary expression constructs, perhaps not as well known::
say "I have @{[$id + 6]} guanacos."; # lists
say "I have ${\($id + 6)} guanacos."; # scalars
say "Hello { @names.join(', ') } how are you?"; # Perl 6 version
* Explicit interpolation within double quotes.
* Arbitrary expressions are supported. [6]_ [7]_
Ruby
''''
Ruby allows arbitrary expressions in its interpolated strings::
puts "One plus one is two: #{1 + 1}\n"
* Explicit interpolation within double quotes.
* Arbitrary expressions are supported. [8]_ [9]_
* Possible to change delimiter chars with ``%``.
* See the Reference Implementation(s) section for an implementation in Python.
Others
''''''
Let's look at some less-similar modern languages recently implementing string
interpolation.
Scala
'''''
`Scala interpolation`_ is directed through string prefixes.
Each prefix has a different result::
s"Hello, $name ${1 + 1}" # arbitrary
f"$name%s is $height%2.2f meters tall" # printf-style
raw"a\nb" # raw, like r''
These prefixes may also be implemented by the user,
by extending Scala's ``StringContext`` class.
* Explicit interpolation within double quotes with literal prefix.
* User implemented prefixes supported.
* Arbitrary expressions are supported.
.. _Scala interpolation: http://docs.scala-lang.org/overviews/core/string-interpolation.html
ES6 (Javascript)
'''''''''''''''''''
Designers of `Template strings`_ faced the same issue as Python where single
and double quotes were taken.
Unlike Python however, "backticks" were not.
2015-09-14 21:37:22 -04:00
Despite `their issues`_,
they were chosen as part of the ECMAScript 2015 (ES6) standard::
console.log(`Fifteen is ${a + b} and\nnot ${2 * a + b}.`);
Custom prefixes are also supported by implementing a function the same name
as the tag::
function tag(strings, ...values) {
console.log(strings.raw[0]); // raw string is also available
return "Bazinga!";
}
tag`Hello ${ a + b } world ${ a * b}`;
* Explicit interpolation within backticks.
* User implemented prefixes supported.
* Arbitrary expressions are supported.
2015-09-14 21:37:22 -04:00
.. _their issues: https://mail.python.org/pipermail/python-ideas/2007-January/000054.html
.. _Template strings: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings
2015-09-14 21:37:22 -04:00
C#, Version 6
'''''''''''''
C# has a useful new `interpolation feature`_ as well,
with some ability to `customize interpolation`_ via the ``IFormattable``
interface::
$"{person.Name, 20} is {person.Age:D3} year{(p.Age == 1 ? "" : "s")} old.";
* Explicit interpolation with double quotes and ``$`` prefix.
* Custom interpolations are available.
* Arbitrary expressions are supported.
.. _interpolation feature: https://msdn.microsoft.com/en-us/library/Dn961160.aspx
.. _customize interpolation: http://www.thomaslevesque.com/2015/02/24/customizing-string-interpolation-in-c-6/
Apple's Swift
'''''''''''''
Arbitrary `interpolation under Swift`_ is available on all strings::
let multiplier = 3
let message = "\(multiplier) times 2.5 is \(Double(multiplier) * 2.5)"
// message is "3 times 2.5 is 7.5"
* Implicit interpolation with double quotes.
* Arbitrary expressions are supported.
* Cannot contain CR/LF.
.. _interpolation under Swift: https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html#//apple_ref/doc/uid/TP40014097-CH7-ID292
Additional examples
'''''''''''''''''''
2015-09-14 21:37:22 -04:00
A number of additional examples of string interpolation may be
`found at Wikipedia`_.
2015-09-14 21:37:22 -04:00
Now that background and history have been covered,
let's continue on for a solution.
2015-09-14 21:37:22 -04:00
.. _found at Wikipedia: https://en.wikipedia.org/wiki/String_interpolation#Examples
New Syntax
----------
This should be an option of last resort,
as every new syntax feature has a cost in terms of real-estate in a brain it
inhabits.
2015-09-14 21:37:22 -04:00
There is however one alternative left on our list of possibilities,
which follows.
New String Prefix
-----------------
2015-09-14 21:37:22 -04:00
Given the history of string formatting in Python and backwards-compatibility,
implementations in other languages,
2015-09-14 21:37:22 -04:00
avoidance of new syntax unless necessary,
an acceptable design is reached through elimination
rather than unique insight.
2015-09-14 21:37:22 -04:00
Therefore, marking interpolated string literals with a string prefix is chosen.
2015-09-14 21:37:22 -04:00
We also choose an expression syntax that reuses and builds on the strongest of
the existing choices,
2015-09-14 21:37:22 -04:00
``str.format()`` to avoid further duplication of functionality::
>>> location = 'World'
2015-09-14 21:37:22 -04:00
>>> f'Hello, {location} !' # new prefix: f''
'Hello, World !' # interpolated result
:pep:`498` -- Literal String Formatting, delves into the mechanics and
2015-09-14 21:37:22 -04:00
implementation of this design.
2015-09-14 21:37:22 -04:00
Additional Topics
=================
Safety
-----------
In this section we will describe the safety situation and precautions taken
2015-09-14 21:37:22 -04:00
in support of format-strings.
2015-09-14 21:37:22 -04:00
#. Only string literals have been considered for format-strings,
not variables to be taken as input or passed around,
making external attacks difficult to accomplish.
2015-09-14 21:37:22 -04:00
``str.format()`` and alternatives `already handle`_ this use-case.
#. Neither ``locals()`` nor ``globals()`` are necessary nor used during the
transformation,
avoiding leakage of information.
#. To eliminate complexity as well as ``RuntimeError`` (s) due to recursion
depth, recursive interpolation is not supported.
However,
mistakes or malicious code could be missed inside string literals.
Though that can be said of code in general,
that these expressions are inside strings means they are a bit more likely
to be obscured.
2015-09-14 21:37:22 -04:00
.. _already handle: https://mail.python.org/pipermail/python-ideas/2015-July/034729.html
2015-09-14 21:37:22 -04:00
Mitigation via Tools
''''''''''''''''''''
The idea is that tools or linters such as pyflakes, pylint, or Pycharm,
2015-09-14 21:37:22 -04:00
may check inside strings with expressions and mark them up appropriately.
As this is a common task with programming languages today,
multi-language tools won't have to implement this feature solely for Python,
significantly shortening time to implementation.
2015-09-14 21:37:22 -04:00
Farther in the future,
strings might also be checked for constructs that exceed the safety policy of
a project.
Style Guide/Precautions
-----------------------
As arbitrary expressions may accomplish anything a Python expression is
able to,
it is highly recommended to avoid constructs inside format-strings that could
cause side effects.
Further guidelines may be written once usage patterns and true problems are
known.
Reference Implementation(s)
---------------------------
The `say module on PyPI`_ implements string interpolation as described here
with the small burden of a callable interface::
pip install say
from say import say
nums = list(range(4))
say("Nums has {len(nums)} items: {nums}")
A Python implementation of Ruby interpolation `is also available`_.
It uses the codecs module to do its work::
pip install interpy
# coding: interpy
location = 'World'
print("Hello #{location}.")
.. _say module on PyPI: https://pypi.python.org/pypi/say/
.. _is also available: https://github.com/syrusakbary/interpy
Backwards Compatibility
-----------------------
2015-09-14 21:37:22 -04:00
By using existing syntax and avoiding current or historical features,
format strings were designed so as to not interfere with existing code and are
not expected to cause any issues.
Postponed Ideas
---------------
Internationalization
''''''''''''''''''''
Though it was highly desired to integrate internationalization support,
(see :pep:`501`),
the finer details diverge at almost every point,
making a common solution unlikely: [15]_
2015-09-14 21:37:22 -04:00
* Use-cases differ
* Compile vs. run-time tasks
* Interpolation syntax needs
* Intended audience
* Security policy
Rejected Ideas
--------------
Restricting Syntax to ``str.format()`` Only
'''''''''''''''''''''''''''''''''''''''''''
The common `arguments against`_ support of arbitrary expressions were:
2015-09-14 21:37:22 -04:00
#. `YAGNI`_, "You aren't gonna need it."
#. The feature is not congruent with historical Python conservatism.
#. Postpone - can implement in a future version if need is demonstrated.
2015-09-14 21:37:22 -04:00
.. _YAGNI: https://en.wikipedia.org/wiki/You_aren't_gonna_need_it
.. _arguments against: https://mail.python.org/pipermail/python-ideas/2015-August/034913.html
2015-09-14 21:37:22 -04:00
Support of only ``str.format()`` syntax however,
was deemed not enough of a solution to the problem.
Often a simple length or increment of an object, for example,
is desired before printing.
It can be seen in the `Implementations in Other Languages`_ section that the
developer community at large tends to agree.
String interpolation with arbitrary expressions is becoming an industry
2015-09-14 21:37:22 -04:00
standard in modern languages due to its utility.
Additional/Custom String-Prefixes
'''''''''''''''''''''''''''''''''
As seen in the `Implementations in Other Languages`_ section,
many modern languages have extensible string prefixes with a common interface.
This could be a way to generalize and reduce lines of code in common
situations.
Examples are found in ES6 (Javascript), Scala, Nim, and C#
(to a lesser extent).
This was rejected by the BDFL. [14]_
Automated Escaping of Input Variables
'''''''''''''''''''''''''''''''''''''
While helpful in some cases,
this was thought to create too much uncertainty of when and where string
expressions could be used safely or not.
The concept was also difficult to describe to others. [12]_
2015-09-14 21:37:22 -04:00
Always consider format string variables to be unescaped,
unless the developer has explicitly escaped them.
Environment Access and Command Substitution
'''''''''''''''''''''''''''''''''''''''''''
For systems programming and shell-script replacements,
it would be useful to handle environment variables and capture output of
commands directly in an expression string.
This was rejected as not important enough,
and looking too much like bash/perl,
which could encourage bad habits. [13]_
Acknowledgements
================
* Eric V. Smith for the authoring and implementation of :pep:`498`.
2015-09-14 21:37:22 -04:00
* Everyone on the python-ideas mailing list for rejecting the various crazy
ideas that came up,
helping to keep the final design in focus.
References
==========
.. [1] Briefer String Format
(https://mail.python.org/pipermail/python-ideas/2015-July/034659.html)
.. [2] Briefer String Format
(https://mail.python.org/pipermail/python-ideas/2015-July/034669.html)
.. [3] Briefer String Format
(https://mail.python.org/pipermail/python-ideas/2015-July/034701.html)
.. [4] Bash Docs
(http://www.tldp.org/LDP/abs/html/arithexp.html)
.. [5] Bash Docs
(http://www.tldp.org/LDP/abs/html/commandsub.html)
.. [6] Perl Cookbook
(http://docstore.mik.ua/orelly/perl/cookbook/ch01_11.htm)
.. [7] Perl Docs
(http://perl6maven.com/perl6-scalar-array-and-hash-interpolation)
.. [8] Ruby Docs
(http://ruby-doc.org/core-2.1.1/doc/syntax/literals_rdoc.html#label-Strings)
.. [9] Ruby Docs
(https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#Interpolation)
.. [10] Python Str.Format Syntax
(https://docs.python.org/3/library/string.html#format-string-syntax)
.. [11] Python Format-Spec Mini Language
(https://docs.python.org/3/library/string.html#format-specification-mini-language)
.. [12] Escaping of Input Variables
(https://mail.python.org/pipermail/python-ideas/2015-August/035532.html)
.. [13] Environment Access and Command Substitution
(https://mail.python.org/pipermail/python-ideas/2015-August/035554.html)
.. [14] Extensible String Prefixes
(https://mail.python.org/pipermail/python-ideas/2015-August/035336.html)
.. [15] Literal String Formatting
(https://mail.python.org/pipermail/python-dev/2015-August/141289.html)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: