2007-04-30 19:03:34 -04:00
|
|
|
|
PEP: 3126
|
|
|
|
|
Title: Remove Implicit String Concatenation
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
2007-05-07 14:05:23 -04:00
|
|
|
|
Author: Jim J. Jewett <JimJJewett@gmail.com>,
|
2007-06-28 15:44:20 -04:00
|
|
|
|
Raymond Hettinger <python@rcn.com>
|
2007-05-10 18:18:18 -04:00
|
|
|
|
Status: Rejected
|
2007-04-30 19:03:34 -04:00
|
|
|
|
Type: Standards Track
|
2007-05-07 14:05:23 -04:00
|
|
|
|
Content-Type: text/x-rst
|
2007-04-30 19:03:34 -04:00
|
|
|
|
Created: 29-Apr-2007
|
2007-05-07 14:05:23 -04:00
|
|
|
|
Post-History: 29-Apr-2007, 30-Apr-2007, 07-May-2007
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
|
|
|
|
|
2007-05-10 18:18:18 -04:00
|
|
|
|
Rejection Notice
|
|
|
|
|
================
|
|
|
|
|
|
|
|
|
|
This PEP is rejected. There wasn't enough support in favor, the
|
|
|
|
|
feature to be removed isn't all that harmful, and there are some use
|
|
|
|
|
cases that would become harder.
|
|
|
|
|
|
|
|
|
|
|
2007-04-30 19:03:34 -04:00
|
|
|
|
Abstract
|
2007-05-07 14:05:23 -04:00
|
|
|
|
========
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
Python inherited many of its parsing rules from C. While this has
|
|
|
|
|
been generally useful, there are some individual rules which are less
|
|
|
|
|
useful for python, and should be eliminated.
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
This PEP proposes to eliminate implicit string concatenation based
|
|
|
|
|
only on the adjacency of literals.
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
Instead of::
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
"abc" "def" == "abcdef"
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
authors will need to be explicit, and either add the strings::
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
"abc" + "def" == "abcdef"
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
or join them::
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
"".join(["abc", "def"]) == "abcdef"
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
Motivation
|
|
|
|
|
==========
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
One goal for Python 3000 should be to simplify the language by
|
|
|
|
|
removing unnecessary features. Implicit string concatenation should
|
|
|
|
|
be dropped in favor of existing techniques. This will simplify the
|
|
|
|
|
grammar and simplify a user's mental picture of Python. The latter is
|
|
|
|
|
important for letting the language "fit in your head". A large group
|
|
|
|
|
of current users do not even know about implicit concatenation. Of
|
|
|
|
|
those who do know about it, a large portion never use it or habitually
|
|
|
|
|
avoid it. Of those who both know about it and use it, very few could
|
|
|
|
|
state with confidence the implicit operator precedence and under what
|
|
|
|
|
circumstances it is computed when the definition is compiled versus
|
|
|
|
|
when it is run.
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
|
|
|
|
|
History or Future
|
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
|
|
Many Python parsing rules are intentionally compatible with C. This
|
|
|
|
|
is a useful default, but Special Cases need to be justified based on
|
|
|
|
|
their utility in Python. We should no longer assume that python
|
|
|
|
|
programmers will also be familiar with C, so compatibility between
|
|
|
|
|
languages should be treated as a tie-breaker, rather than a
|
|
|
|
|
justification.
|
|
|
|
|
|
|
|
|
|
In C, implicit concatenation is the only way to join strings without
|
|
|
|
|
using a (run-time) function call to store into a variable. In Python,
|
|
|
|
|
the strings can be joined (and still recognized as immutable) using
|
|
|
|
|
more standard Python idioms, such ``+`` or ``"".join``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Problem
|
|
|
|
|
-------
|
|
|
|
|
|
2019-06-25 00:58:50 -04:00
|
|
|
|
Implicit String concatenation leads to tuples and lists which are
|
2007-05-07 14:05:23 -04:00
|
|
|
|
shorter than they appear; this is turn can lead to confusing, or even
|
|
|
|
|
silent, errors. For example, given a function which accepts several
|
|
|
|
|
parameters, but offers a default value for some of them::
|
|
|
|
|
|
|
|
|
|
def f(fmt, *args):
|
|
|
|
|
print fmt % args
|
|
|
|
|
|
|
|
|
|
This looks like a valid call, but isn't::
|
|
|
|
|
|
|
|
|
|
>>> f("User %s got a message %s",
|
|
|
|
|
"Bob"
|
|
|
|
|
"Time for dinner")
|
|
|
|
|
|
|
|
|
|
Traceback (most recent call last):
|
|
|
|
|
File "<pyshell#8>", line 2, in <module>
|
|
|
|
|
"Bob"
|
|
|
|
|
File "<pyshell#3>", line 2, in f
|
|
|
|
|
print fmt % args
|
|
|
|
|
TypeError: not enough arguments for format string
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Calls to this function can silently do the wrong thing::
|
|
|
|
|
|
|
|
|
|
def g(arg1, arg2=None):
|
|
|
|
|
...
|
|
|
|
|
|
|
|
|
|
# silently transformed into the possibly very different
|
|
|
|
|
# g("arg1 on this linearg2 on this line", None)
|
|
|
|
|
g("arg1 on this line"
|
|
|
|
|
"arg2 on this line")
|
|
|
|
|
|
|
|
|
|
To quote Jason Orendorff [#Orendorff]
|
|
|
|
|
|
|
|
|
|
Oh. I just realized this happens a lot out here. Where I work,
|
|
|
|
|
we use scons, and each SConscript has a long list of filenames::
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
|
|
|
|
sourceFiles = [
|
2007-05-07 14:05:23 -04:00
|
|
|
|
'foo.c'
|
|
|
|
|
'bar.c',
|
|
|
|
|
#...many lines omitted...
|
|
|
|
|
'q1000x.c']
|
|
|
|
|
|
|
|
|
|
It's a common mistake to leave off a comma, and then scons
|
|
|
|
|
complains that it can't find 'foo.cbar.c'. This is pretty
|
|
|
|
|
bewildering behavior even if you *are* a Python programmer,
|
|
|
|
|
and not everyone here is.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Solution
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
In Python, strings are objects and they support the __add__ operator,
|
|
|
|
|
so it is possible to write::
|
|
|
|
|
|
|
|
|
|
"abc" + "def"
|
|
|
|
|
|
|
|
|
|
Because these are literals, this addition can still be optimized away
|
|
|
|
|
by the compiler; the CPython compiler already does so.
|
|
|
|
|
[#rcn-constantfold]_
|
|
|
|
|
|
|
|
|
|
Other existing alternatives include multiline (triple-quoted) strings,
|
|
|
|
|
and the join method::
|
|
|
|
|
|
|
|
|
|
"""This string
|
|
|
|
|
extends across
|
|
|
|
|
multiple lines, but you may want to use something like
|
|
|
|
|
Textwrap.dedent
|
|
|
|
|
to clear out the leading spaces
|
|
|
|
|
and/or reformat.
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
>>> "".join(["empty", "string", "joiner"]) == "emptystringjoiner"
|
|
|
|
|
True
|
|
|
|
|
|
|
|
|
|
>>> " ".join(["space", "string", "joiner"]) == "space string joiner"
|
2024-09-20 01:59:07 -04:00
|
|
|
|
True
|
2007-05-07 14:05:23 -04:00
|
|
|
|
|
|
|
|
|
>>> "\n".join(["multiple", "lines"]) == "multiple\nlines" == (
|
|
|
|
|
"""multiple
|
|
|
|
|
lines""")
|
|
|
|
|
True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Concerns
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Operator Precedence
|
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
|
|
Guido indicated [#rcn-constantfold]_ that this change should be
|
|
|
|
|
handled by PEP, because there were a few edge cases with other string
|
|
|
|
|
operators, such as the %. (Assuming that str % stays -- it may be
|
2022-01-21 06:03:51 -05:00
|
|
|
|
eliminated in favor of :pep:`3101` -- Advanced String Formatting.
|
|
|
|
|
[#elimpercent]_)
|
2007-05-07 14:05:23 -04:00
|
|
|
|
|
|
|
|
|
The resolution is to use parentheses to enforce precedence -- the same
|
|
|
|
|
solution that can be used today::
|
|
|
|
|
|
|
|
|
|
# Clearest, works today, continues to work, optimization is
|
|
|
|
|
# already possible.
|
|
|
|
|
("abc %s def" + "ghi") % var
|
|
|
|
|
|
|
|
|
|
# Already works today; precedence makes the optimization more
|
|
|
|
|
# difficult to recognize, but does not change the semantics.
|
|
|
|
|
"abc" + "def %s ghi" % var
|
|
|
|
|
|
|
|
|
|
as opposed to::
|
|
|
|
|
|
|
|
|
|
# Already fails because modulus (%) is higher precedence than
|
|
|
|
|
# addition (+)
|
|
|
|
|
("abc %s def" + "ghi" % var)
|
|
|
|
|
|
|
|
|
|
# Works today only because adjacency is higher precedence than
|
|
|
|
|
# modulus. This will no longer be available.
|
|
|
|
|
"abc %s" "def" % var
|
|
|
|
|
|
2019-06-25 00:58:50 -04:00
|
|
|
|
# So the 2-to-3 translator can automatically replace it with the
|
2007-05-07 14:05:23 -04:00
|
|
|
|
# (already valid):
|
|
|
|
|
("abc %s" + "def") % var
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Long Commands
|
|
|
|
|
-------------
|
|
|
|
|
|
|
|
|
|
... build up (what I consider to be) readable SQL queries [#skipSQL]_::
|
|
|
|
|
|
|
|
|
|
rows = self.executesql("select cities.city, state, country"
|
|
|
|
|
" from cities, venues, events, addresses"
|
|
|
|
|
" where cities.city like %s"
|
|
|
|
|
" and events.active = 1"
|
|
|
|
|
" and venues.address = addresses.id"
|
|
|
|
|
" and addresses.city = cities.id"
|
|
|
|
|
" and events.venue = venues.id",
|
|
|
|
|
(city,))
|
|
|
|
|
|
|
|
|
|
Alternatives again include triple-quoted strings, ``+``, and ``.join``::
|
|
|
|
|
|
|
|
|
|
query="""select cities.city, state, country
|
|
|
|
|
from cities, venues, events, addresses
|
|
|
|
|
where cities.city like %s
|
|
|
|
|
and events.active = 1"
|
|
|
|
|
and venues.address = addresses.id
|
|
|
|
|
and addresses.city = cities.id
|
|
|
|
|
and events.venue = venues.id"""
|
|
|
|
|
|
|
|
|
|
query=( "select cities.city, state, country"
|
|
|
|
|
+ " from cities, venues, events, addresses"
|
|
|
|
|
+ " where cities.city like %s"
|
|
|
|
|
+ " and events.active = 1"
|
|
|
|
|
+ " and venues.address = addresses.id"
|
|
|
|
|
+ " and addresses.city = cities.id"
|
|
|
|
|
+ " and events.venue = venues.id"
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
query="\n".join(["select cities.city, state, country",
|
|
|
|
|
" from cities, venues, events, addresses",
|
|
|
|
|
" where cities.city like %s",
|
|
|
|
|
" and events.active = 1",
|
|
|
|
|
" and venues.address = addresses.id",
|
|
|
|
|
" and addresses.city = cities.id",
|
|
|
|
|
" and events.venue = venues.id"])
|
|
|
|
|
|
|
|
|
|
# And yes, you *could* inline any of the above querystrings
|
|
|
|
|
# the same way the original was inlined.
|
|
|
|
|
rows = self.executesql(query, (city,))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Regular Expressions
|
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
|
|
Complex regular expressions are sometimes stated in terms of several
|
|
|
|
|
implicitly concatenated strings with each regex component on a
|
|
|
|
|
different line and followed by a comment. The plus operator can be
|
|
|
|
|
inserted here but it does make the regex harder to read. One
|
|
|
|
|
alternative is to use the re.VERBOSE option. Another alternative is
|
|
|
|
|
to build-up the regex with a series of += lines::
|
|
|
|
|
|
|
|
|
|
# Existing idiom which relies on implicit concatenation
|
|
|
|
|
r = ('a{20}' # Twenty A's
|
|
|
|
|
'b{5}' # Followed by Five B's
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# Mechanical replacement
|
|
|
|
|
r = ('a{20}' +# Twenty A's
|
|
|
|
|
'b{5}' # Followed by Five B's
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
# already works today
|
|
|
|
|
r = '''a{20} # Twenty A's
|
|
|
|
|
b{5} # Followed by Five B's
|
|
|
|
|
''' # Compiled with the re.VERBOSE flag
|
|
|
|
|
|
|
|
|
|
# already works today
|
|
|
|
|
r = 'a{20}' # Twenty A's
|
|
|
|
|
r += 'b{5}' # Followed by Five B's
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Internationalization
|
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
|
|
Some internationalization tools -- notably xgettext -- have already
|
|
|
|
|
been special-cased for implicit concatenation, but not for Python's
|
|
|
|
|
explicit concatenation. [#barryi8]_
|
|
|
|
|
|
|
|
|
|
These tools will fail to extract the (already legal)::
|
|
|
|
|
|
|
|
|
|
_("some string" +
|
|
|
|
|
" and more of it")
|
|
|
|
|
|
|
|
|
|
but often have a special case for::
|
|
|
|
|
|
|
|
|
|
_("some string"
|
|
|
|
|
" and more of it")
|
|
|
|
|
|
|
|
|
|
It should also be possible to just use an overly long line (xgettext
|
|
|
|
|
limits messages to 2048 characters [#xgettext2048]_, which is less
|
|
|
|
|
than Python's enforced limit) or triple-quoted strings, but these
|
|
|
|
|
solutions sacrifice some readability in the code::
|
|
|
|
|
|
|
|
|
|
# Lines over a certain length are unpleasant.
|
|
|
|
|
_("some string and more of it")
|
|
|
|
|
|
|
|
|
|
# Changing whitespace is not ideal.
|
|
|
|
|
_("""Some string
|
|
|
|
|
and more of it""")
|
|
|
|
|
_("""Some string
|
|
|
|
|
and more of it""")
|
|
|
|
|
_("Some string \
|
|
|
|
|
and more of it")
|
|
|
|
|
|
|
|
|
|
I do not see a good short-term resolution for this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Transition
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
The proposed new constructs are already legal in current Python, and
|
|
|
|
|
can be used immediately.
|
|
|
|
|
|
|
|
|
|
The 2 to 3 translator can be made to mechanically change::
|
|
|
|
|
|
|
|
|
|
"str1" "str2"
|
|
|
|
|
("line1" #comment
|
|
|
|
|
"line2")
|
|
|
|
|
|
|
|
|
|
into::
|
|
|
|
|
|
|
|
|
|
("str1" + "str2")
|
|
|
|
|
("line1" +#comments
|
|
|
|
|
"line2")
|
|
|
|
|
|
|
|
|
|
If users want to use one of the other idioms, they can; as these
|
|
|
|
|
idioms are all already legal in python 2, the edits can be made
|
|
|
|
|
to the original source, rather than patching up the translator.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Open Issues
|
|
|
|
|
===========
|
|
|
|
|
|
|
|
|
|
Is there a better way to support external text extraction tools, or at
|
|
|
|
|
least ``xgettext`` [#gettext]_ in particular?
|
|
|
|
|
|
|
|
|
|
|
2007-04-30 19:03:34 -04:00
|
|
|
|
References
|
2007-05-07 14:05:23 -04:00
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [#Orendorff] Implicit String Concatenation, Orendorff
|
2017-06-11 15:02:39 -04:00
|
|
|
|
https://mail.python.org/pipermail/python-ideas/2007-April/000397.html
|
2007-05-07 14:05:23 -04:00
|
|
|
|
|
|
|
|
|
.. [#rcn-constantfold] Reminder: Py3k PEPs due by April, Hettinger,
|
|
|
|
|
van Rossum
|
2017-06-11 15:02:39 -04:00
|
|
|
|
https://mail.python.org/pipermail/python-3000/2007-April/006563.html
|
2007-05-07 14:05:23 -04:00
|
|
|
|
|
|
|
|
|
.. [#elimpercent] ps to question Re: Need help completing ABC pep,
|
|
|
|
|
van Rossum
|
2017-06-11 15:02:39 -04:00
|
|
|
|
https://mail.python.org/pipermail/python-3000/2007-April/006737.html
|
2007-05-07 14:05:23 -04:00
|
|
|
|
|
|
|
|
|
.. [#skipSQL] (email Subject) PEP 30XZ: Simplified Parsing, Skip,
|
2017-06-11 15:02:39 -04:00
|
|
|
|
https://mail.python.org/pipermail/python-3000/2007-May/007261.html
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
.. [#barryi8] (email Subject) PEP 30XZ: Simplified Parsing
|
2017-06-11 15:02:39 -04:00
|
|
|
|
https://mail.python.org/pipermail/python-3000/2007-May/007305.html
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
.. [#gettext] GNU gettext manual
|
|
|
|
|
http://www.gnu.org/software/gettext/
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
.. [#xgettext2048] Unix man page for xgettext -- Notes section
|
|
|
|
|
http://www.scit.wlv.ac.uk/cgi-bin/mansec?1+xgettext
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
2007-05-07 14:05:23 -04:00
|
|
|
|
=========
|
2007-04-30 19:03:34 -04:00
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2007-05-07 14:05:23 -04:00
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|