Many updates based on great comments by python-dev'ers.

This commit is contained in:
Barry Warsaw 2002-06-20 03:58:03 +00:00
parent af7e5a7e46
commit 5fcdd18392
1 changed files with 72 additions and 37 deletions

View File

@ -54,10 +54,9 @@ A Simpler Proposal
as defined in [2]. The first non-identifier character after
the $ character terminates this placeholder specification.
3. ${identifier} is equivalent to $identifier and for clarity,
this is the preferred form. It is required for when valid
identifier characters follow the placeholder but are not part of
the placeholder, e.g. "${noun}ification".
3. ${identifier} is equivalent to $identifier. It is required for
when valid identifier characters follow the placeholder but are
not part of the placeholder, e.g. "${noun}ification".
No other characters have special meaning.
@ -77,8 +76,8 @@ A Simpler Proposal
which the .sub() method is executed. For example:
def birth(self, name):
country = self.countryOfOrigin['name']
return '${name} was born in ${country}'
country = self.countryOfOrigin[name]
return '${name} was born in ${country}'.sub()
birth('Guido')
@ -87,6 +86,20 @@ A Simpler Proposal
'Guido was born in the Netherlands'
Why `$' and Braces?
The BDFL said it best: The $ means "substitution" in so many
languages besides Perl that I wonder where you've been. [...]
We're copying this from the shell.
Security Issues
Never use no-arg .sub() on strings that come from untrusted
sources. It could be used to gain unauthorized information about
variables in your local or global scope.
Reference Implementation
Here's a Python 2.2-based reference implementation. Of course the
@ -97,29 +110,16 @@ Reference Implementation
import sys
import re
dre = re.compile(r'(\$\$)|\$([_a-z]\w*)|\$\{([_a-z]\w*)\}', re.I)
EMPTYSTRING = ''
class dstr(str):
def sub(self, mapping=None):
# Default mapping is locals/globals of caller
if mapping is None:
frame = sys._getframe(1)
mapping = frame.f_globals.copy()
mapping.update(frame.f_locals)
# Escape %'s
s = self.replace('%', '%%')
# Convert $name and ${name} to $(name)s
parts = dre.split(s)
for i in range(1, len(parts), 4):
if parts[i] is not None:
parts[i] = '$'
elif parts[i+1] is not None:
parts[i+1] = '%(' + parts[i+1] + ')s'
else:
parts[i+2] = '%(' + parts[i+2] + ')s'
# Interpolate
return EMPTYSTRING.join(filter(None, parts)) % mapping
def sub(self, mapping=None):
# Default mapping is locals/globals of caller
if mapping is None:
frame = sys._getframe(1)
mapping = frame.f_globals.copy()
mapping.update(frame.f_locals)
def repl(m):
return mapping[m.group(m.lastindex)]
return re.sub(r'\$(?:([_a-z]\w*)|\{([_a-z]\w*)\})', repl, self)
And here are some examples:
@ -141,8 +141,7 @@ Handling Missing Keys
from the mapping (or the locals/globals namespace if no argument
is given)? There are two possibilities:
- We can simply allow the exception (likely a NameError or
KeyError) to propagate.
- We can simply allow the exception.
- We can return the original substitution placeholder unchanged.
@ -167,10 +166,8 @@ Handling Missing Keys
Bob was born in ${country}
The PEP author would prefer the latter interpretation, although a
case can be made for raising the exception instead. We could
almost ignore the issue, since the latter example could be
accomplished by passing in a "safe-dictionary" in instead of a
We could almost ignore the issue, since the latter example could
be accomplished by passing in a "safe-dictionary" in instead of a
normal dictionary, like so:
class safedict(dict):
@ -207,9 +204,47 @@ Handling Missing Keys
this callable would be the identity function, but you could
easily pass in the safedict constructor instead.
BDFL proto-pronouncement: It should always raise a NameError when
the key is missing. There may not be sufficient use case for soft
failures in the no-argument version.
BDFL proto-pronouncement: Strongly in favor of raising the
exception, with KeyError when a dict is used and NameError when
locals/globals are used. There may not be sufficient use case for
soft failures in the no-argument version.
Open Issues, Comments, and Suggestions
- Ka-Ping Yee makes the suggestion that .sub() should take keyword
arguments instead of a dictionary, and that if a dictionary was
to be passed in it should be done with **dict. For example:
s = '${name} was born in ${country}'
print s.sub(name='Guido', country='the Netherlands')
or
print s.sub(**{'name': 'Guido', 'country': 'the Netherlands'})
- Paul Prescod wonders whether having a method use sys._getframe()
doesn't set a bad precedent.
- Oren Tirosh suggests that .sub() take an optional argument which
would be used as a default for missing keys. If the optional
argument were not given, an exception would be raised. This may
not play well with Ka-Ping's suggestion.
- Other suggestions have been made as an alternative to a string
method including: a builtin function, a function in a module, an
operator (similar to "string % dict", e.g. "string / dict").
One strong argument for making it a built-in is given by Paul
Prescod:
"I really hate putting things in modules that will be needed in
a Python programmer's second program (the one after "Hello
world"). If this is to be the *simpler* way of doing
introspection then getting at it should be simpler than getting
at "%". $ is taught in hour 2, import is taught on day 2.
Some people may never make it to the metaphorical day 2 if they
are doing simple text processing in some kind of
embedded-Python environment."
Comparison to PEP 215