206 lines
6.4 KiB
Plaintext
206 lines
6.4 KiB
Plaintext
PEP: 3140
|
|
Title: str(container) should call str(item), not repr(item)
|
|
Version: $Revision$
|
|
Last-Modified: $Date: 2008-05-28 20:38:33 -0600 (Thu, 28 May 2008)$
|
|
Author: Oleg Broytmann <phd@phd.pp.ru>,
|
|
Jim Jewett <jimjjewett@gmail.com>
|
|
Discussions-To: python-3000@python.org
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Content-Type: text/plain
|
|
Created: 27-May-2008
|
|
Post-History: 28-May-2008
|
|
|
|
|
|
Rejection
|
|
|
|
Guido said this would cause too much disturbance too close to beta. See
|
|
http://mail.python.org/pipermail/python-3000/2008-May/013876.html.
|
|
|
|
|
|
Abstract
|
|
|
|
This document discusses the advantages and disadvantages of the
|
|
current implementation of str(container). It also discusses the
|
|
pros and cons of a different approach - to call str(item) instead
|
|
of repr(item).
|
|
|
|
|
|
Motivation
|
|
|
|
Currently str(container) calls repr on items. Arguments for it:
|
|
-- containers refuse to guess what the user wants to see on
|
|
str(container) - surroundings, delimiters, and so on;
|
|
-- repr(item) usually displays type information - apostrophes
|
|
around strings, class names, etc.
|
|
|
|
Arguments against:
|
|
-- it's illogical; str() is expected to call __str__ if it exists,
|
|
not __repr__;
|
|
-- there is no standard way to print a container's content calling
|
|
items' __str__, that's inconvenient in cases where __str__ and
|
|
__repr__ return different results;
|
|
-- repr(item) sometimes do wrong things (hex-escapes non-ASCII
|
|
strings, e.g.)
|
|
|
|
This PEP proposes to change how str(container) works. It is
|
|
proposed to mimic how repr(container) works except one detail
|
|
- call str on items instead of repr. This allows a user to choose
|
|
what results she want to get - from item.__repr__ or item.__str__.
|
|
|
|
|
|
Current situation
|
|
|
|
Most container types (tuples, lists, dicts, sets, etc.) do not
|
|
implement __str__ method, so str(container) calls
|
|
container.__repr__, and container.__repr__, once called, forgets
|
|
it is called from str and always calls repr on the container's
|
|
items.
|
|
|
|
This behaviour has advantages and disadvantages. One advantage is
|
|
that most items are represented with type information - strings
|
|
are surrounded by apostrophes, instances may have both class name
|
|
and instance data:
|
|
|
|
>>> print([42, '42'])
|
|
[42, '42']
|
|
>>> print([Decimal('42'), datetime.now()])
|
|
[Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]
|
|
|
|
The disadvantage is that __repr__ often returns technical data
|
|
(like '<object at address>') or unreadable string (hex-encoded
|
|
string if the input is non-ASCII string):
|
|
|
|
>>> print(['тест'])
|
|
['\xd4\xc5\xd3\xd4']
|
|
|
|
One of the motivations for PEP 3138 is that neither repr nor str
|
|
will allow the sensible printing of dicts whose keys are non-ASCII
|
|
text strings. Now that Unicode identifiers are allowed, it
|
|
includes Python's own attribute dicts. This also includes JSON
|
|
serialization (and caused some hoops for the json lib).
|
|
|
|
PEP 3138 proposes to fix this by breaking the "repr is safe ASCII"
|
|
invariant, and changing the way repr (which is used for
|
|
persistence) outputs some objects, with system-dependent failures.
|
|
|
|
Changing how str(container) works would allow easy debugging in
|
|
the normal case, and retain the safety of ASCII-only for the
|
|
machine-readable case. The only downside is that str(x) and
|
|
repr(x) would more often be different -- but only in those cases
|
|
where the current almost-the-same version is insufficient.
|
|
|
|
It also seems illogical that str(container) calls repr on items
|
|
instead of str. It's only logical to expect following code
|
|
|
|
class Test:
|
|
def __str__(self):
|
|
return "STR"
|
|
|
|
def __repr__(self):
|
|
return "REPR"
|
|
|
|
|
|
test = Test()
|
|
print(test)
|
|
print(repr(test))
|
|
print([test])
|
|
print(str([test]))
|
|
|
|
to print
|
|
|
|
STR
|
|
REPR
|
|
[STR]
|
|
[STR]
|
|
|
|
where it actually prints
|
|
|
|
STR
|
|
REPR
|
|
[REPR]
|
|
[REPR]
|
|
|
|
Especially it is illogical to see that print in Python 2 uses str
|
|
if it is called on what seems to be a tuple:
|
|
|
|
>>> print Decimal('42'), datetime.now()
|
|
42 2008-05-27 20:16:22.534285
|
|
|
|
where on an actual tuple it prints
|
|
|
|
>>> print((Decimal('42'), datetime.now()))
|
|
(Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))
|
|
|
|
|
|
A different approach - call str(item)
|
|
|
|
For example, with numbers it is often only the value that people
|
|
care about.
|
|
|
|
>>> print Decimal('3')
|
|
3
|
|
|
|
But putting the value in a list forces users to read the type
|
|
information, exactly as if repr had been called for the benefit of
|
|
a machine:
|
|
|
|
>>> print [Decimal('3')]
|
|
[Decimal("3")]
|
|
|
|
After this change, the type information would not clutter the str
|
|
output:
|
|
|
|
>>> print "%s".format([Decimal('3')])
|
|
[3]
|
|
>>> str([Decimal('3')]) # ==
|
|
[3]
|
|
|
|
But it would still be available if desired:
|
|
|
|
>>> print "%r".format([Decimal('3')])
|
|
[Decimal('3')]
|
|
>>> repr([Decimal('3')]) # ==
|
|
[Decimal('3')]
|
|
|
|
There is a number of strategies to fix the problem. The most
|
|
radical is to change __repr__ so it accepts a new parameter (flag)
|
|
"called from str, so call str on items, not repr". The
|
|
drawback of the proposal is that every __repr__ implementation
|
|
must be changed. Introspection could help a bit (inspect __repr__
|
|
before calling if it accepts 2 or 3 parameters), but introspection
|
|
doesn't work on classes written in C, like all built-in containers.
|
|
|
|
Less radical proposal is to implement __str__ methods for built-in
|
|
container types. The obvious drawback is a duplication of effort
|
|
- all those __str__ and __repr__ implementations are only differ
|
|
in one small detail - if they call str or repr on items.
|
|
|
|
The most conservative proposal is not to change str at all but
|
|
to allow developers to implement their own application- or
|
|
library-specific pretty-printers. The drawback is again
|
|
a multiplication of effort and proliferation of many small
|
|
specific container-traversal algorithms.
|
|
|
|
|
|
Backward compatibility
|
|
|
|
In those cases where type information is more important than
|
|
usual, it will still be possible to get the current results by
|
|
calling repr explicitly.
|
|
|
|
|
|
Copyright
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|
|
End:
|