python-peps/pep-3140.txt

224 lines
6.5 KiB
Plaintext

PEP: 3140
Title: str(container) should call str(item), not repr(item)
Version: $Revision$
Last-Modified: $Date$
Author: Oleg Broytman <phd@phdru.name>,
Jim J. Jewett <jimjjewett@gmail.com>
Discussions-To: python-3000@python.org
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 27-May-2008
Post-History: 28-May-2008
Rejection
=========
Guido said this would cause too much disturbance too close to beta. See [1]_.
Abstract
========
This document discusses the advantages and disadvantages of the
current implementation of ``str(container)``. It also discusses the
pros and cons of a different approach - to call ``str(item)`` instead
of ``repr(item)``.
Motivation
==========
Currently ``str(container)`` calls ``repr`` on items. Arguments for it:
* containers refuse to guess what the user wants to see on
``str(container)`` - surroundings, delimiters, and so on;
* ``repr(item)`` usually displays type information - apostrophes
around strings, class names, etc.
Arguments against:
* it's illogical; ``str()`` is expected to call ``__str__`` if it exists,
not ``__repr__``;
* there is no standard way to print a container's content calling
items' ``__str__``, that's inconvenient in cases where ``__str__`` and
``__repr__`` return different results;
* ``repr(item)`` sometimes do wrong things (hex-escapes non-ASCII strings,
e.g.)
This PEP proposes to change how ``str(container)`` works. It is
proposed to mimic how ``repr(container)`` works except one detail - call
``str`` on items instead of ``repr``. This allows a user to choose
what results she want to get - from ``item.__repr__`` or ``item.__str__``.
Current situation
=================
Most container types (tuples, lists, dicts, sets, etc.) do not
implement ``__str__`` method, so ``str(container)`` calls
``container.__repr__``, and ``container.__repr__``, once called, forgets
it is called from ``str`` and always calls ``repr`` on the container's
items.
This behaviour has advantages and disadvantages. One advantage is
that most items are represented with type information - strings
are surrounded by apostrophes, instances may have both class name
and instance data::
>>> print([42, '42'])
[42, '42']
>>> print([Decimal('42'), datetime.now()])
[Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]
The disadvantage is that ``__repr__`` often returns technical data
(like '``<object at address>``') or unreadable string (hex-encoded
string if the input is non-ASCII string)::
>>> print(['тест'])
['\xd4\xc5\xd3\xd4']
One of the motivations for PEP 3138 is that neither ``repr`` nor ``str``
will allow the sensible printing of dicts whose keys are non-ASCII
text strings. Now that Unicode identifiers are allowed, it
includes Python's own attribute dicts. This also includes JSON
serialization (and caused some hoops for the json lib).
PEP 3138 proposes to fix this by breaking the "repr is safe ASCII"
invariant, and changing the way ``repr`` (which is used for
persistence) outputs some objects, with system-dependent failures.
Changing how ``str(container)`` works would allow easy debugging in
the normal case, and retain the safety of ASCII-only for the
machine-readable case. The only downside is that ``str(x)`` and
``repr(x)`` would more often be different -- but only in those cases
where the current almost-the-same version is insufficient.
It also seems illogical that ``str(container)`` calls ``repr`` on items
instead of ``str``. It's only logical to expect following code::
class Test:
def __str__(self):
return "STR"
def __repr__(self):
return "REPR"
test = Test()
print(test)
print(repr(test))
print([test])
print(str([test]))
to print::
STR
REPR
[STR]
[STR]
where it actually prints::
STR
REPR
[REPR]
[REPR]
Especially it is illogical to see that print in Python 2 uses ``str``
if it is called on what seems to be a tuple::
>>> print Decimal('42'), datetime.now()
42 2008-05-27 20:16:22.534285
where on an actual tuple it prints::
>>> print((Decimal('42'), datetime.now()))
(Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))
A different approach - call ``str(item)``
=========================================
For example, with numbers it is often only the value that people
care about.
::
>>> print Decimal('3')
3
But putting the value in a list forces users to read the type
information, exactly as if ``repr`` had been called for the benefit of
a machine::
>>> print [Decimal('3')]
[Decimal("3")]
After this change, the type information would not clutter the ``str``
output::
>>> print "%s".format([Decimal('3')])
[3]
>>> str([Decimal('3')]) # ==
[3]
But it would still be available if desired::
>>> print "%r".format([Decimal('3')])
[Decimal('3')]
>>> repr([Decimal('3')]) # ==
[Decimal('3')]
There is a number of strategies to fix the problem. The most
radical is to change ``__repr__`` so it accepts a new parameter (flag)
"called from ``str``, so call ``str`` on items, not ``repr``". The
drawback of the proposal is that every ``__repr__`` implementation
must be changed. Introspection could help a bit (inspect ``__repr__``
before calling if it accepts 2 or 3 parameters), but introspection
doesn't work on classes written in C, like all built-in containers.
Less radical proposal is to implement ``__str__`` methods for built-in
container types. The obvious drawback is a duplication of effort - all
those ``__str__`` and ``__repr__`` implementations are only differ
in one small detail - if they call ``str`` or ``repr`` on items.
The most conservative proposal is not to change str at all but
to allow developers to implement their own application- or
library-specific pretty-printers. The drawback is again
a multiplication of effort and proliferation of many small
specific container-traversal algorithms.
Backward compatibility
======================
In those cases where type information is more important than
usual, it will still be possible to get the current results by
calling ``repr`` explicitly.
References
==========
.. [1] Guido van Rossum, PEP: str(container) should call str(item), not
repr(item)
https://mail.python.org/pipermail/python-3000/2008-May/013876.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: