python-peps/pep-3140.txt

224 lines
6.5 KiB
Plaintext

PEP: 3140
Title: str(container) should call str(item), not repr(item)
Version: $Revision$
Last-Modified: $Date$
Author: Oleg Broytman <phd@phdru.name>,
Jim J. Jewett <jimjjewett@gmail.com>
Discussions-To: python-3000@python.org
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 27-May-2008
Post-History: 28-May-2008
Rejection
=========
Guido said this would cause too much disturbance too close to beta. See [1]_.
Abstract
========
This document discusses the advantages and disadvantages of the
current implementation of ``str(container)``. It also discusses the
pros and cons of a different approach - to call ``str(item)`` instead
of ``repr(item)``.
Motivation
==========
Currently ``str(container)`` calls ``repr`` on items. Arguments for it:
* containers refuse to guess what the user wants to see on
``str(container)`` - surroundings, delimiters, and so on;
* ``repr(item)`` usually displays type information - apostrophes
around strings, class names, etc.
Arguments against:
* it's illogical; ``str()`` is expected to call ``__str__`` if it exists,
not ``__repr__``;
* there is no standard way to print a container's content calling
items' ``__str__``, that's inconvenient in cases where ``__str__`` and
``__repr__`` return different results;
* ``repr(item)`` sometimes do wrong things (hex-escapes non-ASCII strings,
e.g.)
This PEP proposes to change how ``str(container)`` works. It is
proposed to mimic how ``repr(container)`` works except one detail - call
``str`` on items instead of ``repr``. This allows a user to choose
what results she want to get - from ``item.__repr__`` or ``item.__str__``.
Current situation
=================
Most container types (tuples, lists, dicts, sets, etc.) do not
implement ``__str__`` method, so ``str(container)`` calls
``container.__repr__``, and ``container.__repr__``, once called, forgets
it is called from ``str`` and always calls ``repr`` on the container's
items.
This behaviour has advantages and disadvantages. One advantage is
that most items are represented with type information - strings
are surrounded by apostrophes, instances may have both class name
and instance data::
>>> print([42, '42'])
[42, '42']
>>> print([Decimal('42'), datetime.now()])
[Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]
The disadvantage is that ``__repr__`` often returns technical data
(like '``<object at address>``') or unreadable string (hex-encoded
string if the input is non-ASCII string)::
>>> print(['тест'])
['\xd4\xc5\xd3\xd4']
One of the motivations for :pep:`3138` is that neither ``repr`` nor ``str``
will allow the sensible printing of dicts whose keys are non-ASCII
text strings. Now that Unicode identifiers are allowed, it
includes Python's own attribute dicts. This also includes JSON
serialization (and caused some hoops for the json lib).
:pep:`3138` proposes to fix this by breaking the "repr is safe ASCII"
invariant, and changing the way ``repr`` (which is used for
persistence) outputs some objects, with system-dependent failures.
Changing how ``str(container)`` works would allow easy debugging in
the normal case, and retain the safety of ASCII-only for the
machine-readable case. The only downside is that ``str(x)`` and
``repr(x)`` would more often be different -- but only in those cases
where the current almost-the-same version is insufficient.
It also seems illogical that ``str(container)`` calls ``repr`` on items
instead of ``str``. It's only logical to expect following code::
class Test:
def __str__(self):
return "STR"
def __repr__(self):
return "REPR"
test = Test()
print(test)
print(repr(test))
print([test])
print(str([test]))
to print::
STR
REPR
[STR]
[STR]
where it actually prints::
STR
REPR
[REPR]
[REPR]
Especially it is illogical to see that print in Python 2 uses ``str``
if it is called on what seems to be a tuple::
>>> print Decimal('42'), datetime.now()
42 2008-05-27 20:16:22.534285
where on an actual tuple it prints::
>>> print((Decimal('42'), datetime.now()))
(Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))
A different approach - call ``str(item)``
=========================================
For example, with numbers it is often only the value that people
care about.
::
>>> print Decimal('3')
3
But putting the value in a list forces users to read the type
information, exactly as if ``repr`` had been called for the benefit of
a machine::
>>> print [Decimal('3')]
[Decimal("3")]
After this change, the type information would not clutter the ``str``
output::
>>> print "%s".format([Decimal('3')])
[3]
>>> str([Decimal('3')]) # ==
[3]
But it would still be available if desired::
>>> print "%r".format([Decimal('3')])
[Decimal('3')]
>>> repr([Decimal('3')]) # ==
[Decimal('3')]
There is a number of strategies to fix the problem. The most
radical is to change ``__repr__`` so it accepts a new parameter (flag)
"called from ``str``, so call ``str`` on items, not ``repr``". The
drawback of the proposal is that every ``__repr__`` implementation
must be changed. Introspection could help a bit (inspect ``__repr__``
before calling if it accepts 2 or 3 parameters), but introspection
doesn't work on classes written in C, like all built-in containers.
Less radical proposal is to implement ``__str__`` methods for built-in
container types. The obvious drawback is a duplication of effort - all
those ``__str__`` and ``__repr__`` implementations are only differ
in one small detail - if they call ``str`` or ``repr`` on items.
The most conservative proposal is not to change str at all but
to allow developers to implement their own application- or
library-specific pretty-printers. The drawback is again
a multiplication of effort and proliferation of many small
specific container-traversal algorithms.
Backward compatibility
======================
In those cases where type information is more important than
usual, it will still be possible to get the current results by
calling ``repr`` explicitly.
References
==========
.. [1] Guido van Rossum, PEP: str(container) should call str(item), not
repr(item)
https://mail.python.org/pipermail/python-3000/2008-May/013876.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: