PEP: 3140 Title: str(container) should call str(item), not repr(item) Version: $Revision$ Last-Modified: $Date$ Author: Oleg Broytman , Jim J. Jewett Discussions-To: python-3000@python.org Status: Rejected Type: Standards Track Content-Type: text/x-rst Created: 27-May-2008 Post-History: 28-May-2008 Rejection ========= Guido said this would cause too much disturbance too close to beta. See [1]_. Abstract ======== This document discusses the advantages and disadvantages of the current implementation of ``str(container)``. It also discusses the pros and cons of a different approach - to call ``str(item)`` instead of ``repr(item)``. Motivation ========== Currently ``str(container)`` calls ``repr`` on items. Arguments for it: * containers refuse to guess what the user wants to see on ``str(container)`` - surroundings, delimiters, and so on; * ``repr(item)`` usually displays type information - apostrophes around strings, class names, etc. Arguments against: * it's illogical; ``str()`` is expected to call ``__str__`` if it exists, not ``__repr__``; * there is no standard way to print a container's content calling items' ``__str__``, that's inconvenient in cases where ``__str__`` and ``__repr__`` return different results; * ``repr(item)`` sometimes do wrong things (hex-escapes non-ASCII strings, e.g.) This PEP proposes to change how ``str(container)`` works. It is proposed to mimic how ``repr(container)`` works except one detail - call ``str`` on items instead of ``repr``. This allows a user to choose what results she want to get - from ``item.__repr__`` or ``item.__str__``. Current situation ================= Most container types (tuples, lists, dicts, sets, etc.) do not implement ``__str__`` method, so ``str(container)`` calls ``container.__repr__``, and ``container.__repr__``, once called, forgets it is called from ``str`` and always calls ``repr`` on the container's items. This behaviour has advantages and disadvantages. One advantage is that most items are represented with type information - strings are surrounded by apostrophes, instances may have both class name and instance data:: >>> print([42, '42']) [42, '42'] >>> print([Decimal('42'), datetime.now()]) [Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)] The disadvantage is that ``__repr__`` often returns technical data (like '````') or unreadable string (hex-encoded string if the input is non-ASCII string):: >>> print(['ั‚ะตัั‚']) ['\xd4\xc5\xd3\xd4'] One of the motivations for :pep:`3138` is that neither ``repr`` nor ``str`` will allow the sensible printing of dicts whose keys are non-ASCII text strings. Now that Unicode identifiers are allowed, it includes Python's own attribute dicts. This also includes JSON serialization (and caused some hoops for the json lib). :pep:`3138` proposes to fix this by breaking the "repr is safe ASCII" invariant, and changing the way ``repr`` (which is used for persistence) outputs some objects, with system-dependent failures. Changing how ``str(container)`` works would allow easy debugging in the normal case, and retain the safety of ASCII-only for the machine-readable case. The only downside is that ``str(x)`` and ``repr(x)`` would more often be different -- but only in those cases where the current almost-the-same version is insufficient. It also seems illogical that ``str(container)`` calls ``repr`` on items instead of ``str``. It's only logical to expect following code:: class Test: def __str__(self): return "STR" def __repr__(self): return "REPR" test = Test() print(test) print(repr(test)) print([test]) print(str([test])) to print:: STR REPR [STR] [STR] where it actually prints:: STR REPR [REPR] [REPR] Especially it is illogical to see that print in Python 2 uses ``str`` if it is called on what seems to be a tuple:: >>> print Decimal('42'), datetime.now() 42 2008-05-27 20:16:22.534285 where on an actual tuple it prints:: >>> print((Decimal('42'), datetime.now())) (Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911)) A different approach - call ``str(item)`` ========================================= For example, with numbers it is often only the value that people care about. :: >>> print Decimal('3') 3 But putting the value in a list forces users to read the type information, exactly as if ``repr`` had been called for the benefit of a machine:: >>> print [Decimal('3')] [Decimal("3")] After this change, the type information would not clutter the ``str`` output:: >>> print "%s".format([Decimal('3')]) [3] >>> str([Decimal('3')]) # == [3] But it would still be available if desired:: >>> print "%r".format([Decimal('3')]) [Decimal('3')] >>> repr([Decimal('3')]) # == [Decimal('3')] There is a number of strategies to fix the problem. The most radical is to change ``__repr__`` so it accepts a new parameter (flag) "called from ``str``, so call ``str`` on items, not ``repr``". The drawback of the proposal is that every ``__repr__`` implementation must be changed. Introspection could help a bit (inspect ``__repr__`` before calling if it accepts 2 or 3 parameters), but introspection doesn't work on classes written in C, like all built-in containers. Less radical proposal is to implement ``__str__`` methods for built-in container types. The obvious drawback is a duplication of effort - all those ``__str__`` and ``__repr__`` implementations are only differ in one small detail - if they call ``str`` or ``repr`` on items. The most conservative proposal is not to change str at all but to allow developers to implement their own application- or library-specific pretty-printers. The drawback is again a multiplication of effort and proliferation of many small specific container-traversal algorithms. Backward compatibility ====================== In those cases where type information is more important than usual, it will still be possible to get the current results by calling ``repr`` explicitly. References ========== .. [1] Guido van Rossum, PEP: str(container) should call str(item), not repr(item) https://mail.python.org/pipermail/python-3000/2008-May/013876.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: