python-peps/pep-3140.txt

PEP: 3140
Title: str(container) should call str(item), not repr(item)
Version: $Revision$
Last-Modified: $Date$
Author: Oleg Broytmann <phd@phd.pp.ru>,
        Jim J. Jewett <jimjjewett@gmail.com>
Discussions-To: python-3000@python.org
Status: Rejected
Type: Standards Track
Content-Type: text/plain
Created: 27-May-2008
Post-History: 28-May-2008


Rejection

   Guido said this would cause too much disturbance too close to beta. See
   http://mail.python.org/pipermail/python-3000/2008-May/013876.html.


Abstract

   This document discusses the advantages and disadvantages of the
   current implementation of str(container).  It also discusses the
   pros and cons of a different approach - to call str(item) instead
   of repr(item).


Motivation

   Currently str(container) calls repr on items.  Arguments for it:
   -- containers refuse to guess what the user wants to see on
      str(container) - surroundings, delimiters, and so on;
   -- repr(item) usually displays type information - apostrophes
      around strings, class names, etc.

   Arguments against:
   -- it's illogical; str() is expected to call __str__ if it exists,
      not __repr__;
   -- there is no standard way to print a container's content calling
      items' __str__, that's inconvenient in cases where __str__ and
      __repr__ return different results;
   -- repr(item) sometimes do wrong things (hex-escapes non-ASCII
      strings, e.g.)

   This PEP proposes to change how str(container) works.  It is
   proposed to mimic how repr(container) works except one detail
   - call str on items instead of repr.  This allows a user to choose
   what results she want to get - from item.__repr__ or item.__str__.


Current situation

   Most container types (tuples, lists, dicts, sets, etc.) do not
   implement __str__ method, so str(container) calls
   container.__repr__, and container.__repr__, once called, forgets
   it is called from str and always calls repr on the container's
   items.

   This behaviour has advantages and disadvantages.  One advantage is
   that most items are represented with type information - strings
   are surrounded by apostrophes, instances may have both class name
   and instance data:

       >>> print([42, '42'])
       [42, '42']
       >>> print([Decimal('42'), datetime.now()])
       [Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]

   The disadvantage is that __repr__ often returns technical data
   (like '<object at address>') or unreadable string (hex-encoded
   string if the input is non-ASCII string):

       >>> print(['тест'])
       ['\xd4\xc5\xd3\xd4']

   One of the motivations for PEP 3138 is that neither repr nor str
   will allow the sensible printing of dicts whose keys are non-ASCII
   text strings.  Now that Unicode identifiers are allowed, it
   includes Python's own attribute dicts.  This also includes JSON
   serialization (and caused some hoops for the json lib).

   PEP 3138 proposes to fix this by breaking the "repr is safe ASCII"
   invariant, and changing the way repr (which is used for
   persistence) outputs some objects, with system-dependent failures.

   Changing how str(container) works would allow easy debugging in
   the normal case, and retain the safety of ASCII-only for the
   machine-readable  case.  The only downside is that str(x) and
   repr(x) would more often be different -- but only in those cases
   where the current almost-the-same version is insufficient.

   It also seems illogical that str(container) calls repr on items
   instead of str.  It's only logical to expect following code

       class Test:
           def __str__(self):
               return "STR"

           def __repr__(self):
               return "REPR"


       test = Test()
       print(test)
       print(repr(test))
       print([test])
       print(str([test]))

   to print

       STR
       REPR
       [STR]
       [STR]

   where it actually prints

       STR
       REPR
       [REPR]
       [REPR]

   Especially it is illogical to see that print in Python 2 uses str
   if it is called on what seems to be a tuple:

       >>> print Decimal('42'), datetime.now()
       42 2008-05-27 20:16:22.534285

   where on an actual tuple it prints

       >>> print((Decimal('42'), datetime.now()))
       (Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))


A different approach - call str(item)

   For example, with numbers it is often only the value that people
   care about.

       >>> print Decimal('3')
       3

   But putting the value in a list forces users to read the type
   information, exactly as if repr had been called for the benefit of
   a machine:

       >>> print [Decimal('3')]
       [Decimal("3")]

   After this change, the type information would not clutter the str
   output:

       >>> print "%s".format([Decimal('3')])
       [3]
       >>> str([Decimal('3')])  # ==
       [3]

   But it would still be available if desired:

       >>> print "%r".format([Decimal('3')])
       [Decimal('3')]
       >>> repr([Decimal('3')])  # ==
       [Decimal('3')]

   There is a number of strategies to fix the problem.  The most
   radical is to change __repr__ so it accepts a new parameter (flag)
   "called from str, so call str on items, not repr".  The
   drawback of the proposal is that every __repr__ implementation
   must be changed.  Introspection could help a bit (inspect __repr__
   before calling if it accepts 2 or 3 parameters), but introspection
   doesn't work on classes written in C, like all built-in containers.

   Less radical proposal is to implement __str__ methods for built-in
   container types.  The obvious drawback is a duplication of effort
   - all those __str__ and __repr__ implementations are only differ
   in one small detail - if they call str or repr on items.

   The most conservative proposal is not to change str at all but
   to allow developers to implement their own application- or
   library-specific pretty-printers.  The drawback is again
   a multiplication of effort and proliferation of many small
   specific container-traversal algorithms.


Backward compatibility

   In those cases where type information is more important than
   usual, it will still be possible to get the current results by
   calling repr explicitly.


Copyright

   This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`PEP: 3140`
			`Title: str(container) should call str(item), not repr(item)`
			`Version: $Revision$`
fixed keyword 2008-06-03 11:35:18 -04:00			`Last-Modified: $Date$`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`Author: Oleg Broytmann <phd@phd.pp.ru>,`
add initial 2009-01-04 20:20:25 -05:00			`Jim J. Jewett <jimjjewett@gmail.com>`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`Discussions-To: python-3000@python.org`
			`Status: Rejected`
			`Type: Standards Track`
			`Content-Type: text/plain`
			`Created: 27-May-2008`
			`Post-History: 28-May-2008`


			`Rejection`

			`Guido said this would cause too much disturbance too close to beta. See`
			`http://mail.python.org/pipermail/python-3000/2008-May/013876.html.`


			`Abstract`

			`This document discusses the advantages and disadvantages of the`
			`current implementation of str(container). It also discusses the`
			`pros and cons of a different approach - to call str(item) instead`
			`of repr(item).`


			`Motivation`

			`Currently str(container) calls repr on items. Arguments for it:`
			`-- containers refuse to guess what the user wants to see on`
			`str(container) - surroundings, delimiters, and so on;`
			`-- repr(item) usually displays type information - apostrophes`
			`around strings, class names, etc.`

			`Arguments against:`
			`-- it's illogical; str() is expected to call __str__ if it exists,`
			`not __repr__;`
			`-- there is no standard way to print a container's content calling`
			`items' __str__, that's inconvenient in cases where __str__ and`
			`__repr__ return different results;`
typo, spelling, caps 2008-06-01 10:30:24 -04:00			`-- repr(item) sometimes do wrong things (hex-escapes non-ASCII`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`strings, e.g.)`

			`This PEP proposes to change how str(container) works. It is`
			`proposed to mimic how repr(container) works except one detail`
			`- call str on items instead of repr. This allows a user to choose`
			`what results she want to get - from item.__repr__ or item.__str__.`


			`Current situation`

			`Most container types (tuples, lists, dicts, sets, etc.) do not`
			`implement __str__ method, so str(container) calls`
			`container.__repr__, and container.__repr__, once called, forgets`
			`it is called from str and always calls repr on the container's`
			`items.`

			`This behaviour has advantages and disadvantages. One advantage is`
			`that most items are represented with type information - strings`
			`are surrounded by apostrophes, instances may have both class name`
			`and instance data:`

			`>>> print([42, '42'])`
			`[42, '42']`
			`>>> print([Decimal('42'), datetime.now()])`
			`[Decimal("42"), datetime.datetime(2008, 5, 27, 19, 57, 43, 485028)]`

			`The disadvantage is that __repr__ often returns technical data`
			`(like '<object at address>') or unreadable string (hex-encoded`
typo, spelling, caps 2008-06-01 10:30:24 -04:00			`string if the input is non-ASCII string):`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00
			`>>> print(['тест'])`
			`['\xd4\xc5\xd3\xd4']`

			`One of the motivations for PEP 3138 is that neither repr nor str`
typo, spelling, caps 2008-06-01 10:30:24 -04:00			`will allow the sensible printing of dicts whose keys are non-ASCII`
			`text strings. Now that Unicode identifiers are allowed, it`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`includes Python's own attribute dicts. This also includes JSON`
			`serialization (and caused some hoops for the json lib).`

			`PEP 3138 proposes to fix this by breaking the "repr is safe ASCII"`
			`invariant, and changing the way repr (which is used for`
			`persistence) outputs some objects, with system-dependent failures.`

			`Changing how str(container) works would allow easy debugging in`
typo, spelling, caps 2008-06-01 10:30:24 -04:00			`the normal case, and retain the safety of ASCII-only for the`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`machine-readable case. The only downside is that str(x) and`
			`repr(x) would more often be different -- but only in those cases`
			`where the current almost-the-same version is insufficient.`

			`It also seems illogical that str(container) calls repr on items`
			`instead of str. It's only logical to expect following code`

			`class Test:`
			`def __str__(self):`
			`return "STR"`

			`def __repr__(self):`
			`return "REPR"`


			`test = Test()`
			`print(test)`
			`print(repr(test))`
			`print([test])`
			`print(str([test]))`

			`to print`

			`STR`
			`REPR`
			`[STR]`
			`[STR]`

			`where it actually prints`

			`STR`
			`REPR`
			`[REPR]`
			`[REPR]`

			`Especially it is illogical to see that print in Python 2 uses str`
			`if it is called on what seems to be a tuple:`

			`>>> print Decimal('42'), datetime.now()`
			`42 2008-05-27 20:16:22.534285`

			`where on an actual tuple it prints`

			`>>> print((Decimal('42'), datetime.now()))`
			`(Decimal("42"), datetime.datetime(2008, 5, 27, 20, 16, 27, 937911))`


			`A different approach - call str(item)`

			`For example, with numbers it is often only the value that people`
			`care about.`

			`>>> print Decimal('3')`
			`3`

			`But putting the value in a list forces users to read the type`
			`information, exactly as if repr had been called for the benefit of`
			`a machine:`

			`>>> print [Decimal('3')]`
			`[Decimal("3")]`

			`After this change, the type information would not clutter the str`
			`output:`

			`>>> print "%s".format([Decimal('3')])`
			`[3]`
			`>>> str([Decimal('3')]) # ==`
			`[3]`

			`But it would still be available if desired:`

			`>>> print "%r".format([Decimal('3')])`
			`[Decimal('3')]`
			`>>> repr([Decimal('3')]) # ==`
			`[Decimal('3')]`

			`There is a number of strategies to fix the problem. The most`
			`radical is to change __repr__ so it accepts a new parameter (flag)`
			`"called from str, so call str on items, not repr". The`
			`drawback of the proposal is that every __repr__ implementation`
			`must be changed. Introspection could help a bit (inspect __repr__`
			`before calling if it accepts 2 or 3 parameters), but introspection`
typo, spelling, caps 2008-06-01 10:30:24 -04:00			`doesn't work on classes written in C, like all built-in containers.`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00
typo, spelling, caps 2008-06-01 10:30:24 -04:00			`Less radical proposal is to implement __str__ methods for built-in`
add PEP 3140 (rejected): str(container) should call str(item), not repr(item) 2008-05-29 17:03:57 -04:00			`container types. The obvious drawback is a duplication of effort`
			`- all those __str__ and __repr__ implementations are only differ`
			`in one small detail - if they call str or repr on items.`

			`The most conservative proposal is not to change str at all but`
			`to allow developers to implement their own application- or`
			`library-specific pretty-printers. The drawback is again`
			`a multiplication of effort and proliferation of many small`
			`specific container-traversal algorithms.`


			`Backward compatibility`

			`In those cases where type information is more important than`
			`usual, it will still be possible to get the current results by`
			`calling repr explicitly.`


			`Copyright`

			`This document has been placed in the public domain.`



			`Local Variables:`
			`mode: indented-text`
			`indent-tabs-mode: nil`
			`sentence-end-double-space: t`
			`fill-column: 70`
			`coding: utf-8`
			`End:`