From d98a48bde3071b9ec93a994700d84edd1ee3297b Mon Sep 17 00:00:00 2001 From: Ethan Furman Date: Fri, 17 Jan 2014 09:07:32 -0800 Subject: [PATCH] PEP 461: removed .format; added markup --- pep-0461.txt | 120 +++++++++++++++++++++++---------------------------- 1 file changed, 53 insertions(+), 67 deletions(-) diff --git a/pep-0461.txt b/pep-0461.txt index 24c4a3a2c..c38a29673 100644 --- a/pep-0461.txt +++ b/pep-0461.txt @@ -1,5 +1,5 @@ PEP: 461 -Title: Adding % and {} formatting to bytes +Title: Adding % formatting to bytes Version: $Revision$ Last-Modified: $Date$ Author: Ethan Furman @@ -8,25 +8,23 @@ Type: Standards Track Content-Type: text/x-rst Created: 2014-01-13 Python-Version: 3.5 -Post-History: 2014-01-14, 2014-01-15 +Post-History: 2014-01-14, 2014-01-15, 2014-01-17 Resolution: Abstract ======== -This PEP proposes adding the % and {} formatting operations from str to bytes [1]. +This PEP proposes adding % formatting operations similar to Python 2's ``str`` +type to ``bytes`` [1]_ [2]_. Overriding Principles ===================== -In order to avoid the problems of auto-conversion and value-generated exceptions, -all object checking will be done via isinstance, not by values contained in a -Unicode representation. In other words:: - - - duck-typing to allow/reject entry into a byte-stream - - no value generated errors +In order to avoid the problems of auto-conversion and Unicode exceptions that +could plague Py2 code, all object checking will be done by duck-typing, not by +values contained in a Unicode representation [3]_. Proposed semantics for bytes formatting @@ -35,17 +33,23 @@ Proposed semantics for bytes formatting %-interpolation --------------- -All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.) -will be supported, and will work as they do for str, including the -padding, justification and other related modifiers, except locale. +All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``, +``%g``, etc.) will be supported, and will work as they do for str, including +the padding, justification and other related modifiers. Example:: >>> b'%4x' % 10 b' a' -%c will insert a single byte, either from an int in range(256), or from -a bytes argument of length 1. + >>> '%#4x' % 10 + ' 0xa' + + >>> '%04X' % 10 + '000A' + +``%c`` will insert a single byte, either from an ``int`` in range(256), or from +a ``bytes`` argument of length 1, not from a ``str``. Example: @@ -55,13 +59,13 @@ Example: >>> b'%c' % b'a' b'a' -%s is restricted in what it will accept:: +``%s`` is restricted in what it will accept:: - - input type supports Py_buffer? + - input type supports ``Py_buffer`` [4]_? use it to collect the necessary bytes - input type is something else? - use its __bytes__ method; if there isn't one, raise an exception [2] + use its ``__bytes__`` method [5]_ ; if there isn't one, raise a ``TypeError`` Examples: @@ -80,89 +84,71 @@ Examples: .. note:: - Because the str type does not have a __bytes__ method, attempts to - directly use 'a string' as a bytes interpolation value will raise an - exception. To use 'string' values, they must be encoded or otherwise - transformed into a bytes sequence:: + Because the ``str`` type does not have a ``__bytes__`` method, attempts to + directly use ``'a string'`` as a bytes interpolation value will raise an + exception. To use ``'string'`` values, they must be encoded or otherwise + transformed into a ``bytes`` sequence:: 'a string'.encode('latin-1') -format ------- - -The format mini language codes, where they correspond with the %-interpolation codes, -will be used as-is, with three exceptions:: - - - !s is not supported, as {} can mean the default for both str and bytes, in both - Py2 and Py3. - - !b is supported, and new Py3k code can use it to be explicit. - - no other __format__ method will be called. Numeric Format Codes -------------------- -To properly handle int and float subclasses, int(), index(), and float() will be called on the -objects intended for (d, i, u), (b, o, x, X), and (e, E, f, F, g, G). +To properly handle ``int`` and ``float`` subclasses, ``int()``, ``index()``, +and ``float()`` will be called on the objects intended for (``d``, ``i``, +``u``), (``b``, ``o``, ``x``, ``X``), and (``e``, ``E``, ``f``, ``F``, ``g``, +``G``). + Unsupported codes ----------------- -%r (which calls __repr__), and %a (which calls ascii() on __repr__) are not supported. - -!r and !a are not supported. - -The n integer and float format code is not supported. - - -Open Questions -============== - -Currently non-numeric objects go through:: - - - Py_buffer - - __bytes__ - - failure - -Do we want to add a __format_bytes__ method in there? - - - Guaranteed to produce only ascii (as in b'10', not b'\x0a') - - Makes more sense than using __bytes__ to produce ascii output - - What if an object has both __bytes__ and __format_bytes__? - -Do we need to support all the numeric format codes? The floating point -exponential formats seem less appropriate, for example. +``%r`` (which calls ``__repr__``), and ``%a`` (which calls ``ascii()`` on +``__repr__``) are not supported. Proposed variations =================== -It was suggested to let %s accept numbers, but since numbers have their own +It was suggested to let ``%s`` accept numbers, but since numbers have their own format codes this idea was discarded. -It has been suggested to use %b for bytes instead of %s. +It has been suggested to use ``%b`` for bytes instead of ``%s``. - - Rejected as %b does not exist in Python 2.x %-interpolation, which is - why we are using %s. + - Rejected as ``%b`` does not exist in Python 2.x %-interpolation, which is + why we are using ``%s``. -It has been proposed to automatically use .encode('ascii','strict') for str -arguments to %s. +It has been proposed to automatically use ``.encode('ascii','strict')`` for +``str`` arguments to ``%s``. - Rejected as this would lead to intermittent failures. Better to have the operation always fail so the trouble-spot can be correctly fixed. -It has been proposed to have %s return the ascii-encoded repr when the value -is a str (b'%s' % 'abc' --> b"'abc'"). +It has been proposed to have ``%s`` return the ascii-encoded repr when the +value is a ``str`` (b'%s' % 'abc' --> b"'abc'"). - Rejected as this would lead to hard to debug failures far from the problem site. Better to have the operation always fail so the trouble-spot can be easily fixed. +Originally this PEP also proposed adding format style formatting, but it +was decided that format and its related machinery were all strictly text +(aka ``str``) based, and it was dropped. + +Various new special methods were proposed, such as ``__ascii__``, +``__format_bytes__``, etc.; such methods are not needed at this time, but can +be visited again later if real-world use shows deficiencies with this solution. + Footnotes ========= -.. [1] string.Template is not under consideration. -.. [2] TypeError, ValueError, or UnicodeEncodeError? +.. [1] http://docs.python.org/2/library/stdtypes.html#string-formatting +.. [2] neither string.Template, format, nor str.format are under consideration. +.. [3] %c is not an exception as neither of its possible arguments are unicode. +.. [4] http://docs.python.org/3/c-api/buffer.html +.. [5] http://docs.python.org/3/reference/datamodel.html#object.__bytes__ Copyright