From 6f67f624784c9cd9518ca2e3c0a7261213e2cd45 Mon Sep 17 00:00:00 2001 From: Ethan Furman Date: Wed, 15 Jan 2014 16:12:41 -0800 Subject: [PATCH] PEP 461: more updates --- pep-0461.txt | 77 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 55 insertions(+), 22 deletions(-) diff --git a/pep-0461.txt b/pep-0461.txt index cdf585082..24c4a3a2c 100644 --- a/pep-0461.txt +++ b/pep-0461.txt @@ -8,14 +8,25 @@ Type: Standards Track Content-Type: text/x-rst Created: 2014-01-13 Python-Version: 3.5 -Post-History: 2014-01-13 +Post-History: 2014-01-14, 2014-01-15 Resolution: Abstract ======== -This PEP proposes adding the % and {} formatting operations from str to bytes. +This PEP proposes adding the % and {} formatting operations from str to bytes [1]. + + +Overriding Principles +===================== + +In order to avoid the problems of auto-conversion and value-generated exceptions, +all object checking will be done via isinstance, not by values contained in a +Unicode representation. In other words:: + + - duck-typing to allow/reject entry into a byte-stream + - no value generated errors Proposed semantics for bytes formatting @@ -26,7 +37,7 @@ Proposed semantics for bytes formatting All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.) will be supported, and will work as they do for str, including the -padding, justification and other related modifiers. +padding, justification and other related modifiers, except locale. Example:: @@ -44,13 +55,13 @@ Example: >>> b'%c' % b'a' b'a' -%s is a restricted in what it will accept:: +%s is restricted in what it will accept:: - input type supports Py_buffer? use it to collect the necessary bytes - input type is something else? - use its __bytes__ method; if there isn't one, raise an exception [3] + use its __bytes__ method; if there isn't one, raise an exception [2] Examples: @@ -76,23 +87,49 @@ Examples: 'a string'.encode('latin-1') -Unsupported % format codes -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -%r (which calls __repr__) is not supported - - format ------ -The format mini language will be used as-is, with the behaviors as listed -for %-interpolation. +The format mini language codes, where they correspond with the %-interpolation codes, +will be used as-is, with three exceptions:: + + - !s is not supported, as {} can mean the default for both str and bytes, in both + Py2 and Py3. + - !b is supported, and new Py3k code can use it to be explicit. + - no other __format__ method will be called. + +Numeric Format Codes +-------------------- + +To properly handle int and float subclasses, int(), index(), and float() will be called on the +objects intended for (d, i, u), (b, o, x, X), and (e, E, f, F, g, G). + +Unsupported codes +----------------- + +%r (which calls __repr__), and %a (which calls ascii() on __repr__) are not supported. + +!r and !a are not supported. + +The n integer and float format code is not supported. Open Questions ============== -Do we need no support all the numeric format codes? The floating point +Currently non-numeric objects go through:: + + - Py_buffer + - __bytes__ + - failure + +Do we want to add a __format_bytes__ method in there? + + - Guaranteed to produce only ascii (as in b'10', not b'\x0a') + - Makes more sense than using __bytes__ to produce ascii output + - What if an object has both __bytes__ and __format_bytes__? + +Do we need to support all the numeric format codes? The floating point exponential formats seem less appropriate, for example. @@ -121,15 +158,11 @@ is a str (b'%s' % 'abc' --> b"'abc'"). easily fixed. -Foot notes -========== +Footnotes +========= -.. [1] Not sure if this should be the numeric __str__ or the numeric __repr__, - or if there's any difference -.. [2] Any proper numeric class would then have to provide an ascii - representation of its value, either via __repr__ or __str__ (whichever - we choose in [1]). -.. [3] TypeError, ValueError, or UnicodeEncodeError? +.. [1] string.Template is not under consideration. +.. [2] TypeError, ValueError, or UnicodeEncodeError? Copyright