From 7cc89487a0acf13b1d0226b79de637f3f6114c50 Mon Sep 17 00:00:00 2001 From: Ethan Furman Date: Tue, 14 Jan 2014 11:04:10 -0800 Subject: [PATCH] PEP 461: Adding % and {} formatting to bytes --- pep-0461.txt | 142 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 pep-0461.txt diff --git a/pep-0461.txt b/pep-0461.txt new file mode 100644 index 000000000..084e104b9 --- /dev/null +++ b/pep-0461.txt @@ -0,0 +1,142 @@ +PEP: XXX +Title: Adding % and {} formatting to bytes +Version: $Revision$ +Last-Modified: $Date$ +Author: Ethan Furman +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 2014-01-13 +Python-Version: 3.5 +Post-History: 2014-01-13 +Resolution: + + +Abstract +======== + +This PEP proposes adding the % and {} formatting operations from str to bytes. + + +Proposed semantics for bytes formatting +======================================= + +%-interpolation +--------------- + +All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.) +will be supported, and will work as they do for str, including the +padding, justification and other related modifiers. + +Example:: + + >>> b'%4x' % 10 + b' a' + +%c will insert a single byte, either from an int in range(256), or from +a bytes argument of length 1. + +Example: + + >>> b'%c' % 48 + b'0' + + >>> b'%c' % b'a' + b'a' + +%s, because it is the most general, has the most convoluted resolution: + + - input type is bytes? + pass it straight through + + - input type is numeric? + use its __xxx__ [1] [2] method and ascii-encode it (strictly) + + - input type is something else? + use its __bytes__ method; if there isn't one, raise an exception [3] + +Examples: + + >>> b'%s' % b'abc' + b'abc' + + >>> b'%s' % 3.14 + b'3.14' + + >>> b'%s' % 'hello world!' + Traceback (most recent call last): + ... + TypeError: 'hello world' has no __bytes__ method, perhaps you need to encode it? + +.. note:: + + Because the str type does not have a __bytes__ method, attempts to + directly use 'a string' as a bytes interpolation value will raise an + exception. To use 'string' values, they must be encoded or otherwise + transformed into a bytes sequence:: + + 'a string'.encode('latin-1') + + +format +------ + +The format mini language will be used as-is, with the behaviors as listed +for %-interpolation. + + +Open Questions +============== + +For %s there has been some discussion of trying to use the buffer protocol +(Py_buffer) before trying __bytes__. This question should be answered before +the PEP is implemented. + + +Proposed variations +=================== + +It has been suggested to use %b for bytes instead of %s. + + - Rejected as %b does not exist in Python 2.x %-interpolation, which is + why we are using %s. + +It has been proposed to automatically use .encode('ascii','strict') for str +arguments to %s. + + - Rejected as this would lead to intermittent failures. Better to have the + operation always fail so the trouble-spot can be correctly fixed. + +It has been proposed to have %s return the ascii-encoded repr when the value +is a str (b'%s' % 'abc' --> b"'abc'"). + + - Rejected as this would lead to hard to debug failures far from the problem + site. Better to have the operation always fail so the trouble-spot can be + easily fixed. + + +Foot notes +========== + +.. [1] Not sure if this should be the numeric __str__ or the numeric __repr__, + or if there's any difference +.. [2] Any proper numeric class would then have to provide an ascii + representation of its value, either via __repr__ or __str__ (whichever + we choose in [1]). +.. [3] TypeError, ValueError, or UnicodeEncodeError? + + +Copyright +========= + +This document has been placed in the public domain. + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: