143 lines
3.3 KiB
Plaintext
143 lines
3.3 KiB
Plaintext
|
PEP: XXX
|
||
|
Title: Adding % and {} formatting to bytes
|
||
|
Version: $Revision$
|
||
|
Last-Modified: $Date$
|
||
|
Author: Ethan Furman <ethan@stoneleaf.us>
|
||
|
Status: Draft
|
||
|
Type: Standards Track
|
||
|
Content-Type: text/x-rst
|
||
|
Created: 2014-01-13
|
||
|
Python-Version: 3.5
|
||
|
Post-History: 2014-01-13
|
||
|
Resolution:
|
||
|
|
||
|
|
||
|
Abstract
|
||
|
========
|
||
|
|
||
|
This PEP proposes adding the % and {} formatting operations from str to bytes.
|
||
|
|
||
|
|
||
|
Proposed semantics for bytes formatting
|
||
|
=======================================
|
||
|
|
||
|
%-interpolation
|
||
|
---------------
|
||
|
|
||
|
All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.)
|
||
|
will be supported, and will work as they do for str, including the
|
||
|
padding, justification and other related modifiers.
|
||
|
|
||
|
Example::
|
||
|
|
||
|
>>> b'%4x' % 10
|
||
|
b' a'
|
||
|
|
||
|
%c will insert a single byte, either from an int in range(256), or from
|
||
|
a bytes argument of length 1.
|
||
|
|
||
|
Example:
|
||
|
|
||
|
>>> b'%c' % 48
|
||
|
b'0'
|
||
|
|
||
|
>>> b'%c' % b'a'
|
||
|
b'a'
|
||
|
|
||
|
%s, because it is the most general, has the most convoluted resolution:
|
||
|
|
||
|
- input type is bytes?
|
||
|
pass it straight through
|
||
|
|
||
|
- input type is numeric?
|
||
|
use its __xxx__ [1] [2] method and ascii-encode it (strictly)
|
||
|
|
||
|
- input type is something else?
|
||
|
use its __bytes__ method; if there isn't one, raise an exception [3]
|
||
|
|
||
|
Examples:
|
||
|
|
||
|
>>> b'%s' % b'abc'
|
||
|
b'abc'
|
||
|
|
||
|
>>> b'%s' % 3.14
|
||
|
b'3.14'
|
||
|
|
||
|
>>> b'%s' % 'hello world!'
|
||
|
Traceback (most recent call last):
|
||
|
...
|
||
|
TypeError: 'hello world' has no __bytes__ method, perhaps you need to encode it?
|
||
|
|
||
|
.. note::
|
||
|
|
||
|
Because the str type does not have a __bytes__ method, attempts to
|
||
|
directly use 'a string' as a bytes interpolation value will raise an
|
||
|
exception. To use 'string' values, they must be encoded or otherwise
|
||
|
transformed into a bytes sequence::
|
||
|
|
||
|
'a string'.encode('latin-1')
|
||
|
|
||
|
|
||
|
format
|
||
|
------
|
||
|
|
||
|
The format mini language will be used as-is, with the behaviors as listed
|
||
|
for %-interpolation.
|
||
|
|
||
|
|
||
|
Open Questions
|
||
|
==============
|
||
|
|
||
|
For %s there has been some discussion of trying to use the buffer protocol
|
||
|
(Py_buffer) before trying __bytes__. This question should be answered before
|
||
|
the PEP is implemented.
|
||
|
|
||
|
|
||
|
Proposed variations
|
||
|
===================
|
||
|
|
||
|
It has been suggested to use %b for bytes instead of %s.
|
||
|
|
||
|
- Rejected as %b does not exist in Python 2.x %-interpolation, which is
|
||
|
why we are using %s.
|
||
|
|
||
|
It has been proposed to automatically use .encode('ascii','strict') for str
|
||
|
arguments to %s.
|
||
|
|
||
|
- Rejected as this would lead to intermittent failures. Better to have the
|
||
|
operation always fail so the trouble-spot can be correctly fixed.
|
||
|
|
||
|
It has been proposed to have %s return the ascii-encoded repr when the value
|
||
|
is a str (b'%s' % 'abc' --> b"'abc'").
|
||
|
|
||
|
- Rejected as this would lead to hard to debug failures far from the problem
|
||
|
site. Better to have the operation always fail so the trouble-spot can be
|
||
|
easily fixed.
|
||
|
|
||
|
|
||
|
Foot notes
|
||
|
==========
|
||
|
|
||
|
.. [1] Not sure if this should be the numeric __str__ or the numeric __repr__,
|
||
|
or if there's any difference
|
||
|
.. [2] Any proper numeric class would then have to provide an ascii
|
||
|
representation of its value, either via __repr__ or __str__ (whichever
|
||
|
we choose in [1]).
|
||
|
.. [3] TypeError, ValueError, or UnicodeEncodeError?
|
||
|
|
||
|
|
||
|
Copyright
|
||
|
=========
|
||
|
|
||
|
This document has been placed in the public domain.
|
||
|
|
||
|
|
||
|
..
|
||
|
Local Variables:
|
||
|
mode: indented-text
|
||
|
indent-tabs-mode: nil
|
||
|
sentence-end-double-space: t
|
||
|
fill-column: 70
|
||
|
coding: utf-8
|
||
|
End:
|