added list of currently supported codes; modified description for %a; added reference to competing PEP 460

This commit is contained in:
Ethan Furman 2014-03-26 07:46:34 -07:00
parent e274ee9805
commit c633a95251
1 changed files with 31 additions and 15 deletions

View File

@ -50,11 +50,17 @@ Proposed semantics for ``bytes`` and ``bytearray`` formatting
%-interpolation
---------------
All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``,
``%g``, etc.) will be supported, and will work as they do for str, including
the padding, justification and other related modifiers. The only difference
will be that the results from these codes will be ASCII-encoded text, not
unicode. In other words, for any numeric formatting code `%x`::
All the numeric formatting codes (``d``, ``i``, ``o``, ``u``, ``x``, ``X``,
``e``, ``E'', ``f``, ``F``, ``g``, ``G``, and any that are subsequently added
to Python 3) will be supported, and will work as they do for str, including
the padding, justification and other related modifiers (currently ``#``, ``0``,
``-``, `` `` (space), and ``+`` (plus any added to Python 3)). The only
non-numeric codes allowed are ``c``, ``s``, and ``a``.
For the numeric codes, the only difference between ``str`` and ``bytes`` (or
``bytearray``) interpolation is that the results from these codes will be
ASCII-encoded text, not unicode. In other words, for any numeric formatting
code `%x`::
b"%x" % val
@ -116,18 +122,24 @@ Examples::
TypeError: b'%s' does not accept 'str', it must be encoded to `bytes`
``%a`` will call ``ascii()`` on the interpolated value. This is intended
as a debugging aid, rather than something that should be used in production.
Non-ASCII values will be encoded to either ``\xnn`` or ``\unnnn``
representation. Use cases include developing a new protocol and writing
landmarks into the stream; debugging data going into an existing protocol
to see if the problem is the protocol itself or bad data; a fall-back for a
serialization format; or even a rudimentary serialization format when
defining ``__bytes__`` would not be appropriate [8].
``%a`` will give the equivalent of
``repr(some_obj).encode('ascii', 'backslashreplace')`` on the interpolated
value. Use cases include developing a new protocol and writing landmarks
into the stream; debugging data going into an existing protocol to see if
the problem is the protocol itself or bad data; a fall-back for a serialization
format; or any situation where defining ``__bytes__`` would not be appropriate
but a readable/informative representation is needed [8].
.. note::
Examples::
If a ``str`` is passed into ``%a``, it will be surrounded by quotes.
>>> b'%a' % 3.14
b'3.14'
>>> b'%a' % b'abc'
b'abc'
>>> b'%a' % 'def'
b"'def'"
Unsupported codes
@ -166,6 +178,9 @@ Various new special methods were proposed, such as ``__ascii__``,
``__format_bytes__``, etc.; such methods are not needed at this time, but can
be visited again later if real-world use shows deficiencies with this solution.
A competing PEP, ``PEP 460 Add binary interpolation and formatting`` [9], also
exists.
Objections
==========
@ -204,6 +219,7 @@ Footnotes
examples: ``memoryview``, ``array.array``, ``bytearray``, ``bytes``
.. [7] http://docs.python.org/3/reference/datamodel.html#object.__bytes__
.. [8] https://mail.python.org/pipermail/python-dev/2014-February/132750.html
.. [9] http://python.org/dev/peps/pep-0460/
Copyright