2004-08-27 09:44:37 -04:00
|
|
|
|
PEP: 332
|
|
|
|
|
Title: Byte vectors and String/Unicode Unification
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Skip Montanaro <skip@pobox.com>
|
2006-02-15 15:41:16 -05:00
|
|
|
|
Status: Rejected
|
2004-08-27 09:44:37 -04:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 11-Aug-2004
|
|
|
|
|
Python-Version: 2.5
|
|
|
|
|
Post-History:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This PEP outlines the introduction of a raw ``bytes`` sequence object
|
|
|
|
|
and the unification of the current ``str`` and ``unicode`` objects.
|
|
|
|
|
|
|
|
|
|
|
2006-02-15 15:41:16 -05:00
|
|
|
|
Rejection Notice
|
|
|
|
|
================
|
|
|
|
|
|
|
|
|
|
This PEP is rejected in this form. The author has expressed lack of
|
|
|
|
|
time to continue to shepherd it, and discussion on python-dev has
|
|
|
|
|
moved to a slightly different proposal which will (eventually) be
|
|
|
|
|
written up as a new PEP. See the thread starting at
|
2017-06-11 15:02:39 -04:00
|
|
|
|
https://mail.python.org/pipermail/python-dev/2006-February/060930.html.
|
2006-02-15 15:41:16 -05:00
|
|
|
|
|
|
|
|
|
|
2004-08-27 09:44:37 -04:00
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
Python's current string objects are overloaded. They serve both to
|
|
|
|
|
hold ASCII and non-ASCII character data and to also hold sequences of
|
|
|
|
|
raw bytes which have no reasonable interpretation as displayable
|
|
|
|
|
character sequences. This overlap hasn't been a big problem in the
|
|
|
|
|
past, but as Python moves closer to requiring source code to be
|
|
|
|
|
properly encoded, the use of strings to represent raw byte sequences
|
|
|
|
|
will be more problematic. In addition, as Python's Unicode support
|
|
|
|
|
has improved, it's easier to consider strings as ASCII-encoded Unicode
|
|
|
|
|
objects.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Proposed Implementation
|
|
|
|
|
=======================
|
|
|
|
|
|
|
|
|
|
The number in parentheses indicates the Python version in which the
|
|
|
|
|
feature will be introduced.
|
|
|
|
|
|
|
|
|
|
- Add a ``bytes`` builtin which is just a synonym for ``str``. (2.5)
|
|
|
|
|
|
|
|
|
|
- Add a ``b"..."`` string literal which is equivalent to raw string
|
|
|
|
|
literals, with the exception that values which conflict with the
|
|
|
|
|
source encoding of the containing file not generate warnings. (2.5)
|
|
|
|
|
|
|
|
|
|
- Warn about the use of variables named "bytes". (2.5 or 2.6)
|
|
|
|
|
|
|
|
|
|
- Introduce a ``bytes`` builtin which refers to a sequence distinct
|
|
|
|
|
from the ``str`` type. (2.6)
|
|
|
|
|
|
|
|
|
|
- Make ``str`` a synonym for ``unicode``. (3.0)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bytes Object API
|
|
|
|
|
================
|
|
|
|
|
|
|
|
|
|
TBD.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Issues
|
|
|
|
|
======
|
|
|
|
|
|
|
|
|
|
- Can this be accomplished before Python 3.0?
|
|
|
|
|
|
|
|
|
|
- Should ``bytes`` objects be mutable or immutable? (Guido seems to
|
|
|
|
|
like them to be mutable.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
End:
|