94 lines
2.3 KiB
ReStructuredText
94 lines
2.3 KiB
ReStructuredText
PEP: 332
|
||
Title: Byte vectors and String/Unicode Unification
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Skip Montanaro <skip@pobox.com>
|
||
Status: Rejected
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 11-Aug-2004
|
||
Python-Version: 2.5
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP outlines the introduction of a raw ``bytes`` sequence object
|
||
and the unification of the current ``str`` and ``unicode`` objects.
|
||
|
||
|
||
Rejection Notice
|
||
================
|
||
|
||
This PEP is rejected in this form. The author has expressed lack of
|
||
time to continue to shepherd it, and discussion on python-dev has
|
||
moved to a slightly different proposal which will (eventually) be
|
||
written up as a new PEP. See the thread starting at
|
||
https://mail.python.org/pipermail/python-dev/2006-February/060930.html.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Python's current string objects are overloaded. They serve both to
|
||
hold ASCII and non-ASCII character data and to also hold sequences of
|
||
raw bytes which have no reasonable interpretation as displayable
|
||
character sequences. This overlap hasn't been a big problem in the
|
||
past, but as Python moves closer to requiring source code to be
|
||
properly encoded, the use of strings to represent raw byte sequences
|
||
will be more problematic. In addition, as Python's Unicode support
|
||
has improved, it's easier to consider strings as ASCII-encoded Unicode
|
||
objects.
|
||
|
||
|
||
Proposed Implementation
|
||
=======================
|
||
|
||
The number in parentheses indicates the Python version in which the
|
||
feature will be introduced.
|
||
|
||
- Add a ``bytes`` builtin which is just a synonym for ``str``. (2.5)
|
||
|
||
- Add a ``b"..."`` string literal which is equivalent to raw string
|
||
literals, with the exception that values which conflict with the
|
||
source encoding of the containing file not generate warnings. (2.5)
|
||
|
||
- Warn about the use of variables named "bytes". (2.5 or 2.6)
|
||
|
||
- Introduce a ``bytes`` builtin which refers to a sequence distinct
|
||
from the ``str`` type. (2.6)
|
||
|
||
- Make ``str`` a synonym for ``unicode``. (3.0)
|
||
|
||
|
||
Bytes Object API
|
||
================
|
||
|
||
TBD.
|
||
|
||
|
||
Issues
|
||
======
|
||
|
||
- Can this be accomplished before Python 3.0?
|
||
|
||
- Should ``bytes`` objects be mutable or immutable? (Guido seems to
|
||
like them to be mutable.)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|