Grammar fixes in the unicode vs bytes section of PEP 404
This commit is contained in:
parent
f2c1df7542
commit
6230377206
15
pep-0404.txt
15
pep-0404.txt
|
@ -92,13 +92,14 @@ they play a dual role in Python 2 as both ASCII text and as byte
|
|||
sequences. While Python 2 also has a unicode string type, the
|
||||
fundamental ambiguity of the core string type, coupled with Python 2's
|
||||
default behavior of supporting automatic coercion from 8-bit strings
|
||||
to unicodes when the two are combined, often leads to `UnicodeError`s.
|
||||
Python 3's standard string type is a unicode, and Python 3 adds a
|
||||
bytes type, but critically, no automatic coercion between bytes and
|
||||
unicodes is provided (the closest we get are a few text-based APIs that
|
||||
assume UTF-8 as the default encoding if no encoding is explicitly stated).
|
||||
Thus, the core interpreter, its I/O libraries, module names, etc. are clear
|
||||
in their distinction between unicode strings and bytes. Python 3's unicode
|
||||
to unicode objects when the two are combined, often leads to
|
||||
`UnicodeError`s. Python 3's standard string type is Unicode based, and
|
||||
Python 3 adds a dedicated bytes type, but critically, no automatic coercion
|
||||
between bytes and unicode strings is provided. The closest the language gets
|
||||
to implicit coercion are a few text-based APIs that assume a default
|
||||
encoding (usually UTF-8) if no encoding is explicitly stated. Thus, the core
|
||||
interpreter, its I/O libraries, module names, etc. are clear in their
|
||||
distinction between unicode strings and bytes. Python 3's unicode
|
||||
support even extends to the filesystem, so that non-ASCII file names are
|
||||
natively supported.
|
||||
|
||||
|
|
Loading…
Reference in New Issue