Revert Python 3 changes
This commit is contained in:
parent
fae7914313
commit
604910ad8c
143
pep-0333.txt
143
pep-0333.txt
|
@ -142,51 +142,6 @@ callable was provided to it. Callables are only to be called, not
|
||||||
introspected upon.
|
introspected upon.
|
||||||
|
|
||||||
|
|
||||||
A Note On String Types
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
In general, HTTP deals with bytes, which means that this specification
|
|
||||||
is mostly about handling bytes.
|
|
||||||
|
|
||||||
However, the content of those bytes often has some kind of textual
|
|
||||||
interpretation, and in Python, strings are the most convenient way
|
|
||||||
to handle text.
|
|
||||||
|
|
||||||
But in many Python versions and implementations, strings are Unicode,
|
|
||||||
rather than bytes. This requires a careful balance between a usable
|
|
||||||
API and correct translations between bytes and text in the context of
|
|
||||||
HTTP... especially to support porting code between Python
|
|
||||||
implementations with different ``str`` types.
|
|
||||||
|
|
||||||
WSGI therefore defines two kinds of "string":
|
|
||||||
|
|
||||||
* "Native" strings (which are always implemented using the type
|
|
||||||
named ``str``) that are used for request/response headers and
|
|
||||||
metadata
|
|
||||||
|
|
||||||
* "Bytestrings" (which are implemented using the ``bytes`` type
|
|
||||||
in Python 3, and ``str`` elsewhere), that are used for the bodies
|
|
||||||
of requests and responses (e.g. POST/PUT input data and HTML page
|
|
||||||
outputs).
|
|
||||||
|
|
||||||
Do not be confused however: even if Python's ``str`` type is actually
|
|
||||||
Unicode "under the hood", the *content* of native strings must
|
|
||||||
still be translatable to bytes via the Latin-1 encoding! (See
|
|
||||||
the section on `Unicode Issues`_ later in this document for more
|
|
||||||
details.)
|
|
||||||
|
|
||||||
In short: where you see the word "string" in this document, it refers
|
|
||||||
to a "native" string, i.e., an object of type ``str``, whether it is
|
|
||||||
internally implemented as bytes or unicode. Where you see references
|
|
||||||
to "bytestring", this should be read as "an object of type ``bytes``
|
|
||||||
under Python 3, or type ``str`` under Python 2".
|
|
||||||
|
|
||||||
And so, even though HTTP is in some sense "really just bytes", there
|
|
||||||
are many API conveniences to be had by using whatever Python's
|
|
||||||
default ``str`` type is.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
The Application/Framework Side
|
The Application/Framework Side
|
||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
|
@ -209,15 +164,13 @@ support application developers.)
|
||||||
Here are two example application objects; one is a function, and the
|
Here are two example application objects; one is a function, and the
|
||||||
other is a class::
|
other is a class::
|
||||||
|
|
||||||
# this would need to be a byte string in Python 3:
|
|
||||||
HELLO_WORLD = "Hello world!\n"
|
|
||||||
|
|
||||||
def simple_app(environ, start_response):
|
def simple_app(environ, start_response):
|
||||||
"""Simplest possible application object"""
|
"""Simplest possible application object"""
|
||||||
status = '200 OK'
|
status = '200 OK'
|
||||||
response_headers = [('Content-type', 'text/plain')]
|
response_headers = [('Content-type', 'text/plain')]
|
||||||
start_response(status, response_headers)
|
start_response(status, response_headers)
|
||||||
return [HELLO_WORLD]
|
return ['Hello world!\n']
|
||||||
|
|
||||||
|
|
||||||
class AppClass:
|
class AppClass:
|
||||||
"""Produce the same output, but using a class
|
"""Produce the same output, but using a class
|
||||||
|
@ -242,7 +195,7 @@ other is a class::
|
||||||
status = '200 OK'
|
status = '200 OK'
|
||||||
response_headers = [('Content-type', 'text/plain')]
|
response_headers = [('Content-type', 'text/plain')]
|
||||||
self.start(status, response_headers)
|
self.start(status, response_headers)
|
||||||
yield HELLO_WORLD
|
yield "Hello world!\n"
|
||||||
|
|
||||||
|
|
||||||
The Server/Gateway Side
|
The Server/Gateway Side
|
||||||
|
@ -290,7 +243,7 @@ server.
|
||||||
sys.stdout.write('%s: %s\r\n' % header)
|
sys.stdout.write('%s: %s\r\n' % header)
|
||||||
sys.stdout.write('\r\n')
|
sys.stdout.write('\r\n')
|
||||||
|
|
||||||
sys.stdout.write(data) # TODO: this needs to be binary on Py3
|
sys.stdout.write(data)
|
||||||
sys.stdout.flush()
|
sys.stdout.flush()
|
||||||
|
|
||||||
def start_response(status, response_headers, exc_info=None):
|
def start_response(status, response_headers, exc_info=None):
|
||||||
|
@ -373,7 +326,7 @@ a block boundary.)
|
||||||
"""Transform iterated output to piglatin, if it's okay to do so
|
"""Transform iterated output to piglatin, if it's okay to do so
|
||||||
|
|
||||||
Note that the "okayness" can change until the application yields
|
Note that the "okayness" can change until the application yields
|
||||||
its first non-empty bytestring, so 'transform_ok' has to be a mutable
|
its first non-empty string, so 'transform_ok' has to be a mutable
|
||||||
truth value.
|
truth value.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
@ -388,7 +341,7 @@ a block boundary.)
|
||||||
|
|
||||||
def next(self):
|
def next(self):
|
||||||
if self.transform_ok:
|
if self.transform_ok:
|
||||||
return piglatin(self._next()) # call must be byte-safe on Py3
|
return piglatin(self._next())
|
||||||
else:
|
else:
|
||||||
return self._next()
|
return self._next()
|
||||||
|
|
||||||
|
@ -423,7 +376,7 @@ a block boundary.)
|
||||||
|
|
||||||
if transform_ok:
|
if transform_ok:
|
||||||
def write_latin(data):
|
def write_latin(data):
|
||||||
write(piglatin(data)) # call must be byte-safe on Py3
|
write(piglatin(data))
|
||||||
return write_latin
|
return write_latin
|
||||||
else:
|
else:
|
||||||
return write
|
return write
|
||||||
|
@ -473,7 +426,7 @@ It is used only when the application has trapped an error and is
|
||||||
attempting to display an error message to the browser.
|
attempting to display an error message to the browser.
|
||||||
|
|
||||||
The ``start_response`` callable must return a ``write(body_data)``
|
The ``start_response`` callable must return a ``write(body_data)``
|
||||||
callable that takes one positional parameter: a bytestring to be written
|
callable that takes one positional parameter: a string to be written
|
||||||
as part of the HTTP response body. (Note: the ``write()`` callable is
|
as part of the HTTP response body. (Note: the ``write()`` callable is
|
||||||
provided only to support certain existing frameworks' imperative output
|
provided only to support certain existing frameworks' imperative output
|
||||||
APIs; it should not be used by new applications or frameworks if it
|
APIs; it should not be used by new applications or frameworks if it
|
||||||
|
@ -481,24 +434,24 @@ can be avoided. See the `Buffering and Streaming`_ section for more
|
||||||
details.)
|
details.)
|
||||||
|
|
||||||
When called by the server, the application object must return an
|
When called by the server, the application object must return an
|
||||||
iterable yielding zero or more bytestrings. This can be accomplished in a
|
iterable yielding zero or more strings. This can be accomplished in a
|
||||||
variety of ways, such as by returning a list of bytestrings, or by the
|
variety of ways, such as by returning a list of strings, or by the
|
||||||
application being a generator function that yields bytestrings, or
|
application being a generator function that yields strings, or
|
||||||
by the application being a class whose instances are iterable.
|
by the application being a class whose instances are iterable.
|
||||||
Regardless of how it is accomplished, the application object must
|
Regardless of how it is accomplished, the application object must
|
||||||
always return an iterable yielding zero or more bytestrings.
|
always return an iterable yielding zero or more strings.
|
||||||
|
|
||||||
The server or gateway must transmit the yielded bytestrings to the client
|
The server or gateway must transmit the yielded strings to the client
|
||||||
in an unbuffered fashion, completing the transmission of each bytestring
|
in an unbuffered fashion, completing the transmission of each string
|
||||||
before requesting another one. (In other words, applications
|
before requesting another one. (In other words, applications
|
||||||
**should** perform their own buffering. See the `Buffering and
|
**should** perform their own buffering. See the `Buffering and
|
||||||
Streaming`_ section below for more on how application output must be
|
Streaming`_ section below for more on how application output must be
|
||||||
handled.)
|
handled.)
|
||||||
|
|
||||||
The server or gateway should treat the yielded bytestrings as binary byte
|
The server or gateway should treat the yielded strings as binary byte
|
||||||
sequences: in particular, it should ensure that line endings are
|
sequences: in particular, it should ensure that line endings are
|
||||||
not altered. The application is responsible for ensuring that the
|
not altered. The application is responsible for ensuring that the
|
||||||
bytestring(s) to be written are in a format suitable for the client. (The
|
string(s) to be written are in a format suitable for the client. (The
|
||||||
server or gateway **may** apply HTTP transfer encodings, or perform
|
server or gateway **may** apply HTTP transfer encodings, or perform
|
||||||
other transformations for the purpose of implementing HTTP features
|
other transformations for the purpose of implementing HTTP features
|
||||||
such as byte-range transmission. See `Other HTTP Features`_, below,
|
such as byte-range transmission. See `Other HTTP Features`_, below,
|
||||||
|
@ -519,7 +472,7 @@ by the application. This protocol is intended to complement PEP 325's
|
||||||
generator support, and other common iterables with ``close()`` methods.
|
generator support, and other common iterables with ``close()`` methods.
|
||||||
|
|
||||||
(Note: the application **must** invoke the ``start_response()``
|
(Note: the application **must** invoke the ``start_response()``
|
||||||
callable before the iterable yields its first body bytestring, so that the
|
callable before the iterable yields its first body string, so that the
|
||||||
server can send the headers before any body content. However, this
|
server can send the headers before any body content. However, this
|
||||||
invocation **may** be performed by the iterable's first iteration, so
|
invocation **may** be performed by the iterable's first iteration, so
|
||||||
servers **must not** assume that ``start_response()`` has been called
|
servers **must not** assume that ``start_response()`` has been called
|
||||||
|
@ -612,7 +565,7 @@ have a fallback plan in the event such a variable is absent.
|
||||||
|
|
||||||
Note: missing variables (such as ``REMOTE_USER`` when no
|
Note: missing variables (such as ``REMOTE_USER`` when no
|
||||||
authentication has occurred) should be left out of the ``environ``
|
authentication has occurred) should be left out of the ``environ``
|
||||||
dictionary. Also note that CGI-defined variables must be native strings,
|
dictionary. Also note that CGI-defined variables must be strings,
|
||||||
if they are present at all. It is a violation of this specification
|
if they are present at all. It is a violation of this specification
|
||||||
for a CGI variable's value to be of any type other than ``str``.
|
for a CGI variable's value to be of any type other than ``str``.
|
||||||
|
|
||||||
|
@ -632,9 +585,9 @@ Variable Value
|
||||||
``"http"`` or ``"https"``, as appropriate.
|
``"http"`` or ``"https"``, as appropriate.
|
||||||
|
|
||||||
``wsgi.input`` An input stream (file-like object) from which
|
``wsgi.input`` An input stream (file-like object) from which
|
||||||
the HTTP request body bytes can be read. (The
|
the HTTP request body can be read. (The server
|
||||||
server or gateway may perform reads on-demand
|
or gateway may perform reads on-demand as
|
||||||
as requested by the application, or it may pre-
|
requested by the application, or it may pre-
|
||||||
read the client's request body and buffer it
|
read the client's request body and buffer it
|
||||||
in-memory or on disk, or use any other
|
in-memory or on disk, or use any other
|
||||||
technique for providing such an input stream,
|
technique for providing such an input stream,
|
||||||
|
@ -649,12 +602,6 @@ Variable Value
|
||||||
ending, and assume that it will be converted to
|
ending, and assume that it will be converted to
|
||||||
the correct line ending by the server/gateway.
|
the correct line ending by the server/gateway.
|
||||||
|
|
||||||
(On platforms where the ``str`` type is unicode,
|
|
||||||
the error stream **should** accept and log
|
|
||||||
arbitary unicode without raising an error; it
|
|
||||||
is allowed, however, to substitute characters
|
|
||||||
that cannot be rendered in the stream's encoding.)
|
|
||||||
|
|
||||||
For many servers, ``wsgi.errors`` will be the
|
For many servers, ``wsgi.errors`` will be the
|
||||||
server's main error log. Alternatively, this
|
server's main error log. Alternatively, this
|
||||||
may be ``sys.stderr``, or a log file of some
|
may be ``sys.stderr``, or a log file of some
|
||||||
|
@ -798,7 +745,7 @@ headers, please see the `Other HTTP Features`_ section below.)
|
||||||
The ``start_response`` callable **must not** actually transmit the
|
The ``start_response`` callable **must not** actually transmit the
|
||||||
response headers. Instead, it must store them for the server or
|
response headers. Instead, it must store them for the server or
|
||||||
gateway to transmit **only** after the first iteration of the
|
gateway to transmit **only** after the first iteration of the
|
||||||
application return value that yields a non-empty bytestring, or upon
|
application return value that yields a non-empty string, or upon
|
||||||
the application's first invocation of the ``write()`` callable. In
|
the application's first invocation of the ``write()`` callable. In
|
||||||
other words, response headers must not be sent until there is actual
|
other words, response headers must not be sent until there is actual
|
||||||
body data available, or until the application's returned iterable is
|
body data available, or until the application's returned iterable is
|
||||||
|
@ -873,12 +820,12 @@ able to either generate a ``Content-Length`` header, or at least
|
||||||
avoid the need to close the client connection. If the application
|
avoid the need to close the client connection. If the application
|
||||||
does *not* call the ``write()`` callable, and returns an iterable
|
does *not* call the ``write()`` callable, and returns an iterable
|
||||||
whose ``len()`` is 1, then the server can automatically determine
|
whose ``len()`` is 1, then the server can automatically determine
|
||||||
``Content-Length`` by taking the length of the first bytestring yielded
|
``Content-Length`` by taking the length of the first string yielded
|
||||||
by the iterable.
|
by the iterable.
|
||||||
|
|
||||||
And, if the server and client both support HTTP/1.1 "chunked
|
And, if the server and client both support HTTP/1.1 "chunked
|
||||||
encoding" [3]_, then the server **may** use chunked encoding to send
|
encoding" [3]_, then the server **may** use chunked encoding to send
|
||||||
a chunk for each ``write()`` call or bytestring yielded by the iterable,
|
a chunk for each ``write()`` call or string yielded by the iterable,
|
||||||
thus generating a ``Content-Length`` header for each chunk. This
|
thus generating a ``Content-Length`` header for each chunk. This
|
||||||
allows the server to keep the client connection alive, if it wishes
|
allows the server to keep the client connection alive, if it wishes
|
||||||
to do so. Note that the server **must** comply fully with RFC 2616
|
to do so. Note that the server **must** comply fully with RFC 2616
|
||||||
|
@ -903,7 +850,7 @@ transmitted all at once, along with the response headers.
|
||||||
|
|
||||||
The corresponding approach in WSGI is for the application to simply
|
The corresponding approach in WSGI is for the application to simply
|
||||||
return a single-element iterable (such as a list) containing the
|
return a single-element iterable (such as a list) containing the
|
||||||
response body as a single bytestring. This is the recommended approach
|
response body as a single string. This is the recommended approach
|
||||||
for the vast majority of application functions, that render
|
for the vast majority of application functions, that render
|
||||||
HTML pages whose text easily fits in memory.
|
HTML pages whose text easily fits in memory.
|
||||||
|
|
||||||
|
@ -952,12 +899,12 @@ In order to better support asynchronous applications and servers,
|
||||||
middleware components **must not** block iteration waiting for
|
middleware components **must not** block iteration waiting for
|
||||||
multiple values from an application iterable. If the middleware
|
multiple values from an application iterable. If the middleware
|
||||||
needs to accumulate more data from the application before it can
|
needs to accumulate more data from the application before it can
|
||||||
produce any output, it **must** yield an empty bytestring.
|
produce any output, it **must** yield an empty string.
|
||||||
|
|
||||||
To put this requirement another way, a middleware component **must
|
To put this requirement another way, a middleware component **must
|
||||||
yield at least one value** each time its underlying application
|
yield at least one value** each time its underlying application
|
||||||
yields a value. If the middleware cannot yield any other value,
|
yields a value. If the middleware cannot yield any other value,
|
||||||
it must yield an empty bytestring.
|
it must yield an empty string.
|
||||||
|
|
||||||
This requirement ensures that asynchronous applications and servers
|
This requirement ensures that asynchronous applications and servers
|
||||||
can conspire to reduce the number of threads that are required
|
can conspire to reduce the number of threads that are required
|
||||||
|
@ -999,22 +946,22 @@ for web servers to interleave other tasks in the same Python thread,
|
||||||
potentially providing better throughput for the server as a whole.
|
potentially providing better throughput for the server as a whole.
|
||||||
|
|
||||||
The ``write()`` callable is returned by the ``start_response()``
|
The ``write()`` callable is returned by the ``start_response()``
|
||||||
callable, and it accepts a single parameter: a bytestring to be
|
callable, and it accepts a single parameter: a string to be
|
||||||
written as part of the HTTP response body, that is treated exactly
|
written as part of the HTTP response body, that is treated exactly
|
||||||
as though it had been yielded by the output iterable. In other
|
as though it had been yielded by the output iterable. In other
|
||||||
words, before ``write()`` returns, it must guarantee that the
|
words, before ``write()`` returns, it must guarantee that the
|
||||||
passed-in bytestring was either completely sent to the client, or
|
passed-in string was either completely sent to the client, or
|
||||||
that it is buffered for transmission while the application
|
that it is buffered for transmission while the application
|
||||||
proceeds onward.
|
proceeds onward.
|
||||||
|
|
||||||
An application **must** return an iterable object, even if it
|
An application **must** return an iterable object, even if it
|
||||||
uses ``write()`` to produce all or part of its response body.
|
uses ``write()`` to produce all or part of its response body.
|
||||||
The returned iterable **may** be empty (i.e. yield no non-empty
|
The returned iterable **may** be empty (i.e. yield no non-empty
|
||||||
bytestrings), but if it *does* yield non-empty bytestrings, that output
|
strings), but if it *does* yield non-empty strings, that output
|
||||||
must be treated normally by the server or gateway (i.e., it must be
|
must be treated normally by the server or gateway (i.e., it must be
|
||||||
sent or queued immediately). Applications **must not** invoke
|
sent or queued immediately). Applications **must not** invoke
|
||||||
``write()`` from within their return iterable, and therefore any
|
``write()`` from within their return iterable, and therefore any
|
||||||
bytestrings yielded by the iterable are transmitted after all bytestrings
|
strings yielded by the iterable are transmitted after all strings
|
||||||
passed to ``write()`` have been sent to the client.
|
passed to ``write()`` have been sent to the client.
|
||||||
|
|
||||||
|
|
||||||
|
@ -1023,9 +970,9 @@ Unicode Issues
|
||||||
|
|
||||||
HTTP does not directly support Unicode, and neither does this
|
HTTP does not directly support Unicode, and neither does this
|
||||||
interface. All encoding/decoding must be handled by the application;
|
interface. All encoding/decoding must be handled by the application;
|
||||||
all strings passed to or from the server must be of type ``str`` or
|
all strings passed to or from the server must be standard Python byte
|
||||||
``bytes``, never ``unicode``. The result of using a ``unicode``
|
strings, not Unicode objects. The result of using a Unicode object
|
||||||
object where a string object is required, is undefined.
|
where a string object is required, is undefined.
|
||||||
|
|
||||||
Note also that strings passed to ``start_response()`` as a status or
|
Note also that strings passed to ``start_response()`` as a status or
|
||||||
as response headers **must** follow RFC 2616 with respect to encoding.
|
as response headers **must** follow RFC 2616 with respect to encoding.
|
||||||
|
@ -1033,7 +980,7 @@ That is, they must either be ISO-8859-1 characters, or use RFC 2047
|
||||||
MIME encoding.
|
MIME encoding.
|
||||||
|
|
||||||
On Python platforms where the ``str`` or ``StringType`` type is in
|
On Python platforms where the ``str`` or ``StringType`` type is in
|
||||||
fact Unicode-based (e.g. Jython, IronPython, Python 3, etc.), all
|
fact Unicode-based (e.g. Jython, IronPython, Python 3000, etc.), all
|
||||||
"strings" referred to in this specification must contain only
|
"strings" referred to in this specification must contain only
|
||||||
code points representable in ISO-8859-1 encoding (``\u0000`` through
|
code points representable in ISO-8859-1 encoding (``\u0000`` through
|
||||||
``\u00FF``, inclusive). It is a fatal error for an application to
|
``\u00FF``, inclusive). It is a fatal error for an application to
|
||||||
|
@ -1041,18 +988,12 @@ supply strings containing any other Unicode character or code point.
|
||||||
Similarly, servers and gateways **must not** supply
|
Similarly, servers and gateways **must not** supply
|
||||||
strings to an application containing any other Unicode characters.
|
strings to an application containing any other Unicode characters.
|
||||||
|
|
||||||
Again, all objects referred to in this specification as "strings"
|
Again, all strings referred to in this specification **must** be
|
||||||
**must** be of type ``str`` or ``StringType``, and **must not** be
|
of type ``str`` or ``StringType``, and **must not** be of type
|
||||||
of type ``unicode`` or ``UnicodeType``. And, even if a given platform
|
``unicode`` or ``UnicodeType``. And, even if a given platform allows
|
||||||
allows for more than 8 bits per character in ``str``/``StringType``
|
for more than 8 bits per character in ``str``/``StringType`` objects,
|
||||||
objects, only the lower 8 bits may be used, for any value referred
|
only the lower 8 bits may be used, for any value referred to in
|
||||||
to in this specification as a "string".
|
this specification as a "string".
|
||||||
|
|
||||||
For values referred to in this specification as "bytestrings"
|
|
||||||
(i.e., values read from ``wsgi.input``, passed to ``write()``
|
|
||||||
or yielded by the application), the value **must** be of type
|
|
||||||
``bytes`` under Python 3, and ``str`` in earlier versions of
|
|
||||||
Python.
|
|
||||||
|
|
||||||
|
|
||||||
Error Handling
|
Error Handling
|
||||||
|
@ -1507,7 +1448,7 @@ Questions and Answers
|
||||||
``environ`` dictionary. This is the recommended approach for
|
``environ`` dictionary. This is the recommended approach for
|
||||||
offering any such value-added services.
|
offering any such value-added services.
|
||||||
|
|
||||||
2. Why can you call ``write()`` *and* yield bytestrings/return an
|
2. Why can you call ``write()`` *and* yield strings/return an
|
||||||
iterable? Shouldn't we pick just one way?
|
iterable? Shouldn't we pick just one way?
|
||||||
|
|
||||||
If we supported only the iteration approach, then current
|
If we supported only the iteration approach, then current
|
||||||
|
|
Loading…
Reference in New Issue