2004-08-27 13:30:09 -04:00
|
|
|
|
PEP: 333
|
|
|
|
|
Title: Python Web Server Gateway Interface v1.0
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Phillip J. Eby <pje@telecommunity.com>
|
|
|
|
|
Discussions-To: Python Web-SIG <web-sig@python.org>
|
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Informational
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 07-Dec-2003
|
2004-08-27 17:14:42 -04:00
|
|
|
|
Post-History: 07-Dec-2003, 08-Aug-2004, 20-Aug-2004, 27-Aug-2004
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This document specifies a proposed standard interface between web
|
|
|
|
|
servers and Python web applications or frameworks, to promote web
|
|
|
|
|
application portability across a variety of web servers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale and Goals
|
|
|
|
|
===================
|
|
|
|
|
|
|
|
|
|
Python currently boasts a wide variety of web application frameworks,
|
|
|
|
|
such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to
|
|
|
|
|
name just a few [1]_. This wide variety of choices can be a problem
|
|
|
|
|
for new Python users, because generally speaking, their choice of web
|
|
|
|
|
framework will limit their choice of usable web servers, and vice
|
|
|
|
|
versa.
|
|
|
|
|
|
|
|
|
|
By contrast, although Java has just as many web application frameworks
|
|
|
|
|
available, Java's "servlet" API makes it possible for applications
|
|
|
|
|
written with any Java web application framework to run in any web
|
|
|
|
|
server that supports the servlet API.
|
|
|
|
|
|
|
|
|
|
The availability and widespread use of such an API in web servers for
|
|
|
|
|
Python -- whether those servers are written in Python (e.g. Medusa),
|
|
|
|
|
embed Python (e.g. mod_python), or invoke Python via a gateway
|
|
|
|
|
protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of
|
|
|
|
|
framework from choice of web server, freeing users to choose a pairing
|
|
|
|
|
that suits them, while freeing framework and server developers to
|
|
|
|
|
focus on their preferred area of specialization.
|
|
|
|
|
|
|
|
|
|
This PEP, therefore, proposes a simple and universal interface between
|
|
|
|
|
web servers and web applications or frameworks: the Python Web Server
|
|
|
|
|
Gateway Interface (WSGI).
|
|
|
|
|
|
|
|
|
|
But the mere existence of a WSGI spec does nothing to address the
|
|
|
|
|
existing state of servers and frameworks for Python web applications.
|
|
|
|
|
Server and framework authors and maintainers must actually implement
|
|
|
|
|
WSGI for there to be any effect.
|
|
|
|
|
|
|
|
|
|
However, since no existing servers or frameworks support WSGI, there
|
|
|
|
|
is little immediate reward for an author who implements WSGI support.
|
2004-08-31 18:00:28 -04:00
|
|
|
|
Thus, WSGI **must** be easy to implement, so that an author's initial
|
2004-08-27 13:30:09 -04:00
|
|
|
|
investment in the interface can be reasonably low.
|
|
|
|
|
|
|
|
|
|
Thus, simplicity of implementation on *both* the server and framework
|
|
|
|
|
sides of the interface is absolutely critical to the utility of the
|
|
|
|
|
WSGI interface, and is therefore the principal criterion for any
|
|
|
|
|
design decisions.
|
|
|
|
|
|
|
|
|
|
Note, however, that simplicity of implementation for a framework
|
|
|
|
|
author is not the same thing as ease of use for a web application
|
|
|
|
|
author. WSGI presents an absolutely "no frills" interface to the
|
|
|
|
|
framework author, because bells and whistles like response objects and
|
|
|
|
|
cookie handling would just get in the way of existing frameworks'
|
|
|
|
|
handling of these issues. Again, the goal of WSGI is to facilitate
|
|
|
|
|
easy interconnection of existing servers and applications or
|
|
|
|
|
frameworks, not to create a new web framework.
|
|
|
|
|
|
|
|
|
|
Note also that this goal precludes WSGI from requiring anything that
|
|
|
|
|
is not already available in deployed versions of Python. Therefore,
|
|
|
|
|
new standard library modules are not proposed or required by this
|
|
|
|
|
specification, and nothing in WSGI requires a Python version greater
|
|
|
|
|
than 2.2.2. (It would be a good idea, however, for future versions
|
|
|
|
|
of Python to include support for this interface in web servers
|
|
|
|
|
provided by the standard library.)
|
|
|
|
|
|
|
|
|
|
In addition to ease of implementation for existing and future
|
|
|
|
|
frameworks and servers, it should also be easy to create request
|
|
|
|
|
preprocessors, response postprocessors, and other WSGI-based
|
|
|
|
|
"middleware" components that look like an application to their
|
|
|
|
|
containing server, while acting as a server for their contained
|
|
|
|
|
applications.
|
|
|
|
|
|
|
|
|
|
If middleware can be both simple and robust, and WSGI is widely
|
|
|
|
|
available in servers and frameworks, it allows for the possibility
|
|
|
|
|
of an entirely new kind of Python web application framework: one
|
|
|
|
|
consisting of loosely-coupled WSGI middleware components. Indeed,
|
|
|
|
|
existing framework authors may even choose to refactor their
|
|
|
|
|
frameworks' existing services to be provided in this way, becoming
|
|
|
|
|
more like libraries used with WSGI, and less like monolithic
|
|
|
|
|
frameworks. This would then allow application developers to choose
|
|
|
|
|
"best-of-breed" components for specific functionality, rather than
|
|
|
|
|
having to commit to all the pros and cons of a single framework.
|
|
|
|
|
|
|
|
|
|
Of course, as of this writing, that day is doubtless quite far off.
|
|
|
|
|
In the meantime, it is a sufficient short-term goal for WSGI to
|
|
|
|
|
enable the use of any framework with any server.
|
|
|
|
|
|
|
|
|
|
Finally, it should be mentioned that the current version of WSGI
|
|
|
|
|
does not prescribe any particular mechanism for "deploying" an
|
|
|
|
|
application for use with a web server or server gateway. At the
|
|
|
|
|
present time, this is necessarily implementation-defined by the
|
|
|
|
|
server or gateway. After a sufficient number of servers and
|
|
|
|
|
frameworks have implemented WSGI to provide field experience with
|
|
|
|
|
varying deployment requirements, it may make sense to create
|
|
|
|
|
another PEP, describing a deployment standard for WSGI servers and
|
|
|
|
|
application frameworks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification Overview
|
|
|
|
|
======================
|
|
|
|
|
|
|
|
|
|
The WSGI interface has two sides: the "server" or "gateway" side, and
|
|
|
|
|
the "application" or "framework" side. The server side invokes a
|
|
|
|
|
callable object that is provided by the application side. The
|
|
|
|
|
specifics of how that object is provided are up to the server or
|
|
|
|
|
gateway. It is assumed that some servers or gateways will require an
|
|
|
|
|
application's deployer to write a short script to create an instance
|
|
|
|
|
of the server or gateway, and supply it with the application object.
|
|
|
|
|
Other servers and gateways may use configuration files or other
|
|
|
|
|
mechanisms to specify where an application object should be
|
|
|
|
|
imported from, or otherwise obtained.
|
|
|
|
|
|
2004-09-17 11:31:58 -04:00
|
|
|
|
In addition to "pure" servers/gateways and applications/frameworks,
|
|
|
|
|
it is also possible to create "middleware" components that implement
|
|
|
|
|
both sides of this specification. Such components act as an
|
|
|
|
|
application to their containing server, and as a server to a
|
|
|
|
|
contained application, and can be used to provide extended APIs,
|
|
|
|
|
content transformation, navigation, and other useful functions.
|
|
|
|
|
|
2004-09-17 12:32:17 -04:00
|
|
|
|
Throughout this specification, we will use the term "a callable" to
|
|
|
|
|
mean "a function, method, class, or an instance with a ``__call__``
|
|
|
|
|
method". It is up to the server, gateway, or application implementing
|
|
|
|
|
the callable to choose the appropriate implementation technique for
|
|
|
|
|
their needs. Conversely, a server, gateway, or application that is
|
|
|
|
|
invoking a callable **must not** have any dependency on what kind of
|
|
|
|
|
callable was provided to it. Callables are only to be called, not
|
|
|
|
|
introspected upon.
|
|
|
|
|
|
2004-09-17 11:31:58 -04:00
|
|
|
|
|
|
|
|
|
The Application/Framework Side
|
|
|
|
|
------------------------------
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
The application object is simply a callable object that accepts
|
|
|
|
|
two arguments. The term "object" should not be misconstrued as
|
|
|
|
|
requiring an actual object instance: a function, method, class,
|
|
|
|
|
or instance with a ``__call__`` method are all acceptable for
|
2004-09-17 12:32:17 -04:00
|
|
|
|
use as an application object. Application objects must be able
|
|
|
|
|
to be invoked more than once, as virtually all servers/gateways
|
|
|
|
|
(other than CGI) will make such repeated requests.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
(Note: although we refer to it as an "application" object, this
|
|
|
|
|
should not be construed to mean that application developers will use
|
|
|
|
|
WSGI as a web programming API! It is assumed that application
|
|
|
|
|
developers will continue to use existing, high-level framework
|
|
|
|
|
services to develop their applications. WSGI is a tool for
|
|
|
|
|
framework and server developers, and is not intended to directly
|
|
|
|
|
support application developers.)
|
|
|
|
|
|
|
|
|
|
Here are two example application objects; one is a function, and the
|
|
|
|
|
other is a class::
|
|
|
|
|
|
|
|
|
|
def simple_app(environ, start_response):
|
|
|
|
|
"""Simplest possible application object"""
|
|
|
|
|
status = '200 OK'
|
|
|
|
|
headers = [('Content-type','text/plain')]
|
2004-08-31 18:00:28 -04:00
|
|
|
|
start_response(status, headers)
|
|
|
|
|
return ['Hello world!\n']
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class AppClass:
|
2004-08-31 18:00:28 -04:00
|
|
|
|
"""Produce the same output, but using a class
|
|
|
|
|
|
|
|
|
|
(Note: 'AppClass' is the "application" here, so calling it
|
|
|
|
|
returns an instance of 'AppClass', which is then the iterable
|
|
|
|
|
return value of the "application callable" as required by
|
|
|
|
|
the spec.
|
|
|
|
|
|
|
|
|
|
If we wanted to use *instances* of 'AppClass' as application
|
|
|
|
|
objects instead, we would have to implement a '__call__'
|
|
|
|
|
method, which would be invoked to execute the application,
|
|
|
|
|
and we would need to create an instance for use by the
|
|
|
|
|
server or gateway.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
def __init__(self, environ, start_response):
|
|
|
|
|
self.environ = environ
|
|
|
|
|
self.start = start_response
|
|
|
|
|
|
|
|
|
|
def __iter__(self):
|
|
|
|
|
status = '200 OK'
|
|
|
|
|
headers = [('Content-type','text/plain')]
|
|
|
|
|
self.start(status, headers)
|
|
|
|
|
yield "Hello world!\n"
|
|
|
|
|
|
2004-09-17 11:31:58 -04:00
|
|
|
|
|
|
|
|
|
The Server/Gateway Side
|
|
|
|
|
-----------------------
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
The server or gateway invokes the application callable once for each
|
|
|
|
|
request it receives from an HTTP client, that is directed at the
|
|
|
|
|
application. To illustrate, here is a simple CGI gateway, implemented
|
2004-09-13 16:01:54 -04:00
|
|
|
|
as a function taking an application object. Note that this simple
|
|
|
|
|
example has limited error handling, because by default an uncaught
|
|
|
|
|
exception will be dumped to ``sys.stderr`` and logged by the web
|
|
|
|
|
server.
|
|
|
|
|
|
|
|
|
|
::
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
import os, sys
|
|
|
|
|
|
|
|
|
|
def run_with_cgi(application):
|
|
|
|
|
|
2004-09-15 21:56:44 -04:00
|
|
|
|
environ = dict(os.environ.items())
|
2004-08-27 13:30:09 -04:00
|
|
|
|
environ['wsgi.input'] = sys.stdin
|
|
|
|
|
environ['wsgi.errors'] = sys.stderr
|
|
|
|
|
environ['wsgi.version'] = (1,0)
|
|
|
|
|
environ['wsgi.multithread'] = False
|
|
|
|
|
environ['wsgi.multiprocess'] = True
|
|
|
|
|
environ['wsgi.last_call'] = True
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
if environ.get('HTTPS','off') in ('on','1'):
|
|
|
|
|
environ['wsgi.url_scheme'] = 'https'
|
|
|
|
|
else:
|
|
|
|
|
environ['wsgi.url_scheme'] = 'http'
|
|
|
|
|
|
|
|
|
|
headers_set = []
|
|
|
|
|
headers_sent = []
|
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
def write(data):
|
2004-09-13 16:01:54 -04:00
|
|
|
|
if not headers_set:
|
|
|
|
|
raise AssertionError("write() before start_response()")
|
|
|
|
|
|
|
|
|
|
elif not headers_sent:
|
|
|
|
|
# Before the first output, send the stored headers
|
|
|
|
|
status, headers = headers_sent[:] = headers_set
|
|
|
|
|
sys.stdout.write('Status: %s\r\n' % status)
|
|
|
|
|
for header in headers:
|
|
|
|
|
sys.stdout.write('%s: %s\r\n' % header)
|
|
|
|
|
sys.stdout.write('\r\n')
|
|
|
|
|
|
2004-09-01 00:59:21 -04:00
|
|
|
|
sys.stdout.write(data)
|
2004-08-31 18:00:28 -04:00
|
|
|
|
sys.stdout.flush()
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
def start_response(status,headers,exc_info=None):
|
|
|
|
|
if exc_info:
|
|
|
|
|
try:
|
|
|
|
|
if headers_sent:
|
|
|
|
|
# Re-raise original exception if headers sent
|
|
|
|
|
raise exc_info[0], exc_info[1], exc_info[2]
|
|
|
|
|
finally:
|
|
|
|
|
exc_info = None # avoid dangling circular ref
|
|
|
|
|
elif headers_sent:
|
|
|
|
|
raise AssertionError("Headers already sent!")
|
|
|
|
|
|
|
|
|
|
headers_set[:] = [status,headers]
|
2004-08-27 13:30:09 -04:00
|
|
|
|
return write
|
|
|
|
|
|
|
|
|
|
result = application(environ, start_response)
|
2004-08-31 18:00:28 -04:00
|
|
|
|
try:
|
|
|
|
|
for data in result:
|
2004-09-16 17:22:39 -04:00
|
|
|
|
if data: # don't send headers until body appears
|
|
|
|
|
write(data)
|
2004-09-15 21:56:44 -04:00
|
|
|
|
if not headers_sent:
|
2004-09-16 17:22:39 -04:00
|
|
|
|
write('') # send headers now if body was empty
|
2004-08-31 18:00:28 -04:00
|
|
|
|
finally:
|
|
|
|
|
if hasattr(result,'close'):
|
|
|
|
|
result.close()
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-17 11:31:58 -04:00
|
|
|
|
|
|
|
|
|
Middleware: Components that Play Both Sides
|
|
|
|
|
-------------------------------------------
|
|
|
|
|
|
|
|
|
|
Note that a single object may play the role of a server with respect
|
|
|
|
|
to some application(s), while also acting as an application with
|
|
|
|
|
respect to some server(s). Such "middleware" components can perform
|
|
|
|
|
such functions as:
|
|
|
|
|
|
|
|
|
|
* Routing a request to different application objects based on the
|
|
|
|
|
target URL, after rewriting the ``environ`` accordingly.
|
|
|
|
|
|
|
|
|
|
* Allowing multiple applications or frameworks to run side-by-side
|
|
|
|
|
in the same process
|
|
|
|
|
|
|
|
|
|
* Load balancing and remote processing, by forwarding requests and
|
|
|
|
|
responses over a network
|
|
|
|
|
|
|
|
|
|
* Perform content postprocessing, such as applying XSL stylesheets
|
|
|
|
|
|
|
|
|
|
The presence of middleware in general is transparent to both the
|
|
|
|
|
"server/gateway" and the "application/framework" sides of the
|
|
|
|
|
interface, and should require no special support. A user who
|
|
|
|
|
desires to incorporate middleware into an application simply
|
|
|
|
|
provides the middleware component to the server, as if it were
|
|
|
|
|
an application, and configures the middleware component to
|
|
|
|
|
invoke the application, as if the middleware component were a
|
|
|
|
|
server. Of course, the "application" that the middleware wraps
|
|
|
|
|
may in fact be another middleware component wrapping another
|
|
|
|
|
application, and so on, creating what is referred to as a
|
|
|
|
|
"middleware stack".
|
|
|
|
|
|
|
|
|
|
For the most part, middleware must conform to the restrictions
|
|
|
|
|
and requirements of both the server and application sides of
|
|
|
|
|
WSGI. In some cases, however, requirements for middleware
|
|
|
|
|
are more stringent than for a "pure" server or application,
|
|
|
|
|
and these points will be noted in the specification.
|
|
|
|
|
|
|
|
|
|
Here is a (tongue-in-cheek) example of a middleware component that
|
|
|
|
|
converts ``text/plain`` responses to pig latin, using Joe Strout's
|
|
|
|
|
``piglatin.py``. (Note: a "real" middleware component would
|
|
|
|
|
probably use a more robust way of checking the content type, and
|
2004-09-17 12:32:17 -04:00
|
|
|
|
should also check for a content encoding. Also, this simple
|
|
|
|
|
example ignores the possibility that a word might be split across
|
|
|
|
|
a block boundary.)
|
2004-09-17 11:31:58 -04:00
|
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
|
|
from piglatin import piglatin
|
|
|
|
|
|
2004-09-17 12:32:17 -04:00
|
|
|
|
class LatinIter:
|
|
|
|
|
|
|
|
|
|
"""Transform iterated output to piglatin, if it's okay to do so
|
|
|
|
|
|
|
|
|
|
Note that the "okayness" can change until the application yields
|
|
|
|
|
its first non-empty string, so 'transform_ok' has to be a mutable
|
|
|
|
|
truth value."""
|
|
|
|
|
|
|
|
|
|
def __init__(self,result,transform_ok):
|
|
|
|
|
if hasattr(result,'close'):
|
|
|
|
|
self.close = result.close
|
|
|
|
|
self._next = iter(result).next
|
|
|
|
|
self.transform_ok = transform_ok
|
|
|
|
|
|
|
|
|
|
def __iter__(self):
|
|
|
|
|
return self
|
|
|
|
|
|
|
|
|
|
def next(self):
|
|
|
|
|
if self.transform_ok:
|
|
|
|
|
return piglatin(self._next())
|
|
|
|
|
else:
|
|
|
|
|
return self._next()
|
|
|
|
|
|
2004-09-17 11:31:58 -04:00
|
|
|
|
class Latinator:
|
|
|
|
|
|
|
|
|
|
# by default, don't transform output
|
2004-09-17 12:32:17 -04:00
|
|
|
|
transform = False
|
2004-09-17 11:31:58 -04:00
|
|
|
|
|
|
|
|
|
def __init__(self, application):
|
|
|
|
|
self.application = application
|
|
|
|
|
|
|
|
|
|
def __call__(environ, start_response):
|
2004-09-17 12:32:17 -04:00
|
|
|
|
|
|
|
|
|
transform_ok = []
|
|
|
|
|
|
2004-09-17 11:31:58 -04:00
|
|
|
|
def start_latin(status,headers,exc_info=None):
|
|
|
|
|
|
|
|
|
|
for name,value in headers:
|
|
|
|
|
if name.lower()=='content-type' and value=='text/plain':
|
2004-09-17 12:32:17 -04:00
|
|
|
|
transform_ok.append(True)
|
2004-09-17 11:31:58 -04:00
|
|
|
|
# Strip content-length if present, else it'll be wrong
|
|
|
|
|
headers = [(name,value)
|
|
|
|
|
for name,value in headers
|
|
|
|
|
if name.lower()<>'content-length'
|
|
|
|
|
]
|
|
|
|
|
break
|
|
|
|
|
|
2004-09-17 12:32:17 -04:00
|
|
|
|
write = start_response(status,headers,exc_info)
|
|
|
|
|
|
|
|
|
|
if transform_ok:
|
|
|
|
|
def write_latin(data):
|
|
|
|
|
write(piglatin(data))
|
|
|
|
|
return write_latin
|
|
|
|
|
else:
|
|
|
|
|
return write
|
|
|
|
|
|
|
|
|
|
return LatinIter(self.application(environ,start_latin),transform_ok)
|
2004-09-17 11:31:58 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Run foo_app under a Latinator's control, using the example CGI gateway
|
|
|
|
|
from foo_app import foo_app
|
|
|
|
|
run_with_cgi(Latinator(foo_app))
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification Details
|
|
|
|
|
=====================
|
|
|
|
|
|
|
|
|
|
The application object must accept two positional arguments. For
|
|
|
|
|
the sake of illustration, we have named them ``environ`` and
|
|
|
|
|
``start_response``, but they are not required to have these names.
|
2004-08-31 18:00:28 -04:00
|
|
|
|
A server or gateway **must** invoke the application object using
|
2004-08-27 13:30:09 -04:00
|
|
|
|
positional (not keyword) arguments. (E.g. by calling
|
|
|
|
|
``result = application(environ,start_response)`` as shown above.)
|
|
|
|
|
|
|
|
|
|
The ``environ`` parameter is a dictionary object, containing CGI-style
|
2004-08-31 18:00:28 -04:00
|
|
|
|
environment variables. This object **must** be a builtin Python
|
2004-08-27 13:30:09 -04:00
|
|
|
|
dictionary (*not* a subclass, ``UserDict`` or other dictionary
|
|
|
|
|
emulation), and the application is allowed to modify the dictionary
|
|
|
|
|
in any way it desires. The dictionary must also include certain
|
|
|
|
|
WSGI-required variables (described in a later section), and may
|
|
|
|
|
also include server-specific extension variables, named according
|
|
|
|
|
to a convention that will be described below.
|
|
|
|
|
|
|
|
|
|
The ``start_response`` parameter is a callable accepting two
|
2004-09-13 16:01:54 -04:00
|
|
|
|
required positional arguments, and one optional argument. For the sake
|
|
|
|
|
of illustration, we have named these arguments ``status``, ``headers``,
|
|
|
|
|
and ``exc_info``, but they are not required to have these names, and
|
|
|
|
|
the application **must** invoke the ``start_response`` callable using
|
|
|
|
|
positional arguments (e.g. ``start_response(status,headers)``).
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
The ``status`` parameter is a status string of the form
|
2004-09-13 16:01:54 -04:00
|
|
|
|
``"999 Message here"``, and ``headers`` is a list of
|
|
|
|
|
``(header_name,header_value)`` tuples describing the HTTP response
|
|
|
|
|
header. The optional ``exc_info`` parameter is described below in the
|
|
|
|
|
sections on `The start_response() Callable`_ and `Error Handling`_.
|
|
|
|
|
It is used only when the application has trapped an error and is
|
|
|
|
|
attempting to display an error message to the browser.
|
|
|
|
|
|
|
|
|
|
The ``start_response`` callable must return a ``write(body_data)``
|
|
|
|
|
callable that takes one positional parameter: a string to be written
|
|
|
|
|
as part of the HTTP response body. (Note: the ``write()`` callable is
|
|
|
|
|
provided only to support certain existing frameworks' imperative output
|
|
|
|
|
APIs; it should not be used by new applications or frameworks if it
|
|
|
|
|
can be avoided. See the `Buffering and Streaming`_ section for more
|
|
|
|
|
details.)
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
The application object must return an iterable yielding strings.
|
|
|
|
|
(For example, it could be a generator-iterator that yields strings,
|
|
|
|
|
or it could be a sequence such as a list of strings.) The server
|
|
|
|
|
or gateway must transmit these strings to the client in an
|
|
|
|
|
unbuffered fashion, completing the transmission of each string
|
|
|
|
|
before requesting another one. (See the `Buffering and Streaming`_
|
|
|
|
|
section below for more on how application output must be handled.)
|
|
|
|
|
|
|
|
|
|
The server or gateway must not modify supplied strings in any way;
|
|
|
|
|
they must be treated as binary byte sequences with no character
|
|
|
|
|
interpretation, line ending changes, or other modification. The
|
|
|
|
|
application is responsible for ensuring that the string(s) to be
|
|
|
|
|
written are in a format suitable for the client.
|
|
|
|
|
|
|
|
|
|
If a call to ``len(iterable)`` succeeds, the server must be able
|
|
|
|
|
to rely on the result being accurate. That is, if the iterable
|
|
|
|
|
returned by the application provides a working ``__len__()``
|
2004-09-13 16:01:54 -04:00
|
|
|
|
method, it **must** return an accurate result. (See
|
|
|
|
|
the `Handling the Content-Length Header`_ section for information
|
|
|
|
|
on how this would normally be used.)
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
If the iterable returned by the application has a ``close()`` method,
|
|
|
|
|
the server or gateway **must** call that method upon completion of the
|
|
|
|
|
current request, whether the request was completed normally, or
|
|
|
|
|
terminated early due to an error. (This is to support resource release
|
2004-09-13 16:01:54 -04:00
|
|
|
|
by the application. This protocol is intended to complement PEP 325's
|
|
|
|
|
generator support, and other common iterables with ``close()`` methods.
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
(Note: the application **must** invoke the ``start_response()``
|
|
|
|
|
callable before the iterable yields its first body string, so that the
|
|
|
|
|
server can send the headers before any body content. However, this
|
|
|
|
|
invocation **may** be performed by the iterable's first iteration, so
|
|
|
|
|
servers **must not** assume that ``start_response()`` has been called
|
|
|
|
|
before they begin iterating over the iterable.)
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
Finally, servers and gateways **must not** directly use any other
|
|
|
|
|
attributes of the iterable returned by the application, unless it is an
|
|
|
|
|
instance of a type specific to that server or gateway, such as a "file
|
|
|
|
|
wrapper" returned by ``wsgi.file_wrapper`` (see `Optional
|
|
|
|
|
Platform-Specific File Handling`_). In the general case, only
|
|
|
|
|
attributes specified here, or accessed via e.g. the PEP 234 iteration
|
|
|
|
|
APIs are acceptable.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
``environ`` Variables
|
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
|
|
The ``environ`` dictionary is required to contain these CGI
|
|
|
|
|
environment variables, as defined by the Common Gateway Interface
|
2004-09-15 16:41:16 -04:00
|
|
|
|
specification [2]_. The following variables **must** be present,
|
|
|
|
|
unless their value would be an empty string, in which case they
|
|
|
|
|
**may** be omitted, except as otherwise noted below.
|
|
|
|
|
|
|
|
|
|
``REQUEST_METHOD``
|
|
|
|
|
The HTTP request method, such as ``"GET"`` or ``"POST"``. This
|
|
|
|
|
cannot ever be an empty string, and so is always required.
|
|
|
|
|
|
|
|
|
|
``SCRIPT_NAME``
|
|
|
|
|
The initial portion of the request URL's "path" that corresponds to
|
|
|
|
|
the application object, so that the application knows its virtual
|
|
|
|
|
"location". This **may** be an empty string, if the application
|
|
|
|
|
corresponds to the "root" of the server.
|
|
|
|
|
|
|
|
|
|
``PATH_INFO``
|
|
|
|
|
The remainder of the request URL's "path", designating the virtual
|
|
|
|
|
"location" of the request's target within the application. This
|
|
|
|
|
**may** be an empty string, if the request URL targets the
|
|
|
|
|
application root and does not have a trailing slash.
|
|
|
|
|
|
|
|
|
|
``QUERY_STRING``
|
|
|
|
|
The portion of the request URL that follows the ``"?"``, if any.
|
|
|
|
|
May be empty or absent.
|
|
|
|
|
|
|
|
|
|
``CONTENT_TYPE``
|
|
|
|
|
The contents of any ``Content-Type`` fields in the HTTP request.
|
|
|
|
|
May be empty or absent.
|
|
|
|
|
|
|
|
|
|
``CONTENT_LENGTH``
|
|
|
|
|
The contents of any ``Content-Length`` fields in the HTTP request.
|
|
|
|
|
May be empty or absent.
|
|
|
|
|
|
|
|
|
|
``SERVER_NAME``, ``SERVER_PORT``
|
|
|
|
|
When combined with ``SCRIPT_NAME`` and ``PATH_INFO``, these variables
|
|
|
|
|
can be used to complete the URL. Note, however, that ``HTTP_HOST``,
|
|
|
|
|
if present, should be used in preference to ``SERVER_NAME`` for
|
|
|
|
|
reconstructing the request URL. See the `URL Reconstruction`_
|
|
|
|
|
section below for more detail. ``SERVER_NAME`` and ``SERVER_PORT``
|
|
|
|
|
can never be empty strings, and so are always required.
|
|
|
|
|
|
|
|
|
|
``HTTP_`` Variables
|
|
|
|
|
Variables corresponding to the client-supplied HTTP headers (i.e.,
|
|
|
|
|
variables whose names begin with ``"HTTP_"``). The presence or
|
|
|
|
|
absence of these variables should correspond with the presence or
|
|
|
|
|
absence of the appropriate HTTP header in the request.
|
|
|
|
|
|
|
|
|
|
In general, a server or gateway **should** attempt to provide as many
|
2004-08-27 13:30:09 -04:00
|
|
|
|
other CGI variables as are applicable, including e.g. the nonstandard
|
|
|
|
|
SSL variables such as ``HTTPS=on``, if an SSL connection is in effect.
|
2004-09-15 16:41:16 -04:00
|
|
|
|
However, an application that uses any CGI variables other than the ones
|
2004-08-27 13:30:09 -04:00
|
|
|
|
listed above are necessarily non-portable to web servers that do not
|
|
|
|
|
support the relevant extensions.
|
|
|
|
|
|
2004-09-15 16:41:16 -04:00
|
|
|
|
A WSGI-compliant server or gateway **should** document what variables
|
|
|
|
|
it provides, along with their definitions as appropriate. Applications
|
|
|
|
|
**should** check for the presence of any variables they require, and
|
|
|
|
|
have a fallback plan in the event such a variable is absent.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
Note: missing variables (such as ``REMOTE_USER`` when no
|
|
|
|
|
authentication has occurred) should be left out of the ``environ``
|
|
|
|
|
dictionary. Also note that CGI-defined variables must be strings,
|
|
|
|
|
if they are present at all. It is a violation of this specification
|
|
|
|
|
for a CGI variable's value to be of any type other than ``str``.
|
|
|
|
|
|
|
|
|
|
In addition to the CGI-defined variables, the ``environ`` dictionary
|
|
|
|
|
must also contain the following WSGI-defined variables:
|
|
|
|
|
|
|
|
|
|
===================== ===============================================
|
|
|
|
|
Variable Value
|
|
|
|
|
===================== ===============================================
|
|
|
|
|
``wsgi.version`` The tuple ``(1,0)``, representing WSGI
|
|
|
|
|
version 1.0.
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
``wsgi.url_scheme`` A string representing the "scheme" portion of
|
|
|
|
|
the URL at which the application is being
|
|
|
|
|
invoked. Normally, this will have the value
|
|
|
|
|
``"http"`` or ``"https"``, as appropriate.
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
``wsgi.input`` An input stream from which the HTTP request
|
|
|
|
|
body can be read. (The server or gateway may
|
|
|
|
|
perform reads on-demand as requested by the
|
|
|
|
|
application, or it may pre-read the client's
|
|
|
|
|
request body and buffer it in-memory or on
|
|
|
|
|
disk, or use any other technique for providing
|
|
|
|
|
such an input stream, according to its
|
|
|
|
|
preference.)
|
|
|
|
|
|
|
|
|
|
``wsgi.errors`` An output stream to which error output can be
|
2004-08-29 23:04:59 -04:00
|
|
|
|
written, for the purpose of recording program
|
|
|
|
|
or other errors in a standardized and possibly
|
|
|
|
|
centralized location. For many servers, this
|
|
|
|
|
will be the server's main error log.
|
|
|
|
|
|
|
|
|
|
Alternatively, this may be ``sys.stderr``, or
|
2004-09-13 16:01:54 -04:00
|
|
|
|
a log file of some sort. The server's
|
2004-08-29 23:04:59 -04:00
|
|
|
|
documentation should include an explanation of
|
|
|
|
|
how to configure this or where to find the
|
|
|
|
|
recorded output. A server or gateway may
|
|
|
|
|
supply different error streams to different
|
|
|
|
|
applications, if this is desired.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
``wsgi.multithread`` This value should be true if the application
|
|
|
|
|
object may be simultaneously invoked by another
|
|
|
|
|
thread in the same process, and false
|
|
|
|
|
otherwise.
|
|
|
|
|
|
|
|
|
|
``wsgi.multiprocess`` This value should be true if an equivalent
|
|
|
|
|
application object may be simultaneously
|
|
|
|
|
invoked by another process, and false
|
|
|
|
|
otherwise.
|
|
|
|
|
|
|
|
|
|
``wsgi.run_once`` This value should be true if the server/gateway
|
|
|
|
|
expects (but does not guarantee!) that the
|
|
|
|
|
application will only be invoked this one time
|
|
|
|
|
during the life of its containing process.
|
|
|
|
|
Normally, this will only be true for a gateway
|
|
|
|
|
based on CGI (or something similar).
|
|
|
|
|
===================== ===============================================
|
|
|
|
|
|
|
|
|
|
Finally, the ``environ`` dictionary may also contain server-defined
|
|
|
|
|
variables. These variables should be named using only lower-case
|
|
|
|
|
letters, numbers, dots, and underscores, and should be prefixed with
|
|
|
|
|
a name that is unique to the defining server or gateway. For
|
|
|
|
|
example, ``mod_python`` might define variables with names like
|
|
|
|
|
``mod_python.some_variable``.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Input and Error Streams
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The input and error streams provided by the server must support
|
|
|
|
|
the following methods:
|
|
|
|
|
|
|
|
|
|
=================== ========== ========
|
|
|
|
|
Method Stream Notes
|
|
|
|
|
=================== ========== ========
|
|
|
|
|
``read(size)`` ``input`` 1
|
|
|
|
|
``readline()`` ``input`` 1,2
|
|
|
|
|
``readlines(hint)`` ``input`` 1,3
|
|
|
|
|
``__iter__()`` ``input``
|
|
|
|
|
``flush()`` ``errors`` 4
|
|
|
|
|
``write(str)`` ``errors``
|
|
|
|
|
``writelines(seq)`` ``errors``
|
|
|
|
|
=================== ========== ========
|
|
|
|
|
|
|
|
|
|
The semantics of each method are as documented in the Python Library
|
|
|
|
|
Reference, except for these notes as listed in the table above:
|
|
|
|
|
|
|
|
|
|
1. The server is not required to read past the client's specified
|
|
|
|
|
``Content-Length``, and is allowed to simulate an end-of-file
|
|
|
|
|
condition if the application attempts to read past that point.
|
2004-09-13 16:01:54 -04:00
|
|
|
|
The application **should not** attempt to read more data than is
|
2004-08-27 13:30:09 -04:00
|
|
|
|
specified by the ``CONTENT_LENGTH`` variable.
|
|
|
|
|
|
|
|
|
|
2. The optional "size" argument to ``readline()`` is not supported,
|
|
|
|
|
as it may be complex for server authors to implement, and is not
|
|
|
|
|
often used in practice.
|
|
|
|
|
|
|
|
|
|
3. Note that the ``hint`` argument to ``readlines()`` is optional for
|
|
|
|
|
both caller and implementer. The application is free not to
|
|
|
|
|
supply it, and the server or gateway is free to ignore it.
|
|
|
|
|
|
|
|
|
|
4. Since the ``errors`` stream may not be rewound, a container is
|
|
|
|
|
free to forward write operations immediately, without buffering.
|
|
|
|
|
In this case, the ``flush()`` method may be a no-op. Portable
|
|
|
|
|
applications, however, cannot assume that output is unbuffered
|
|
|
|
|
or that ``flush()`` is a no-op. They must call ``flush()`` if
|
|
|
|
|
they need to ensure that output has in fact been written. (For
|
|
|
|
|
example, to minimize intermingling of data from multiple processes
|
|
|
|
|
writing to the same error log.)
|
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
The methods listed in the table above **must** be supported by all
|
2004-08-27 13:30:09 -04:00
|
|
|
|
servers conforming to this specification. Applications conforming
|
2004-08-31 18:00:28 -04:00
|
|
|
|
to this specification **must not** use any other methods or attributes
|
2004-08-27 13:30:09 -04:00
|
|
|
|
of the ``input`` or ``errors`` objects. In particular, applications
|
2004-08-31 18:00:28 -04:00
|
|
|
|
**must not** attempt to close these streams, even if they possess
|
2004-08-27 13:30:09 -04:00
|
|
|
|
``close()`` methods.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The ``start_response()`` Callable
|
|
|
|
|
---------------------------------
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
The second parameter passed to the application object is a callable
|
|
|
|
|
of the form ``start_response(status,headers,exc_info=None)``.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
(As with all WSGI callables, the arguments must be supplied
|
|
|
|
|
positionally, not by keyword.) The ``start_response`` callable is
|
|
|
|
|
used to begin the HTTP response, and it must return a
|
2004-08-31 18:00:28 -04:00
|
|
|
|
``write(body_data)`` callable (see the `Buffering and Streaming`_
|
|
|
|
|
section, below).
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
The ``status`` argument is an HTTP "status" string like ``"200 OK"``
|
2004-09-13 16:01:54 -04:00
|
|
|
|
or ``"404 Not Found"``. The string **must not** contain control
|
|
|
|
|
characters, and must not be terminated with a carriage return,
|
|
|
|
|
linefeed, or combination thereof.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
The ``headers`` argument is a list of ``(header_name,header_value)``
|
|
|
|
|
tuples. It must be a Python list; i.e. ``type(headers) is
|
|
|
|
|
ListType)``, and the server **may** change its contents in any way
|
|
|
|
|
it desires. Each ``header_name`` must be a valid HTTP header name,
|
|
|
|
|
without a trailing colon or other punctuation. Each ``header_value``
|
|
|
|
|
**must not** include *any* control characters, including carriage
|
|
|
|
|
returns or linefeeds, either embedded or at the end. (These
|
|
|
|
|
requirements are to minimize the complexity of any parsing that must
|
|
|
|
|
be performed by servers, gateways, and intermediate response
|
|
|
|
|
processors that need to inspect or modify response headers.)
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
In general, the server or gateway is responsible for ensuring that
|
|
|
|
|
correct headers are sent to the client: if the application omits
|
|
|
|
|
a needed header, the server or gateway *should* add it. For example,
|
|
|
|
|
the HTTP ``Date:`` and ``Server:`` headers would normally be supplied
|
2004-08-31 18:00:28 -04:00
|
|
|
|
by the server or gateway.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
(A reminder for server/gateway authors: HTTP header names are
|
|
|
|
|
case-insensitive, so be sure to take that into consideration when
|
|
|
|
|
examining application-supplied headers!)
|
|
|
|
|
|
2004-09-16 18:04:35 -04:00
|
|
|
|
Applications and middleware are forbidden from using HTTP/1.1
|
|
|
|
|
"hop-by-hop" features or headers, any equivalent features in HTTP/1.0,
|
|
|
|
|
or any headers that would affect the persistence of the client's
|
|
|
|
|
connection to the web server. These features are the
|
|
|
|
|
exclusive province of the actual web server, and a server or gateway
|
|
|
|
|
**should** consider it a fatal error for an application to attempt
|
|
|
|
|
using them, and raise an error if they are supplied to
|
|
|
|
|
``start_response()``. (For more specifics on "hop-by-hop" features and
|
|
|
|
|
headers, please see the `Other HTTP Features`_ section below.)
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
The ``start_response`` callable **must not** actually transmit the
|
2004-09-17 12:32:17 -04:00
|
|
|
|
HTTP headers. It must store them until the first ``write`` call, or
|
|
|
|
|
until after the first iteration of the application return value that
|
|
|
|
|
yields a non-empty string. This is to ensure that buffered and
|
|
|
|
|
asynchronous applications can replace their originally intended output
|
|
|
|
|
with error output, up until the last possible moment.
|
2004-09-13 16:01:54 -04:00
|
|
|
|
|
|
|
|
|
The ``exc_info`` argument, if supplied, must be a Python
|
|
|
|
|
``sys.exc_info()`` tuple. This argument should be supplied by the
|
|
|
|
|
application only if ``start_response`` is being called by an error
|
|
|
|
|
handler. If ``exc_info`` is supplied, and no HTTP headers have been
|
|
|
|
|
output yet, ``start_response`` should replace the currently-stored
|
|
|
|
|
HTTP headers with the newly-supplied ones, thus allowing the
|
|
|
|
|
application to "change its mind" about the output when an error has
|
|
|
|
|
occurred.
|
|
|
|
|
|
|
|
|
|
However, if ``exc_info`` is provided, and the HTTP headers have already
|
|
|
|
|
been sent, ``start_response`` **must** raise an error, and **should**
|
|
|
|
|
raise the ``exc_info`` tuple. That is::
|
|
|
|
|
|
|
|
|
|
raise exc_info[0],exc_info[1],exc_info[2]
|
|
|
|
|
|
|
|
|
|
This will re-raise the exception trapped by the application, and in
|
|
|
|
|
principle should abort the application. (It is not safe for the
|
|
|
|
|
application to attempt error output to the browser once the HTTP
|
|
|
|
|
headers have already been sent.) The application **must not** trap
|
|
|
|
|
any exceptions raised by ``start_response``, if it called
|
|
|
|
|
``start_response`` with ``exc_info``. Instead, it should allow
|
|
|
|
|
such exceptions to propagate back to the server or gateway. See
|
|
|
|
|
`Error Handling`_ below, for more details.
|
|
|
|
|
|
|
|
|
|
The application **may** call ``start_response`` more than once, if and
|
|
|
|
|
only if the ``exc_info`` argument is provided. More precisely, it is
|
|
|
|
|
a fatal error to call ``start_response`` without the ``exc_info``
|
|
|
|
|
argument if ``start_response`` has already been called within the
|
|
|
|
|
current invocation of the application. (See the example CGI
|
|
|
|
|
gateway above for an illustration of the correct logic.)
|
|
|
|
|
|
|
|
|
|
Note: servers, gateways, or middleware implementing ``start_response``
|
|
|
|
|
**should** ensure that no reference is held to the ``exc_info``
|
|
|
|
|
parameter beyond the duration of the function's execution, to avoid
|
|
|
|
|
creating a circular reference through the traceback and frames
|
|
|
|
|
involved. The simplest way to do this is something like::
|
|
|
|
|
|
|
|
|
|
def start_response(status,headers,exc_info=None):
|
|
|
|
|
if exc_info:
|
|
|
|
|
try:
|
|
|
|
|
# do stuff w/exc_info here
|
|
|
|
|
finally:
|
|
|
|
|
exc_info = None # Avoid circular ref.
|
|
|
|
|
|
|
|
|
|
The example CGI gateway provides another illustration of this
|
|
|
|
|
technique.
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
Handling the ``Content-Length`` Header
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
If the application does not supply a ``Content-Length`` header, a
|
|
|
|
|
server or gateway may choose one of several approaches to handling
|
|
|
|
|
it. The simplest of these is to close the client connection when
|
|
|
|
|
the response is completed.
|
|
|
|
|
|
|
|
|
|
Under some circumstances, however, the server or gateway may be
|
|
|
|
|
able to either generate a ``Content-Length`` header, or at least
|
|
|
|
|
avoid the need to close the client connection. If the application
|
|
|
|
|
does *not* call the ``write()`` callable, and returns an iterable
|
|
|
|
|
whose ``len()`` is 1, then the server can automatically determine
|
|
|
|
|
``Content-Length`` by taking the length of the first string yielded
|
|
|
|
|
by the iterable.
|
|
|
|
|
|
|
|
|
|
And, if the server and client both support HTTP/1.1 "chunked
|
2004-08-31 18:00:28 -04:00
|
|
|
|
encoding" [3]_, then the server **may** use chunked encoding to send
|
2004-08-27 13:30:09 -04:00
|
|
|
|
a chunk for each ``write()`` call or string yielded by the iterable,
|
|
|
|
|
thus generating a ``Content-Length`` header for each chunk. This
|
|
|
|
|
allows the server to keep the client connection alive, if it wishes
|
2004-08-31 18:00:28 -04:00
|
|
|
|
to do so. Note that the server **must** comply fully with RFC 2616
|
|
|
|
|
when doing this, or else fall back to one of the other strategies for
|
2004-08-27 13:30:09 -04:00
|
|
|
|
dealing with the absence of ``Content-Length``.
|
|
|
|
|
|
2004-09-16 18:04:35 -04:00
|
|
|
|
(Note: applications and middleware **must not** apply any kind of
|
|
|
|
|
``Transfer-Encoding`` to their output, such as chunking or gzipping;
|
|
|
|
|
as "hop-by-hop" operations, these encodings are the province of the
|
|
|
|
|
actual web server/gateway. See `Other HTTP Features`_ below, for
|
|
|
|
|
more details.)
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
Buffering and Streaming
|
|
|
|
|
-----------------------
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
Generally speaking, applications will achieve the best throughput
|
|
|
|
|
by buffering their (modestly-sized) output and sending it all at
|
|
|
|
|
once. When this is the case, applications **should** simply
|
|
|
|
|
return a single-element iterable containing their entire output as
|
|
|
|
|
a single string.
|
|
|
|
|
|
|
|
|
|
(In addition to improved performance, buffering all of an application's
|
2004-08-27 13:30:09 -04:00
|
|
|
|
output has an advantage for error handling: the buffered output can
|
2004-08-31 18:00:28 -04:00
|
|
|
|
be discarded and replaced by an error page, rather than dumping an
|
2004-08-27 13:30:09 -04:00
|
|
|
|
error message in the middle of some partially-completed output. For
|
|
|
|
|
this and other reasons, many existing Python frameworks already
|
|
|
|
|
accumulate their output for a single write, unless the application
|
|
|
|
|
explicitly requests streaming, or the expected output is larger than
|
2004-08-31 18:00:28 -04:00
|
|
|
|
practical for buffering (e.g. multi-megabyte PDFs).)
|
|
|
|
|
|
|
|
|
|
For large files, however, or for specialized uses of HTTP streaming
|
|
|
|
|
(such as multipart "server push"), an application may need to provide
|
|
|
|
|
output in smaller blocks (e.g. to avoid loading a large file into
|
|
|
|
|
memory). It's also sometimes the case that part of a response may
|
|
|
|
|
be time-consuming to produce, but it would be useful to send ahead the
|
|
|
|
|
portion of the response that precedes it.
|
|
|
|
|
|
|
|
|
|
In these cases, applications **should** return an iterator (usually
|
|
|
|
|
a generator-iterator) that produces the output in a block-by-block
|
|
|
|
|
fashion. These blocks may be broken to coincide with mulitpart
|
|
|
|
|
boundaries (for "server push"), or just before time-consuming
|
|
|
|
|
tasks (such as reading another block of an on-disk file).
|
|
|
|
|
|
2004-09-16 17:22:39 -04:00
|
|
|
|
WSGI servers, gateways, and middleware **must not** delay the
|
|
|
|
|
transmission of any block; they **must** either fully transmit
|
|
|
|
|
the block to the client, or guarantee that they will continue
|
|
|
|
|
transmission even while the application is producing its next block.
|
|
|
|
|
A server/gateway or middleware may provide this guarantee in one of
|
|
|
|
|
three ways:
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
1. Send the entire block to the operating system (and request
|
|
|
|
|
that any O/S buffers be flushed) before returning control
|
|
|
|
|
to the application, OR
|
|
|
|
|
|
|
|
|
|
2. Use a different thread to ensure that the block continues
|
|
|
|
|
to be transmitted while the application produces the next
|
|
|
|
|
block.
|
2004-09-16 17:22:39 -04:00
|
|
|
|
|
|
|
|
|
3. (Middleware only) send the entire block to its parent
|
|
|
|
|
gateway/server
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
By providing this guarantee, WSGI allows applications to ensure
|
|
|
|
|
that transmission will not become stalled at an arbitrary point
|
|
|
|
|
in their output data. This is critical for proper functioning
|
|
|
|
|
of e.g. multipart "server push" streaming, where data between
|
|
|
|
|
multipart boundaries should be transmitted in full to the client.
|
|
|
|
|
|
|
|
|
|
|
2004-09-16 17:22:39 -04:00
|
|
|
|
Middleware Handling of Block Boundaries
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
In order to better support asynchronous applications and servers,
|
|
|
|
|
middleware components **must not** block iteration waiting for
|
|
|
|
|
multiple values from an application iterable. If the middleware
|
|
|
|
|
needs to accumulate more data from the application before it can
|
|
|
|
|
produce any output, it **must** yield an empty string.
|
|
|
|
|
|
|
|
|
|
To put this requirement another way, a middleware component **must
|
|
|
|
|
yield at least one value** each time its underlying application
|
|
|
|
|
yields a value. If the middleware cannot yield any other value,
|
|
|
|
|
it must yield an empty string.
|
|
|
|
|
|
|
|
|
|
This requirement ensures that asynchronous applications and servers
|
|
|
|
|
can conspire to reduce the number of threads that are required
|
|
|
|
|
to run a given number of application instances simultaneously.
|
|
|
|
|
|
|
|
|
|
Note also that this requirement means that middleware **must**
|
|
|
|
|
return an iterable as soon as its underlying application returns
|
|
|
|
|
an iterable. It is also forbidden for middleware to use the
|
|
|
|
|
``write()`` callable to transmit data that is yielded by an
|
|
|
|
|
underlying application. Middleware may only use their parent
|
|
|
|
|
server's ``write()`` callable to transmit data that the
|
|
|
|
|
underlying application sent using a middleware-provided ``write()``
|
|
|
|
|
callable.
|
|
|
|
|
|
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
The ``write()`` Callable
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Some existing application framework APIs support unbuffered
|
|
|
|
|
output in a different manner than WSGI. Specifically, they
|
|
|
|
|
provide a "write" function or method of some kind to write
|
|
|
|
|
an unbuffered block of data, or else they provide a buffered
|
|
|
|
|
"write" function and a "flush" mechanism to flush the buffer.
|
|
|
|
|
|
|
|
|
|
Unfortunately, such APIs cannot be implemented in terms of
|
|
|
|
|
WSGI's "iterable" application return value, unless threads
|
|
|
|
|
or other special mechanisms are used.
|
|
|
|
|
|
|
|
|
|
Therefore, to allow these frameworks to continue using an
|
|
|
|
|
imperative API, WSGI includes a special ``write()`` callable,
|
|
|
|
|
returned by the ``start_response`` callable.
|
|
|
|
|
|
|
|
|
|
New WSGI applications and frameworks **should not** use the
|
|
|
|
|
``write()`` callable if it is possible to avoid doing so. The
|
2004-09-13 16:01:54 -04:00
|
|
|
|
``write()`` callable is strictly a hack to support imperative
|
|
|
|
|
streaming APIs. In general, applications should either be
|
|
|
|
|
internally buffered, or produce iterable output, as this makes
|
|
|
|
|
it possible for web servers to interleave other tasks in the
|
|
|
|
|
same Python thread, potentially providing better throughput for
|
|
|
|
|
the server as a whole.
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
The ``write()`` callable is returned by the ``start_response()``
|
|
|
|
|
callable, and it accepts a single parameter: a string to be
|
|
|
|
|
written as part of the HTTP response body, that is treated exactly
|
|
|
|
|
as though it had been yielded by the output iterable. In other
|
|
|
|
|
words, before ``write()`` returns, it must guarantee that the
|
|
|
|
|
passed-in string was either completely sent to the client, or
|
|
|
|
|
that it is buffered for transmission while the application
|
|
|
|
|
proceeds forward.
|
|
|
|
|
|
|
|
|
|
An application **may** return a non-empty iterable even if it
|
|
|
|
|
invokes ``write()``, and that output must be treated normally
|
2004-09-13 16:01:54 -04:00
|
|
|
|
by the server or gateway (i.e., it must be sent or queued
|
|
|
|
|
immediately). Applications **must not** invoke ``write()``
|
|
|
|
|
from within their return iterable.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Implementation/Application Notes
|
|
|
|
|
================================
|
|
|
|
|
|
|
|
|
|
Unicode
|
|
|
|
|
-------
|
|
|
|
|
|
|
|
|
|
HTTP does not directly support Unicode, and neither does this
|
|
|
|
|
interface. All encoding/decoding must be handled by the application;
|
2004-09-13 16:01:54 -04:00
|
|
|
|
all strings passed to or from the server must be standard Python byte
|
|
|
|
|
strings, not Unicode objects. The result of using a Unicode object
|
|
|
|
|
where a string object is required, is undefined.
|
|
|
|
|
|
|
|
|
|
Note also that strings passed to ``start_response()`` as a status or
|
|
|
|
|
as headers **must** follow RFC 2616 with respect to encoding. That
|
|
|
|
|
is, they must either be ISO-8859-1 characters, or use RFC 2047 MIME
|
|
|
|
|
encoding.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-15 13:05:22 -04:00
|
|
|
|
On Python platforms where the ``str`` or ``StringType`` type is in
|
|
|
|
|
fact Unicode-based (e.g. Jython, IronPython, Python 3000, etc.), all
|
|
|
|
|
"strings" referred to in this specification must contain only
|
|
|
|
|
code points representable in ISO-8859-1 encoding (``\u0000`` through
|
|
|
|
|
``\u00FF``, inclusive). It is a fatal error for an application to
|
|
|
|
|
supply strings containing any other Unicode character or code point.
|
|
|
|
|
Similarly, servers and gateways **must not** supply
|
|
|
|
|
strings to an application containing any other Unicode characters.
|
|
|
|
|
|
|
|
|
|
Again, all strings referred to in this specification **must** be
|
|
|
|
|
of type ``str`` or ``StringType``, and **must not** be of type
|
|
|
|
|
``unicode`` or ``UnicodeType``. And, even if a given platform allows
|
|
|
|
|
for more than 8 bits per character in ``str``/``StringType`` objects,
|
|
|
|
|
only the lower 8 bits may be used, for any value referred to in
|
|
|
|
|
this specification as a "string".
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
Error Handling
|
|
|
|
|
--------------
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
In general, applications **should** try to trap their own, internal
|
|
|
|
|
errors, and display a helpful message in the browser. (It is up
|
|
|
|
|
to the application to decide what "helpful" means in this context.)
|
|
|
|
|
|
|
|
|
|
However, to display such a message, the application must not have
|
|
|
|
|
actually sent any data to the browser yet, or else it risks corrupting
|
|
|
|
|
the response. WSGI therefore provides a mechanism to either allow the
|
|
|
|
|
application to send its error message, or be automatically aborted:
|
|
|
|
|
the ``exc_info`` argument to ``start_response``. Here is an example
|
|
|
|
|
of its use::
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
# regular application code here
|
|
|
|
|
status = "200 Froody"
|
|
|
|
|
headers = [("content-type","text/plain")]
|
|
|
|
|
start_response(status, headers)
|
|
|
|
|
return ["normal body goes here"]
|
|
|
|
|
except:
|
2004-09-17 11:31:58 -04:00
|
|
|
|
# XXX should trap runtime issues like MemoryError, KeyboardInterrupt
|
|
|
|
|
# in a separate handler before this bare 'except:'...
|
2004-09-13 16:01:54 -04:00
|
|
|
|
status = "500 Oops"
|
|
|
|
|
headers = [("content-type","text/plain")]
|
|
|
|
|
start_response(status, headers, sys.exc_info())
|
|
|
|
|
return ["error body goes here"]
|
|
|
|
|
|
|
|
|
|
If no output has been written when an exception occurs, the call to
|
|
|
|
|
``start_response`` will return normally, and the application will
|
|
|
|
|
return an error body to be sent to the browser. However, if any output
|
|
|
|
|
has already been sent to the browser, ``start_response`` will reraise
|
|
|
|
|
the provided exception. This exception **should not** be trapped by
|
|
|
|
|
the application, and so the application will abort. The server or
|
|
|
|
|
gateway can then trap this (fatal) exception and abort the response.
|
|
|
|
|
|
|
|
|
|
Servers **should** trap and log any exception that aborts an
|
|
|
|
|
application or the iteration of its return value. If a partial
|
|
|
|
|
response has already been written to the browser when an application
|
|
|
|
|
error occurs, the server or gateway **may** attempt to add an error
|
|
|
|
|
message to the output, if the already-sent headers indicate a
|
|
|
|
|
``text/*`` content type that the server knows how to modify cleanly.
|
|
|
|
|
|
|
|
|
|
Some middleware may wish to provide additional exception handling
|
|
|
|
|
services, or intercept and replace application error messages. In
|
|
|
|
|
such cases, middleware may choose to **not** re-raise the ``exc_info``
|
|
|
|
|
supplied to ``start_response``, but instead raise a middleware-specific
|
|
|
|
|
exception, or simply return without an exception after storing the
|
|
|
|
|
supplied arguments. This will then cause the application to return
|
|
|
|
|
its error body iterable (or invoke ``write()``), allowing the middleware
|
|
|
|
|
to capture and modify the error output. These techniques will work as
|
|
|
|
|
long as application authors:
|
|
|
|
|
|
|
|
|
|
1. Always provide ``exc_info`` when beginning an error response
|
|
|
|
|
|
|
|
|
|
2. Never trap errors raised by ``start_response`` when ``exc_info`` is
|
|
|
|
|
being provided
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thread Support
|
|
|
|
|
--------------
|
|
|
|
|
|
|
|
|
|
Thread support, or lack thereof, is also server-dependent.
|
2004-09-13 16:01:54 -04:00
|
|
|
|
Servers that can run multiple requests in parallel, **should** also
|
2004-08-27 13:30:09 -04:00
|
|
|
|
provide the option of running an application in a single-threaded
|
|
|
|
|
fashion, so that applications or frameworks that are not thread-safe
|
|
|
|
|
may still be used with that server.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
URL Reconstruction
|
|
|
|
|
------------------
|
|
|
|
|
|
|
|
|
|
If an application wishes to reconstruct a request's complete URL, it
|
|
|
|
|
may do so using the following algorithm, contributed by Ian Bicking::
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
url = environ['wsgi.url_scheme']+'://'
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
if environ.get('HTTP_HOST'):
|
|
|
|
|
url += environ['HTTP_HOST']
|
|
|
|
|
else:
|
|
|
|
|
url += environ['SERVER_NAME']
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
if environ['wsgi.url_scheme'] == 'https':
|
2004-08-27 13:30:09 -04:00
|
|
|
|
if environ['SERVER_PORT'] != '443'
|
|
|
|
|
url += ':' + environ['SERVER_PORT']
|
|
|
|
|
else:
|
|
|
|
|
if environ['SERVER_PORT'] != '80':
|
|
|
|
|
url += ':' + environ['SERVER_PORT']
|
|
|
|
|
|
2004-09-15 16:41:16 -04:00
|
|
|
|
url += environ.get('SCRIPT_NAME','')
|
|
|
|
|
url += environ.get('PATH_INFO','')
|
2004-08-27 13:30:09 -04:00
|
|
|
|
if environ.get('QUERY_STRING'):
|
|
|
|
|
url += '?' + environ['QUERY_STRING']
|
|
|
|
|
|
|
|
|
|
Note that such a reconstructed URL may not be precisely the same URI
|
|
|
|
|
as requested by the client. Server rewrite rules, for example, may
|
|
|
|
|
have modified the client's originally requested URL to place it in a
|
|
|
|
|
canonical form.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Application Configuration
|
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
|
|
This specification does not define how a server selects or obtains an
|
|
|
|
|
application to invoke. These and other configuration options are
|
|
|
|
|
highly server-specific matters. It is expected that server/gateway
|
|
|
|
|
authors will document how to configure the server to execute a
|
|
|
|
|
particular application object, and with what options (such as
|
|
|
|
|
threading options).
|
|
|
|
|
|
|
|
|
|
Framework authors, on the other hand, should document how to create an
|
|
|
|
|
application object that wraps their framework's functionality. The
|
|
|
|
|
user, who has chosen both the server and the application framework,
|
|
|
|
|
must connect the two together. However, since both the framework and
|
|
|
|
|
the server now have a common interface, this should be merely a
|
|
|
|
|
mechanical matter, rather than a significant engineering effort for
|
|
|
|
|
each new server/framework pair.
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
Finally, some applications, frameworks, and middleware may wish to
|
|
|
|
|
use the ``environ`` dictionary to receive simple string configuration
|
|
|
|
|
options. Servers and gateways **should** support this by allowing
|
|
|
|
|
an application's deployer to specify name-value pairs to be placed in
|
|
|
|
|
``environ``. In the simplest case, this support can consist merely of
|
|
|
|
|
copying all operating system-supplied environment variables from
|
|
|
|
|
``os.environ`` into the ``environ`` dictionary, since the deployer in
|
|
|
|
|
principle can configure these externally to the server, or in the
|
|
|
|
|
CGI case they may be able to be set via the server's configuration
|
|
|
|
|
files.
|
|
|
|
|
|
|
|
|
|
Applications **should** try to keep such required variables to a
|
|
|
|
|
minimum, since not all servers will support easy configuration of
|
|
|
|
|
them. Of course, even in the worst case, persons deploying an
|
|
|
|
|
application can create a script to supply the necessary configuration
|
|
|
|
|
values::
|
|
|
|
|
|
|
|
|
|
from the_app import application
|
|
|
|
|
|
|
|
|
|
def new_app(environ,start_response):
|
|
|
|
|
environ['the_app.configval1'] = 'something'
|
|
|
|
|
return application(environ,start_response)
|
|
|
|
|
|
|
|
|
|
But, most existing applications and frameworks will probably only need
|
|
|
|
|
a single configuration value from ``environ``, to indicate the location
|
|
|
|
|
of their application or framework-specific configuration file(s). (Of
|
|
|
|
|
course, applications should cache such configuration, to avoid having
|
|
|
|
|
to re-read it upon each invocation.)
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-08-31 18:00:28 -04:00
|
|
|
|
Supporting Older (<2.2) Versions of Python
|
|
|
|
|
------------------------------------------
|
|
|
|
|
|
|
|
|
|
Some servers, gateways, or applications may wish to support older
|
|
|
|
|
(<2.2) versions of Python. This is especially important if Jython
|
|
|
|
|
is a target platform, since as of this writing a production-ready
|
|
|
|
|
version of Jython 2.2 is not yet available.
|
|
|
|
|
|
|
|
|
|
For servers and gateways, this is relatively straightforward:
|
|
|
|
|
servers and gateways targeting pre-2.2 versions of Python must
|
|
|
|
|
simply restrict themselves to using only a standard "for" loop to
|
|
|
|
|
iterate over any iterable returned by an application. This is the
|
|
|
|
|
only way to ensure source-level compatibility with both the pre-2.2
|
|
|
|
|
iterator protocol (discussed further below) and "today's" iterator
|
|
|
|
|
protocol (see PEP 234).
|
|
|
|
|
|
|
|
|
|
(Note that this technique necessarily applies only to servers,
|
|
|
|
|
gateways, or middleware that are written in Python. Discussion of
|
|
|
|
|
how to use iterator protocol(s) correctly from other languages is
|
|
|
|
|
outside the scope of this PEP.)
|
|
|
|
|
|
|
|
|
|
For applications, supporting pre-2.2 versions of Python is slightly
|
|
|
|
|
more complex:
|
|
|
|
|
|
|
|
|
|
* You may not return a file object and expect it to work as an iterable,
|
2004-09-13 16:01:54 -04:00
|
|
|
|
since before Python 2.2, files were not iterable. (In general, you
|
|
|
|
|
shouldn't do this anyway, because it will peform quite poorly most
|
|
|
|
|
of the time!) Use ``wsgi.file_wrapper`` or an application-specific
|
|
|
|
|
file wrapper class. (See `Optional Platform-Specific File Handling`_
|
|
|
|
|
for more on ``wsgi.file_wrapper``, and an example class you can use
|
|
|
|
|
to wrap a file as an iterable.)
|
2004-09-01 16:35:42 -04:00
|
|
|
|
|
|
|
|
|
* If you return a custom iterable, it **must** implement the pre-2.2
|
|
|
|
|
iterator protocol. That is, provide a ``__getitem__`` method that
|
|
|
|
|
accepts an integer key, and raises ``IndexError`` when exhausted.
|
|
|
|
|
(Note that built-in sequence types are also acceptable, since they
|
|
|
|
|
also implement this protocol.)
|
2004-08-31 18:00:28 -04:00
|
|
|
|
|
|
|
|
|
Finally, middleware that wishes to support pre-2.2 versions of Python,
|
|
|
|
|
and iterates over application return values or itself returns an
|
|
|
|
|
iterable (or both), must follow the appropriate recommendations above.
|
|
|
|
|
|
|
|
|
|
(Note: It should go without saying that to support pre-2.2 versions
|
|
|
|
|
of Python, any server, gateway, application, or middleware must also
|
|
|
|
|
use only language features available in the target version, use
|
|
|
|
|
1 and 0 instead of ``True`` and ``False``, etc.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
Server Extension APIs
|
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
|
|
Some server authors may wish to expose more advanced APIs, that
|
|
|
|
|
application or framework authors can use for specialized purposes.
|
|
|
|
|
For example, a gateway based on ``mod_python`` might wish to expose
|
|
|
|
|
part of the Apache API as a WSGI extension.
|
|
|
|
|
|
|
|
|
|
In the simplest case, this requires nothing more than defining an
|
|
|
|
|
``environ`` variable, such as ``mod_python.some_api``. But, in many
|
|
|
|
|
cases, the possible presence of middleware can make this difficult.
|
|
|
|
|
For example, an API that offers access to the same HTTP headers that
|
|
|
|
|
are found in ``environ`` variables, might return different data if
|
|
|
|
|
``environ`` has been modified by middleware.
|
|
|
|
|
|
|
|
|
|
In general, any extension API that duplicates, supplants, or bypasses
|
|
|
|
|
some portion of WSGI functionality runs the risk of being incompatible
|
|
|
|
|
with middleware components. Server/gateway developers should *not*
|
|
|
|
|
assume that nobody will use middleware, because some framework
|
|
|
|
|
developers specifically intend to organize or reorganize their
|
|
|
|
|
frameworks to function almost entirely as middleware of various kinds.
|
|
|
|
|
|
|
|
|
|
So, to provide maximum compatibility, servers and gateways that
|
2004-08-31 18:00:28 -04:00
|
|
|
|
provide extension APIs that replace some WSGI functionality, **must**
|
2004-08-27 13:30:09 -04:00
|
|
|
|
design those APIs so that they are invoked using the portion of the
|
|
|
|
|
API that they replace. For example, an extension API to access HTTP
|
|
|
|
|
request headers must require the application to pass in its current
|
|
|
|
|
``environ``, so that the server/gateway may verify that HTTP headers
|
|
|
|
|
accessible via the API have not been altered by middleware. If the
|
|
|
|
|
extension API cannot guarantee that it will always agree with
|
|
|
|
|
``environ`` about the contents of HTTP headers, it must refuse service
|
|
|
|
|
to the application, e.g. by raising an error, returning ``None``
|
|
|
|
|
instead of a header collection, or whatever is appropriate to the API.
|
|
|
|
|
|
|
|
|
|
Similarly, if an extension API provides an alternate means of writing
|
|
|
|
|
response data or headers, it should require the ``start_response``
|
|
|
|
|
callable to be passed in, before the application can obtain the
|
|
|
|
|
extended service. If the object passed in is not the same one that
|
|
|
|
|
the server/gateway originally supplied to the application, it cannot
|
|
|
|
|
guarantee correct operation and must refuse to provide the extended
|
|
|
|
|
service to the application.
|
|
|
|
|
|
|
|
|
|
These guidelines also apply to middleware that adds information such
|
|
|
|
|
as parsed cookies, form variables, sessions, and the like to
|
|
|
|
|
``environ``. Specifically, such middleware should provide these
|
|
|
|
|
features as functions which operate on ``environ``, rather than simply
|
|
|
|
|
stuffing values into ``environ``. This helps ensure that information
|
|
|
|
|
is calculated from ``environ`` *after* any middleware has done any URL
|
|
|
|
|
rewrites or other ``environ`` modifications.
|
|
|
|
|
|
|
|
|
|
It is very important that these "safe extension" rules be followed by
|
|
|
|
|
both server/gateway and middleware developers, in order to avoid a
|
|
|
|
|
future in which middleware developers are forced to delete any and all
|
|
|
|
|
extension APIs from ``environ`` to ensure that their mediation isn't
|
|
|
|
|
being bypassed by applications using those extensions!
|
|
|
|
|
|
|
|
|
|
|
2004-09-01 09:44:07 -04:00
|
|
|
|
Optional Platform-Specific File Handling
|
|
|
|
|
----------------------------------------
|
|
|
|
|
|
2004-09-13 16:01:54 -04:00
|
|
|
|
Some operating environments provide special high-performance file-
|
|
|
|
|
transmission facilities, such as the Unix ``sendfile()`` call.
|
|
|
|
|
Servers and gateways **may** expose this functionality via an optional
|
|
|
|
|
``wsgi.file_wrapper`` key in the ``environ``. An application
|
|
|
|
|
**may** use this "file wrapper" to convert a file or file-like object
|
|
|
|
|
into an iterable that it then returns, e.g.::
|
|
|
|
|
|
|
|
|
|
if 'wsgi.file_wrapper' in environ:
|
|
|
|
|
return environ['wsgi.file_wrapper'](filelike, block_size)
|
|
|
|
|
else:
|
|
|
|
|
return iter(lambda: filelike.read(block_size), '')
|
|
|
|
|
|
|
|
|
|
If the server or gateway supplies ``wsgi.file_wrapper``, it must be
|
|
|
|
|
a callable that accepts one required positional parameter, and one
|
|
|
|
|
optional positional parameter. The first parameter is the file-like
|
|
|
|
|
object to be sent, and the second parameter is an optional block
|
|
|
|
|
size "suggestion" (which the server/gateway need not use). The
|
|
|
|
|
callable **must** return an iterable object, and **must not** perform
|
|
|
|
|
any data transmission until and unless the server/gateway actually
|
|
|
|
|
receives the iterable as a return value from the application.
|
|
|
|
|
(To do otherwise would prevent middleware from being able to interpret
|
|
|
|
|
or override the response data.)
|
|
|
|
|
|
|
|
|
|
To be considered "file-like", the object supplied by the application
|
|
|
|
|
must have a ``read()`` method that takes an optional size argument.
|
|
|
|
|
It **may** have a ``close()`` method, and if so, the iterable returned
|
|
|
|
|
by ``wsgi.file_wrapper`` **must** have a ``close()`` method that
|
|
|
|
|
invokes the original file-like object's ``close()`` method. If the
|
|
|
|
|
"file-like" object has any other methods or attributes with names
|
|
|
|
|
matching those of Python built-in file objects (e.g. ``fileno()``),
|
|
|
|
|
the ``wsgi.file_wrapper`` **may** assume that these methods or
|
|
|
|
|
attributes have the same semantics as those of a built-in file object.
|
|
|
|
|
|
|
|
|
|
The actual implementation of any platform-specific file handling
|
|
|
|
|
must occur **after** the application returns, and the server or
|
|
|
|
|
gateway checks to see if a wrapper object was returned. (Again,
|
|
|
|
|
because of the presence of middleware, error handlers, and the like,
|
|
|
|
|
it is not guaranteed that any wrapper created will actually be used.)
|
|
|
|
|
|
|
|
|
|
Apart from the handling of ``close()``, the semantics of returning a
|
|
|
|
|
file wrapper from the application should be the same as if the
|
|
|
|
|
application had returned ``iter(filelike.read, '')``. In other words,
|
|
|
|
|
transmission should begin at the current position within the "file"
|
|
|
|
|
at the time that transmission begins, and continue until the end is
|
|
|
|
|
reached.
|
|
|
|
|
|
|
|
|
|
Of course, platform-specific file transmission APIs don't usually
|
|
|
|
|
accept arbitrary "file-like" objects. Therefore, a
|
|
|
|
|
``wsgi.file_wrapper`` has to introspect the supplied object for
|
|
|
|
|
things such as a ``fileno()`` (Unix-like OSes) or a
|
|
|
|
|
``java.nio.FileChannel`` (under Jython) in order to determine if
|
|
|
|
|
the file-like object is suitable for use with the platform-specific
|
|
|
|
|
API it supports.
|
|
|
|
|
|
|
|
|
|
Note that even if the object is *not* suitable for the platform API,
|
|
|
|
|
the ``wsgi.file_wrapper`` **must** still return an iterable that wraps
|
|
|
|
|
``read()`` and ``close()``, so that applications using file wrappers
|
|
|
|
|
are portable across platforms. Here's a simple platform-agnostic
|
|
|
|
|
file wrapper class, suitable for old (pre 2.2) and new Pythons alike::
|
|
|
|
|
|
|
|
|
|
class FileWrapper:
|
|
|
|
|
|
|
|
|
|
def __init__(self, filelike, blksize=8192):
|
|
|
|
|
self.filelike = filelike
|
|
|
|
|
self.blksize = blksize
|
|
|
|
|
if hasattr(filelike,'close'):
|
|
|
|
|
self.close = filelike.close
|
|
|
|
|
|
|
|
|
|
def __getitem__(self,key):
|
|
|
|
|
data = self.filelike.read(self.blksize)
|
|
|
|
|
if data:
|
|
|
|
|
return data
|
|
|
|
|
raise IndexError
|
|
|
|
|
|
|
|
|
|
and here is a snippet from a server/gateway that uses it to provide
|
|
|
|
|
access to a platform-specific API::
|
|
|
|
|
|
|
|
|
|
environ['wsgi.file_wrapper'] = FileWrapper
|
|
|
|
|
result = application(environ, start_response)
|
|
|
|
|
|
|
|
|
|
try:
|
|
|
|
|
if isinstance(result,FileWrapper):
|
|
|
|
|
# check if result.filelike is usable w/platform-specific
|
|
|
|
|
# API, and if so, use that API to transmit the result.
|
|
|
|
|
# If not, fall through to normal iterable handling
|
|
|
|
|
# loop below.
|
|
|
|
|
|
|
|
|
|
for data in result:
|
|
|
|
|
# etc.
|
|
|
|
|
|
|
|
|
|
finally:
|
|
|
|
|
if hasattr(result,'close'):
|
|
|
|
|
result.close()
|
2004-09-01 09:44:07 -04:00
|
|
|
|
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
HTTP 1.1 Expect/Continue
|
|
|
|
|
------------------------
|
|
|
|
|
|
2004-09-14 02:02:28 -04:00
|
|
|
|
Servers and gateways that implement HTTP 1.1 **must** provide
|
|
|
|
|
transparent support for HTTP 1.1's "expect/continue" mechanism. This
|
2004-08-31 18:00:28 -04:00
|
|
|
|
may be done in any of several ways:
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-14 02:02:28 -04:00
|
|
|
|
1. Respond to requests containing an ``Expect: 100-continue`` request
|
2004-09-01 00:59:21 -04:00
|
|
|
|
with an immediate "100 Continue" response, and proceed normally.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-14 02:02:28 -04:00
|
|
|
|
2. Proceed with the request normally, but provide the application
|
2004-09-01 00:59:21 -04:00
|
|
|
|
with a ``wsgi.input`` stream that will send the "100 Continue"
|
|
|
|
|
response if/when the application first attempts to read from the
|
|
|
|
|
input stream. The read request must then remain blocked until the
|
|
|
|
|
client responds.
|
2004-09-14 02:02:28 -04:00
|
|
|
|
|
|
|
|
|
3. Wait until the client decides that the server does not support
|
|
|
|
|
expect/continue, and sends the request body on its own. (This
|
|
|
|
|
is suboptimal, and is not recommended.)
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-14 01:06:44 -04:00
|
|
|
|
Note that these behavior restrictions do not apply for HTTP 1.0
|
2004-08-27 13:30:09 -04:00
|
|
|
|
requests, or for requests that are not directed to an application
|
|
|
|
|
object. For more information on HTTP 1.1 Expect/Continue, see RFC
|
|
|
|
|
2616, sections 8.2.3 and 10.1.1.
|
|
|
|
|
|
|
|
|
|
|
2004-09-14 02:02:28 -04:00
|
|
|
|
Other HTTP Features
|
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
|
|
In general, servers and gateways should "play dumb" and allow the
|
|
|
|
|
application complete control over its output. They should only make
|
|
|
|
|
changes that do not alter the effective semantics of the application's
|
|
|
|
|
response. It is always possible for the application developer to add
|
|
|
|
|
middleware components to supply additional features, so server/gateway
|
|
|
|
|
developers should be conservative in their implementation. In a sense,
|
|
|
|
|
a server should consider itself to be like an HTTP "proxy server", with
|
|
|
|
|
the application being an HTTP "origin server".
|
|
|
|
|
|
2004-09-16 18:04:35 -04:00
|
|
|
|
However, because WSGI servers and applications do not communicate via
|
|
|
|
|
HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to WSGI
|
|
|
|
|
internal communications. WSGI applications **must not** generate any
|
|
|
|
|
"hop-by-hop" headers [4]_, attempt to use HTTP features that would
|
|
|
|
|
require them to generate such headers, or rely on the content of
|
|
|
|
|
any incoming "hop-by-hop" headers in the ``environ`` dictionary.
|
|
|
|
|
WSGI servers **must** handle any supported inbound "hop-by-hop" headers
|
|
|
|
|
on their own, such as by decoding any inbound ``Transfer-Encoding``,
|
|
|
|
|
including chunked encoding if applicable.
|
|
|
|
|
|
|
|
|
|
Applying these principles to a variety of HTTP features, it should be
|
2004-09-14 02:02:28 -04:00
|
|
|
|
clear that a server **may** handle cache validation via the
|
|
|
|
|
``If-None-Match`` and ``If-Modified-Since`` request headers and the
|
|
|
|
|
``Last-Modified`` and ``ETag`` response headers. However, it is
|
|
|
|
|
not required to do this, and the application **should** perform its
|
|
|
|
|
own cache validation if it wants to support that feature, since
|
|
|
|
|
the server/gateway is not required to do such validation.
|
|
|
|
|
|
|
|
|
|
Similarly, a server **may** re-encode or transport-encode an
|
|
|
|
|
application's response, but the application **should** use a
|
2004-09-16 18:04:35 -04:00
|
|
|
|
suitable content encoding on its own, and **must not** apply a
|
|
|
|
|
transport encoding. A server **may** transmit byte ranges of the
|
|
|
|
|
application's response if requested by the client, and the
|
|
|
|
|
application doesn't natively support byte ranges. Again, however,
|
|
|
|
|
the application **should** perform this function on its own if desired.
|
2004-09-14 02:02:28 -04:00
|
|
|
|
|
|
|
|
|
Note that these restrictions on applications do not necessarily mean
|
|
|
|
|
that every application must reimplement every HTTP feature; many HTTP
|
|
|
|
|
features can be partially or fully implemented by middleware
|
|
|
|
|
components, thus freeing both server and application authors from
|
|
|
|
|
implementing the same features over and over again.
|
|
|
|
|
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
Questions and Answers
|
|
|
|
|
=====================
|
|
|
|
|
|
|
|
|
|
1. Why must ``environ`` be a dictionary? What's wrong with using a
|
|
|
|
|
subclass?
|
|
|
|
|
|
|
|
|
|
The rationale for requiring a dictionary is to maximize portability
|
|
|
|
|
between servers. The alternative would be to define some subset of
|
|
|
|
|
a dictionary's methods as being the standard and portable
|
|
|
|
|
interface. In practice, however, most servers will probably find a
|
|
|
|
|
dictionary adequate to their needs, and thus framework authors will
|
|
|
|
|
come to expect the full set of dictionary features to be available,
|
|
|
|
|
since they will be there more often than not. But, if some server
|
|
|
|
|
chooses *not* to use a dictionary, then there will be
|
|
|
|
|
interoperability problems despite that server's "conformance" to
|
|
|
|
|
spec. Therefore, making a dictionary mandatory simplifies the
|
|
|
|
|
specification and guarantees interoperabilty.
|
|
|
|
|
|
|
|
|
|
Note that this does not prevent server or framework developers from
|
|
|
|
|
offering specialized services as custom variables *inside* the
|
|
|
|
|
``environ`` dictionary. This is the recommended approach for
|
|
|
|
|
offering any such value-added services.
|
|
|
|
|
|
|
|
|
|
2. Why can you call ``write()`` *and* yield strings/return an
|
2004-09-15 13:05:22 -04:00
|
|
|
|
iterable? Shouldn't we pick just one way?
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
If we supported only the iteration approach, then current
|
|
|
|
|
frameworks that assume the availability of "push" suffer. But, if
|
|
|
|
|
we only support pushing via ``write()``, then server performance
|
|
|
|
|
suffers for transmission of e.g. large files (if a worker thread
|
|
|
|
|
can't begin work on a new request until all of the output has been
|
|
|
|
|
sent). Thus, this compromise allows an application framework to
|
|
|
|
|
support both approaches, as appropriate, but with only a little
|
|
|
|
|
more burden to the server implementor than a push-only approach
|
|
|
|
|
would require.
|
|
|
|
|
|
|
|
|
|
3. What's the ``close()`` for?
|
|
|
|
|
|
2004-09-15 13:05:22 -04:00
|
|
|
|
When writes are done during the execution of an application
|
2004-08-27 13:30:09 -04:00
|
|
|
|
object, the application can ensure that resources are released
|
|
|
|
|
using a try/finally block. But, if the application returns an
|
2004-09-15 13:05:22 -04:00
|
|
|
|
iterable, any resources used will not be released until the
|
|
|
|
|
iterable is garbage collected. The ``close()`` idiom allows an
|
2004-08-27 13:30:09 -04:00
|
|
|
|
application to release critical resources at the end of a request,
|
|
|
|
|
and it's forward-compatible with the support for try/finally in
|
|
|
|
|
generators that's proposed by PEP 325.
|
|
|
|
|
|
|
|
|
|
4. Why is this interface so low-level? I want feature X! (e.g.
|
|
|
|
|
cookies, sessions, persistence, ...)
|
|
|
|
|
|
|
|
|
|
This isn't Yet Another Python Web Framework. It's just a way for
|
|
|
|
|
frameworks to talk to web servers, and vice versa. If you want
|
|
|
|
|
these features, you need to pick a web framework that provides the
|
|
|
|
|
features you want. And if that framework lets you create a WSGI
|
|
|
|
|
application, you should be able to run it in most WSGI-supporting
|
|
|
|
|
servers. Also, some WSGI servers may offer additional services via
|
|
|
|
|
objects provided in their ``environ`` dictionary; see the
|
|
|
|
|
applicable server documentation for details. (Of course,
|
|
|
|
|
applications that use such extensions will not be portable to other
|
|
|
|
|
WSGI-based servers.)
|
|
|
|
|
|
|
|
|
|
5. Why use CGI variables instead of good old HTTP headers? And why
|
|
|
|
|
mix them in with WSGI-defined variables?
|
|
|
|
|
|
|
|
|
|
Many existing web frameworks are built heavily upon the CGI spec,
|
|
|
|
|
and existing web servers know how to generate CGI variables. In
|
|
|
|
|
contrast, alternative ways of representing inbound HTTP information
|
|
|
|
|
are fragmented and lack market share. Thus, using the CGI
|
|
|
|
|
"standard" seems like a good way to leverage existing
|
|
|
|
|
implementations. As for mixing them with WSGI variables,
|
|
|
|
|
separating them would just require two dictionary arguments to be
|
|
|
|
|
passed around, while providing no real benefits.
|
|
|
|
|
|
|
|
|
|
6. What about the status string? Can't we just use the number,
|
|
|
|
|
passing in ``200`` instead of ``"200 OK"``?
|
|
|
|
|
|
|
|
|
|
Doing this would complicate the server or gateway, by requiring
|
|
|
|
|
them to have a table of numeric statuses and corresponding
|
|
|
|
|
messages. By contrast, it is easy for an application or framework
|
|
|
|
|
author to type the extra text to go with the specific response code
|
|
|
|
|
they are using, and existing frameworks often already have a table
|
|
|
|
|
containing the needed messages. So, on balance it seems better to
|
|
|
|
|
make the application/framework responsible, rather than the server
|
|
|
|
|
or gateway.
|
|
|
|
|
|
|
|
|
|
7. Why is ``wsgi.run_once`` not guaranteed to run the app only once?
|
|
|
|
|
|
|
|
|
|
Because it's merely a suggestion to the application that it should
|
|
|
|
|
"rig for infrequent running". This is intended for application
|
|
|
|
|
frameworks that have multiple modes of operation for caching,
|
|
|
|
|
sessions, and so forth. In a "multiple run" mode, such frameworks
|
|
|
|
|
may preload caches, and may not write e.g. logs or session data to
|
|
|
|
|
disk after each request. In "single run" mode, such frameworks
|
|
|
|
|
avoid preloading and flush all necessary writes after each request.
|
|
|
|
|
|
|
|
|
|
However, in order to test an application or framework to verify
|
|
|
|
|
correct operation in the latter mode, it may be necessary (or at
|
|
|
|
|
least expedient) to invoke it more than once. Therefore, an
|
|
|
|
|
application should not assume that it will definitely not be run
|
|
|
|
|
again, just because it is called with ``wsgi.run_once`` set to
|
|
|
|
|
``True``.
|
|
|
|
|
|
|
|
|
|
8. Feature X (dictionaries, callables, etc.) are ugly for use in
|
|
|
|
|
application code; why don't we use objects instead?
|
|
|
|
|
|
|
|
|
|
All of these implementation choices of WSGI are specifically
|
|
|
|
|
intended to *decouple* features from one another; recombining these
|
|
|
|
|
features into encapsulated objects makes it somewhat harder to
|
|
|
|
|
write servers or gateways, and an order of magnitude harder to
|
|
|
|
|
write middleware that replaces or modifies only small portions of
|
|
|
|
|
the overall functionality.
|
|
|
|
|
|
|
|
|
|
In essence, middleware wants to have a "Chain of Responsibility"
|
|
|
|
|
pattern, whereby it can act as a "handler" for some functions,
|
|
|
|
|
while allowing others to remain unchanged. This is difficult to do
|
|
|
|
|
with ordinary Python objects, if the interface is to remain
|
|
|
|
|
extensible. For example, one must use ``__getattr__`` or
|
|
|
|
|
``__getattribute__`` overrides, to ensure that extensions (such as
|
|
|
|
|
attributes defined by future WSGI versions) are passed through.
|
|
|
|
|
|
|
|
|
|
This type of code is notoriously difficult to get 100% correct, and
|
|
|
|
|
few people will want to write it themselves. They will therefore
|
|
|
|
|
copy other people's implementations, but fail to update them when
|
|
|
|
|
the person they copied from corrects yet another corner case.
|
|
|
|
|
|
|
|
|
|
Further, this necessary boilerplate would be pure excise, a
|
|
|
|
|
developer tax paid by middleware developers to support a slightly
|
|
|
|
|
prettier API for application framework developers. But,
|
|
|
|
|
application framework developers will typically only be updating
|
|
|
|
|
*one* framework to support WSGI, and in a very limited part of
|
|
|
|
|
their framework as a whole. It will likely be their first (and
|
|
|
|
|
maybe their only) WSGI implementation, and thus they will likely
|
|
|
|
|
implement with this specification ready to hand. Thus, the effort
|
|
|
|
|
of making the API "prettier" with object attributes and suchlike
|
|
|
|
|
would likely be wasted for this audience.
|
|
|
|
|
|
|
|
|
|
We encourage those who want a prettier (or otherwise improved) WSGI
|
|
|
|
|
interface for use in direct web application programming (as opposed
|
|
|
|
|
to web framework development) to develop APIs or frameworks that
|
|
|
|
|
wrap WSGI for convenient use by application developers. In this
|
|
|
|
|
way, WSGI can remain conveniently low-level for server and
|
|
|
|
|
middleware authors, while not being "ugly" for application
|
|
|
|
|
developers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Acknowledgements
|
|
|
|
|
================
|
|
|
|
|
|
|
|
|
|
Thanks go to the many folks on the Web-SIG mailing list whose
|
|
|
|
|
thoughtful feedback made this revised draft possible. Especially:
|
|
|
|
|
|
2004-09-01 00:59:21 -04:00
|
|
|
|
* Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up
|
|
|
|
|
on the first draft as not offering any advantages over "plain old
|
|
|
|
|
CGI", thus encouraging me to look for a better approach.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-01 00:59:21 -04:00
|
|
|
|
* Ian Bicking, who helped nag me into properly specifying the
|
|
|
|
|
multithreading and multiprocess options, as well as badgering me to
|
|
|
|
|
provide a mechanism for servers to supply custom extension data to
|
|
|
|
|
an application.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
2004-09-01 00:59:21 -04:00
|
|
|
|
* Tony Lownds, who came up with the concept of a ``start_response``
|
|
|
|
|
function that took the status and headers, returning a ``write``
|
2004-09-13 16:01:54 -04:00
|
|
|
|
function. His input also guided the design of the exception handling
|
|
|
|
|
facilities, especially in the area of allowing for middleware that
|
|
|
|
|
overrides application error messages.
|
|
|
|
|
|
|
|
|
|
* Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython
|
|
|
|
|
(well before the spec was finalized) helped to shape the "supporting
|
|
|
|
|
older versions of Python" section, as well as the optional
|
|
|
|
|
``wsgi.file_wrapper`` facility.
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
.. [1] The Python Wiki "Web Programming" topic
|
|
|
|
|
(http://www.python.org/cgi-bin/moinmoin/WebProgramming)
|
|
|
|
|
|
|
|
|
|
.. [2] The Common Gateway Interface Specification, v 1.1, 3rd Draft
|
|
|
|
|
(http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt)
|
|
|
|
|
|
2004-09-16 18:04:35 -04:00
|
|
|
|
.. [3] "Chunked Transfer Coding" -- HTTP/1.1, section 3.6.1
|
2004-08-27 13:30:09 -04:00
|
|
|
|
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1)
|
|
|
|
|
|
2004-09-16 18:04:35 -04:00
|
|
|
|
.. [4] "End-to-end and Hop-by-hop Headers" -- HTTP/1.1, Section 13.5.1
|
|
|
|
|
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.1)
|
|
|
|
|
|
2004-08-27 13:30:09 -04:00
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
End:
|