201 lines
8.3 KiB
Plaintext
201 lines
8.3 KiB
Plaintext
PEP: 268
|
||
Title: Extended HTTP functionality and WebDAV
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: gstein@lyra.org (Greg Stein)
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Created: 20-Aug-2001
|
||
Python-Version: 2.2
|
||
Post-History: 21-Aug-2001
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP discusses new modules and extended functionality for
|
||
Python's HTTP support. Notably, the addition of authenticated
|
||
requests, proxy support, authenticated proxy usage, and WebDAV [1]
|
||
capabilities.
|
||
|
||
|
||
Rationale
|
||
|
||
Python has been quite popular as a result of its "batteries
|
||
included" positioning. One of the most heavily used protocols,
|
||
HTTP (see RFC 2616), has been included with Python for years
|
||
(httplib). However, this support has not kept up with the full
|
||
needs and requirements of many HTTP-based applications and
|
||
systems. In addition, new protocols based on HTTP, such as WebDAV
|
||
and XML-RPC, are becoming useful and are seeing increasing
|
||
usage. Supplying this functionality meets Python's "batteries
|
||
included" role and also keeps Python at the leading edge of new
|
||
technologies.
|
||
|
||
While authentication and proxy support are two very notable
|
||
features missing from Python's core HTTP processing, they are
|
||
minimally handled as part of Python's URL handling (urllib and
|
||
urllib2). However, applications that need fine-grained or
|
||
sophisticated HTTP handling cannot make use of the features while
|
||
they reside in urllib. Refactoring these features into a location
|
||
where they can be directly associated with an HTTP connection will
|
||
improve their utility for both urllib and for sophisticated
|
||
applications.
|
||
|
||
The motivation for this PEP was from several people requesting
|
||
these features directly, and from a number of feature requests on
|
||
SourceForge. Since the exact form of the modules to be provided
|
||
and the classes/architecture used could be subject to debate, this
|
||
PEP was created to provide a focal point for those discussions.
|
||
|
||
|
||
Specification
|
||
|
||
Two modules will be added to the standard library: 'httpx' (HTTP
|
||
extended functionality), and 'davlib' (WebDAV library).
|
||
|
||
[ suggestions for module names are welcome; davlib has some
|
||
precedence, but something like 'webdav' might be desirable ]
|
||
|
||
|
||
HTTP Authentication
|
||
|
||
The httpx module will provide a mixin for performing HTTP
|
||
authentication (for both proxy and origin server
|
||
authentication). This mixin (httpx.HandleAuthentication) can be
|
||
combined with the HTTPConnection and the HTTPSConnection classes
|
||
(the mixin may possibly work with the HTTP and HTTPS compatibility
|
||
classes, but that is not a requirement).
|
||
|
||
The mixin will delegate the authentication process to one or more
|
||
"authenticator" objects, allowing multiple connections to share
|
||
authenticators. The use of a separate object allows for a long
|
||
term connection to an authentication system (e.g. LDAP). An
|
||
authenticator for the Basic and Digest mechanisms (see RFC 2617)
|
||
will be provided. User-supplied authenticator subclasses can be
|
||
registered and used by the connections.
|
||
|
||
A "credentials" object (httpx.Credentials) is also associated with
|
||
the mixin, and stores the credentials (e.g. username and password)
|
||
needed by the authenticators. Subclasses of Credentials can be
|
||
created to hold additional information (e.g. NT domain).
|
||
|
||
The mixin overrides the getresponse() method to detect 401
|
||
(Unauthorized) and 407 (Proxy Authentication Required)
|
||
responses. When this is found, the response object, the
|
||
connection, and the credentials are passed to the authenticator
|
||
corresponding with the authentication scheme specified in the
|
||
response (multiple authenticators are tried in decreasing order of
|
||
security if multiple schemes are in the response). Each
|
||
authenticator can examine the response headers and decide whether
|
||
and how to resend the request with the correct authentication
|
||
headers. If no authenticator can successfully handle the
|
||
authentication, then an exception is raised.
|
||
|
||
Resending a request, with the appropriate credentials, is one of
|
||
the more difficult portions of the authentication system. The
|
||
difficulty arises in recording what was sent originally: the
|
||
request line, the headers, and the body. By overriding putrequest,
|
||
putheader, and endheaders, we can capture all but the body. Once
|
||
the endheaders method is called, then we capture all calls to
|
||
send() (until the next putrequest method call) to hold the body
|
||
content. The mixin will have a configurable limit for the amount
|
||
of data to hold in this fashion (e.g. only hold up to 100k of body
|
||
content). Assuming that the entire body has been stored, then we
|
||
can resend the request with the appropriate authentication
|
||
information.
|
||
|
||
If the body is too large to be stored, then the getresponse()
|
||
simply returns the response object, indicating the 401 or 407
|
||
error. Since the authentication information has been computed and
|
||
cached (into the Credentials object; see below), the caller can
|
||
simply regenerate the request. The mixin will attach the
|
||
appropriate credentials.
|
||
|
||
A "protection space" (see RFC 2617, section 1.2) is defined as a
|
||
tuple of the host, port, and authentication realm. When a request
|
||
is initially sent to an HTTP server, we do not know the
|
||
authentication realm (the realm is only returned when
|
||
authentication fails). However, we do have the path from the URL,
|
||
and that can be useful in determining the credentials to send to
|
||
the server. The Basic authentication scheme is typically set up
|
||
hierarchically: the credentials for /path can be tried for
|
||
/path/subpath. The Digest authentication scheme has explicit
|
||
support for the hierarchical setup. The httpx.Credentials object
|
||
will store credentials for multiple protection spaces, and can be
|
||
looked up in two differents ways:
|
||
|
||
1) looked up using (host, port, path) -- this lookup scheme is
|
||
used when generating a request for a path where we don't know the
|
||
authentication realm.
|
||
|
||
2) looked up using (host, port, realm) -- this mechanism is used
|
||
during the authentication process when the server has specified
|
||
that the Request-URI resides within a specific realm.
|
||
|
||
The mixin will override endheaders() to automatically insert
|
||
credentials, if available.
|
||
|
||
It is also important to note that two sets of credentials are
|
||
important, and stored by the mixin. One set for any proxy that may
|
||
be used, and one used for the target origin server.
|
||
|
||
|
||
Proxy Handling
|
||
|
||
The httpx module will provide a mixin for using a proxy to perform
|
||
HTTP(S) operations. This mixin (httpx.UseProxy) can be combined
|
||
with the HTTPConnection and the HTTPSConnection classes (the mixin
|
||
may possibly work with the HTTP and HTTPS compatibility classes,
|
||
but that is not a requirement).
|
||
|
||
The mixin will record the (host, port) of the proxy to use. XXX
|
||
will be overridden to use this host/port combination for
|
||
connections and to rewrite request URLs into the absoluteURIs
|
||
referring to the origin server (these URIs are passed to the proxy
|
||
server).
|
||
|
||
Proxy authentication is handled by the httpx.HandleAuthentication
|
||
class since a user may directly use HTTP(S)Connection to speak
|
||
with proxies.
|
||
|
||
|
||
WebDAV Features
|
||
|
||
The davlib module will provide a mixin for sending WebDAV requests
|
||
to a WebDAV-enabled server. This mixin (davlib.DAVClient) can be
|
||
combined with the HTTPConnection and the HTTPSConnection classes
|
||
(the mixin may possibly work with the HTTP and HTTPS compatibility
|
||
classes, but that is not a requirement).
|
||
|
||
The mixin provides methods to perform the various HTTP methods
|
||
defined by HTTP in RFC 2616, and by WebDAV in RFC 2518.
|
||
|
||
A custom response object is used to decode 207 (Multi-Status)
|
||
responses. The response object will use the standard library's xml
|
||
package to parse the multistatus XML information, producing a
|
||
simple structure of objects to hold the multistatus data. Multiple
|
||
parsing schemes will be tried/used, in order of decreasing speed.
|
||
|
||
|
||
Reference Implementation
|
||
|
||
reference impl will probably go into /nondist/sandbox/ until
|
||
accepted into the main Lib directory ...
|
||
|
||
|
||
References
|
||
|
||
[1] http://www.webdav.org/
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|