python-peps/pep-0268.txt

206 lines
8.7 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 268
Title: Extended HTTP functionality and WebDAV
Version: $Revision$
Last-Modified: $Date$
Author: gstein@lyra.org (Greg Stein)
Status: Draft
Type: Standards Track
Created: 20-Aug-2001
Python-Version: 2.x
Post-History: 21-Aug-2001
Abstract
This PEP discusses new modules and extended functionality for
Python's HTTP support. Notably, the addition of authenticated
requests, proxy support, authenticated proxy usage, and WebDAV [1]
capabilities.
Rationale
Python has been quite popular as a result of its "batteries
included" positioning. One of the most heavily used protocols,
HTTP (see RFC 2616), has been included with Python for years
(httplib). However, this support has not kept up with the full
needs and requirements of many HTTP-based applications and
systems. In addition, new protocols based on HTTP, such as WebDAV
and XML-RPC, are becoming useful and are seeing increasing
usage. Supplying this functionality meets Python's "batteries
included" role and also keeps Python at the leading edge of new
technologies.
While authentication and proxy support are two very notable
features missing from Python's core HTTP processing, they are
minimally handled as part of Python's URL handling (urllib and
urllib2). However, applications that need fine-grained or
sophisticated HTTP handling cannot make use of the features while
they reside in urllib. Refactoring these features into a location
where they can be directly associated with an HTTP connection will
improve their utility for both urllib and for sophisticated
applications.
The motivation for this PEP was from several people requesting
these features directly, and from a number of feature requests on
SourceForge. Since the exact form of the modules to be provided
and the classes/architecture used could be subject to debate, this
PEP was created to provide a focal point for those discussions.
Specification
Two modules will be added to the standard library: 'httpx' (HTTP
extended functionality), and 'davlib' (WebDAV library).
[ suggestions for module names are welcome; davlib has some
precedence, but something like 'webdav' might be desirable ]
HTTP Authentication
The httpx module will provide a mixin for performing HTTP
authentication (for both proxy and origin server
authentication). This mixin (httpx.HandleAuthentication) can be
combined with the HTTPConnection and the HTTPSConnection classes
(the mixin may possibly work with the HTTP and HTTPS compatibility
classes, but that is not a requirement).
The mixin will delegate the authentication process to one or more
"authenticator" objects, allowing multiple connections to share
authenticators. The use of a separate object allows for a long
term connection to an authentication system (e.g. LDAP). An
authenticator for the Basic and Digest mechanisms (see RFC 2617)
will be provided. User-supplied authenticator subclasses can be
registered and used by the connections.
A "credentials" object (httpx.Credentials) is also associated with
the mixin, and stores the credentials (e.g. username and password)
needed by the authenticators. Subclasses of Credentials can be
created to hold additional information (e.g. NT domain).
The mixin overrides the getresponse() method to detect 401
(Unauthorized) and 407 (Proxy Authentication Required)
responses. When this is found, the response object, the
connection, and the credentials are passed to the authenticator
corresponding with the authentication scheme specified in the
response (multiple authenticators are tried in decreasing order of
security if multiple schemes are in the response). Each
authenticator can examine the response headers and decide whether
and how to resend the request with the correct authentication
headers. If no authenticator can successfully handle the
authentication, then an exception is raised.
Resending a request, with the appropriate credentials, is one of
the more difficult portions of the authentication system. The
difficulty arises in recording what was sent originally: the
request line, the headers, and the body. By overriding putrequest,
putheader, and endheaders, we can capture all but the body. Once
the endheaders method is called, then we capture all calls to
send() (until the next putrequest method call) to hold the body
content. The mixin will have a configurable limit for the amount
of data to hold in this fashion (e.g. only hold up to 100k of body
content). Assuming that the entire body has been stored, then we
can resend the request with the appropriate authentication
information.
If the body is too large to be stored, then the getresponse()
simply returns the response object, indicating the 401 or 407
error. Since the authentication information has been computed and
cached (into the Credentials object; see below), the caller can
simply regenerate the request. The mixin will attach the
appropriate credentials.
A "protection space" (see RFC 2617, section 1.2) is defined as a
tuple of the host, port, and authentication realm. When a request
is initially sent to an HTTP server, we do not know the
authentication realm (the realm is only returned when
authentication fails). However, we do have the path from the URL,
and that can be useful in determining the credentials to send to
the server. The Basic authentication scheme is typically set up
hierarchically: the credentials for /path can be tried for
/path/subpath. The Digest authentication scheme has explicit
support for the hierarchical setup. The httpx.Credentials object
will store credentials for multiple protection spaces, and can be
looked up in two differents ways:
1) looked up using (host, port, path) -- this lookup scheme is
used when generating a request for a path where we don't know the
authentication realm.
2) looked up using (host, port, realm) -- this mechanism is used
during the authentication process when the server has specified
that the Request-URI resides within a specific realm.
The HandleAuthentication mixin will override putrequest() to
automatically insert credentials, if available. The URL from the
putrequest is used to determine the appropriate authentication
information to use.
It is also important to note that two sets of credentials are
used, and stored by the mixin. One set for any proxy that may be
used, and one used for the target origin server. Since proxies do
not have paths, the protection spaces in the proxy credentials
will always use "/" for storing and looking up via a path.
Proxy Handling
The httpx module will provide a mixin for using a proxy to perform
HTTP(S) operations. This mixin (httpx.UseProxy) can be combined
with the HTTPConnection and the HTTPSConnection classes (the mixin
may possibly work with the HTTP and HTTPS compatibility classes,
but that is not a requirement).
The mixin will record the (host, port) of the proxy to use. XXX
will be overridden to use this host/port combination for
connections and to rewrite request URLs into the absoluteURIs
referring to the origin server (these URIs are passed to the proxy
server).
Proxy authentication is handled by the httpx.HandleAuthentication
class since a user may directly use HTTP(S)Connection to speak
with proxies.
WebDAV Features
The davlib module will provide a mixin for sending WebDAV requests
to a WebDAV-enabled server. This mixin (davlib.DAVClient) can be
combined with the HTTPConnection and the HTTPSConnection classes
(the mixin may possibly work with the HTTP and HTTPS compatibility
classes, but that is not a requirement).
The mixin provides methods to perform the various HTTP methods
defined by HTTP in RFC 2616, and by WebDAV in RFC 2518.
A custom response object is used to decode 207 (Multi-Status)
responses. The response object will use the standard library's xml
package to parse the multistatus XML information, producing a
simple structure of objects to hold the multistatus data. Multiple
parsing schemes will be tried/used, in order of decreasing speed.
Reference Implementation
The actual (future/final) implementation is being developed in the
/nondist/sandbox/Lib directory, until it is accepted and moved
into the main Lib directory.
References
[1] http://www.webdav.org/
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: