Per PEP 12, the body of paragraphs should not be indented. This commit

reflows all of the text, but makes no content changes.

The numbered list was indented appropriately for recognition, but
(again) no content changes occurred.
This commit is contained in:
Greg Stein 2002-09-05 06:50:07 +00:00
parent 8bff1bdaa0
commit b30961a2fe
1 changed files with 127 additions and 133 deletions

View File

@ -14,186 +14,180 @@ Content-Type: text/x-rst
Abstract
========
This PEP discusses new modules and extended functionality for
Python's HTTP support. Notably, the addition of authenticated
requests, proxy support, authenticated proxy usage, and WebDAV_
capabilities.
This PEP discusses new modules and extended functionality for Python's
HTTP support. Notably, the addition of authenticated requests, proxy
support, authenticated proxy usage, and WebDAV_ capabilities.
Rationale
=========
Python has been quite popular as a result of its "batteries
included" positioning. One of the most heavily used protocols,
HTTP (see RFC 2616), has been included with Python for years
(``httplib``). However, this support has not kept up with the full
needs and requirements of many HTTP-based applications and
systems. In addition, new protocols based on HTTP, such as WebDAV
and XML-RPC, are becoming useful and are seeing increasing
usage. Supplying this functionality meets Python's "batteries
included" role and also keeps Python at the leading edge of new
technologies.
Python has been quite popular as a result of its "batteries included"
positioning. One of the most heavily used protocols, HTTP (see RFC
2616), has been included with Python for years (``httplib``). However,
this support has not kept up with the full needs and requirements of
many HTTP-based applications and systems. In addition, new protocols
based on HTTP, such as WebDAV and XML-RPC, are becoming useful and are
seeing increasing usage. Supplying this functionality meets Python's
"batteries included" role and also keeps Python at the leading edge of
new technologies.
While authentication and proxy support are two very notable
features missing from Python's core HTTP processing, they are
minimally handled as part of Python's URL handling (``urllib`` and
``urllib2``). However, applications that need fine-grained or
sophisticated HTTP handling cannot make use of the features while
they reside in urllib. Refactoring these features into a location
where they can be directly associated with an HTTP connection will
improve their utility for both urllib and for sophisticated
applications.
While authentication and proxy support are two very notable features
missing from Python's core HTTP processing, they are minimally handled
as part of Python's URL handling (``urllib`` and
``urllib2``). However, applications that need fine-grained or
sophisticated HTTP handling cannot make use of the features while they
reside in urllib. Refactoring these features into a location where
they can be directly associated with an HTTP connection will improve
their utility for both urllib and for sophisticated applications.
The motivation for this PEP was from several people requesting
these features directly, and from a number of feature requests on
SourceForge. Since the exact form of the modules to be provided
and the classes/architecture used could be subject to debate, this
PEP was created to provide a focal point for those discussions.
The motivation for this PEP was from several people requesting these
features directly, and from a number of feature requests on
SourceForge. Since the exact form of the modules to be provided and
the classes/architecture used could be subject to debate, this PEP was
created to provide a focal point for those discussions.
Specification
=============
Two modules will be added to the standard library: ``httpx`` (HTTP
extended functionality), and ``davlib`` (WebDAV library).
Two modules will be added to the standard library: ``httpx`` (HTTP
extended functionality), and ``davlib`` (WebDAV library).
[ suggestions for module names are welcome; ``davlib`` has some
precedence, but something like ``webdav`` might be desirable ]
[ suggestions for module names are welcome; ``davlib`` has some
precedence, but something like ``webdav`` might be desirable ]
HTTP Authentication
-------------------
The ``httpx`` module will provide a mixin for performing HTTP
authentication (for both proxy and origin server
authentication). This mixin (``httpx.HandleAuthentication``) can be
combined with the ``HTTPConnection`` and the ``HTTPSConnection`` classes
(the mixin may possibly work with the HTTP and HTTPS compatibility
classes, but that is not a requirement).
The ``httpx`` module will provide a mixin for performing HTTP
authentication (for both proxy and origin server authentication). This
mixin (``httpx.HandleAuthentication``) can be combined with the
``HTTPConnection`` and the ``HTTPSConnection`` classes (the mixin may
possibly work with the HTTP and HTTPS compatibility classes, but that
is not a requirement).
The mixin will delegate the authentication process to one or more
"authenticator" objects, allowing multiple connections to share
authenticators. The use of a separate object allows for a long
term connection to an authentication system (e.g. LDAP). An
authenticator for the Basic and Digest mechanisms (see RFC 2617)
will be provided. User-supplied authenticator subclasses can be
registered and used by the connections.
The mixin will delegate the authentication process to one or more
"authenticator" objects, allowing multiple connections to share
authenticators. The use of a separate object allows for a long term
connection to an authentication system (e.g. LDAP). An authenticator
for the Basic and Digest mechanisms (see RFC 2617) will be
provided. User-supplied authenticator subclasses can be registered and
used by the connections.
A "credentials" object (``httpx.Credentials``) is also associated with
the mixin, and stores the credentials (e.g. username and password)
needed by the authenticators. Subclasses of Credentials can be
created to hold additional information (e.g. NT domain).
A "credentials" object (``httpx.Credentials``) is also associated with
the mixin, and stores the credentials (e.g. username and password)
needed by the authenticators. Subclasses of Credentials can be created
to hold additional information (e.g. NT domain).
The mixin overrides the ``getresponse()`` method to detect ``401
(Unauthorized)`` and ``407 (Proxy Authentication Required)``
responses. When this is found, the response object, the
connection, and the credentials are passed to the authenticator
corresponding with the authentication scheme specified in the
response (multiple authenticators are tried in decreasing order of
security if multiple schemes are in the response). Each
authenticator can examine the response headers and decide whether
and how to resend the request with the correct authentication
headers. If no authenticator can successfully handle the
authentication, then an exception is raised.
The mixin overrides the ``getresponse()`` method to detect ``401
(Unauthorized)`` and ``407 (Proxy Authentication Required)``
responses. When this is found, the response object, the connection,
and the credentials are passed to the authenticator corresponding with
the authentication scheme specified in the response (multiple
authenticators are tried in decreasing order of security if multiple
schemes are in the response). Each authenticator can examine the
response headers and decide whether and how to resend the request with
the correct authentication headers. If no authenticator can
successfully handle the authentication, then an exception is raised.
Resending a request, with the appropriate credentials, is one of
the more difficult portions of the authentication system. The
difficulty arises in recording what was sent originally: the
request line, the headers, and the body. By overriding putrequest,
putheader, and endheaders, we can capture all but the body. Once
the endheaders method is called, then we capture all calls to
send() (until the next putrequest method call) to hold the body
content. The mixin will have a configurable limit for the amount
of data to hold in this fashion (e.g. only hold up to 100k of body
content). Assuming that the entire body has been stored, then we
can resend the request with the appropriate authentication
information.
Resending a request, with the appropriate credentials, is one of the
more difficult portions of the authentication system. The difficulty
arises in recording what was sent originally: the request line, the
headers, and the body. By overriding putrequest, putheader, and
endheaders, we can capture all but the body. Once the endheaders
method is called, then we capture all calls to send() (until the next
putrequest method call) to hold the body content. The mixin will have
a configurable limit for the amount of data to hold in this fashion
(e.g. only hold up to 100k of body content). Assuming that the entire
body has been stored, then we can resend the request with the
appropriate authentication information.
If the body is too large to be stored, then the ``getresponse()``
simply returns the response object, indicating the 401 or 407
error. Since the authentication information has been computed and
cached (into the Credentials object; see below), the caller can
simply regenerate the request. The mixin will attach the
appropriate credentials.
If the body is too large to be stored, then the ``getresponse()``
simply returns the response object, indicating the 401 or 407
error. Since the authentication information has been computed and
cached (into the Credentials object; see below), the caller can simply
regenerate the request. The mixin will attach the appropriate
credentials.
A "protection space" (see RFC 2617, section 1.2) is defined as a
tuple of the host, port, and authentication realm. When a request
is initially sent to an HTTP server, we do not know the
authentication realm (the realm is only returned when
authentication fails). However, we do have the path from the URL,
and that can be useful in determining the credentials to send to
the server. The Basic authentication scheme is typically set up
hierarchically: the credentials for ``/path`` can be tried for
``/path/subpath``. The Digest authentication scheme has explicit
support for the hierarchical setup. The ``httpx.Credentials`` object
will store credentials for multiple protection spaces, and can be
looked up in two differents ways:
A "protection space" (see RFC 2617, section 1.2) is defined as a tuple
of the host, port, and authentication realm. When a request is
initially sent to an HTTP server, we do not know the authentication
realm (the realm is only returned when authentication fails). However,
we do have the path from the URL, and that can be useful in
determining the credentials to send to the server. The Basic
authentication scheme is typically set up hierarchically: the
credentials for ``/path`` can be tried for ``/path/subpath``. The
Digest authentication scheme has explicit support for the hierarchical
setup. The ``httpx.Credentials`` object will store credentials for
multiple protection spaces, and can be looked up in two differents
ways:
1) looked up using ``(host, port, path)`` -- this lookup scheme is
used when generating a request for a path where we don't know the
authentication realm.
1. looked up using ``(host, port, path)`` -- this lookup scheme is
used when generating a request for a path where we don't know the
authentication realm.
2) looked up using ``(host, port, realm)`` -- this mechanism is used
during the authentication process when the server has specified
that the Request-URI resides within a specific realm.
2. looked up using ``(host, port, realm)`` -- this mechanism is used
during the authentication process when the server has specified that
the Request-URI resides within a specific realm.
The ``HandleAuthentication`` mixin will override ``putrequest()`` to
automatically insert credentials, if available. The URL from the
putrequest is used to determine the appropriate authentication
information to use.
The ``HandleAuthentication`` mixin will override ``putrequest()`` to
automatically insert credentials, if available. The URL from the
putrequest is used to determine the appropriate authentication
information to use.
It is also important to note that two sets of credentials are
used, and stored by the mixin. One set for any proxy that may be
used, and one used for the target origin server. Since proxies do
not have paths, the protection spaces in the proxy credentials
will always use "/" for storing and looking up via a path.
It is also important to note that two sets of credentials are used,
and stored by the mixin. One set for any proxy that may be used, and
one used for the target origin server. Since proxies do not have
paths, the protection spaces in the proxy credentials will always use
"/" for storing and looking up via a path.
Proxy Handling
--------------
The ``httpx`` module will provide a mixin for using a proxy to perform
HTTP(S) operations. This mixin (``httpx.UseProxy``) can be combined
with the ``HTTPConnection`` and the ``HTTPSConnection`` classes (the mixin
may possibly work with the HTTP and HTTPS compatibility classes,
but that is not a requirement).
The ``httpx`` module will provide a mixin for using a proxy to perform
HTTP(S) operations. This mixin (``httpx.UseProxy``) can be combined
with the ``HTTPConnection`` and the ``HTTPSConnection`` classes (the
mixin may possibly work with the HTTP and HTTPS compatibility classes,
but that is not a requirement).
The mixin will record the ``(host, port)`` of the proxy to use. XXX
will be overridden to use this host/port combination for
connections and to rewrite request URLs into the absoluteURIs
referring to the origin server (these URIs are passed to the proxy
server).
The mixin will record the ``(host, port)`` of the proxy to use. XXX
will be overridden to use this host/port combination for connections
and to rewrite request URLs into the absoluteURIs referring to the
origin server (these URIs are passed to the proxy server).
Proxy authentication is handled by the ``httpx.HandleAuthentication``
class since a user may directly use ``HTTP(S)Connection`` to speak
with proxies.
Proxy authentication is handled by the ``httpx.HandleAuthentication``
class since a user may directly use ``HTTP(S)Connection`` to speak
with proxies.
WebDAV Features
---------------
The ``davlib`` module will provide a mixin for sending WebDAV requests
to a WebDAV-enabled server. This mixin (``davlib.DAVClient``) can be
combined with the ``HTTPConnection`` and the ``HTTPSConnection`` classes
(the mixin may possibly work with the HTTP and HTTPS compatibility
classes, but that is not a requirement).
The ``davlib`` module will provide a mixin for sending WebDAV requests
to a WebDAV-enabled server. This mixin (``davlib.DAVClient``) can be
combined with the ``HTTPConnection`` and the ``HTTPSConnection``
classes (the mixin may possibly work with the HTTP and HTTPS
compatibility classes, but that is not a requirement).
The mixin provides methods to perform the various HTTP methods
defined by HTTP in RFC 2616, and by WebDAV in RFC 2518.
The mixin provides methods to perform the various HTTP methods defined
by HTTP in RFC 2616, and by WebDAV in RFC 2518.
A custom response object is used to decode ``207 (Multi-Status)``
responses. The response object will use the standard library's xml
package to parse the multistatus XML information, producing a
simple structure of objects to hold the multistatus data. Multiple
parsing schemes will be tried/used, in order of decreasing speed.
A custom response object is used to decode ``207 (Multi-Status)``
responses. The response object will use the standard library's xml
package to parse the multistatus XML information, producing a simple
structure of objects to hold the multistatus data. Multiple parsing
schemes will be tried/used, in order of decreasing speed.
Reference Implementation
========================
The actual (future/final) implementation is being developed in the
``/nondist/sandbox/Lib`` directory, until it is accepted and moved
into the main Lib directory.
The actual (future/final) implementation is being developed in the
``/nondist/sandbox/Lib`` directory, until it is accepted and moved
into the main Lib directory.
References