Per PEP 12, the body of paragraphs should not be indented. This commit
reflows all of the text, but makes no content changes. The numbered list was indented appropriately for recognition, but (again) no content changes occurred.
This commit is contained in:
parent
8bff1bdaa0
commit
b30961a2fe
260
pep-0268.txt
260
pep-0268.txt
|
@ -14,186 +14,180 @@ Content-Type: text/x-rst
|
|||
Abstract
|
||||
========
|
||||
|
||||
This PEP discusses new modules and extended functionality for
|
||||
Python's HTTP support. Notably, the addition of authenticated
|
||||
requests, proxy support, authenticated proxy usage, and WebDAV_
|
||||
capabilities.
|
||||
This PEP discusses new modules and extended functionality for Python's
|
||||
HTTP support. Notably, the addition of authenticated requests, proxy
|
||||
support, authenticated proxy usage, and WebDAV_ capabilities.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Python has been quite popular as a result of its "batteries
|
||||
included" positioning. One of the most heavily used protocols,
|
||||
HTTP (see RFC 2616), has been included with Python for years
|
||||
(``httplib``). However, this support has not kept up with the full
|
||||
needs and requirements of many HTTP-based applications and
|
||||
systems. In addition, new protocols based on HTTP, such as WebDAV
|
||||
and XML-RPC, are becoming useful and are seeing increasing
|
||||
usage. Supplying this functionality meets Python's "batteries
|
||||
included" role and also keeps Python at the leading edge of new
|
||||
technologies.
|
||||
Python has been quite popular as a result of its "batteries included"
|
||||
positioning. One of the most heavily used protocols, HTTP (see RFC
|
||||
2616), has been included with Python for years (``httplib``). However,
|
||||
this support has not kept up with the full needs and requirements of
|
||||
many HTTP-based applications and systems. In addition, new protocols
|
||||
based on HTTP, such as WebDAV and XML-RPC, are becoming useful and are
|
||||
seeing increasing usage. Supplying this functionality meets Python's
|
||||
"batteries included" role and also keeps Python at the leading edge of
|
||||
new technologies.
|
||||
|
||||
While authentication and proxy support are two very notable
|
||||
features missing from Python's core HTTP processing, they are
|
||||
minimally handled as part of Python's URL handling (``urllib`` and
|
||||
``urllib2``). However, applications that need fine-grained or
|
||||
sophisticated HTTP handling cannot make use of the features while
|
||||
they reside in urllib. Refactoring these features into a location
|
||||
where they can be directly associated with an HTTP connection will
|
||||
improve their utility for both urllib and for sophisticated
|
||||
applications.
|
||||
While authentication and proxy support are two very notable features
|
||||
missing from Python's core HTTP processing, they are minimally handled
|
||||
as part of Python's URL handling (``urllib`` and
|
||||
``urllib2``). However, applications that need fine-grained or
|
||||
sophisticated HTTP handling cannot make use of the features while they
|
||||
reside in urllib. Refactoring these features into a location where
|
||||
they can be directly associated with an HTTP connection will improve
|
||||
their utility for both urllib and for sophisticated applications.
|
||||
|
||||
The motivation for this PEP was from several people requesting
|
||||
these features directly, and from a number of feature requests on
|
||||
SourceForge. Since the exact form of the modules to be provided
|
||||
and the classes/architecture used could be subject to debate, this
|
||||
PEP was created to provide a focal point for those discussions.
|
||||
The motivation for this PEP was from several people requesting these
|
||||
features directly, and from a number of feature requests on
|
||||
SourceForge. Since the exact form of the modules to be provided and
|
||||
the classes/architecture used could be subject to debate, this PEP was
|
||||
created to provide a focal point for those discussions.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
Two modules will be added to the standard library: ``httpx`` (HTTP
|
||||
extended functionality), and ``davlib`` (WebDAV library).
|
||||
Two modules will be added to the standard library: ``httpx`` (HTTP
|
||||
extended functionality), and ``davlib`` (WebDAV library).
|
||||
|
||||
[ suggestions for module names are welcome; ``davlib`` has some
|
||||
precedence, but something like ``webdav`` might be desirable ]
|
||||
[ suggestions for module names are welcome; ``davlib`` has some
|
||||
precedence, but something like ``webdav`` might be desirable ]
|
||||
|
||||
|
||||
HTTP Authentication
|
||||
-------------------
|
||||
|
||||
The ``httpx`` module will provide a mixin for performing HTTP
|
||||
authentication (for both proxy and origin server
|
||||
authentication). This mixin (``httpx.HandleAuthentication``) can be
|
||||
combined with the ``HTTPConnection`` and the ``HTTPSConnection`` classes
|
||||
(the mixin may possibly work with the HTTP and HTTPS compatibility
|
||||
classes, but that is not a requirement).
|
||||
The ``httpx`` module will provide a mixin for performing HTTP
|
||||
authentication (for both proxy and origin server authentication). This
|
||||
mixin (``httpx.HandleAuthentication``) can be combined with the
|
||||
``HTTPConnection`` and the ``HTTPSConnection`` classes (the mixin may
|
||||
possibly work with the HTTP and HTTPS compatibility classes, but that
|
||||
is not a requirement).
|
||||
|
||||
The mixin will delegate the authentication process to one or more
|
||||
"authenticator" objects, allowing multiple connections to share
|
||||
authenticators. The use of a separate object allows for a long
|
||||
term connection to an authentication system (e.g. LDAP). An
|
||||
authenticator for the Basic and Digest mechanisms (see RFC 2617)
|
||||
will be provided. User-supplied authenticator subclasses can be
|
||||
registered and used by the connections.
|
||||
The mixin will delegate the authentication process to one or more
|
||||
"authenticator" objects, allowing multiple connections to share
|
||||
authenticators. The use of a separate object allows for a long term
|
||||
connection to an authentication system (e.g. LDAP). An authenticator
|
||||
for the Basic and Digest mechanisms (see RFC 2617) will be
|
||||
provided. User-supplied authenticator subclasses can be registered and
|
||||
used by the connections.
|
||||
|
||||
A "credentials" object (``httpx.Credentials``) is also associated with
|
||||
the mixin, and stores the credentials (e.g. username and password)
|
||||
needed by the authenticators. Subclasses of Credentials can be
|
||||
created to hold additional information (e.g. NT domain).
|
||||
A "credentials" object (``httpx.Credentials``) is also associated with
|
||||
the mixin, and stores the credentials (e.g. username and password)
|
||||
needed by the authenticators. Subclasses of Credentials can be created
|
||||
to hold additional information (e.g. NT domain).
|
||||
|
||||
The mixin overrides the ``getresponse()`` method to detect ``401
|
||||
(Unauthorized)`` and ``407 (Proxy Authentication Required)``
|
||||
responses. When this is found, the response object, the
|
||||
connection, and the credentials are passed to the authenticator
|
||||
corresponding with the authentication scheme specified in the
|
||||
response (multiple authenticators are tried in decreasing order of
|
||||
security if multiple schemes are in the response). Each
|
||||
authenticator can examine the response headers and decide whether
|
||||
and how to resend the request with the correct authentication
|
||||
headers. If no authenticator can successfully handle the
|
||||
authentication, then an exception is raised.
|
||||
The mixin overrides the ``getresponse()`` method to detect ``401
|
||||
(Unauthorized)`` and ``407 (Proxy Authentication Required)``
|
||||
responses. When this is found, the response object, the connection,
|
||||
and the credentials are passed to the authenticator corresponding with
|
||||
the authentication scheme specified in the response (multiple
|
||||
authenticators are tried in decreasing order of security if multiple
|
||||
schemes are in the response). Each authenticator can examine the
|
||||
response headers and decide whether and how to resend the request with
|
||||
the correct authentication headers. If no authenticator can
|
||||
successfully handle the authentication, then an exception is raised.
|
||||
|
||||
Resending a request, with the appropriate credentials, is one of
|
||||
the more difficult portions of the authentication system. The
|
||||
difficulty arises in recording what was sent originally: the
|
||||
request line, the headers, and the body. By overriding putrequest,
|
||||
putheader, and endheaders, we can capture all but the body. Once
|
||||
the endheaders method is called, then we capture all calls to
|
||||
send() (until the next putrequest method call) to hold the body
|
||||
content. The mixin will have a configurable limit for the amount
|
||||
of data to hold in this fashion (e.g. only hold up to 100k of body
|
||||
content). Assuming that the entire body has been stored, then we
|
||||
can resend the request with the appropriate authentication
|
||||
information.
|
||||
Resending a request, with the appropriate credentials, is one of the
|
||||
more difficult portions of the authentication system. The difficulty
|
||||
arises in recording what was sent originally: the request line, the
|
||||
headers, and the body. By overriding putrequest, putheader, and
|
||||
endheaders, we can capture all but the body. Once the endheaders
|
||||
method is called, then we capture all calls to send() (until the next
|
||||
putrequest method call) to hold the body content. The mixin will have
|
||||
a configurable limit for the amount of data to hold in this fashion
|
||||
(e.g. only hold up to 100k of body content). Assuming that the entire
|
||||
body has been stored, then we can resend the request with the
|
||||
appropriate authentication information.
|
||||
|
||||
If the body is too large to be stored, then the ``getresponse()``
|
||||
simply returns the response object, indicating the 401 or 407
|
||||
error. Since the authentication information has been computed and
|
||||
cached (into the Credentials object; see below), the caller can
|
||||
simply regenerate the request. The mixin will attach the
|
||||
appropriate credentials.
|
||||
If the body is too large to be stored, then the ``getresponse()``
|
||||
simply returns the response object, indicating the 401 or 407
|
||||
error. Since the authentication information has been computed and
|
||||
cached (into the Credentials object; see below), the caller can simply
|
||||
regenerate the request. The mixin will attach the appropriate
|
||||
credentials.
|
||||
|
||||
A "protection space" (see RFC 2617, section 1.2) is defined as a
|
||||
tuple of the host, port, and authentication realm. When a request
|
||||
is initially sent to an HTTP server, we do not know the
|
||||
authentication realm (the realm is only returned when
|
||||
authentication fails). However, we do have the path from the URL,
|
||||
and that can be useful in determining the credentials to send to
|
||||
the server. The Basic authentication scheme is typically set up
|
||||
hierarchically: the credentials for ``/path`` can be tried for
|
||||
``/path/subpath``. The Digest authentication scheme has explicit
|
||||
support for the hierarchical setup. The ``httpx.Credentials`` object
|
||||
will store credentials for multiple protection spaces, and can be
|
||||
looked up in two differents ways:
|
||||
A "protection space" (see RFC 2617, section 1.2) is defined as a tuple
|
||||
of the host, port, and authentication realm. When a request is
|
||||
initially sent to an HTTP server, we do not know the authentication
|
||||
realm (the realm is only returned when authentication fails). However,
|
||||
we do have the path from the URL, and that can be useful in
|
||||
determining the credentials to send to the server. The Basic
|
||||
authentication scheme is typically set up hierarchically: the
|
||||
credentials for ``/path`` can be tried for ``/path/subpath``. The
|
||||
Digest authentication scheme has explicit support for the hierarchical
|
||||
setup. The ``httpx.Credentials`` object will store credentials for
|
||||
multiple protection spaces, and can be looked up in two differents
|
||||
ways:
|
||||
|
||||
1) looked up using ``(host, port, path)`` -- this lookup scheme is
|
||||
used when generating a request for a path where we don't know the
|
||||
authentication realm.
|
||||
1. looked up using ``(host, port, path)`` -- this lookup scheme is
|
||||
used when generating a request for a path where we don't know the
|
||||
authentication realm.
|
||||
|
||||
2) looked up using ``(host, port, realm)`` -- this mechanism is used
|
||||
during the authentication process when the server has specified
|
||||
that the Request-URI resides within a specific realm.
|
||||
2. looked up using ``(host, port, realm)`` -- this mechanism is used
|
||||
during the authentication process when the server has specified that
|
||||
the Request-URI resides within a specific realm.
|
||||
|
||||
The ``HandleAuthentication`` mixin will override ``putrequest()`` to
|
||||
automatically insert credentials, if available. The URL from the
|
||||
putrequest is used to determine the appropriate authentication
|
||||
information to use.
|
||||
The ``HandleAuthentication`` mixin will override ``putrequest()`` to
|
||||
automatically insert credentials, if available. The URL from the
|
||||
putrequest is used to determine the appropriate authentication
|
||||
information to use.
|
||||
|
||||
It is also important to note that two sets of credentials are
|
||||
used, and stored by the mixin. One set for any proxy that may be
|
||||
used, and one used for the target origin server. Since proxies do
|
||||
not have paths, the protection spaces in the proxy credentials
|
||||
will always use "/" for storing and looking up via a path.
|
||||
It is also important to note that two sets of credentials are used,
|
||||
and stored by the mixin. One set for any proxy that may be used, and
|
||||
one used for the target origin server. Since proxies do not have
|
||||
paths, the protection spaces in the proxy credentials will always use
|
||||
"/" for storing and looking up via a path.
|
||||
|
||||
|
||||
Proxy Handling
|
||||
--------------
|
||||
|
||||
The ``httpx`` module will provide a mixin for using a proxy to perform
|
||||
HTTP(S) operations. This mixin (``httpx.UseProxy``) can be combined
|
||||
with the ``HTTPConnection`` and the ``HTTPSConnection`` classes (the mixin
|
||||
may possibly work with the HTTP and HTTPS compatibility classes,
|
||||
but that is not a requirement).
|
||||
The ``httpx`` module will provide a mixin for using a proxy to perform
|
||||
HTTP(S) operations. This mixin (``httpx.UseProxy``) can be combined
|
||||
with the ``HTTPConnection`` and the ``HTTPSConnection`` classes (the
|
||||
mixin may possibly work with the HTTP and HTTPS compatibility classes,
|
||||
but that is not a requirement).
|
||||
|
||||
The mixin will record the ``(host, port)`` of the proxy to use. XXX
|
||||
will be overridden to use this host/port combination for
|
||||
connections and to rewrite request URLs into the absoluteURIs
|
||||
referring to the origin server (these URIs are passed to the proxy
|
||||
server).
|
||||
The mixin will record the ``(host, port)`` of the proxy to use. XXX
|
||||
will be overridden to use this host/port combination for connections
|
||||
and to rewrite request URLs into the absoluteURIs referring to the
|
||||
origin server (these URIs are passed to the proxy server).
|
||||
|
||||
Proxy authentication is handled by the ``httpx.HandleAuthentication``
|
||||
class since a user may directly use ``HTTP(S)Connection`` to speak
|
||||
with proxies.
|
||||
Proxy authentication is handled by the ``httpx.HandleAuthentication``
|
||||
class since a user may directly use ``HTTP(S)Connection`` to speak
|
||||
with proxies.
|
||||
|
||||
|
||||
WebDAV Features
|
||||
---------------
|
||||
|
||||
The ``davlib`` module will provide a mixin for sending WebDAV requests
|
||||
to a WebDAV-enabled server. This mixin (``davlib.DAVClient``) can be
|
||||
combined with the ``HTTPConnection`` and the ``HTTPSConnection`` classes
|
||||
(the mixin may possibly work with the HTTP and HTTPS compatibility
|
||||
classes, but that is not a requirement).
|
||||
The ``davlib`` module will provide a mixin for sending WebDAV requests
|
||||
to a WebDAV-enabled server. This mixin (``davlib.DAVClient``) can be
|
||||
combined with the ``HTTPConnection`` and the ``HTTPSConnection``
|
||||
classes (the mixin may possibly work with the HTTP and HTTPS
|
||||
compatibility classes, but that is not a requirement).
|
||||
|
||||
The mixin provides methods to perform the various HTTP methods
|
||||
defined by HTTP in RFC 2616, and by WebDAV in RFC 2518.
|
||||
The mixin provides methods to perform the various HTTP methods defined
|
||||
by HTTP in RFC 2616, and by WebDAV in RFC 2518.
|
||||
|
||||
A custom response object is used to decode ``207 (Multi-Status)``
|
||||
responses. The response object will use the standard library's xml
|
||||
package to parse the multistatus XML information, producing a
|
||||
simple structure of objects to hold the multistatus data. Multiple
|
||||
parsing schemes will be tried/used, in order of decreasing speed.
|
||||
A custom response object is used to decode ``207 (Multi-Status)``
|
||||
responses. The response object will use the standard library's xml
|
||||
package to parse the multistatus XML information, producing a simple
|
||||
structure of objects to hold the multistatus data. Multiple parsing
|
||||
schemes will be tried/used, in order of decreasing speed.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
The actual (future/final) implementation is being developed in the
|
||||
``/nondist/sandbox/Lib`` directory, until it is accepted and moved
|
||||
into the main Lib directory.
|
||||
The actual (future/final) implementation is being developed in the
|
||||
``/nondist/sandbox/Lib`` directory, until it is accepted and moved
|
||||
into the main Lib directory.
|
||||
|
||||
|
||||
References
|
||||
|
|
Loading…
Reference in New Issue