Migrated remaining chapters of the HttpClient tutorial from WIKI

git-svn-id: https://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk@792267 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Oleg Kalnichevski 2009-07-08 19:07:17 +00:00
parent de445e3232
commit 4c9f497316
9 changed files with 1962 additions and 42 deletions

View File

@ -193,7 +193,6 @@
<plugin>
<groupId>com.agilejava.docbkx</groupId>
<artifactId>docbkx-maven-plugin</artifactId>
<version>2.0.8</version>
<executions>
<execution>
<goals>

223
src/docbkx/advanced.xml Normal file
View File

@ -0,0 +1,223 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<!--
====================================================================
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
====================================================================
-->
<chapter>
<title>Advanced topics</title>
<section>
<title>Custom client connections</title>
<para>In certain situations it may be necessary to customize the way HTTP messages get
transmitted across the wire beyond what is possible possible using HTTP parameters in
order to be able to deal non-standard, non-compliant behaviours. For instance, for web
crawlers it may be necessary to force HttpClient into accepting malformed response heads
in order to salvage the content of the messages. </para>
<para>Usually the process of plugging in a custom message parser or a custom connection
implementation involves several steps:</para>
<itemizedlist>
<listitem>
<para>Provide a custom <interfacename>LineParser</interfacename> /
<interfacename>LineFormatter</interfacename> interface implementation.
Implement message parsing / formatting logic as required.</para>
<programlisting><![CDATA[
class MyLineParser extends BasicLineParser {
@Override
public Header parseHeader(
final CharArrayBuffer buffer) throws ParseException {
try {
return super.parseHeader(buffer);
} catch (ParseException ex) {
// Suppress ParseException exception
return new BasicHeader("invalid", buffer.toString());
}
}
}
]]></programlisting>
</listitem>
<listitem>
<para>Provide a custom <interfacename>OperatedClientConnection</interfacename>
implementation. Replace default request / response parsers, request / response
formatters with custom ones as required. Implement different message writing /
reading code if necessary.</para>
<programlisting><![CDATA[
class MyClientConnection extends DefaultClientConnection {
@Override
protected HttpMessageParser createResponseParser(
final SessionInputBuffer buffer,
final HttpResponseFactory responseFactory,
final HttpParams params) {
return new DefaultResponseParser(
buffer,
new MyLineParser(),
responseFactory,
params);
}
}
]]></programlisting>
</listitem>
<listitem>
<para>Provide a custom <interfacename>ClientConnectionOperator</interfacename>
interface implementation in order to create connections of new class. Implement
different socket initialization code if necessary.</para>
<programlisting><![CDATA[
class MyClientConnectionOperator extends DefaultClientConnectionOperator {
public MyClientConnectionOperator(final SchemeRegistry sr) {
super(sr);
}
@Override
public OperatedClientConnection createConnection() {
return new MyClientConnection();
}
}
]]></programlisting>
</listitem>
<listitem>
<para>Provide a custom <interfacename>ClientConnectionManager</interfacename>
interface implementation in order to create connection operator of new
class.</para>
<programlisting><![CDATA[
class MyClientConnManager extends SingleClientConnManager {
public MyClientConnManager(
final HttpParams params,
final SchemeRegistry sr) {
super(params, sr);
}
@Override
protected ClientConnectionOperator createConnectionOperator(
final SchemeRegistry sr) {
return new MyClientConnectionOperator(sr);
}
}
]]></programlisting>
</listitem>
</itemizedlist>
</section>
<section>
<title>Stateful HTTP connections</title>
<para>While HTTP specification assumes that session state information is always embedded in
HTTP messages in the form of HTTP cookies and therefore HTTP connections are always
stateless, this assumption does not always hold true in real life. There are cases when
HTTP connections are created with a particular user identity or within a particular
security context and therefore cannot be shared with other users and can be reused by
the same user only. Examples of such stateful HTTP connections are
<literal>NTLM</literal> authenticated connections and SSL connections with client
certificate authentication.</para>
<section>
<title>User token handler</title>
<para>HttpClient relies on <interfacename>UserTokenHandler</interfacename> interface to
determine if the given execution context is user specific or not. The token object
returned by this handler is expected to uniquely identify the current user if the
context is user specific or to be null if the context does not contain any resources
or details specific to the current user. The user token will be used to ensure that
user specific resources will not be shared with or reused by other users.</para>
<para>The default implementation of the <interfacename>UserTokenHandler</interfacename>
interface uses an instance of Principal class to represent a state object for HTTP
connections, if it can be obtained from the given execution context.
<classname>DefaultUserTokenHandler</classname> will use the user principle of
connection based authentication schemes such as <literal>NTLM</literal> or that of
the SSL session with client authentication turned on. If both are unavailable, null
token will be returned.</para>
<para>Users can provide a custom implementation if the default one does not satisfy
their needs:</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.setUserTokenHandler(new UserTokenHandler() {
public Object getUserToken(HttpContext context) {
return context.getAttribute("my-token");
}
});
]]></programlisting>
</section>
<section>
<title>User token and execution context</title>
<para>In the course of HTTP request execution HttpClient adds the following user
identity related objects to the execution context: </para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.user-token':</title>
<para>Object instance representing the actual user identity, usually
expected to be an instance of <interfacename>Principle</interfacename>
interface</para>
</formalpara>
</listitem>
</itemizedlist>
<para>One can find out whether or not the connection used to execute the request was
stateful by examining the content of the local HTTP context after the request has
been executed.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://localhost:8080/");
HttpResponse response = httpclient.execute(httpget, localContext);
HttpEntity entity = response.getEntity();
if (entity != null) {
entity.consumeContent();
}
Object userToken = localContext.getAttribute(ClientContext.USER_TOKEN);
System.out.println(userToken);
]]></programlisting>
<section>
<title>Persistent stateful connections</title>
<para>Please note that persistent connection that carry a state object can be reused
only if the same state object is bound to the execution context when requests
are executed. So, it is really important to ensure the either same context is
reused for execution of subsequent HTTP requests by the same user or the user
token is bound to the context prior to request execution.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext1 = new BasicHttpContext();
HttpGet httpget1 = new HttpGet("http://localhost:8080/");
HttpResponse response1 = httpclient.execute(httpget1, localContext1);
HttpEntity entity1 = response1.getEntity();
if (entity1 != null) {
entity1.consumeContent();
}
Principal principal = (Principal) localContext1.getAttribute(
ClientContext.USER_TOKEN);
HttpContext localContext2 = new BasicHttpContext();
localContext2.setAttribute(ClientContext.USER_TOKEN, principal);
HttpGet httpget2 = new HttpGet("http://localhost:8080/");
HttpResponse response2 = httpclient.execute(httpget2, localContext2);
HttpEntity entity2 = response2.getEntity();
if (entity2 != null) {
entity2.consumeContent();
}
]]></programlisting>
</section>
</section>
</section>
</chapter>

View File

@ -0,0 +1,318 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<!--
====================================================================
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
====================================================================
-->
<chapter>
<title>HTTP authentication</title>
<para>HttpClient provides full support for authentication schemes defined by the HTTP standard
specification. HttpClient's authentication framework can also be extended to support
non-standard authentication schemes such as <literal>NTLM</literal> and
<literal>SPNEGO</literal>.</para>
<section>
<title>User credentials</title>
<para>Any process of user authentication requires a set of credentials that can be used to
establish user identity. In the simplest form user crednetials can be just a user name /
password pair. <classname>UsernamePasswordCredentials</classname> represents a set of
credentials consisting of a security principal and a password in clear text. This
implementation is sufficient for standard authentication schemes defined by the HTTP
standard specification.</para>
<programlisting><![CDATA[
UsernamePasswordCredentials creds = new UsernamePasswordCredentials("user", "pwd");
System.out.println(creds.getUserPrincipal().getName());
System.out.println(creds.getPassword());
]]></programlisting>
<para>stdout &gt;</para>
<programlisting><![CDATA[
user
pwd
]]></programlisting>
<para><classname>NTCredentials</classname> is a Microsoft Windows specific implementation
that includes in addition to the user name / password pair a set of additional Windows
specific attributes such as a name of the user domain, as in Microsoft Windows network
the same user can belong to multiple domains with a different set of
authorizations.</para>
<programlisting><![CDATA[
NTCredentials creds = new NTCredentials("user", "pwd", "workstation", "domain");
System.out.println(creds.getUserPrincipal().getName());
System.out.println(creds.getPassword());
]]></programlisting>
<para>stdout &gt;</para>
<programlisting><![CDATA[
DOMAIN/user
pwd
]]></programlisting>
</section>
<section>
<title>Authentication schemes</title>
<para>The <interfacename>AuthScheme</interfacename> interface represents an abstract
challenge-response oriented authentication scheme. An authentication scheme is expected
to support the following functions:</para>
<itemizedlist>
<listitem>
<para>Parse and process the challenge sent by the target server in response to
request for a protected resource.</para>
</listitem>
<listitem>
<para>Provide properties of the processed challenge: the authentication scheme type
and its parameters, such the realm this authentication scheme is applicable to,
if available</para>
</listitem>
<listitem>
<para>Generate authorization string for the given set of credentials and the HTTP
request in response to the actual authorization challenge.</para>
</listitem>
</itemizedlist>
<para>Please note authentication schemes may be stateful involving a series of
challenge-response exchanges.</para>
<para>HttpClient ships with several <interfacename>AuthScheme</interfacename>
implementations:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>Basic:</title>
<para>Basic authentication scheme as defined in RFC 2617. This authentication
scheme is insecure, as the credentials are transmitted in clear text.
Despite its insecurity Basic authentication scheme is perfectly adequate if
used in combination with the TLS/SSL encryption.</para>
</formalpara>
<formalpara>
<title>Digest</title>
<para>Digest authentication scheme as defined in RFC 2617. Digest authentication
scheme is significantly more secure than Basic and can be a good choice for
those applications that do not want the overhead of full transport security
through TLS/SSL encryption.</para>
</formalpara>
<formalpara>
<title>NTLM:</title>
<para>NTLM is a proprietary authentication scheme developed by Microsoft and
optimized for Windows platforms. NTLM is believed to be more secure than
Digest. This scheme is supported only partially and requires an external
NTLM engine. For details please refer to the
<literal>NTLM_SUPPORT.txt</literal> document included with HttpClient
distributions.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>HTTP authentication parameters</title>
<para>These are parameters that be used to customize HTTP authentication process and
behaviour of individual authentication schemes:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.protocol.handle-authentication':</title>
<para>defines whether authentication should be handled automatically. This
parameter expects a value of type <classname>java.lang.Boolean</classname>.
If this parameter is not set HttpClient will handle authentication
automatically.</para>
</formalpara>
<formalpara>
<title>'http.auth.credential-charset':</title>
<para>defines the charset to be used when encoding user credentials. This
parameter expects a value of type <literal>java.lang.String</literal>. If
this parameter is not set <literal>US-ASCII</literal> will be used.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Authentication scheme registry</title>
<para>HttpClient maintains a registry of available authentication scheme using
<classname>AuthSchemeRegistry</classname> class. The following schemes are
registered per default:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>Basic:</title>
<para>Basic authentication scheme</para>
</formalpara>
<formalpara>
<title>Digest:</title>
<para>Digest authentication scheme</para>
</formalpara>
</listitem>
</itemizedlist>
<para>Please note <literal>NTLM</literal> scheme is <emphasis>NOT</emphasis> registered per
default. For details on how to enable <literal>NTLM</literal> support please refer to
the <literal>NTLM_SUPPORT.txt</literal> document included with HttpClient
distributions.</para>
</section>
<section>
<title>Credentials provider</title>
<para>Credentials providers are intended to maintain a set of user credentials and to be
able to produce user credentials for a particular authentication scope. Authentication
scope consists of a host name, a port number, a realm name and an authentication scheme
name. When registering credentials with the credentials provider one can provide a wild
card (any host, any port, any realm, any scheme) instead of a concrete attribute value.
The credentials provider is then expected to be able to find the closest match for a
particular scope if the direct match cannot be found.</para>
<para>HttpClient can work with any physical representation of a credentials provider that
implements the <interfacename>CredentialsProvider</interfacename> interface. The default
<interfacename>CredentialsProvider</interfacename> implementation called
<classname>BasicCredentialsProvider</classname> is a simple implementation backed by
a <classname>java.util.HashMap</classname>.</para>
<programlisting><![CDATA[
CredentialsProvider credsProvider = new BasicCredentialsProvider();
credsProvider.setCredentials(
new AuthScope("somehost", AuthScope.ANY_PORT),
new UsernamePasswordCredentials("u1", "p1"));
credsProvider.setCredentials(
new AuthScope("somehost", 8080),
new UsernamePasswordCredentials("u2", "p2"));
credsProvider.setCredentials(
new AuthScope("otherhost", 8080, AuthScope.ANY_REALM, "ntlm"),
new UsernamePasswordCredentials("u3", "p3"));
System.out.println(credsProvider.getCredentials(
new AuthScope("somehost", 80, "realm", "basic")));
System.out.println(credsProvider.getCredentials(
new AuthScope("somehost", 8080, "realm", "basic")));
System.out.println(credsProvider.getCredentials(
new AuthScope("otherhost", 8080, "realm", "basic")));
System.out.println(credsProvider.getCredentials(
new AuthScope("otherhost", 8080, null, "ntlm")));
]]></programlisting>
<para>stdout &gt;</para>
<programlisting><![CDATA[
[principal: u1]
[principal: u2]
null
[principal: u3]
]]></programlisting>
</section>
<section>
<title>HTTP authentication and execution context</title>
<para>HttpClient relies on the <classname>AuthState</classname> class to keep track of
detailed information about the state of the authentication process. HttpClient creates
two instances of <classname>AuthState</classname> in the course of HTTP request
execution: one for target host authentication and another one for proxy authentication.
In case the target server or the proxy require user authentication the respective
<classname>AuthScope</classname> instance will be populated with the
<classname>AuthScope</classname>, <interfacename>AuthScheme</interfacename> and
<interfacename>Crednetials</interfacename> used during the authentication process.
The <classname>AuthState</classname> can be examined in order to find out what kind of
authentication was requested, whether a matching
<interfacename>AuthScheme</interfacename> implementation was found and whether the
credentials provider managed to find user credentials for the given authentication
scope.</para>
<para>In the course of HTTP request execution HttpClient adds the following authentication
related objects to the execution context:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.authscheme-registry':</title>
<para><classname>AuthSchemeRegistry</classname> instance representing the actual
authentication scheme registry. The value of this attribute set in the local
context takes precedence over the default one.</para>
</formalpara>
<formalpara>
<title>'http.auth.credentials-provider':</title>
<para><interfacename>CookieSpec</interfacename> instance representing the actual
credentials provider. The value of this attribute set in the local context
takes precedence over the default one.</para>
</formalpara>
<formalpara>
<title>'http.auth.target-scope':</title>
<para><classname>AuthState</classname> instance representing the actual target
authentication state. The value of this attribute set in the local context
takes precedence over the default one.</para>
</formalpara>
<formalpara>
<title>'http.auth.proxy-scope':</title>
<para><classname>AuthState</classname> instance representing the actual proxy
authentication state. The value of this attribute set in the local context
takes precedence over the default one.</para>
</formalpara>
</listitem>
</itemizedlist>
<para>The local <interfacename>HttpContext</interfacename> object can be used to customize
the HTTP authentication context prior to request execution or examine its state after
the request has been executed:</para>
<programlisting><![CDATA[
HttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://localhost:8080/");
HttpResponse response = httpclient.execute(httpget, localContext);
AuthState proxyAuthState = (AuthState) localContext.getAttribute(
ClientContext.PROXY_AUTH_STATE);
System.out.println("Proxy auth scope: " + proxyAuthState.getAuthScope());
System.out.println("Proxy auth scheme: " + proxyAuthState.getAuthScheme());
System.out.println("Proxy auth credentials: " + proxyAuthState.getCredentials());
AuthState targetAuthState = (AuthState) localContext.getAttribute(
ClientContext.TARGET_AUTH_STATE);
System.out.println("Target auth scope: " + targetAuthState.getAuthScope());
System.out.println("Target auth scheme: " + targetAuthState.getAuthScheme());
System.out.println("Target auth credentials: " + targetAuthState.getCredentials());
]]></programlisting>
</section>
<section>
<title>Preemptive authentication</title>
<para>HttpClient does not support preemptive authentication out of the box, because if
misused or used incorrectly the preemptive authentication can lead to significant
security issues, such as sending user credentials in clear text to an unauthorized third
party. Therefore, users are expected to evaluate potential benefits of preemptive
authentication versus security risks in the context of their specific application
environment and are required to add support for preemptive authentication using standard
HttpClient extension mechanisms such as protocol interceptors.</para>
<para>This is an example of a simple protocol interceptor that preemptively introduces an
instance of <classname>BasicScheme</classname> to the execution context, if no
authentication has been attempted yet. Please note that this interceptor must be added
to the protocol processing chain before the standard authentication interceptors.</para>
<programlisting><![CDATA[
HttpRequestInterceptor preemptiveAuth = new HttpRequestInterceptor() {
public void process(
final HttpRequest request,
final HttpContext context) throws HttpException, IOException {
AuthState authState = (AuthState) context.getAttribute(
ClientContext.TARGET_AUTH_STATE);
CredentialsProvider credsProvider = (CredentialsProvider) context.getAttribute(
ClientContext.CREDS_PROVIDER);
HttpHost targetHost = (HttpHost) context.getAttribute(
ExecutionContext.HTTP_TARGET_HOST);
// If not auth scheme has been initialized yet
if (authState.getAuthScheme() == null) {
AuthScope authScope = new AuthScope(
targetHost.getHostName(),
targetHost.getPort());
// Obtain credentials matching the target host
Credentials creds = credsProvider.getCredentials(authScope);
// If found, generate BasicScheme preemptively
if (creds != null) {
authState.setAuthScheme(new BasicScheme());
authState.setCredentials(creds);
}
}
}
};
DefaultHttpClient httpclient = new DefaultHttpClient();
// Add as the very first interceptor in the protocol chain
httpclient.addRequestInterceptor(preemptiveAuth, 0);
]]></programlisting>
</section>
</chapter>

806
src/docbkx/connmgmt.xml Normal file
View File

@ -0,0 +1,806 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<!--
====================================================================
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
====================================================================
-->
<chapter>
<title>Connection management</title>
<para>HttpClient has a complete control over the process of connection initialization and
termination as well as I/O operations on active connections. However various aspects of
connection operations can be controlled using a number of parameters.</para>
<section>
<title>Connection parameters</title>
<para>These are parameters that can influence connection operations:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.socket.timeout':</title>
<para>defines the socket timeout (<literal>SO_TIMEOUT</literal>) in
milliseconds, which is the timeout for waiting for data or, put differently,
a maximum period inactivity between two consecutive data packets). A timeout
value of zero is interpreted as an infinite timeout. This parameter expects
a value of type <classname>java.lang.Integer</classname>. If this parameter
is not set read operations will not time out (infinite timeout).</para>
</formalpara>
<formalpara>
<title>'http.tcp.nodelay':</title>
<para>determines whether Nagle's algorithm is to be used. The Nagle's algorithm
tries to conserve bandwidth by minimizing the number of segments that are
sent. When applications wish to decrease network latency and increase
performance, they can disable Nagle's algorithm (that is enable
<literal>TCP_NODELAY</literal>. Data will be sent earlier, at the cost
of an increase in bandwidth consumption. This parameter expects a value of
type <classname>java.lang.Boolean</classname>. If this parameter is not,
<literal>TCP_NODELAY</literal> will be enabled (no delay).</para>
</formalpara>
<formalpara>
<title>'http.socket.buffer-size':</title>
<para>determines the size of the internal socket buffer used to buffer data
while receiving / transmitting HTTP messages. This parameter expects a value
of type <classname>java.lang.Integer</classname>. If this parameter is not
set HttpClient will allocate 8192 byte socket buffers.</para>
</formalpara>
<formalpara>
<title>'http.socket.linger':</title>
<para>sets <literal>SO_LINGER</literal> with the specified linger time in
seconds. The maximum timeout value is platform specific. Value 0 implies
that the option is disabled. Value -1 implies that the JRE default is used.
The setting only affects the socket close operation. If this parameter is
not set value -1 (JRE default) will be assumed.</para>
</formalpara>
<formalpara>
<title>'http.connection.timeout':</title>
<para>determines the timeout in milliseconds until a connection is established.
A timeout value of zero is interpreted as an infinite timeout. This
parameter expects a value of type <classname>java.lang.Integer</classname>.
If this parameter is not set connect operations will not time out (infinite
timeout).</para>
</formalpara>
<formalpara>
<title>'http.connection.stalecheck':</title>
<para>determines whether stale connection check is to be used. Disabling stale
connection check may result in a noticeable performance improvement (the
check can cause up to 30 millisecond overhead per request) at the risk of
getting an I/O error when executing a request over a connection that has
been closed at the server side. This parameter expects a value of type
<classname>java.lang.Boolean</classname>. For performance critical
operations the check should be disabled. If this parameter is not set the
stale connection will be performed before each request execution.</para>
</formalpara>
<formalpara>
<title>'http.connection.max-line-length':</title>
<para>determines the maximum line length limit. If set to a positive value, any
HTTP line exceeding this limit will cause an
<exceptionname>java.io.IOException</exceptionname>. A negative or zero
value will effectively disable the check. This parameter expects a value of
type <classname>java.lang.Integer</classname>. If this parameter is not set,
no limit will be enforced.</para>
</formalpara>
<formalpara>
<title>'http.connection.max-header-count':</title>
<para>determines the maximum HTTP header count allowed. If set to a positive
value, the number of HTTP headers received from the data stream exceeding
this limit will cause an <exceptionname>java.io.IOException</exceptionname>.
A negative or zero value will effectively disable the check. This parameter
expects a value of type <classname>java.lang.Integer</classname>. If this
parameter is not set, no limit will be enforced.</para>
</formalpara>
<formalpara>
<title>'http.connection.max-status-line-garbage':</title>
<para>defines the maximum number of ignorable lines before we expect a HTTP
response's status line. With HTTP/1.1 persistent connections, the problem
arises that broken scripts could return a wrong
<literal>Content-Length</literal> (there are more bytes sent than
specified). Unfortunately, in some cases, this cannot be detected after the
bad response, but only before the next one. So HttpClient must be able to
skip those surplus lines this way. This parameter expects a value of type
java.lang.Integer. 0 disallows all garbage/empty lines before the status
line. Use <constant>java.lang.Integer#MAX_VALUE</constant> for unlimited
number. If this parameter is not set unlimited number will be
assumed.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Connection persistence</title>
<para>The process of establishing a connection from one host to another is quite complex and
involves multiple packet exchanges between two endpoints, which can be quite time
consuming. The overhead of connection handshaking can be significant, especially for
small HTTP messages. One can achieve a much higher data throughput if open connections
can be re-used to execute multiple requests.</para>
<para>HTTP/1.1 states that HTTP connections can be re-used for multiple requests per
default. HTTP/1.0 compliant endpoints can also use similar mechanism to explicitly
communicate their preference to keep connection alive and use it for multiple requests.
HTTP agents can also keep idle connections alive for a certain period time in case a
connection to the same target host may be needed for subsequent requests. The ability to
keep connections alive is usually refered to as connection persistence. HttpClient fully
supports connection persistence.</para>
</section>
<section>
<title>HTTP connection routing</title>
<para>HttpClient is capable of establishing connections to the target host either directly
or via a route that may involve multiple intermediate connections also referred to as
hops. HttpClient differentiates connections of a route into plain, tunneled and layered.
The use of multiple intermediate proxies to tunnel connections to the target host is
referred to as proxy chaining.</para>
<para>Plain routes are established by connecting to the target or the first and only proxy.
Tunnelled routes are established by connecting to the first and tunnelling through a
chain of proxies to the target. Routes without a proxy cannot be tunnelled. Layered
routes are established by layering a protocol over an existing connection. Protocols can
only be layered over a tunnel to the target, or over a direct connection without
proxies.</para>
<section>
<title>Route computation</title>
<para><interfacename>RouteInfo</interfacename> interface represents information about a
definitive route to a target host involving one or more intermediate steps or hops.
<classname>HttpRoute</classname> is a concrete implementation of
<interfacename>RouteInfo</interfacename>, which cannot be changed (is
immutable). <classname>HttpTracker</classname> is a mutable
<interfacename>RouteInfo</interfacename> implementation used internally by
HttpClient to track the remaining hops to the ultimate route target.
<classname>HttpTracker</classname> can be updated after a successful execution
of the next hop towards the route target. <classname>HttpRouteDirector</classname>
is a helper class that can be used to compute the next step in a route. This class
is used internally by HttpClient.</para>
<para><interfacename>HttpRoutePlanner</interfacename> is an interface representing a
strategy to compute a complete route to a given target based on the execution
context. HttpClient ships with two default
<interfacename>HttpRoutePlanner</interfacename> implementation.
<classname>ProxySelectorRoutePlanner</classname> is based on
<classname>java.net.ProxySelector</classname>. By default, it will pick up the
proxy settings of the JVM, either from system properties or from the browser running
the application. <classname>DefaultHttpRoutePlanner</classname> implementation does
not make use of any Java system properties, nor of system or browser proxy settings.
It computes routes based exclusively on HTTP parameters described below.</para>
</section>
<section>
<title>Secure HTTP connections</title>
<para>HTTP connections can be considered secure if information transmitted between two
connection endpoints cannot be read or tampered with by an unauthorized third party.
The SSL/TLS protocol is the most widely used technique to ensure HTTP transport
security. However, other encryption techniques could be employed as well. Usually,
HTTP transport is layered over the SSL/TLS encrypted connection.</para>
</section>
</section>
<section>
<title>HTTP route parameters</title>
<para>These are parameters that can influence route computation:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.route.default-proxy':</title>
<para>defines a proxy host to be used by default route planners that do not make
use of JRE settings. This parameter expects a value of type
<classname>HttpHost</classname>. If this parameter is not set direct
connections to the target will be attempted.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.route.local-address':</title>
<para>defines a local address to be used by all default route planner. On
machines with multiple network interfaces, this parameter can be used to
select the network interface from which the connection originates. This
parameter expects a value of type
<classname>java.net.InetAddress</classname>. If this parameter is not
set a default local address will be used automatically.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.route.forced-route':</title>
<para>defines an forced route to be used by all default route planner. Instead
of computing a route, the given forced route will be returned, even if it
points to a completely different target host. This parameter expects a value
of type <classname>HttpRoute</classname>.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Socket factories</title>
<para>HTTP connections make use of a <classname>java.net.Socket</classname> object
internally to handle transmission of data across the wire. They, however, rely on
<interfacename>SocketFactory</interfacename> interface to create, initialize and
connect sockets. This enables the users of HttpClient to provide application specific
socket initialization code at runtime. <classname>PlainSocketFactory</classname> is the
default factory for creating and initializing plain (unencrypted) sockets.</para>
<para>The process of creating a socket and that of connecting it to a host are decoupled, so
that the socket could be closed while being blocked in the connect operation.</para>
<programlisting><![CDATA[
PlainSocketFactory sf = PlainSocketFactory.getSocketFactory();
Socket socket = sf.createSocket();
HttpParams params = new BasicHttpParams();
params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 1000L);
sf.connectSocket(socket, "locahost", 8080, null, -1, params);
]]></programlisting>
<section>
<title>Secure socket layering</title>
<para><interfacename>LayeredSocketFactory</interfacename> is an extension of
<interfacename>SocketFactory</interfacename> interface. Layered socket factories
are capable of creating sockets that are layered over an existing plain socket.
Socket layering is used primarily for creating secure sockets through proxies.
HttpClient ships with SSLSocketFactory that implements SSL/TLS layering. Please note
HttpClient does not use any custom encryption functionality. It is fully reliant on
standard Java Cryptography (JCE) and Secure Sockets (JSEE) extensions.</para>
</section>
<section>
<title>SSL/TLS customization</title>
<para>HttpClient makes use of SSLSocketFactory to create SSL connections.
<classname>SSLSocketFactory</classname> allows for a high degree of
customization. It can take an instance of
<interfacename>javax.net.ssl.SSLContext</interfacename> as a parameter and use
it to create custom configured SSL connections.</para>
<programlisting><![CDATA[
TrustManager easyTrustManager = new X509TrustManager() {
@Override
public void checkClientTrusted(
X509Certificate[] chain,
String authType) throws CertificateException {
// Oh, I am easy!
}
@Override
public void checkServerTrusted(
X509Certificate[] chain,
String authType) throws CertificateException {
// Oh, I am easy!
}
@Override
public X509Certificate[] getAcceptedIssuers() {
return null;
}
};
SSLContext sslcontext = SSLContext.getInstance("TLS");
sslcontext.init(null, new TrustManager[] { easyTrustManager }, null);
SSLSocketFactory sf = new SSLSocketFactory(sslcontext);
SSLSocket socket = (SSLSocket) sf.createSocket();
socket.setEnabledCipherSuites(new String[] { "SSL_RSA_WITH_RC4_128_MD5" });
HttpParams params = new BasicHttpParams();
params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 1000L);
sf.connectSocket(socket, "locahost", 443, null, -1, params);
]]></programlisting>
<para>Customization of SSLSocketFactory implies a certain degree of familiarity with the
concepts of the SSL/TLS protocol, a detailed explanation of which is out of scope
for this document. Please refer to the <ulink
url="http://java.sun.com/j2se/1.5.0/docs/guide/security/jsse/JSSERefGuide.html"
>Java Secure Socket Extension</ulink> for a detailed description of
<interfacename>javax.net.ssl.SSLContext</interfacename> and related
tools.</para>
</section>
<section>
<title>Hostname verification</title>
<para>In addition to the trust verification and the client authentication performed on
the SSL/TLS protocol level, HttpClient can optionally verify whether the target
hostname matches the names stored inside the server's X.509 certificate, once the
connection has been established. This verification can provide additional guarantees
of authenticity of the server trust material. X509HostnameVerifier interface
represents a strategy for hostname verification. HttpClient ships with three
X509HostnameVerifier. Important: hostname verification should not be confused with
SSL trust verification.</para>
<itemizedlist>
<listitem>
<formalpara>
<title><classname>StrictHostnameVerifier</classname>:</title>
<para>The strict hostname verifier works the same way as Sun Java 1.4, Sun
Java 5, Sun Java 6. It's also pretty close to IE6. This implementation
appears to be compliant with RFC 2818 for dealing with wildcards. The
hostname must match either the first CN, or any of the subject-alts. A
wildcard can occur in the CN, and in any of the subject-alts.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title><classname>BrowserCompatHostnameVerifier</classname>:</title>
<para>The hostname verifier that works the same way as Curl and Firefox. The
hostname must match either the first CN, or any of the subject-alts. A
wildcard can occur in the CN, and in any of the subject-alts. The only
difference between <classname>BrowserCompatHostnameVerifier</classname>
and <classname>StrictHostnameVerifier</classname> is that a wildcard
(such as "*.foo.com") with
<classname>BrowserCompatHostnameVerifier</classname> matches all
subdomains, including "a.b.foo.com".</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title><classname>AllowAllHostnameVerifier</classname>:</title>
<para>This hostname verifier essentially turns hostname verification off.
This implementation is a no-op, and never throws the
<exceptionname>javax.net.ssl.SSLException</exceptionname>.</para>
</formalpara>
</listitem>
</itemizedlist>
<para>Per default HttpClient uses <classname>BrowserCompatHostnameVerifier</classname>
implementation. One can specify a different hostname verifier implementation if
desired</para>
<programlisting><![CDATA[
SSLSocketFactory sf = new SSLSocketFactory(SSLContext.getInstance("TLS"));
sf.setHostnameVerifier(SSLSocketFactory.STRICT_HOSTNAME_VERIFIER);
]]></programlisting>
</section>
</section>
<section>
<title>Protocol schemes</title>
<para><classname>Scheme</classname> class represents a protocol scheme such as "http" or
"https" and contains a number of protocol properties such as the default port and the
socket factory to be used to creating <classname>java.net.Socket</classname> instances
for the given protocol. <classname>SchemeRegistry</classname> class is used to maintain
a set of <classname>Scheme</classname>s HttpClient can choose from when trying to
establish a connection by a request URI:</para>
<programlisting><![CDATA[
Scheme http = new Scheme("http", PlainSocketFactory.getSocketFactory(), 80);
SSLSocketFactory sf = new SSLSocketFactory(SSLContext.getInstance("TLS"));
sf.setHostnameVerifier(SSLSocketFactory.STRICT_HOSTNAME_VERIFIER);
Scheme https = new Scheme("https", sf, 443);
SchemeRegistry sr = new SchemeRegistry();
sr.register(http);
sr.register(https);
]]></programlisting>
</section>
<section>
<title>HttpClient proxy configuration</title>
<para>Even though HttpClient is aware of complex routing scemes and proxy chaining, it
supports only simple direct or one hop proxy connections out of the box.</para>
<para>The simplest way to tell HttpClient to connect to the target host via a proxy is by
setting the default proxy parameter:</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpHost proxy = new HttpHost("someproxy", 8080);
httpclient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
]]></programlisting>
<para>One can also instruct HttpClient to use standard JRE proxy selector to obtain proxy
information:</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
ProxySelectorRoutePlanner routePlanner = new ProxySelectorRoutePlanner(
httpclient.getConnectionManager().getSchemeRegistry(),
ProxySelector.getDefault());
httpclient.setRoutePlanner(routePlanner);
]]></programlisting>
<para>Alternatively, one can provide a custom <interfacename>RoutePlanner</interfacename>
implementation in order to have a complete control over the process of HTTP route
computation:</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.setRoutePlanner(new HttpRoutePlanner() {
public HttpRoute determineRoute(
HttpHost target,
HttpRequest request,
HttpContext context) throws HttpException {
return new HttpRoute(target, null, new HttpHost("someproxy", 8080),
"https".equalsIgnoreCase(target.getSchemeName()));
}
});
]]></programlisting>
</section>
<section>
<title>HTTP connection managers</title>
<section>
<title>Connection operators</title>
<para>Operated connections are client side connections whose underlying socket or its
state can be manipulated by an external entity, usually referred to as a connection
operator. <interfacename>OperatedClientConnection</interfacename> interface extends
<interfacename>HttpClientConnection</interfacename> interface and define
additional methods to manage connection socket. The
<interfacename>ClientConnectionOperator</interfacename> interface represents a
strategy for creating <interfacename>OperatedClientConnection</interfacename>
instances and updating the underlying socket of those objects. Implementations will
most likely make use <interfacename>SocketFactory</interfacename>s to create
<classname>java.net.Socket</classname> instances. The
<interfacename>ClientConnectionOperator</interfacename> interface enables the
users of HttpClient to provide a custom strategy for connection operators as well as
an ability to provide alternative implementation of the
<interfacename>OperatedClientConnection</interfacename> interface.</para>
</section>
<section>
<title>Managed connections and connection managers</title>
<para>HTTP connections are complex, stateful, thread-unsafe objects which need to be
properly managed to function correctly. HTTP connections can only be used by one
execution thread at a time. HttpClient employs a special entity to manage access to
HTTP connections called HTTP connection manager and represented by the
<interfacename>ClientConnectionManager</interfacename> interface. The purpose of
an HTTP connection manager is to serve as a factory for new HTTP connections, manage
persistent connections and synchronize access to persistent connections making sure
that only one thread can have access to a connection at a time.</para>
<para>Internally HTTP connection managers work with instances of
<interfacename>OperatedClientConnection</interfacename>, but they hands out
instances of <interfacename>ManagedClientConnection</interfacename> to the service
consumers. <interfacename>ManagedClientConnection</interfacename> acts as a wrapper
for a <interfacename>OperatedClientConnection</interfacename> instance that manages
its state and controls all I/O operations on that connection. It also abstracts away
socket operations and provides convenience methods for opening and updating sockets
in order to establish a route.
<interfacename>ManagedClientConnection</interfacename> instances are aware of
their link to the connection manager that spawned them and of the fact that they
must be returned back to the manager when no longer in use.
<interfacename>ManagedClientConnection</interfacename> classes also implement
<interfacename>ConnectionReleaseTrigger</interfacename> interface that can be
used to trigger the release of the connection back to the manager. Once the
connection release has been triggered the wrapped connection gets detached from the
<interfacename>ManagedClientConnection</interfacename> wrapper and the
<interfacename>OperatedClientConnection</interfacename> instance is returned
back to the manager. Even though the service consumer still holds a reference to the
<interfacename>ManagedClientConnection</interfacename> instance, it is no longer
able to execute any I/O operation or change the state of the
<interfacename>OperatedClientConnection</interfacename> either intentionally or
unintentionally.</para>
<para>This is an example of acquiring a connection from a connection manager:</para>
<programlisting><![CDATA[
HttpParams params = new BasicHttpParams();
Scheme http = new Scheme("http", PlainSocketFactory.getSocketFactory(), 80);
SchemeRegistry sr = new SchemeRegistry();
sr.register(http);
ClientConnectionManager connMrg = new SingleClientConnManager(params, sr);
// Request new connection. This can be a long process
ClientConnectionRequest connRequest = connMrg.requestConnection(
new HttpRoute(new HttpHost("localhost", 80)), null);
// Wait for connection up to 10 sec
ManagedClientConnection conn = connRequest.getConnection(10, TimeUnit.SECONDS);
try {
// Do useful things with the connection.
// Release it when done.
conn.releaseConnection();
} catch (IOException ex) {
// Abort connection upon an I/O error.
conn.abortConnection();
throw ex;
}
]]></programlisting>
<para>The connection request can be terminated prematurely by calling
<methodname>ClientConnectionRequest#abortRequest()</methodname> if necessary.
This will unblock the thread blocked in the
<methodname>ClientConnectionRequest#getConnection()</methodname> method.</para>
<para><classname>BasicManagedEntity</classname> wrapper class can be used to ensure
automatic release of the underlying connection once the response content has been
fully consumed. HttpClient uses this mechanism internally to achieve transparent
connection release for all responses obtained from
<methodname>HttpClient#execute()</methodname> methods:</para>
<programlisting><![CDATA[
ClientConnectionRequest connRequest = connMrg.requestConnection(
new HttpRoute(new HttpHost("localhost", 80)), null);
ManagedClientConnection conn = connRequest.getConnection(10, TimeUnit.SECONDS);
try {
BasicHttpRequest request = new BasicHttpRequest("GET", "/");
conn.sendRequestHeader(request);
HttpResponse response = conn.receiveResponseHeader();
conn.receiveResponseEntity(response);
HttpEntity entity = response.getEntity();
if (entity != null) {
BasicManagedEntity managedEntity = new BasicManagedEntity(entity, conn, true);
// Replace entity
response.setEntity(managedEntity);
}
// Do something useful with the response
// The connection will be released automatically
// as soon as the response content has been consumed
} catch (IOException ex) {
// Abort connection upon an I/O error.
conn.abortConnection();
throw ex;
}
]]></programlisting>
</section>
<section>
<title>Simple connection manager</title>
<para><classname>SingleClientConnManager</classname> is a simple connection manager that
maintains only one connection at a time. Even though this class is thread-safe it
ought to be used by one execution thread only.
<classname>SingleClientConnManager</classname> will make an effort to reuse the
connection for subsequent requests with the same route. It will, however, close the
existing connection and open it for the given route, if the route of the persistent
connection does not match that of the connection request. If the connection has been
already been allocated
<exceptionname>java.lang.IllegalStateException</exceptionname> is thrown.</para>
<para><classname>SingleClientConnManager</classname> is used by HttpClient per
default.</para>
</section>
<section>
<title>Pooling connection manager</title>
<para><classname>ThreadSafeClientConnManager</classname> is a more complex
implementation that manages a pool of client connections and is able to service
connection requests from multiple execution threads. Connections are pooled on a per
route basis. A request for a route which already the manager has persistent
connections for available in the pool will be services by leasing a connection from
the pool rather than creating a brand new connection.</para>
<para><classname>ThreadSafeClientConnManager</classname> maintains a maximum limit of
connection on a per route basis and in total. Per default this implementation will
create no more than than 2 concurrent connections per given route and no more 20
connections in total. For many real-world applications these limits may prove too
constraining, especially if they use HTTP as a transport protocol for their
services. Connection limits, however, can be adjusted using HTTP parameters.</para>
<para>This example shows how the connection pool parameters can be adjusted:</para>
<programlisting><![CDATA[
HttpParams params = new BasicHttpParams();
// Increase max total connection to 200
ConnManagerParams.setMaxTotalConnections(params, 200);
// Increase default max connection per route to 20
ConnPerRouteBean connPerRoute = new ConnPerRouteBean(20);
// Increase max connections for localhost:80 to 50
HttpHost localhost = new HttpHost("locahost", 80);
connPerRoute.setMaxForRoute(new HttpRoute(localhost), 50);
ConnManagerParams.setMaxConnectionsPerRoute(params, connPerRoute);
SchemeRegistry schemeRegistry = new SchemeRegistry();
schemeRegistry.register(
new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
schemeRegistry.register(
new Scheme("https", SSLSocketFactory.getSocketFactory(), 443));
ClientConnectionManager cm = new ThreadSafeClientConnManager(params, schemeRegistry);
HttpClient httpClient = new DefaultHttpClient(cm, params);
]]></programlisting>
</section>
<section>
<title>Connection manager shutdown</title>
<para>When an HttpClient instance is no longer needed and is about to go out of scope it
is important to shut down its connection manager to ensure that all connections kept
alive by the manager get closed and system resources allocated by those connections
are released.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://www.google.com/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
System.out.println(response.getStatusLine());
if (entity != null) {
entity.consumeContent();
}
httpclient.getConnectionManager().shutdown();
]]></programlisting>
</section>
</section>
<section>
<title>Connection management parameters</title>
<para>These are parameters that be used to customize standard HTTP connection manager
implementations:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.conn-manager.timeout':</title>
<para>defines the timeout in milliseconds used when retrieving an instance of
<interfacename>ManagedClientConnection</interfacename> from the
<interfacename>ClientConnectionManager</interfacename> This parameter
expects a value of type <classname>java.lang.Long</classname>. If this
parameter is not set connection requests will not time out (infinite
timeout).</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.conn-manager.max-per-route':</title>
<para>defines the maximum number of connections per route. This limit is
interpreted by client connection managers and applies to individual manager
instances. This parameter expects a value of type
<interfacename>ConnPerRoute</interfacename>.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.conn-manager.max-total':</title>
<para>defines the maximum number of connections in total. This limit is
interpreted by client connection managers and applies to individual manager
instances. This parameter expects a value of type
<classname>java.lang.Integer</classname>.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Multithreaded request execution</title>
<para>When equipped with a pooling connection manager such as ThreadSafeClientConnManager
HttpClient can be used to execute multiple requests simultaneously using multiple
threads of execution.</para>
<para><classname>ThreadSafeClientConnManager</classname> will allocate connections based on
its configuration. If all connections for a given route have already been leased, a
request for connection will block until a connection is released back to the pool. One
can ensure the connection manager does not block indefinitely in the connection request
operation by setting <literal>'http.conn-manager.timeout'</literal> to a positive value.
If the connection request cannot be serviced within the given time period
<exceptionname>ConnectionPoolTimeoutException</exceptionname> will be thrown.</para>
<programlisting><![CDATA[
HttpParams params = new BasicHttpParams();
SchemeRegistry schemeRegistry = new SchemeRegistry();
schemeRegistry.register(
new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
ClientConnectionManager cm = new ThreadSafeClientConnManager(params, schemeRegistry);
HttpClient httpClient = new DefaultHttpClient(cm, params);
// URIs to perform GETs on
String[] urisToGet = {
"http://www.domain1.com/",
"http://www.domain2.com/",
"http://www.domain3.com/",
"http://www.domain4.com/"
};
// create a thread for each URI
GetThread[] threads = new GetThread[urisToGet.length];
for (int i = 0; i < threads.length; i++) {
HttpGet httpget = new HttpGet(urisToGet[i]);
threads[i] = new GetThread(httpClient, httpget);
}
// start the threads
for (int j = 0; j < threads.length; j++) {
threads[j].start();
}
// join the threads
for (int j = 0; j < threads.length; j++) {
threads[j].join();
}
]]></programlisting>
<programlisting><![CDATA[
static class GetThread extends Thread {
private final HttpClient httpClient;
private final HttpContext context;
private final HttpGet httpget;
public GetThread(HttpClient httpClient, HttpGet httpget) {
this.httpClient = httpClient;
this.context = new BasicHttpContext();
this.httpget = httpget;
}
@Override
public void run() {
try {
HttpResponse response = this.httpClient.execute(this.httpget, this.context);
HttpEntity entity = response.getEntity();
if (entity != null) {
// do something useful with the entity
// ...
// ensure the connection gets released to the manager
entity.consumeContent();
}
} catch (Exception ex) {
this.httpget.abort();
}
}
}
]]></programlisting>
</section>
<section>
<title>Connection eviction policy</title>
<para>One of the major shortcoming of the classic blocking I/O model is that the network
socket can react to I/O events only when blocked in an I/O operation. When a connection
is released back to the manager, it can be kept alive however it is unable to monitor
the status of the socket and react to any I/O events. If the connection gets closed on
the server side, the client side connection is unable to detect the change in the
connection state and react appropriately by closing the socket on its end.</para>
<para>HttpClient tries to mitigate the problem by testing whether the connection is 'stale',
that is no longer valid because it was closed on the server side, prior to using the
connection for executing an HTTP request. The stale connection check is not 100%
reliable and adds 10 to 30 ms overhead to each request execution. The only feasible
solution that does not involve a one thread per socket model for idle connections is a
dedicated monitor thread used to evict connections that are considered expired due to a
long period of inactivity. The monitor thread can periodically call
<methodname>ClientConnectionManager#closeExpiredConnections()</methodname> method to
close all expired connections and evict closed connections from the pool. It can also
optionally call <methodname>ClientConnectionManager#closeIdleConnections()</methodname>
method to close all connections that have been idle over a given period of time.</para>
<programlisting><![CDATA[
public static class IdleConnectionMonitorThread extends Thread {
private final ClientConnectionManager connMgr;
private volatile boolean shutdown;
public IdleConnectionMonitorThread(ClientConnectionManager connMgr) {
super();
this.connMgr = connMgr;
}
@Override
public void run() {
try {
while (!shutdown) {
synchronized (this) {
wait(5000);
// Close expired connections
connMgr.closeExpiredConnections();
// Optionally, close connections
// that have been idle longer than 30 sec
connMgr.closeIdleConnections(30, TimeUnit.SECONDS);
}
}
} catch (InterruptedException ex) {
// terminate
}
}
public void shutdown() {
shutdown = true;
synchronized (this) {
notifyAll();
}
}
}
]]></programlisting>
</section>
<section>
<title>Connection keep alive strategy</title>
<para>The HTTP specification does not specify how long a persistent connection may be and
should be kept alive. Some HTTP servers use non-standard <literal>Keep-Alive</literal>
header to communicate to the client the period of time in seconds they intend to keep
the connection alive on the server side. HttpClient makes use of this information if
available. If the <literal>Keep-Alive</literal> header is not present in the response,
HttpClient assumes the connection can be kept alive indefinitely. However, many HTTP
servers out there are configured to drop persistent connections after a certain period
of inactivity in order to conserve system resources, quite often without informing the
client. In case the default strategy turns out to be too optimistic, one may want to
provide a custom keep-alive strategy.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.setKeepAliveStrategy(new ConnectionKeepAliveStrategy() {
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
// Honor 'keep-alive' header
HeaderElementIterator it = new BasicHeaderElementIterator(
response.headerIterator(HTTP.CONN_KEEP_ALIVE));
while (it.hasNext()) {
HeaderElement he = it.nextElement();
String param = he.getName();
String value = he.getValue();
if (value != null && param.equalsIgnoreCase("timeout")) {
try {
return Long.parseLong(value) * 1000;
} catch(NumberFormatException ignore) {
}
}
}
HttpHost target = (HttpHost) context.getAttribute(
ExecutionContext.HTTP_TARGET_HOST);
if ("www.naughty-server.com".equalsIgnoreCase(target.getHostName())) {
// Keep alive for 5 seconds only
return 5 * 1000;
} else {
// otherwise keep alive for 30 seconds
return 30 * 1000;
}
}
});
]]></programlisting>
</section>
</chapter>

View File

@ -503,54 +503,42 @@ byte[] response = httpclient.execute(httpget, handler);
<itemizedlist>
<listitem>
<formalpara>
<title>
<literal>http.connection</literal>
</title>
<title>'http.connection':</title>
<para><interfacename>HttpConnection</interfacename> instance representing the
actual connection to the target server.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.target_host</literal>
</title>
<title>'http.target_host':</title>
<para><classname>HttpHost</classname> instance representing the connection
target.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.proxy_host</literal>
</title>
<title>'http.proxy_host':</title>
<para><classname>HttpHost</classname> instance representing the connection
proxy, if used</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.request</literal>
</title>
<title>'http.request':</title>
<para><interfacename>HttpRequest</interfacename> instance representing the
actual HTTP request.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.response</literal>
</title>
<title>'http.response':</title>
<para><interfacename>HttpResponse</interfacename> instance representing the
actual HTTP response.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.request_sent</literal>
</title>
<title>'http.request_sent':</title>
<para><classname>java.lang.Boolean</classname> object representing the flag
indicating whether the actual request has been fully transmitted to the
connection target.</para>
@ -889,9 +877,7 @@ null
<itemizedlist>
<listitem>
<formalpara>
<title>
<literal>http.protocol.version</literal>
</title>
<title>'http.protocol.version':</title>
<para>defines HTTP protocol version used if not set explicitly on the request
object. This parameter expects a value of type
<interfacename>ProtocolVersion</interfacename>. If this parameter is not
@ -900,9 +886,7 @@ null
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.protocol.element-charset</literal>
</title>
<title>'http.protocol.element-charset':</title>
<para>defines the charset to be used for encoding HTTP protocol elements. This
parameter expects a value of type <classname>java.lang.String</classname>.
If this parameter is not set <literal>US-ASCII</literal> will be
@ -911,9 +895,7 @@ null
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.protocol.content-charset</literal>
</title>
<title>'http.protocol.content-charset':</title>
<para>defines the charset to be used per default for content body coding. This
parameter expects a value of type <classname>java.lang.String</classname>.
If this parameter is not set <literal>ISO-8859-1</literal> will be
@ -922,9 +904,7 @@ null
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.useragent</literal>
</title>
<title>'http.useragent':</title>
<para>defines the content of the <literal>User-Agent</literal> header. This
parameter expects a value of type <classname>java.lang.String</classname>.
If this parameter is not set, HttpClient will automatically generate a value
@ -933,9 +913,7 @@ null
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.protocol.strict-transfer-encoding</literal>
</title>
<title>'http.protocol.strict-transfer-encoding':</title>
<para>defines whether responses with an invalid
<literal>Transfer-Encoding</literal> header should be rejected. This
parameter expects a value of type <classname>java.lang.Boolean</classname>.
@ -945,9 +923,7 @@ null
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.protocol.expect-continue</literal>
</title>
<title>'http.protocol.expect-continue':</title>
<para>activates <literal>Expect: 100-Continue</literal> handshake for the entity
enclosing methods. The purpose of the <literal>Expect:
100-Continue</literal> handshake is to allow the client that is sending
@ -966,9 +942,7 @@ null
</listitem>
<listitem>
<formalpara>
<title>
<literal>http.protocol.wait-for-continue</literal>
</title>
<title>'http.protocol.wait-for-continue':</title>
<para>defines the maximum period of time in milliseconds the client should spend
waiting for a <literal>100-continue</literal> response. This parameter
expects a value of type <classname>java.lang.Integer</classname>. If this

203
src/docbkx/httpagent.xml Normal file
View File

@ -0,0 +1,203 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<!--
====================================================================
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
====================================================================
-->
<chapter>
<title>HTTP client service</title>
<section>
<title>HttpClient facade</title>
<para><interfacename>HttpClient</interfacename> interface represents the most essential
contract for HTTP request execution. It imposes no restrictions or particular details on
the request execution process and leaves the specifics of connection management, state
management, authentication and redirect handling up to individual implementations. This
should make it easier to decorate the interface with additional functionality such as
response content caching.</para>
<para><classname>DefaultHttpClient</classname> is the default implementation of the
<interfacename>HttpClient</interfacename> interface. This class acts as a facade to
a number of special purpose handler or strategy interface implementations responsible
for handling of a particular aspect of the HTTP protocol such as redirect or
authentication handling or making decision about connection persistence and keep alive
duration. This enables the users to selectively replace default implementation of those
aspects with custom, application specific ones.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.setKeepAliveStrategy(new DefaultConnectionKeepAliveStrategy() {
@Override
public long getKeepAliveDuration(
HttpResponse response,
HttpContext context) {
long keepAlive = super.getKeepAliveDuration(response, context);
if (keepAlive == -1) {
// Keep connections alive 5 seconds if a keep-alive value
// has not be explicitly set by the server
keepAlive = 5000;
}
return keepAlive;
}
});
]]></programlisting>
<para><classname>DefaultHttpClient</classname> also maintains a list of protocol
interceptors intended for processing outgoing requests and incoming responses and
provides methods for managing those interceptors. New protocol interceptors can be
introduced to the protocol processor chain or removed from it if needed. Internally
protocol interceptors are stored in a simple <classname>java.util.ArrayList</classname>.
They are executed in the same natural order as they are added to the list.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.removeRequestInterceptorByClass(RequestUserAgent.class);
httpclient.addRequestInterceptor(new HttpRequestInterceptor() {
public void process(
HttpRequest request, HttpContext context)
throws HttpException, IOException {
request.setHeader(HTTP.USER_AGENT, "My-own-client");
}
});
]]></programlisting>
<para><classname>DefaultHttpClient</classname> is thread safe. It is recommended that the
same instance of this class is reused for multiple request executions. When an instance
of <classname>DefaultHttpClient</classname> is no longer needed and is about to go out
of scope the connection manager associated with it must be shut down by calling the
<methodname>ClientConnectionManager#shutdown()</methodname> method.</para>
<programlisting><![CDATA[
HttpClient httpclient = new DefaultHttpClient();
// Do something useful
httpclient.getConnectionManager().shutdown();
]]></programlisting>
</section>
<section>
<title>HttpClient parameters</title>
<para>These are parameters that be used to customize the behaviour of the default HttpClient
implementation:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.protocol.handle-redirects':</title>
<para>defines whether redirects should be handled automatically. This parameter
expects a value of type <classname>java.lang.Boolean</classname>. If this
parameter is not HttpClient will handle redirects automatically.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.protocol.reject-relative-redirect':</title>
<para>defines whether relative redirects should be rejected. HTTP specification
requires the location value be an absolute URI. This parameter expects a
value of type <classname>java.lang.Boolean</classname>. If this parameter is
not set relative redirects will be allowed.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.protocol.max-redirects':</title>
<para>defines the maximum number of redirects to be followed. The limit on
number of redirects is intended to prevent infinite loops caused by broken
server side scripts. This parameter expects a value of type
<classname>java.lang.Integer</classname>. If this parameter is not set
no more than 100 redirects will be allowed.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.protocol.allow-circular-redirects':</title>
<para>defines whether circular redirects (redirects to the same location) should
be allowed. The HTTP spec is not sufficiently clear whether circular
redirects are permitted, therefore optionally they can be enabled. This
parameter expects a value of type <classname>java.lang.Boolean</classname>.
If this parameter is not set circular redirects will be disallowed.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.connection-manager.factory-class-name':</title>
<para>defines the class name of the default
<interfacename>ClientConnectionManager</interfacename> implementation.
This parameter expects a value of type
<classname>java.lang.String</classname>. If this parameter is not set
<classname>SingleClientConnManager</classname> will be used per
default.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.virtual-host':</title>
<para>defines the virtual host name to be used in the <literal>Host</literal>
header instead of the physical host name. This parameter expects a value of
type <classname>HttpHost</classname>. If this parameter is not set name or
IP address of the target host will be used.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.default-headers':</title>
<para>defines the request headers to be sent per default with each request. This
parameter expects a value of type
<interfacename>java.util.Collection</interfacename> containing
<interfacename>Header</interfacename> objects.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.default-host':</title>
<para>defines the default host. The default value will be used if the target
host is not explicitly specified in the request URI (relative URIs). This
parameter expects a value of type <classname>HttpHost</classname>.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Automcatic redirect handling</title>
<para>HttpClient handles all types of redirects automatically, except those explicitly
prohibited by the HTTP specification as requiring user intervention. Redirects on
<literal>POST</literal> and <literal>PUT</literal> requests are converted to
<literal>GET</literal> requests as required by the HTTP specification.</para>
</section>
<section>
<title>HTTP client and execution context</title>
<para>The <classname>DefaultHttpClient</classname> treats HTTP requests as immutable objects
that are never supposed to change in the course of request execution. Instead, it
creates a private mutable copy of the original request object, whose properties can be
updated depending on the execution context. Therefore the final request properties such
as the target host and request URI can be determined by examining the content of the
local HTTP context after the request has been executed.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://localhost:8080/");
HttpResponse response = httpclient.execute(httpget, localContext);
HttpHost target = (HttpHost) localContext.getAttribute(
ExecutionContext.HTTP_TARGET_HOST);
HttpUriRequest req = (HttpUriRequest) localContext.getAttribute(
ExecutionContext.HTTP_REQUEST);
System.out.println("Target host: " + target);
System.out.println("Final request URI: " + req.getURI());
System.out.println("Final request method: " + req.getMethod());
]]></programlisting>
</section>
</chapter>

View File

@ -62,5 +62,10 @@
<xi:include href="preface.xml"/>
<xi:include href="fundamentals.xml"/>
<xi:include href="connmgmt.xml"/>
<xi:include href="statemgmt.xml"/>
<xi:include href="authentication.xml"/>
<xi:include href="httpagent.xml"/>
<xi:include href="advanced.xml"/>
</book>

View File

@ -48,7 +48,7 @@
<listitem>
<para>
Client-side HTTP transport library based on <ulink
url="http://hc.apache.org/httpcomponents-core/index.html/">HttpCore</ulink>
url="http://hc.apache.org/httpcomponents-core/index.html">HttpCore</ulink>
</para>
</listitem>
<listitem>

392
src/docbkx/statemgmt.xml Normal file
View File

@ -0,0 +1,392 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<!--
====================================================================
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
====================================================================
-->
<chapter>
<title>HTTP state management</title>
<para>Originally HTTP was designed as a stateless, request / response oriented protocol that
made no special provisions for stateful sessions spanning across several logically related
request / response exchanges. As HTTP protocol grew in popularity and adoption more and more
systems began to use it for applications it was never intended for, for instance as a
transport for e-commerce applications. Thus, the support for state management became a
necessity.</para>
<para>Netscape Communications, at that time a leading developer of web client and server
software, implemented support for HTTP state management in their products based on a
proprietary specification. Later, Netscape tried to standardise the mechanism by publishing
a specification draft. Those efforts contributed to the formal specification defined through
the RFC standard track. However, state management in a significant number of applications is
still largely based on the Netscape draft and is incompatible with the official
specification. All major developers of web browsers felt compelled to retain compatibility
with those applications greatly contributing to the fragmentation of standards
compliance.</para>
<section>
<title>HTTP cookies</title>
<para>Cookie is a token or short packet of state information that the HTTP agent and the
target server can exchange to maintain a session. Netscape engineers used to refer to it
as as a "magic cookie" and the name stuck.</para>
<para>HttpClient uses <interfacename>Cookie</interfacename> interface to represent an
abstract cookie token. In its simples form an HTTP cookie is merely a name / value pair.
Usually an HTTP cookie also contains a number of attributes such as version, a domain
for which is valid, a path that specifies the subset of URLs on the origin server to
which this cookie applies, and maximum period of time the cookie is valid for.</para>
<para><interfacename>SetCookie</interfacename> interface represents a
<literal>Set-Cookie</literal> response header sent by the origin server to the HTTP
agent in order to maintain a conversational state.
<interfacename>SetCookie2</interfacename> interface extends SetCookie with
<literal>Set-Cookie2</literal> specific methods.</para>
<para><interfacename>ClientCookie</interfacename> interface extends
<interfacename>Cookie</interfacename> interface with additional client specific
functionality such ability to retrieve original cookie attributes exactly as they were
specified by the origin server. This is important for generating the
<literal>Cookie</literal> header because some cookie specifications require that the
<literal>Cookie</literal> header should include certain attributes only if they were
specified in the <literal>Set-Cookie</literal> or <literal>Set-Cookie2</literal>
header.</para>
<section>
<title>Cookie versions</title>
<para>Cookies compatible with Netscape draft specification but non-compliant with the
official specification are considered to be of version 0. Standard compliant cookies
are expected to have version 1. HttpClient may handle cookies differently depending
on the version.</para>
<para>Here is an example of re-creating a Netscape cookie:</para>
<programlisting><![CDATA[
BasicClientCookie netscapeCookie = new BasicClientCookie("name", "value");
netscapeCookie.setVersion(0);
netscapeCookie.setDomain(".mycompany.com");
netscapeCookie.setPath("/");
]]></programlisting>
<para>Here is an example of re-creating a standard cookie. Please note that standard
compliant cookie must retain all attributes as sent by the origin server:</para>
<programlisting><![CDATA[
BasicClientCookie stdCookie = new BasicClientCookie("name", "value");
stdCookie.setVersion(1);
stdCookie.setDomain(".mycompany.com");
stdCookie.setPath("/");
stdCookie.setSecure(true);
// Set attributes EXACTLY as sent by the server
stdCookie.setAttribute(ClientCookie.VERSION_ATTR, "1");
stdCookie.setAttribute(ClientCookie.DOMAIN_ATTR, ".mycompany.com");
]]></programlisting>
<para>Here is an example of re-creating a <literal>Set-Cookie2</literal> compliant
cookie. Please note that standard compliant cookie must retain all attributes as
sent by the origin server:</para>
<programlisting><![CDATA[
BasicClientCookie2 stdCookie = new BasicClientCookie2("name", "value");
stdCookie.setVersion(1);
stdCookie.setDomain(".mycompany.com");
stdCookie.setPorts(new int[] {80,8080});
stdCookie.setPath("/");
stdCookie.setSecure(true);
// Set attributes EXACTLY as sent by the server
stdCookie.setAttribute(ClientCookie.VERSION_ATTR, "1");
stdCookie.setAttribute(ClientCookie.DOMAIN_ATTR, ".mycompany.com");
stdCookie.setAttribute(ClientCookie.PORT_ATTR, "80,8080");
]]></programlisting>
</section>
</section>
<section>
<title>Cookie specifications</title>
<para><interfacename>CookieSpec</interfacename> interface represents a cookie management
specification. Cookie management specification is expected to enforce:</para>
<itemizedlist>
<listitem>
<para>rules of parsing <literal>Set-Cookie</literal> and optionally
<literal>Set-Cookie2</literal> headers.</para>
</listitem>
<listitem>
<para>rules of validation of parsed cookies.</para>
</listitem>
<listitem>
<para>formatting of <literal>Cookie</literal> header for a given host, port and path
of origin.</para>
</listitem>
</itemizedlist>
<para>HttpClient ships with several <interfacename>CookieSpec</interfacename>
implementations:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>Netscape draft:</title>
<para>This specification conforms to the original draft specification published
by Netscape Communications. It should be avoided unless absolutely necessary
for compatibility with legacy code.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>RFC 2109:</title>
<para>Older version of the official HTTP state management specification
superseded by RFC 2965.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>RFC 2965:</title>
<para>The official HTTP state management specification.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>Browser compatibility:</title>
<para>This implementations strives to closely mimic (mis)behavior of common web
browser applications such as Microsoft Internet Explorer and Mozilla
FireFox.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>Best match:</title>
<para>'Meta' cookie specification that picks up a cookie policy based on the
format of cookies sent with the HTTP response. It basically aggregates all
above implementations into one class.</para>
</formalpara>
</listitem>
</itemizedlist>
<para>It is strongly recommended to use the <literal>Best Match</literal> policy and let
HttpClient pick up an appropriate compliance level at runtime based on the execution
context.</para>
</section>
<section>
<title>HTTP cookie and state management parameters</title>
<para>These are parameters that be used to customize HTTP state management and behaviour of
individual cookie specifications:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.protocol.cookie-datepatterns':</title>
<para>defines valid date patterns to be used for parsing non-standard
<literal>expires</literal> attribute. Only required for compatibility
with non-compliant servers that still use <literal>expires</literal> defined
in the Netscape draft instead of the standard <literal>max-age</literal>
attribute. This parameter expects a value of type
<interfacename>java.util.Collection</interfacename>. The collection
elements must be of type <classname>java.lang.String</classname> compatible
with the syntax of <classname>java.text.SimpleDateFormat</classname>. If
this parameter is not set the choice of a default value is
<interfacename>CookieSpec</interfacename> implementation specific.
Please note this parameter applies</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.protocol.single-cookie-header':</title>
<para>defines whether cookies should be forced into a single
<literal>Cookie</literal> request header. Otherwise, each cookie is
formatted as a separate <literal>Cookie</literal> header. This parameter
expects a value of type <classname>java.lang.Boolean</classname>. If this
parameter is not set the choice of a default value is CookieSpec
implementation specific. Please note this parameter applies to strict cookie
specifications (RFC 2109 and RFC 2965) only. Browser compatibility and
netscape draft policies will always put all cookies into one request
header.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.protocol.cookie-policy':</title>
<para>defines the name of a cookie specification to be used for HTTP state
management. This parameter expects a value of type
<classname>java.lang.String</classname>. If this parameter is not set
valid date patterns are <interfacename>CookieSpec</interfacename>
implementation specific.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Cookie specification registry</title>
<para>HttpClient maintains a registry of available cookie specifications using
<classname>CookieSpecRegistry</classname> class. The following specifications are
registered per default:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>compatibility:</title>
<para> Browser compatibility (lenient policy).</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>netscape:</title>
<para>Netscape draft.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>rfc2109:</title>
<para>RFC 2109 (outdated strict policy).</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>rfc2965:</title>
<para>RFC 2965 (standard conformant strict policy).</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>best-match:</title>
<para>Best match meta-policy.</para>
</formalpara>
</listitem>
</itemizedlist>
</section>
<section>
<title>Choosing cookie policy</title>
<para>Cookie policy can be set at the HTTP client and overridden on the HTTP request level
if required.</para>
<programlisting><![CDATA[
HttpClient httpclient = new DefaultHttpClient();
// force strict cookie policy per default
httpclient.getParams().setParameter(
ClientPNames.COOKIE_POLICY, CookiePolicy.RFC_2965);
HttpGet httpget = new HttpGet("http://www.broken-server.com/");
// Override the default policy for this request
httpget.getParams().setParameter(
ClientPNames.COOKIE_POLICY, CookiePolicy.BROWSER_COMPATIBILITY);
]]></programlisting>
</section>
<section>
<title>Custom cookie policy</title>
<para>In order to implement a custom cookie policy one should create a custom implementation
of <interfacename>CookieSpec</interfacename> interface, create a
<interfacename>CookieSpecFactory</interfacename> implementation to create and
initialize instances of the custom specification and register the factory with
HttpClient. Once the custom specification has been registered, it can be activated the
same way as the standard cookie specifications.</para>
<programlisting><![CDATA[
CookieSpecFactory csf = new CookieSpecFactory() {
public CookieSpec newInstance(HttpParams params) {
return new BrowserCompatSpec() {
@Override
public void validate(Cookie cookie, CookieOrigin origin)
throws MalformedCookieException {
// Oh, I am easy
}
};
}
};
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.getCookieSpecs().register("easy", csf);
httpclient.getParams().setParameter(
ClientPNames.COOKIE_POLICY, "easy");
]]></programlisting>
</section>
<section>
<title>Cookie persistence</title>
<para>HttpClient can work with any physical representation of a persistent cookie store that
implements the <interfacename>CookieStore</interfacename> interface. The default
<interfacename>CookieStore</interfacename> implementation called
<classname>BasicClientCookie</classname> is a simple implementation backed by a
<classname>java.util.ArrayList</classname>. Cookies stored in an
<classname>BasicClientCookie</classname> object are lost when the container object
get garbage collected. Users can provide more complex implementations if
necessary.</para>
<programlisting><![CDATA[
DefaultHttpClient httpclient = new DefaultHttpClient();
// Create a local instance of cookie store
CookieStore cookieStore = new MyCookieStore();
// Populate cookies if needed
BasicClientCookie cookie = new BasicClientCookie("name", "value");
cookie.setVersion(0);
cookie.setDomain(".mycompany.com");
cookie.setPath("/");
cookieStore.addCookie(cookie);
// Set the store
httpclient.setCookieStore(cookieStore);
]]></programlisting>
</section>
<section>
<title>HTTP state management and execution context</title>
<para>In the course of HTTP request execution HttpClient adds the following state management
related objects to the execution context:</para>
<itemizedlist>
<listitem>
<formalpara>
<title>'http.cookiespec-registry':</title>
<para><classname>CookieSpecRegistry</classname> instance representing the actual
cookie specification registry. The value of this attribute set in the local
context takes precedence over the default one.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.cookie-spec':</title>
<para><interfacename>CookieSpec</interfacename> instance representing the actual
cookie specification.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.cookie-origin':</title>
<para><classname>CookieOrigin</classname> instance representing the actual
details of the origin server.</para>
</formalpara>
</listitem>
<listitem>
<formalpara>
<title>'http.cookie-store':</title>
<para><interfacename>CookieStore</interfacename> instance represents the actual
cookie store. The value of this attribute set in the local context takes
precedence over the default one.</para>
</formalpara>
</listitem>
</itemizedlist>
<para>The local <interfacename>HttpContext</interfacename> object can be used to customize
the HTTP state management context prior to request execution or examine its state after
the request has been executed:</para>
<programlisting><![CDATA[
HttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://localhost:8080/");
HttpResponse response = httpclient.execute(httpget, localContext);
CookieOrigin cookieOrigin = (CookieOrigin) localContext.getAttribute(
ClientContext.COOKIE_ORIGIN);
System.out.println("Cookie origin: " + cookieOrigin);
CookieSpec cookieSpec = (CookieSpec) localContext.getAttribute(
ClientContext.COOKIE_SPEC);
System.out.println("Cookie spec used: " + cookieSpec);
]]></programlisting>
</section>
<section>
<title>Per user / thread state management</title>
<para>One can use an individual local execution context in order to implement per user (or
per thread) state management. Cookie specification registry and cookie store defined in
the local context will take precedence over the default ones set at the HTTP client
level.</para>
<programlisting><![CDATA[
HttpClient httpclient = new DefaultHttpClient();
// Create a local instance of cookie store
CookieStore cookieStore = new BasicCookieStore();
// Create local HTTP context
HttpContext localContext = new BasicHttpContext();
// Bind custom cookie store to the local context
localContext.setAttribute(ClientContext.COOKIE_STORE, cookieStore);
HttpGet httpget = new HttpGet("http://www.google.com/");
// Pass local context as a parameter
HttpResponse response = httpclient.execute(httpget, localContext);
]]></programlisting>
</section>
</chapter>