Improved docbook for caching module.

git-svn-id: https://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk@1057314 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Jonathan Moore 2011-01-10 19:02:33 +00:00
parent 6e1d7a0ad3
commit 438411d60d
1 changed files with 99 additions and 9 deletions

View File

@ -27,11 +27,29 @@
<section id="generalconcepts">
<title>General Concepts</title>
<para>HttpClient Cache provides an HTTP 1.1 compliant caching layer to be
used with HttpClient. It is implemented as a decorator of HttpClient. It
provides basic HTTP 1.1 caching capability. You can specify a limit on the
maximum cacheable object size to have some control over the size of your
cache.</para>
<para>HttpClient Cache provides an HTTP/1.1-compliant caching layer to be
used with HttpClient--the Java equivalent of a browser cache. The
implementation follows the Decorator design pattern, where the
CachingHttpClient class is a drop-in replacement for
a DefaultHttpClient; requests that can be satisfied entirely from the cache
will not result in actual origin requests. Stale cache entries are
automatically validated with the origin where possible, using conditional GETs
and the If-Modified-Since and/or If-None-Match request headers.
</para>
<para>
HTTP/1.1 caching in general is designed to be <emphasis>semantically
transparent</emphasis>; that is, a cache should not change the meaning of
the request-response exchange between client and server. As such, it should
be safe to drop a CachingHttpClient into an existing compliant client-server
relationship. Although the caching module is part of the client from an
HTTP protocol point of view, the implementation aims to be compatible with
the requirements placed on a transparent caching proxy.
</para>
<para>Finally, CachingHttpClient includes support the Cache-Control
extensions specified by RFC 5861 (stale-if-error and stale-while-revalidate).
</para>
<para>When CachingHttpClient executes a request, it goes through the
following flow:</para>
@ -77,7 +95,7 @@
<orderedlist>
<listitem>
<para>Examing the response for protocol compliance</para>
<para>Examining the response for protocol compliance</para>
</listitem>
<listitem>
@ -105,12 +123,17 @@
<section id="rfc2616compliance">
<title>RFC-2616 Compliance</title>
<para>HttpClient Cache makes an effort to be at least conditionally
compliant with <ulink
<para>HttpClient Cache makes an effort to be at least <emphasis>conditionally
compliant</emphasis> with <ulink
url="http://www.ietf.org/rfc/rfc2616.txt">RFC-2616</ulink>. That is,
wherever the specification indicates MUST or MUST NOT for HTTP caches, the
caching layer attempts to behave in a way that satisfies those
requirements.</para>
requirements. This means the caching module won't produce incorrect
behavior when you drop it in. At the same time, the project is continuing
to work on unconditional compliance, which would add compliance with all the
SHOULDs and SHOULD NOTs, many of which we already comply with. We just can't
claim fully unconditional compliance until we satisfy <emphasis>all</emphasis>
of them.</para>
</section>
<section>
@ -155,4 +178,71 @@ case VALIDATED:
]]>
</programlisting>
</section>
<section id="configuration">
<title>Configuration</title>
<para>As the CachingHttpClient is a decorator, much of the configuration you may
want to do can be done on the HttpClient used as the "backend" by the HttpClient
(this includes setting options like timeouts and connection pool sizes). For
caching-specific configuration, you can provide a CacheConfig instance to
customize behavior across the following areas:</para>
<para><emphasis>Cache size.</emphasis> If the backend storage supports these limits,
you can specify the maximum number of cache entries as well as the maximum cacheable
response body size.</para>
<para><emphasis>Public/private caching.</emphasis> By default, the caching module
considers itself to be a shared (public) cache, and will not, for example, cache
responses to requests with Authorization headers or responses marked with
"Cache-Control: private". If, however, the cache is only going to be used by one
logical "user" (behaving similarly to a browser cache), then you will want to turn
off the shared cache setting.</para>
<para><emphasis>Heuristic caching.</emphasis>Per RFC2616, a cache MAY cache
certain cache entries even if no explicit cache control headers are set by the
origin. This behavior is off by default, but you may want to turn this on if you
are working with an origin that doesn't set proper headers but where you still
want to cache the responses. You will want to enable heuristic caching, then
specify either a default freshness lifetime and/or a fraction of the time since
the resource was last modified. See Sections 13.2.2 and 13.2.4 of the HTTP/1.1
RFC for more details on heuristic caching.</para>
<para><emphasis>Background validation.</emphasis> The cache module supports the
stale-while-revalidate directive of RFC5861, which allows certain cache entry
revalidations to happen in the background. You may want to tweak the settings
for the minimum and maximum number of background worker threads, as well as the
maximum time they can be idle before being reclaimed. You can also control the
size of the queue used for revalidations when there aren't enough workers to
keep up with demand.</para>
</section>
<section id="storage">
<title>Storage Backends</title>
<para>The default implementation of CachingHttpClient stores cache entries and
cached response bodies in memory in the JVM of your application. While this
offers high performance, it may not be appropriate for your application due to
the limitation on size or because the cache entries are ephemeral and don't
survive an application restart. The current release includes support for storing
cache entries using Ehcache and memcached implementations, which allow for
spilling cache entries to disk or storing them in an external process.</para>
<para>If none of those options are suitable for your application, it is
possible to provide your own storage backend by implementing the HttpCacheStorage
interface and then supplying that to CachingHttpClient at construction time. In
this case, the cache entries will be stored using your scheme but you will get to
reuse all of the logic surrounding HTTP/1.1 compliance and cache handling.
Generally speaking, it should be possible to create an HttpCacheStorage
implementation out of anything that supports a key/value store (similar to the
Java Map interface) with the ability to apply atomic updates.</para>
<para>Finally, because the CachingHttpClient is a decorator for HttpClient,
it's entirely possible to set up a multi-tier caching hierarchy; for example,
wrapping an in-memory CachingHttpClient around one that stores cache entries on
disk or remotely in memcached, following a pattern similar to virtual memory,
L1/L2 processor caches, etc.
</para>
</section>
</chapter>