Improved docbook for caching module.
git-svn-id: https://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk@1057314 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
6e1d7a0ad3
commit
438411d60d
|
@ -27,11 +27,29 @@
|
|||
<section id="generalconcepts">
|
||||
<title>General Concepts</title>
|
||||
|
||||
<para>HttpClient Cache provides an HTTP 1.1 compliant caching layer to be
|
||||
used with HttpClient. It is implemented as a decorator of HttpClient. It
|
||||
provides basic HTTP 1.1 caching capability. You can specify a limit on the
|
||||
maximum cacheable object size to have some control over the size of your
|
||||
cache.</para>
|
||||
<para>HttpClient Cache provides an HTTP/1.1-compliant caching layer to be
|
||||
used with HttpClient--the Java equivalent of a browser cache. The
|
||||
implementation follows the Decorator design pattern, where the
|
||||
CachingHttpClient class is a drop-in replacement for
|
||||
a DefaultHttpClient; requests that can be satisfied entirely from the cache
|
||||
will not result in actual origin requests. Stale cache entries are
|
||||
automatically validated with the origin where possible, using conditional GETs
|
||||
and the If-Modified-Since and/or If-None-Match request headers.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
HTTP/1.1 caching in general is designed to be <emphasis>semantically
|
||||
transparent</emphasis>; that is, a cache should not change the meaning of
|
||||
the request-response exchange between client and server. As such, it should
|
||||
be safe to drop a CachingHttpClient into an existing compliant client-server
|
||||
relationship. Although the caching module is part of the client from an
|
||||
HTTP protocol point of view, the implementation aims to be compatible with
|
||||
the requirements placed on a transparent caching proxy.
|
||||
</para>
|
||||
|
||||
<para>Finally, CachingHttpClient includes support the Cache-Control
|
||||
extensions specified by RFC 5861 (stale-if-error and stale-while-revalidate).
|
||||
</para>
|
||||
|
||||
<para>When CachingHttpClient executes a request, it goes through the
|
||||
following flow:</para>
|
||||
|
@ -77,7 +95,7 @@
|
|||
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>Examing the response for protocol compliance</para>
|
||||
<para>Examining the response for protocol compliance</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
|
@ -105,12 +123,17 @@
|
|||
<section id="rfc2616compliance">
|
||||
<title>RFC-2616 Compliance</title>
|
||||
|
||||
<para>HttpClient Cache makes an effort to be at least conditionally
|
||||
compliant with <ulink
|
||||
<para>HttpClient Cache makes an effort to be at least <emphasis>conditionally
|
||||
compliant</emphasis> with <ulink
|
||||
url="http://www.ietf.org/rfc/rfc2616.txt">RFC-2616</ulink>. That is,
|
||||
wherever the specification indicates MUST or MUST NOT for HTTP caches, the
|
||||
caching layer attempts to behave in a way that satisfies those
|
||||
requirements.</para>
|
||||
requirements. This means the caching module won't produce incorrect
|
||||
behavior when you drop it in. At the same time, the project is continuing
|
||||
to work on unconditional compliance, which would add compliance with all the
|
||||
SHOULDs and SHOULD NOTs, many of which we already comply with. We just can't
|
||||
claim fully unconditional compliance until we satisfy <emphasis>all</emphasis>
|
||||
of them.</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
|
@ -155,4 +178,71 @@ case VALIDATED:
|
|||
]]>
|
||||
</programlisting>
|
||||
</section>
|
||||
|
||||
<section id="configuration">
|
||||
<title>Configuration</title>
|
||||
|
||||
<para>As the CachingHttpClient is a decorator, much of the configuration you may
|
||||
want to do can be done on the HttpClient used as the "backend" by the HttpClient
|
||||
(this includes setting options like timeouts and connection pool sizes). For
|
||||
caching-specific configuration, you can provide a CacheConfig instance to
|
||||
customize behavior across the following areas:</para>
|
||||
|
||||
<para><emphasis>Cache size.</emphasis> If the backend storage supports these limits,
|
||||
you can specify the maximum number of cache entries as well as the maximum cacheable
|
||||
response body size.</para>
|
||||
|
||||
|
||||
<para><emphasis>Public/private caching.</emphasis> By default, the caching module
|
||||
considers itself to be a shared (public) cache, and will not, for example, cache
|
||||
responses to requests with Authorization headers or responses marked with
|
||||
"Cache-Control: private". If, however, the cache is only going to be used by one
|
||||
logical "user" (behaving similarly to a browser cache), then you will want to turn
|
||||
off the shared cache setting.</para>
|
||||
|
||||
<para><emphasis>Heuristic caching.</emphasis>Per RFC2616, a cache MAY cache
|
||||
certain cache entries even if no explicit cache control headers are set by the
|
||||
origin. This behavior is off by default, but you may want to turn this on if you
|
||||
are working with an origin that doesn't set proper headers but where you still
|
||||
want to cache the responses. You will want to enable heuristic caching, then
|
||||
specify either a default freshness lifetime and/or a fraction of the time since
|
||||
the resource was last modified. See Sections 13.2.2 and 13.2.4 of the HTTP/1.1
|
||||
RFC for more details on heuristic caching.</para>
|
||||
|
||||
<para><emphasis>Background validation.</emphasis> The cache module supports the
|
||||
stale-while-revalidate directive of RFC5861, which allows certain cache entry
|
||||
revalidations to happen in the background. You may want to tweak the settings
|
||||
for the minimum and maximum number of background worker threads, as well as the
|
||||
maximum time they can be idle before being reclaimed. You can also control the
|
||||
size of the queue used for revalidations when there aren't enough workers to
|
||||
keep up with demand.</para>
|
||||
</section>
|
||||
|
||||
<section id="storage">
|
||||
<title>Storage Backends</title>
|
||||
|
||||
<para>The default implementation of CachingHttpClient stores cache entries and
|
||||
cached response bodies in memory in the JVM of your application. While this
|
||||
offers high performance, it may not be appropriate for your application due to
|
||||
the limitation on size or because the cache entries are ephemeral and don't
|
||||
survive an application restart. The current release includes support for storing
|
||||
cache entries using Ehcache and memcached implementations, which allow for
|
||||
spilling cache entries to disk or storing them in an external process.</para>
|
||||
|
||||
<para>If none of those options are suitable for your application, it is
|
||||
possible to provide your own storage backend by implementing the HttpCacheStorage
|
||||
interface and then supplying that to CachingHttpClient at construction time. In
|
||||
this case, the cache entries will be stored using your scheme but you will get to
|
||||
reuse all of the logic surrounding HTTP/1.1 compliance and cache handling.
|
||||
Generally speaking, it should be possible to create an HttpCacheStorage
|
||||
implementation out of anything that supports a key/value store (similar to the
|
||||
Java Map interface) with the ability to apply atomic updates.</para>
|
||||
|
||||
<para>Finally, because the CachingHttpClient is a decorator for HttpClient,
|
||||
it's entirely possible to set up a multi-tier caching hierarchy; for example,
|
||||
wrapping an in-memory CachingHttpClient around one that stores cache entries on
|
||||
disk or remotely in memcached, following a pattern similar to virtual memory,
|
||||
L1/L2 processor caches, etc.
|
||||
</para>
|
||||
</section>
|
||||
</chapter>
|
||||
|
|
Loading…
Reference in New Issue