From 438411d60d919a8ab02db5a9a9947b9eb452238d Mon Sep 17 00:00:00 2001 From: Jonathan Moore Date: Mon, 10 Jan 2011 19:02:33 +0000 Subject: [PATCH] Improved docbook for caching module. git-svn-id: https://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk@1057314 13f79535-47bb-0310-9956-ffa450edef68 --- src/docbkx/caching.xml | 108 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 99 insertions(+), 9 deletions(-) diff --git a/src/docbkx/caching.xml b/src/docbkx/caching.xml index f04a40f51..90a81ea35 100644 --- a/src/docbkx/caching.xml +++ b/src/docbkx/caching.xml @@ -27,11 +27,29 @@
General Concepts - HttpClient Cache provides an HTTP 1.1 compliant caching layer to be - used with HttpClient. It is implemented as a decorator of HttpClient. It - provides basic HTTP 1.1 caching capability. You can specify a limit on the - maximum cacheable object size to have some control over the size of your - cache. + HttpClient Cache provides an HTTP/1.1-compliant caching layer to be + used with HttpClient--the Java equivalent of a browser cache. The + implementation follows the Decorator design pattern, where the + CachingHttpClient class is a drop-in replacement for + a DefaultHttpClient; requests that can be satisfied entirely from the cache + will not result in actual origin requests. Stale cache entries are + automatically validated with the origin where possible, using conditional GETs + and the If-Modified-Since and/or If-None-Match request headers. + + + + HTTP/1.1 caching in general is designed to be semantically + transparent; that is, a cache should not change the meaning of + the request-response exchange between client and server. As such, it should + be safe to drop a CachingHttpClient into an existing compliant client-server + relationship. Although the caching module is part of the client from an + HTTP protocol point of view, the implementation aims to be compatible with + the requirements placed on a transparent caching proxy. + + + Finally, CachingHttpClient includes support the Cache-Control + extensions specified by RFC 5861 (stale-if-error and stale-while-revalidate). + When CachingHttpClient executes a request, it goes through the following flow: @@ -77,7 +95,7 @@ - Examing the response for protocol compliance + Examining the response for protocol compliance @@ -105,12 +123,17 @@
RFC-2616 Compliance - HttpClient Cache makes an effort to be at least conditionally - compliant with HttpClient Cache makes an effort to be at least conditionally + compliant with RFC-2616. That is, wherever the specification indicates MUST or MUST NOT for HTTP caches, the caching layer attempts to behave in a way that satisfies those - requirements. + requirements. This means the caching module won't produce incorrect + behavior when you drop it in. At the same time, the project is continuing + to work on unconditional compliance, which would add compliance with all the + SHOULDs and SHOULD NOTs, many of which we already comply with. We just can't + claim fully unconditional compliance until we satisfy all + of them.
@@ -155,4 +178,71 @@ case VALIDATED: ]]>
+ +
+ Configuration + + As the CachingHttpClient is a decorator, much of the configuration you may + want to do can be done on the HttpClient used as the "backend" by the HttpClient + (this includes setting options like timeouts and connection pool sizes). For + caching-specific configuration, you can provide a CacheConfig instance to + customize behavior across the following areas: + + Cache size. If the backend storage supports these limits, + you can specify the maximum number of cache entries as well as the maximum cacheable + response body size. + + + Public/private caching. By default, the caching module + considers itself to be a shared (public) cache, and will not, for example, cache + responses to requests with Authorization headers or responses marked with + "Cache-Control: private". If, however, the cache is only going to be used by one + logical "user" (behaving similarly to a browser cache), then you will want to turn + off the shared cache setting. + + Heuristic caching.Per RFC2616, a cache MAY cache + certain cache entries even if no explicit cache control headers are set by the + origin. This behavior is off by default, but you may want to turn this on if you + are working with an origin that doesn't set proper headers but where you still + want to cache the responses. You will want to enable heuristic caching, then + specify either a default freshness lifetime and/or a fraction of the time since + the resource was last modified. See Sections 13.2.2 and 13.2.4 of the HTTP/1.1 + RFC for more details on heuristic caching. + + Background validation. The cache module supports the + stale-while-revalidate directive of RFC5861, which allows certain cache entry + revalidations to happen in the background. You may want to tweak the settings + for the minimum and maximum number of background worker threads, as well as the + maximum time they can be idle before being reclaimed. You can also control the + size of the queue used for revalidations when there aren't enough workers to + keep up with demand. +
+ +
+ Storage Backends + + The default implementation of CachingHttpClient stores cache entries and + cached response bodies in memory in the JVM of your application. While this + offers high performance, it may not be appropriate for your application due to + the limitation on size or because the cache entries are ephemeral and don't + survive an application restart. The current release includes support for storing + cache entries using Ehcache and memcached implementations, which allow for + spilling cache entries to disk or storing them in an external process. + + If none of those options are suitable for your application, it is + possible to provide your own storage backend by implementing the HttpCacheStorage + interface and then supplying that to CachingHttpClient at construction time. In + this case, the cache entries will be stored using your scheme but you will get to + reuse all of the logic surrounding HTTP/1.1 compliance and cache handling. + Generally speaking, it should be possible to create an HttpCacheStorage + implementation out of anything that supports a key/value store (similar to the + Java Map interface) with the ability to apply atomic updates. + + Finally, because the CachingHttpClient is a decorator for HttpClient, + it's entirely possible to set up a multi-tier caching hierarchy; for example, + wrapping an in-memory CachingHttpClient around one that stores cache entries on + disk or remotely in memcached, following a pattern similar to virtual memory, + L1/L2 processor caches, etc. + +