[Docs] Improve documentation of the new caching policy for filters.
This commit is contained in:
parent
391b5f3f5e
commit
fb6c3b7c29
|
@ -15,37 +15,76 @@ filter does not require a lot of memory, and will cause other queries
|
|||
executing against the same filter (same parameters) to be blazingly
|
||||
fast.
|
||||
|
||||
Some filters already produce a result that is easily cacheable, and the
|
||||
difference between caching and not caching them is the act of placing
|
||||
the result in the cache or not. These filters, which include the
|
||||
<<query-dsl-term-filter,term>>,
|
||||
However the cost of caching is not the same for all filters. For
|
||||
instance some filters are already fast out of the box while caching could
|
||||
add significant overhead, and some filters produce results that are already
|
||||
cacheable so caching them is just a matter of putting the result in the
|
||||
cache.
|
||||
|
||||
The default caching policy, `_cache: auto`, tracks the 1000 most recently
|
||||
used filters on a per-index basis and makes decisions based on their
|
||||
frequency.
|
||||
|
||||
[float]
|
||||
==== Filters that read directly the index structure
|
||||
|
||||
Some filters can directly read the index structure and potentially jump
|
||||
over large sequences of documents that are not worth evaluating (for
|
||||
instance when these documents do not match the query). Caching these
|
||||
filters introduces overhead given that all documents that the filter
|
||||
matches need to be consumed in order to be loaded into the cache.
|
||||
|
||||
These filters, which include the <<query-dsl-term-filter,term>> and
|
||||
<<query-dsl-term-query,query>> filters, are only cached after they
|
||||
appear 5 times or more in the history of the 1000 most recently used
|
||||
filters.
|
||||
|
||||
[float]
|
||||
==== Filters that produce results that are already cacheable
|
||||
|
||||
Some filters produce results that are already cacheable, and the difference
|
||||
between caching and not caching them is the act of placing the result in
|
||||
the cache or not. These filters, which include the
|
||||
<<query-dsl-terms-filter,terms>>,
|
||||
<<query-dsl-prefix-filter,prefix>>, and
|
||||
<<query-dsl-range-filter,range>> filters, are by
|
||||
default cached and are recommended to use (compared to the equivalent
|
||||
query version) when the same filter (same parameters) will be used
|
||||
across multiple different queries (for example, a range filter with age
|
||||
higher than 10).
|
||||
<<query-dsl-range-filter,range>> filters, are by default cached after they
|
||||
appear twice or more in the history of the most 1000 recently used filters.
|
||||
|
||||
Other filters, usually already working with the field data loaded into
|
||||
memory, are not cached by default. Those filters are already very fast,
|
||||
and the process of caching them requires extra processing in order to
|
||||
allow the filter result to be used with different queries than the one
|
||||
executed. These filters, including the geo,
|
||||
and <<query-dsl-script-filter,script>> filters
|
||||
are not cached by default.
|
||||
[float]
|
||||
==== Computational filters
|
||||
|
||||
The last type of filters are those working with other filters. The
|
||||
Some filters need to run some computation in order to figure out whether
|
||||
a given document matches a filter. These filters, which include the geo and
|
||||
<<query-dsl-script-filter,script>> filters, but also the
|
||||
<<query-dsl-terms-filter,terms>> and <<query-dsl-range-filter,range>>
|
||||
filters when using the `fielddata` execution mode are never cached by default,
|
||||
as it would require to evaluate the filter on all documents in your indices
|
||||
while they can otherwise be only evaluated on documents that match the query.
|
||||
|
||||
[float]
|
||||
==== Compound filters
|
||||
|
||||
The last type of filters are those working with other filters, and includes
|
||||
the <<query-dsl-bool-filter,bool>>,
|
||||
<<query-dsl-and-filter,and>>,
|
||||
<<query-dsl-not-filter,not>> and
|
||||
<<query-dsl-or-filter,or>> filters are not
|
||||
cached as they basically just manipulate the internal filters.
|
||||
<<query-dsl-or-filter,or>> filters.
|
||||
|
||||
There is no general rule about these filters. Depending on the filters that
|
||||
they wrap, they will sometimes return a filter that dynamically evaluates the
|
||||
sub filters and sometimes evaluate the sub filters eagerly in order to return
|
||||
a result that is already cacheable, so depending on the case, these filters
|
||||
will be cached after they appear 2+ or 5+ times in the history of the most
|
||||
1000 recently used filters.
|
||||
|
||||
[float]
|
||||
==== Overriding the default behaviour
|
||||
|
||||
All filters allow to set `_cache` element on them to explicitly control
|
||||
caching. It accepts 3 values: `true` in order to cache the filter, `false`
|
||||
to make sure that the filter will not be cached, and `auto`, which is the
|
||||
default and will decide on whether to cache the filter based on the cost
|
||||
to cache the filter and how often the filter has been used.
|
||||
to cache it and how often it has been used as explained above.
|
||||
|
||||
Filters also allow to set `_cache_key` which will be used as the
|
||||
caching key for that filter. This can be handy when using very large
|
||||
|
|
Loading…
Reference in New Issue