Docs: Rewrote the filtered query docs to be clearer

Closes #1688
This commit is contained in:
Clinton Gormley 2014-06-19 16:33:57 +02:00
parent 8ccfca3a2f
commit adf6e794b6
1 changed files with 134 additions and 29 deletions

View File

@ -1,54 +1,159 @@
[[query-dsl-filtered-query]] [[query-dsl-filtered-query]]
=== Filtered Query === Filtered Query
A query that applies a filter to the results of another query. This The `filtered` query is used to combine another query with any
query maps to Lucene `FilteredQuery`. <<query-dsl-filters,filter>>). Filters are usually faster than queries because:
* they don't have to calculate the relevance `_score` for each document --
the answer is just a boolean ``Yes, the document matches the filter'' or
``No, the document does not match the filter''.
* the results from most filters can be cached in memory, making subsequent
executions faster.
TIP: Exclude as many document as you can with a filter, then query just the
documents that remain.
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
{ {
"filtered" : { "filtered": {
"query" : { "query": {
"term" : { "tag" : "wow" } "match": { "tweet": "full text search" }
}, },
"filter" : { "filter": {
"range" : { "range": { "created": { "gte": "now - 1d / d" }}
"age" : { "from" : 10, "to" : 20 } }
}
}
--------------------------------------------------
The `filtered` query can be used wherever a `query` is expected, for instance,
to use the above example in search request:
[source,js]
--------------------------------------------------
curl -XGET localhost:9200/_search -d '
{
"query": {
"filtered": { <1>
"query": {
"match": { "tweet": "full text search" }
},
"filter": {
"range": { "created": { "gte": "now - 1d / d" }}
}
}
}
}
'
--------------------------------------------------
<1> The `filtered` query is passed as the value of the `query`
parameter in the search request.
==== Filtering without a query
If a `query` is not specified, it defaults to the
<<query-dsl-match-all-query,`match_all` query>>. This means that the
`filtered` query can be used to wrap just a filter, so that it can be used
wherever a query is expected.
[source,js]
--------------------------------------------------
curl -XGET localhost:9200/_search -d '
{
"query": {
"filtered": { <1>
"filter": {
"range": { "created": { "gte": "now - 1d / d" }}
}
}
}
}
'
--------------------------------------------------
<1> No `query` has been specfied, so this request applies just the filter,
returning all documents created since yesterday.
==== Multiple filters
Multiple filters can be applied by wrapping them in a
<<query-dsl-bool-filter,`bool` filter>>, for example:
[source,js]
--------------------------------------------------
{
"filtered": {
"query": { "match": { "tweet": "full text search" }},
"filter": {
"bool": {
"must": { "range": { "created": { "gte": "now - 1d / d" }}},
"should": [
{ "term": { "featured": true }},
{ "term": { "starred": true }}
],
"must_not": { "term": { "deleted": false }}
} }
} }
} }
} }
-------------------------------------------------- --------------------------------------------------
The filter object can hold only filter elements, not queries. Filters Similarly, multiple queries can be combined with a
can be much faster compared to queries since they don't perform any <<query-dsl-bool-query,`bool` query>>.
scoring, especially when they are cached.
==== Filter strategy ==== Filter strategy
The filtered query allows to configure how to intersect the filter with the query: You can control how the filter and query are executed with the `strategy`
parameter:
[source,js] [source,js]
-------------------------------------------------- --------------------------------------------------
{ {
"filtered" : { "filtered" : {
"query" : { "query" : { ... },
// query definition "filter" : { ... ],
},
"filter" : {
// filter definition
},
"strategy": "leap_frog" "strategy": "leap_frog"
} }
} }
-------------------------------------------------- --------------------------------------------------
[horizontal] IMPORTANT: This is an _expert-level_ setting. Most users can simply ignore it.
`leap_frog_query_first`:: Look for the first document matching the query, and then alternatively advance the query and the filter to find common matches.
`leap_frog_filter_first`:: Look for the first document matching the filter, and then alternatively advance the query and the filter to find common matches.
`leap_frog`:: Same as `leap_frog_query_first`.
`query_first`:: If the filter supports random access, then search for documents using the query, and then consult the filter to check whether there is a match. Otherwise fall back to `leap_frog_query_first`.
`random_access_${threshold}`:: If the filter supports random access and if there is at least one matching document among the first `threshold` ones, then apply the filter first. Otherwise fall back to `leap_frog_query_first`. `${threshold}` must be greater than or equal to `1`.
`random_access_always`:: Apply the filter first if it supports random access. Otherwise fall back to `leap_frog_query_first`.
The default strategy is to use `query_first` on filters that are not advanceable such as geo filters and script filters, and `random_access_100` on other filters. The `strategy` parameter accepts the following options:
[horizontal]
`leap_frog_query_first`::
Look for the first document matching the query, and then alternatively
advance the query and the filter to find common matches.
`leap_frog_filter_first`::
Look for the first document matching the filter, and then alternatively
advance the query and the filter to find common matches.
`leap_frog`::
Same as `leap_frog_query_first`.
`query_first`::
If the filter supports random access, then search for documents using the
query, and then consult the filter to check whether there is a match.
Otherwise fall back to `leap_frog_query_first`.
`random_access_${threshold}`::
If the filter supports random access and if there is at least one matching
document among the first `threshold` ones, then apply the filter first.
Otherwise fall back to `leap_frog_query_first`. `${threshold}` must be
greater than or equal to `1`.
`random_access_always`::
Apply the filter first if it supports random access. Otherwise fall back
to `leap_frog_query_first`.
The default strategy is to use `query_first` on filters that are not
advanceable such as geo filters and script filters, and `random_access_100` on
other filters.