465 lines
20 KiB
Plaintext
465 lines
20 KiB
Plaintext
--
|
|
:api: search
|
|
:request: SearchRequest
|
|
:response: SearchResponse
|
|
--
|
|
|
|
[id="{upid}-{api}"]
|
|
=== Search API
|
|
|
|
[id="{upid}-{api}-request"]
|
|
==== Search Request
|
|
|
|
The +{request}+ is used for any operation that has to do with searching
|
|
documents, aggregations, suggestions and also offers ways of requesting
|
|
highlighting on the resulting documents.
|
|
|
|
In its most basic form, we can add a query to the request:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-basic]
|
|
--------------------------------------------------
|
|
|
|
<1> Creates the `SeachRequest`. Without arguments this runs against all indices.
|
|
<2> Most search parameters are added to the `SearchSourceBuilder`. It offers setters for everything that goes into the search request body.
|
|
<3> Add a `match_all` query to the `SearchSourceBuilder`.
|
|
<4> Add the `SearchSourceBuilder` to the `SeachRequest`.
|
|
|
|
[id="{upid}-{api}-request-optional"]
|
|
===== Optional arguments
|
|
|
|
Let's first look at some of the optional arguments of a +{request}+:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-indices]
|
|
--------------------------------------------------
|
|
<1> Restricts the request to an index
|
|
|
|
There are a couple of other interesting optional parameters:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-routing]
|
|
--------------------------------------------------
|
|
<1> Set a routing parameter
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-indicesOptions]
|
|
--------------------------------------------------
|
|
<1> Setting `IndicesOptions` controls how unavailable indices are resolved and
|
|
how wildcard expressions are expanded
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-preference]
|
|
--------------------------------------------------
|
|
<1> Use the preference parameter e.g. to execute the search to prefer local
|
|
shards. The default is to randomize across shards.
|
|
|
|
===== Using the SearchSourceBuilder
|
|
|
|
Most options controlling the search behavior can be set on the
|
|
`SearchSourceBuilder`,
|
|
which contains more or less the equivalent of the options in the search request
|
|
body of the Rest API.
|
|
|
|
Here are a few examples of some common options:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-source-basics]
|
|
--------------------------------------------------
|
|
<1> Create a `SearchSourceBuilder` with default options.
|
|
<2> Set the query. Can be any type of `QueryBuilder`
|
|
<3> Set the `from` option that determines the result index to start searching
|
|
from. Defaults to 0.
|
|
<4> Set the `size` option that determines the number of search hits to return.
|
|
Defaults to 10.
|
|
<5> Set an optional timeout that controls how long the search is allowed to
|
|
take.
|
|
|
|
After this, the `SearchSourceBuilder` only needs to be added to the
|
|
+{request}+:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-source-setter]
|
|
--------------------------------------------------
|
|
|
|
[id="{upid}-{api}-request-building-queries"]
|
|
===== Building queries
|
|
|
|
Search queries are created using `QueryBuilder` objects. A `QueryBuilder` exists
|
|
for every search query type supported by Elasticsearch's {ref}/query-dsl.html[Query DSL].
|
|
|
|
A `QueryBuilder` can be created using its constructor:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-query-builder-ctor]
|
|
--------------------------------------------------
|
|
<1> Create a full text {ref}/query-dsl-match-query.html[Match Query] that matches
|
|
the text "kimchy" over the field "user".
|
|
|
|
Once created, the `QueryBuilder` object provides methods to configure the options
|
|
of the search query it creates:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-query-builder-options]
|
|
--------------------------------------------------
|
|
<1> Enable fuzzy matching on the match query
|
|
<2> Set the prefix length option on the match query
|
|
<3> Set the max expansion options to control the fuzzy
|
|
process of the query
|
|
|
|
`QueryBuilder` objects can also be created using the `QueryBuilders` utility class.
|
|
This class provides helper methods that can be used to create `QueryBuilder` objects
|
|
using a fluent programming style:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-query-builders]
|
|
--------------------------------------------------
|
|
|
|
Whatever the method used to create it, the `QueryBuilder` object must be added
|
|
to the `SearchSourceBuilder` as follows:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-query-setter]
|
|
--------------------------------------------------
|
|
|
|
The <<{upid}-query-builders, Building Queries>> page gives a list of all available search queries with
|
|
their corresponding `QueryBuilder` objects and `QueryBuilders` helper methods.
|
|
|
|
|
|
===== Specifying Sorting
|
|
|
|
The `SearchSourceBuilder` allows to add one or more `SortBuilder` instances. There are four special implementations (Field-, Score-, GeoDistance- and ScriptSortBuilder).
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-source-sorting]
|
|
--------------------------------------------------
|
|
<1> Sort descending by `_score` (the default)
|
|
<2> Also sort ascending by `_id` field
|
|
|
|
===== Source filtering
|
|
|
|
By default, search requests return the contents of the document `_source` but like in the Rest API you can overwrite this behavior. For example, you can turn off `_source` retrieval completely:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-source-filtering-off]
|
|
--------------------------------------------------
|
|
|
|
The method also accepts an array of one or more wildcard patterns to control which fields get included or excluded in a more fine grained way:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-source-filtering-includes]
|
|
--------------------------------------------------
|
|
|
|
[id="{upid}-{api}-request-highlighting"]
|
|
===== Requesting Highlighting
|
|
|
|
Highlighting search results can be achieved by setting a `HighlightBuilder` on the
|
|
`SearchSourceBuilder`. Different highlighting behaviour can be defined for each
|
|
fields by adding one or more `HighlightBuilder.Field` instances to a `HighlightBuilder`.
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-highlighting]
|
|
--------------------------------------------------
|
|
<1> Creates a new `HighlightBuilder`
|
|
<2> Create a field highlighter for the `title` field
|
|
<3> Set the field highlighter type
|
|
<4> Add the field highlighter to the highlight builder
|
|
|
|
There are many options which are explained in detail in the Rest API documentation. The Rest
|
|
API parameters (e.g. `pre_tags`) are usually changed by
|
|
setters with a similar name (e.g. `#preTags(String ...)`).
|
|
|
|
Highlighted text fragments can <<{upid}-{api}-response-highlighting,later be retrieved>> from the +{response}+.
|
|
|
|
[id="{upid}-{api}-request-building-aggs"]
|
|
===== Requesting Aggregations
|
|
|
|
Aggregations can be added to the search by first creating the appropriate
|
|
`AggregationBuilder` and then setting it on the `SearchSourceBuilder`. In the
|
|
following example we create a `terms` aggregation on company names with a
|
|
sub-aggregation on the average age of employees in the company:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-aggregations]
|
|
--------------------------------------------------
|
|
|
|
The <<{upid}-aggregation-builders, Building Aggregations>> page gives a list of all available aggregations with
|
|
their corresponding `AggregationBuilder` objects and `AggregationBuilders` helper methods.
|
|
|
|
We will later see how to <<{upid}-{api}-response-aggs,access aggregations>> in the +{response}+.
|
|
|
|
===== Requesting Suggestions
|
|
|
|
To add Suggestions to the search request, use one of the `SuggestionBuilder` implementations
|
|
that are easily accessible from the `SuggestBuilders` factory class. Suggestion builders
|
|
need to be added to the top level `SuggestBuilder`, which itself can be set on the `SearchSourceBuilder`.
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-suggestion]
|
|
--------------------------------------------------
|
|
<1> Creates a new `TermSuggestionBuilder` for the `user` field and
|
|
the text `kmichy`
|
|
<2> Adds the suggestion builder and names it `suggest_user`
|
|
|
|
We will later see how to <<{upid}-{api}-response-suggestions,retrieve suggestions>> from the
|
|
+{response}+.
|
|
|
|
===== Profiling Queries and Aggregations
|
|
|
|
The {ref}/search-profile.html[Profile API] can be used to profile the execution of queries and aggregations for
|
|
a specific search request. in order to use it, the profile flag must be set to true on the `SearchSourceBuilder`:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-profiling]
|
|
--------------------------------------------------
|
|
|
|
Once the +{request}+ is executed the corresponding +{response}+ will
|
|
<<{upid}-{api}-response-profile,contain the profiling results>>.
|
|
|
|
include::../execution.asciidoc[]
|
|
|
|
[id="{upid}-{api}-response"]
|
|
==== {response}
|
|
|
|
The +{response}+ that is returned by executing the search provides details
|
|
about the search execution itself as well as access to the documents returned.
|
|
First, there is useful information about the request execution itself, like the
|
|
HTTP status code, execution time or whether the request terminated early or timed
|
|
out:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-response-1]
|
|
--------------------------------------------------
|
|
|
|
Second, the response also provides information about the execution on the
|
|
shard level by offering statistics about the total number of shards that were
|
|
affected by the search, and the successful vs. unsuccessful shards. Possible
|
|
failures can also be handled by iterating over an array off
|
|
`ShardSearchFailures` like in the following example:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-response-2]
|
|
--------------------------------------------------
|
|
|
|
[id="{upid}-{api}-response-search-hits"]
|
|
===== Retrieving SearchHits
|
|
|
|
To get access to the returned documents, we need to first get the `SearchHits`
|
|
contained in the response:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-hits-get]
|
|
--------------------------------------------------
|
|
|
|
The `SearchHits` provides global information about all hits, like total number
|
|
of hits or the maximum score:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-hits-info]
|
|
--------------------------------------------------
|
|
|
|
Nested inside the `SearchHits` are the individual search results that can
|
|
be iterated over:
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-hits-singleHit]
|
|
--------------------------------------------------
|
|
|
|
The `SearchHit` provides access to basic information like index, document ID
|
|
and score of each search hit:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-hits-singleHit-properties]
|
|
--------------------------------------------------
|
|
|
|
Furthermore, it lets you get back the document source, either as a simple
|
|
JSON-String or as a map of key/value pairs. In this map, regular fields
|
|
are keyed by the field name and contain the field value. Multi-valued fields are
|
|
returned as lists of objects, nested objects as another key/value map. These
|
|
cases need to be cast accordingly:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-hits-singleHit-source]
|
|
--------------------------------------------------
|
|
|
|
[id="{upid}-{api}-response-highlighting"]
|
|
===== Retrieving Highlighting
|
|
|
|
If <<{upid}-{api}-request-highlighting,requested>>, highlighted text fragments can be retrieved from each `SearchHit` in the result. The hit object offers
|
|
access to a map of field names to `HighlightField` instances, each of which contains one
|
|
or many highlighted text fragments:
|
|
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-highlighting-get]
|
|
--------------------------------------------------
|
|
<1> Get the highlighting for the `title` field
|
|
<2> Get one or many fragments containing the highlighted field content
|
|
|
|
[id="{upid}-{api}-response-aggs"]
|
|
===== Retrieving Aggregations
|
|
|
|
Aggregations can be retrieved from the +{response}+ by first getting the
|
|
root of the aggregation tree, the `Aggregations` object, and then getting the
|
|
aggregation by name.
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-aggregations-get]
|
|
--------------------------------------------------
|
|
<1> Get the `by_company` terms aggregation
|
|
<2> Get the buckets that is keyed with `Elastic`
|
|
<3> Get the `average_age` sub-aggregation from that bucket
|
|
|
|
Note that if you access aggregations by name, you need to specify the
|
|
aggregation interface according to the type of aggregation you requested,
|
|
otherwise a `ClassCastException` will be thrown:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[search-request-aggregations-get-wrongCast]
|
|
--------------------------------------------------
|
|
<1> This will throw an exception because "by_company" is a `terms` aggregation
|
|
but we try to retrieve it as a `range` aggregation
|
|
|
|
It is also possible to access all aggregations as a map that is keyed by the
|
|
aggregation name. In this case, the cast to the proper aggregation interface
|
|
needs to happen explicitly:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-aggregations-asMap]
|
|
--------------------------------------------------
|
|
|
|
There are also getters that return all top level aggregations as a list:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-aggregations-asList]
|
|
--------------------------------------------------
|
|
|
|
And last but not least you can iterate over all aggregations and then e.g.
|
|
decide how to further process them based on their type:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-aggregations-iterator]
|
|
--------------------------------------------------
|
|
|
|
[id="{upid}-{api}-response-suggestions"]
|
|
===== Retrieving Suggestions
|
|
|
|
To get back the suggestions from a +{response}+, use the `Suggest` object as an entry point and then retrieve the nested suggestion objects:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-suggestion-get]
|
|
--------------------------------------------------
|
|
<1> Use the `Suggest` class to access suggestions
|
|
<2> Suggestions can be retrieved by name. You need to assign them to the correct
|
|
type of Suggestion class (here `TermSuggestion`), otherwise a `ClassCastException` is thrown
|
|
<3> Iterate over the suggestion entries
|
|
<4> Iterate over the options in one entry
|
|
|
|
[id="{upid}-{api}-response-profile"]
|
|
===== Retrieving Profiling Results
|
|
|
|
Profiling results are retrieved from a +{response}+ using the `getProfileResults()` method. This
|
|
method returns a `Map` containing a `ProfileShardResult` object for every shard involved in the
|
|
+{request}+ execution. `ProfileShardResult` are stored in the `Map` using a key that uniquely
|
|
identifies the shard the profile result corresponds to.
|
|
|
|
Here is a sample code that shows how to iterate over all the profiling results of every shard:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-profiling-get]
|
|
--------------------------------------------------
|
|
<1> Retrieve the `Map` of `ProfileShardResult` from the +{response}+
|
|
<2> Profiling results can be retrieved by shard's key if the key is known, otherwise it might be simpler
|
|
to iterate over all the profiling results
|
|
<3> Retrieve the key that identifies which shard the `ProfileShardResult` belongs to
|
|
<4> Retrieve the `ProfileShardResult` for the given shard
|
|
|
|
The `ProfileShardResult` object itself contains one or more query profile results, one for each query
|
|
executed against the underlying Lucene index:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-profiling-queries]
|
|
--------------------------------------------------
|
|
<1> Retrieve the list of `QueryProfileShardResult`
|
|
<2> Iterate over each `QueryProfileShardResult`
|
|
|
|
Each `QueryProfileShardResult` gives access to the detailed query tree execution, returned as a list of
|
|
`ProfileResult` objects:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-profiling-queries-results]
|
|
--------------------------------------------------
|
|
<1> Iterate over the profile results
|
|
<2> Retrieve the name of the Lucene query
|
|
<3> Retrieve the time in millis spent executing the Lucene query
|
|
<4> Retrieve the profile results for the sub-queries (if any)
|
|
|
|
The Rest API documentation contains more information about {ref}/search-profile-queries.html[Profiling Queries] with
|
|
a description of the query profiling information.
|
|
|
|
The `QueryProfileShardResult` also gives access to the profiling information for the Lucene collectors:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-profiling-queries-collectors]
|
|
--------------------------------------------------
|
|
<1> Retrieve the profiling result of the Lucene collector
|
|
<2> Retrieve the name of the Lucene collector
|
|
<3> Retrieve the time in millis spent executing the Lucene collector
|
|
<4> Retrieve the profile results for the sub-collectors (if any)
|
|
|
|
The Rest API documentation contains more information about profiling information
|
|
for Lucene collectors. See {ref}/search-profile-queries.html[Profiling queries].
|
|
|
|
In a very similar manner to the query tree execution, the `QueryProfileShardResult` objects gives access
|
|
to the detailed aggregations tree execution:
|
|
|
|
["source","java",subs="attributes,callouts,macros"]
|
|
--------------------------------------------------
|
|
include-tagged::{doc-tests-file}[{api}-request-profiling-aggs]
|
|
--------------------------------------------------
|
|
<1> Retrieve the `AggregationProfileShardResult`
|
|
<2> Iterate over the aggregation profile results
|
|
<3> Retrieve the type of the aggregation (corresponds to Java class used to execute the aggregation)
|
|
<4> Retrieve the time in millis spent executing the Lucene collector
|
|
<5> Retrieve the profile results for the sub-aggregations (if any)
|
|
|
|
The Rest API documentation contains more information about
|
|
{ref}/search-profile-aggregations.html[Profiling aggregations].
|