OpenSearch/docs/reference/search/suggesters/context-suggest.asciidoc

388 lines
12 KiB
Plaintext
Raw Normal View History

[[suggester-context]]
=== Context Suggester
The completion suggester considers all documents in the index, but it is often
desirable to serve suggestions filtered and/or boosted by some criteria.
For example, you want to suggest song titles filtered by certain artists or
you want to boost song titles based on their genre.
To achieve suggestion filtering and/or boosting, you can add context mappings while
configuring a completion field. You can define multiple context mappings for a
completion field.
Every context mapping has a unique name and a type. There are two types: `category`
and `geo`. Context mappings are configured under the `contexts` parameter in
the field mapping.
NOTE: It is mandatory to provide a context when indexing and querying
a context enabled completion field.
The following defines types, each with two context mappings for a completion
field:
[source,js]
--------------------------------------------------
Update the default for include_type_name to false. (#37285) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.
2019-01-14 16:08:01 -05:00
PUT place?include_type_name=true
{
"mappings": {
"_doc" : {
"properties" : {
"suggest" : {
"type" : "completion",
"contexts": [
{ <1>
"name": "place_type",
"type": "category"
},
{ <2>
"name": "location",
"type": "geo",
"precision": 4
}
]
}
}
}
}
}
Update the default for include_type_name to false. (#37285) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.
2019-01-14 16:08:01 -05:00
PUT place_path_category?include_type_name=true
{
"mappings": {
"_doc" : {
"properties" : {
"suggest" : {
"type" : "completion",
"contexts": [
{ <3>
"name": "place_type",
"type": "category",
"path": "cat"
},
{ <4>
"name": "location",
"type": "geo",
"precision": 4,
"path": "loc"
}
]
},
"loc": {
"type": "geo_point"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TESTSETUP
<1> Defines a `category` context named 'place_type' where the categories must be
sent with the suggestions.
<2> Defines a `geo` context named 'location' where the categories must be sent
with the suggestions.
<3> Defines a `category` context named 'place_type' where the categories are
read from the `cat` field.
<4> Defines a `geo` context named 'location' where the categories are read from
the `loc` field.
NOTE: Adding context mappings increases the index size for completion field. The completion index
is entirely heap resident, you can monitor the completion field index size using <<indices-stats>>.
[[suggester-context-category]]
[float]
==== Category Context
The `category` context allows you to associate one or more categories with suggestions at index
time. At query time, suggestions can be filtered and boosted by their associated categories.
The mappings are set up like the `place_type` fields above. If `path` is defined
then the categories are read from that path in the document, otherwise they must
be sent in the suggest field like this:
[source,js]
--------------------------------------------------
PUT place/_doc/1
{
"suggest": {
"input": ["timmy's", "starbucks", "dunkin donuts"],
"contexts": {
"place_type": ["cafe", "food"] <1>
}
}
}
--------------------------------------------------
// CONSOLE
<1> These suggestions will be associated with 'cafe' and 'food' category.
If the mapping had a `path` then the following index request would be enough to
add the categories:
[source,js]
--------------------------------------------------
PUT place_path_category/_doc/1
{
"suggest": ["timmy's", "starbucks", "dunkin donuts"],
"cat": ["cafe", "food"] <1>
}
--------------------------------------------------
// CONSOLE
<1> These suggestions will be associated with 'cafe' and 'food' category.
NOTE: If context mapping references another field and the categories
are explicitly indexed, the suggestions are indexed with both set
of categories.
[float]
===== Category Query
Suggestions can be filtered by one or more categories. The following
filters suggestions by multiple categories:
[source,js]
--------------------------------------------------
POST place/_search?pretty
{
"suggest": {
"place_suggestion" : {
"prefix" : "tim",
"completion" : {
"field" : "suggest",
"size": 10,
"contexts": {
"place_type": [ "cafe", "restaurants" ]
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
NOTE: If multiple categories or category contexts are set on the query
they are merged as a disjunction. This means that suggestions match
if they contain at least one of the provided context values.
Suggestions with certain categories can be boosted higher than others.
The following filters suggestions by categories and additionally boosts
suggestions associated with some categories:
[source,js]
--------------------------------------------------
POST place/_search?pretty
{
"suggest": {
"place_suggestion" : {
"prefix" : "tim",
"completion" : {
"field" : "suggest",
"size": 10,
"contexts": {
"place_type": [ <1>
{ "context" : "cafe" },
{ "context" : "restaurants", "boost": 2 }
]
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
<1> The context query filter suggestions associated with
categories 'cafe' and 'restaurants' and boosts the
suggestions associated with 'restaurants' by a
factor of `2`
In addition to accepting category values, a context query can be composed of
multiple category context clauses. The following parameters are supported for a
`category` context clause:
[horizontal]
`context`::
The value of the category to filter/boost on.
This is mandatory.
`boost`::
The factor by which the score of the suggestion
should be boosted, the score is computed by
multiplying the boost with the suggestion weight,
defaults to `1`
`prefix`::
Whether the category value should be treated as a
prefix or not. For example, if set to `true`,
you can filter category of 'type1', 'type2' and
so on, by specifying a category prefix of 'type'.
Defaults to `false`
NOTE: If a suggestion entry matches multiple contexts the final score is computed as the
maximum score produced by any matching contexts.
[[suggester-context-geo]]
[float]
==== Geo location Context
A `geo` context allows you to associate one or more geo points or geohashes with suggestions
at index time. At query time, suggestions can be filtered and boosted if they are within
a certain distance of a specified geo location.
Internally, geo points are encoded as geohashes with the specified precision.
[float]
===== Geo Mapping
In addition to the `path` setting, `geo` context mapping accepts the following settings:
[horizontal]
`precision`::
This defines the precision of the geohash to be indexed and can be specified
as a distance value (`5m`, `10km` etc.), or as a raw geohash precision (`1`..`12`).
Defaults to a raw geohash precision value of `6`.
NOTE: The index time `precision` setting sets the maximum geohash precision that
can be used at query time.
[float]
===== Indexing geo contexts
`geo` contexts can be explicitly set with suggestions or be indexed from a geo point field in the
document via the `path` parameter, similar to `category` contexts. Associating multiple geo location context
with a suggestion, will index the suggestion for every geo location. The following indexes a suggestion
with two geo location contexts:
[source,js]
--------------------------------------------------
PUT place/_doc/1
{
"suggest": {
"input": "timmy's",
"contexts": {
"location": [
{
"lat": 43.6624803,
"lon": -79.3863353
},
{
"lat": 43.6624718,
"lon": -79.3873227
}
]
}
}
}
--------------------------------------------------
// CONSOLE
[float]
===== Geo location Query
Suggestions can be filtered and boosted with respect to how close they are to one or
more geo points. The following filters suggestions that fall within the area represented by
the encoded geohash of a geo point:
[source,js]
--------------------------------------------------
POST place/_search
{
"suggest": {
"place_suggestion" : {
"prefix" : "tim",
"completion" : {
"field" : "suggest",
"size": 10,
"contexts": {
"location": {
"lat": 43.662,
"lon": -79.380
}
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
NOTE: When a location with a lower precision at query time is specified, all suggestions
that fall within the area will be considered.
NOTE: If multiple categories or category contexts are set on the query
they are merged as a disjunction. This means that suggestions match
if they contain at least one of the provided context values.
Suggestions that are within an area represented by a geohash can also be boosted higher
than others, as shown by the following:
[source,js]
--------------------------------------------------
POST place/_search?pretty
{
"suggest": {
"place_suggestion" : {
"prefix" : "tim",
"completion" : {
"field" : "suggest",
"size": 10,
"contexts": {
"location": [ <1>
{
"lat": 43.6624803,
"lon": -79.3863353,
"precision": 2
},
{
"context": {
"lat": 43.6624803,
"lon": -79.3863353
},
"boost": 2
}
]
}
}
}
}
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
<1> The context query filters for suggestions that fall under
the geo location represented by a geohash of '(43.662, -79.380)'
with a precision of '2' and boosts suggestions
that fall under the geohash representation of '(43.6624803, -79.3863353)'
with a default precision of '6' by a factor of `2`
NOTE: If a suggestion entry matches multiple contexts the final score is computed as the
maximum score produced by any matching contexts.
In addition to accepting context values, a context query can be composed of
multiple context clauses. The following parameters are supported for a
`category` context clause:
[horizontal]
`context`::
A geo point object or a geo hash string to filter or
boost the suggestion by. This is mandatory.
`boost`::
The factor by which the score of the suggestion
should be boosted, the score is computed by
multiplying the boost with the suggestion weight,
defaults to `1`
`precision`::
The precision of the geohash to encode the query geo point.
This can be specified as a distance value (`5m`, `10km` etc.),
or as a raw geohash precision (`1`..`12`).
Defaults to index time precision level.
`neighbours`::
Accepts an array of precision values at which
neighbouring geohashes should be taken into account.
precision value can be a distance value (`5m`, `10km` etc.)
or a raw geohash precision (`1`..`12`). Defaults to
generating neighbours for index time precision level.