OpenSearch/docs/reference/search/field-caps.asciidoc

267 lines
7.4 KiB
Plaintext
Raw Normal View History

[[search-field-caps]]
=== Field capabilities API
++++
<titleabbrev>Field capabilities</titleabbrev>
++++
Allows you to retrieve the capabilities of fields among multiple indices.
For data streams, the API returns field capabilities among the stream's backing
indices.
[source,console]
--------------------------------------------------
GET /_field_caps?fields=rating
--------------------------------------------------
[[search-field-caps-api-request]]
==== {api-request-title}
`GET /_field_caps?fields=<fields>`
`POST /_field_caps?fields=<fields>`
`GET /<target>/_field_caps?fields=<fields>`
`POST /<target>/_field_caps?fields=<fields>`
[[search-field-caps-api-desc]]
==== {api-description-title}
The field capabilities API returns the information about the capabilities of
fields among multiple indices.
[[search-field-caps-api-path-params]]
==== {api-path-parms-title}
`<target>`::
(Optional, string)
Comma-separated list of data streams, indices, and index aliases used to limit
the request. Wildcard expressions (`*`) are supported.
+
To target all data streams and indices in a cluster, omit this parameter or use
`_all` or `*`.
[[search-field-caps-api-query-params]]
==== {api-query-parms-title}
`fields`::
(Required, string)
Comma-separated list of fields to retrieve capabilities for. Wildcard (`*`)
expressions are supported.
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=allow-no-indices]
+
Defaults to `true`.
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=expand-wildcards]
+
--
Defaults to `open`.
--
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=index-ignore-unavailable]
`include_unmapped`::
(Optional, boolean) If `true`, unmapped fields are included in the response.
Defaults to `false`.
[[search-field-caps-api-request-body]]
==== {api-request-body-title}
`index_filter`::
(Optional, <<query-dsl,query object>> Allows to filter indices if the provided
query rewrites to `match_none` on every shard.
[[search-field-caps-api-response-body]]
==== {api-response-body-title}
The types used in the response describe _families_ of field types.
Normally a type family is the same as the field type declared in the mapping,
but to simplify matters certain field types that behave identically are
described using a type family. For example, `keyword`, `constant_keyword` and `wildcard`
field types are all described as the `keyword` type family.
`searchable`::
Whether this field is indexed for search on all indices.
`aggregatable`::
Whether this field can be aggregated on all indices.
`indices`::
The list of indices where this field has the same type family, or null if all indices
have the same type family for the field.
`non_searchable_indices`::
The list of indices where this field is not searchable, or null if all indices
have the same definition for the field.
`non_aggregatable_indices`::
The list of indices where this field is not aggregatable, or null if all
indices have the same definition for the field.
Add per-field metadata. (#50333) This PR adds per-field metadata that can be set in the mappings and is later returned by the field capabilities API. This metadata is completely opaque to Elasticsearch but may be used by tools that index data in Elasticsearch to communicate metadata about fields with tools that then search this data. A typical example that has been requested in the past is the ability to attach a unit to a numeric field. In order to not bloat the cluster state, Elasticsearch requires that this metadata be small: - keys can't be longer than 20 chars, - values can only be numbers or strings of no more than 50 chars - no inner arrays or objects, - the metadata can't have more than 5 keys in total. Given that metadata is opaque to Elasticsearch, field capabilities don't try to do anything smart when merging metadata about multiple indices, the union of all field metadatas is returned. Here is how the meta might look like in mappings: ```json { "properties": { "latency": { "type": "long", "meta": { "unit": "ms" } } } } ``` And then in the field capabilities response: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms" ] } } } } ``` When there are no conflicts, values are arrays of size 1, but when there are conflicts, Elasticsearch includes all unique values in this array, without giving ways to know which index has which metadata value: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms", "ns" ] } } } } ``` Closes #33267
2020-01-08 10:21:18 -05:00
`meta`::
Merged metadata across all indices as a map of string keys to arrays of values.
A value length of 1 indicates that all indices had the same value for this key,
while a length of 2 or more indicates that not all indices had the same value
for this key.
[[search-field-caps-api-example]]
==== {api-examples-title}
The request can be restricted to specific data streams and indices:
[source,console]
--------------------------------------------------
GET my-index-000001/_field_caps?fields=rating
--------------------------------------------------
// TEST[setup:my_index]
The next example API call requests information about the `rating` and the
`title` fields:
[source,console]
--------------------------------------------------
GET _field_caps?fields=rating,title
--------------------------------------------------
The API returns the following response:
[source,console-result]
--------------------------------------------------
{
"indices": [ "index1", "index2", "index3", "index4", "index5" ],
"fields": {
"rating": { <1>
"long": {
"searchable": true,
"aggregatable": false,
"indices": [ "index1", "index2" ],
"non_aggregatable_indices": [ "index1" ] <2>
},
"keyword": {
"searchable": false,
"aggregatable": true,
"indices": [ "index3", "index4" ],
"non_searchable_indices": [ "index4" ] <3>
}
},
"title": { <4>
"text": {
"searchable": true,
"aggregatable": false
}
}
}
}
--------------------------------------------------
// TESTRESPONSE[skip:historically skipped]
<1> The field `rating` is defined as a long in `index1` and `index2`
and as a `keyword` in `index3` and `index4`.
<2> The field `rating` is not aggregatable in `index1`.
<3> The field `rating` is not searchable in `index4`.
<4> The field `title` is defined as `text` in all indices.
By default unmapped fields are ignored. You can include them in the response by
adding a parameter called `include_unmapped` in the request:
[source,console]
--------------------------------------------------
GET _field_caps?fields=rating,title&include_unmapped
--------------------------------------------------
In which case the response will contain an entry for each field that is present
in some indices but not all:
[source,console-result]
--------------------------------------------------
{
"indices": [ "index1", "index2", "index3" ],
"fields": {
"rating": {
"long": {
"searchable": true,
"aggregatable": false,
"indices": [ "index1", "index2" ],
"non_aggregatable_indices": [ "index1" ]
},
"keyword": {
"searchable": false,
"aggregatable": true,
"indices": [ "index3", "index4" ],
"non_searchable_indices": [ "index4" ]
},
"unmapped": { <1>
"indices": [ "index5" ],
"searchable": false,
"aggregatable": false
}
},
"title": {
"text": {
"indices": [ "index1", "index2", "index3", "index4" ],
"searchable": true,
"aggregatable": false
},
"unmapped": { <2>
"indices": [ "index5" ],
"searchable": false,
"aggregatable": false
}
}
}
}
--------------------------------------------------
// TESTRESPONSE[skip:historically skipped]
<1> The `rating` field is unmapped` in `index5`.
<2> The `title` field is unmapped` in `index5`.
It is also possible to filter indices with a query:
[source,console]
--------------------------------------------------
POST my-index-*/_field_caps?fields=rating
{
"index_filter": {
"range": {
"@timestamp": {
"gte": "2018"
}
}
}
}
--------------------------------------------------
// TEST[setup:my_index]
In which case indices that rewrite the provided filter to `match_none` on every shard
will be filtered from the response.
--
[IMPORTANT]
====
The filtering is done on a best-effort basis, it uses index statistics and mappings
to rewrite queries to `match_none` instead of fully executing the request.
For instance a `range` query over a `date` field can rewrite to `match_none`
if all documents within a shard (including deleted documents) are outside
of the provided range.
However, not all queries can rewrite to `match_none` so this API may return
an index even if the provided filter matches no document.
====
--