Updating documentation around keyword fields (#5453)

* Updating documentation around keyword fields

Signed-off-by: Harsha Vamsi Kalluri <hvamsi@amazon.com>

* Rewording with additional info

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Copyedits

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Harsha Vamsi Kalluri <hvamsi@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
This commit is contained in:
Harsha Vamsi Kalluri 2024-02-01 12:08:57 -08:00 committed by GitHub
parent 95a5f3e5a9
commit 103bb964ae
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 9 additions and 5 deletions

View File

@ -14,12 +14,14 @@ redirect_from:
A keyword field type contains a string that is not analyzed. It allows only exact, case-sensitive matches. A keyword field type contains a string that is not analyzed. It allows only exact, case-sensitive matches.
By default, keyword fields are both indexed (because `index` is enabled) and stored on disk (because `doc_values` is enabled). To reduce disk space, you can specify not to index keyword fields by setting `index` to `false`.
If you need to use a field for full-text search, map it as [`text`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) instead. If you need to use a field for full-text search, map it as [`text`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/text/) instead.
{: .note } {: .note }
## Example ## Example
Create a mapping with a keyword field: The following query creates a mapping with a keyword field. Setting `index` to `false` specifies to store the `genre` field on disk and to retrieve it using `doc_values`:
```json ```json
PUT movies PUT movies
@ -27,7 +29,8 @@ PUT movies
"mappings" : { "mappings" : {
"properties" : { "properties" : {
"genre" : { "genre" : {
"type" : "keyword" "type" : "keyword",
"index" : false
} }
} }
} }
@ -46,12 +49,13 @@ Parameter | Description
`eager_global_ordinals` | Specifies whether global ordinals should be loaded eagerly on refresh. If the field is often used for aggregations, this parameter should be set to `true`. Default is `false`. `eager_global_ordinals` | Specifies whether global ordinals should be loaded eagerly on refresh. If the field is often used for aggregations, this parameter should be set to `true`. Default is `false`.
`fields` | To index the same string in several ways (for example, as a keyword and text), provide the fields parameter. You can specify one version of the field to be used for search and another to be used for sorting and aggregations. `fields` | To index the same string in several ways (for example, as a keyword and text), provide the fields parameter. You can specify one version of the field to be used for search and another to be used for sorting and aggregations.
`ignore_above` | Any string longer than this integer value should not be indexed. Default is 2147483647. Default dynamic mapping creates a keyword subfield for which `ignore_above` is set to 256. `ignore_above` | Any string longer than this integer value should not be indexed. Default is 2147483647. Default dynamic mapping creates a keyword subfield for which `ignore_above` is set to 256.
`index` | A Boolean value that specifies whether the field should be searchable. Default is `true`. `index` | A Boolean value that specifies whether the field should be searchable. Default is `true`. To reduce disk space, set `index` to `false`.
`index_options` | Information to be stored in the index that will be considered when calculating relevance scores. Can be set to `freqs` for term frequency. Default is `docs`. `index_options` | Information to be stored in the index that will be considered when calculating relevance scores. Can be set to `freqs` for term frequency. Default is `docs`.
`meta` | Accepts metadata for this field. `meta` | Accepts metadata for this field.
`normalizer` | Specifies how to preprocess this field before indexing (for example, make it lowercase). Default is `null` (no preprocessing). `normalizer` | Specifies how to preprocess this field before indexing (for example, make it lowercase). Default is `null` (no preprocessing).
`norms` | A Boolean value that specifies whether the field length should be used when calculating relevance scores. Default is `false`. `norms` | A Boolean value that specifies whether the field length should be used when calculating relevance scores. Default is `false`.
[`null_value`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/index#null-value) | A value to be used in place of `null`. Must be of the same type as the field. If this parameter is not specified, the field is treated as missing when its value is `null`. Default is `null`. [`null_value`]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/index#null-value) | A value to be used in place of `null`. Must be of the same type as the field. If this parameter is not specified, the field is treated as missing when its value is `null`. Default is `null`.
`similarity` | The ranking algorithm for calculating relevance scores. Default is `BM25`. `similarity` | The ranking algorithm for calculating relevance scores. Default is `BM25`.
`split_queries_on_whitespace` | A Boolean value that specifies whether full-text queries should be split on white space. Default is `false`. `split_queries_on_whitespace` | A Boolean value that specifies whether full-text queries should be split on white space. Default is `false`.
`store` | A Boolean value that specifies whether the field value should be stored and can be retrieved separately from the _source field. Default is `false`. `store` | A Boolean value that specifies whether the field value should be stored and can be retrieved separately from the `_source` field. Default is `false`.