[DOCS] Add identifier mapping tip to numeric and keyword datatype docs (#49933)

Users often mistakenly map numeric IDs to numeric datatypes. However,
this is often slow for the `term` and other term-level queries.

The "Tune for search speed" docs includes advice for mapping numeric
IDs to `keyword` fields. However, this tip is not included in the
`numeric` or `keyword` field datatype doc pages.

This rewords the tip in the "Tune for search speed" docs, relocates it
to the `numeric` field docs, and reuses it using tagged regions.
This commit is contained in:
James Rodewig 2019-12-17 09:31:07 -05:00
parent 55cc5432d6
commit 726c35dfd0
3 changed files with 34 additions and 9 deletions

View File

@ -159,13 +159,7 @@ GET index/_search
[[map-ids-as-keyword]]
=== Consider mapping identifiers as `keyword`
The fact that some data is numeric does not mean it should always be mapped as a
<<number,numeric field>>. The way that Elasticsearch indexes numbers optimizes
for `range` queries while `keyword` fields are better at `term` queries. Typically,
fields storing identifiers such as an `ISBN` or any number identifying a record
from another database are rarely used in `range` queries or aggregations. This is
why they might benefit from being mapped as <<keyword,`keyword`>> rather than as
`integer` or `long`.
include::../mapping/types/numeric.asciidoc[tag=map-ids-as-keyword]
[float]
=== Avoid scripts

View File

@ -4,8 +4,8 @@
<titleabbrev>Keyword</titleabbrev>
++++
A field to index structured content such as email addresses, hostnames, status
codes, zip codes or tags.
A field to index structured content such as IDs, email addresses, hostnames,
status codes, zip codes or tags.
They are typically used for filtering (_Find me all blog posts where
++status++ is ++published++_), for sorting, and for aggregations. Keyword
@ -30,6 +30,12 @@ PUT my_index
}
--------------------------------
[TIP]
.Mapping numeric identifiers
====
include::numeric.asciidoc[tag=map-ids-as-keyword]
====
[[keyword-params]]
==== Parameters for keyword fields

View File

@ -80,6 +80,31 @@ to help make a decision.
|`half_float`|+2^-24^+ |+65504+ |+11+ / +3.31+
|=======================================================================
[TIP]
.Mapping numeric identifiers
====
// tag::map-ids-as-keyword[]
Not all numeric data should be mapped as a <<number,numeric>> field datatype.
{es} optimizes numeric fields, such as `integer` or `long`, for
<<query-dsl-range-query,`range`>> queries. However, <<keyword,`keyword`>> fields
are better for <<query-dsl-term-query,`term`>> and other
<<term-level-queries,term-level>> queries.
Identifiers, such as an ISBN or a product ID, are rarely used in `range`
queries. However, they are often retrieved using term-level queries.
Consider mapping a numeric identifier as a `keyword` if:
* You don't plan to search for the identifier data using
<<query-dsl-range-query,`range`>> queries.
* Fast retrieval is important. `term` query searches on `keyword` fields are
often faster than `term` searches on numeric fields.
If you're unsure which to use, you can use a <<multi-fields,multi-field>> to map
the data as both a `keyword` _and_ a numeric datatype.
// end::map-ids-as-keyword[]
====
[[number-params]]
==== Parameters for numeric fields