From 978d1ed2575d622cb5055f6fa7fbe845ce58572c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christoph=20B=C3=BCscher?= Date: Mon, 3 Sep 2018 11:09:30 +0200 Subject: [PATCH] [Docs] Improve tuning for speed advice (#33315) This change merges two sections in the "Tune for search speed" documentation that recommend mapping numeric identifiers as keywords. Both sections contain mostly the same advice, so they can be merged. Closes #32733 --- docs/reference/how-to/search-speed.asciidoc | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/docs/reference/how-to/search-speed.asciidoc b/docs/reference/how-to/search-speed.asciidoc index cd37d0735e9..bb5b0edd2e6 100644 --- a/docs/reference/how-to/search-speed.asciidoc +++ b/docs/reference/how-to/search-speed.asciidoc @@ -165,12 +165,15 @@ GET index/_search // TEST[continued] [float] -=== Mappings +=== Consider mapping identifiers as `keyword` The fact that some data is numeric does not mean it should always be mapped as a -<>. Typically, fields storing identifiers such as an `ISBN` -or any number identifying a record from another database, might benefit from -being mapped as <> rather than `integer` or `long`. +<>. The way that Elasticsearch indexes numbers optimizes +for `range` queries while `keyword` fields are better at `term` queries. Typically, +fields storing identifiers such as an `ISBN` or any number identifying a record +from another database are rarely used in `range` queries or aggregations. This is +why they might benefit from being mapped as <> rather than as +`integer` or `long`. [float] === Avoid scripts @@ -349,15 +352,6 @@ WARNING: Loading data into the filesystem cache eagerly on too many indices or too many files will make search _slower_ if the filesystem cache is not large enough to hold all the data. Use with caution. -[float] -=== Map identifiers as `keyword` - -When you have numeric identifiers in your documents, it is tempting to map them -as numbers, which is consistent with their json type. However, the way that -Elasticsearch indexes numbers optimizes for `range` queries while `keyword` -fields are better at `term` queries. Since identifiers are never used in `range` -queries, they should be mapped as a `keyword`. - [float] === Use index sorting to speed up conjunctions