OpenSearch/x-pack
Adrien Grand f993ef80f8
Move the terms index of `_id` off-heap. (#52518)
In #42838 we moved the terms index of all fields off-heap except the
`_id` field because we were worried it might make indexing slower. In
general, the indexing rate is only affected if explicit IDs are used, as
otherwise Elasticsearch almost never performs lookups in the terms
dictionary for the purpose of indexing. So it's quite wasteful to
require the terms index of `_id` to be loaded on-heap for users who have
append-only workloads. Furthermore I've been conducting benchmarks when
indexing with explicit ids on the http_logs dataset that suggest that
the slowdown is low enough that it's probably not worth forcing the terms
index to be kept on-heap. Here are some numbers for the median indexing
rate in docs/s:

| Run | Master  | Patch   |
| --- | ------- | ------- |
| 1   | 45851.2 | 46401.4 |
| 2   | 45192.6 | 44561.0 |
| 3   | 45635.2 | 44137.0 |
| 4   | 46435.0 | 44692.8 |
| 5   | 45829.0 | 44949.0 |

And now heap usage in MB for segments:

| Run | Master  | Patch    |
| --- | ------- | -------- |
| 1   | 41.1720 | 0.352083 |
| 2   | 45.1545 | 0.382534 |
| 3   | 41.7746 | 0.381285 |
| 4   | 45.3673 | 0.412737 |
| 5   | 45.4616 | 0.375063 |

Indexing rate decreased by 1.8% on average, while memory usage decreased
by more than 100x.

The `http_logs` dataset contains small documents and has a simple
indexing chain. More complex indexing chains, e.g. with more fields,
ingest pipelines, etc. would see an even lower decrease of indexing rate.
2020-02-24 18:14:12 +01:00
..
dev-tools Build: Merge xpack checkstyle config into core (#33399) 2018-09-05 09:17:02 -04:00
docs [DOCS] Adds certutil http command to TLS setup steps (#51241) 2020-02-21 10:11:59 -08:00
license-tools Support "enterprise" license types (#49474) 2019-12-12 14:37:44 +11:00
plugin Move the terms index of `_id` off-heap. (#52518) 2020-02-24 18:14:12 +01:00
qa Skip 'setupPorts' tasks when Docker is unavailable (#52679) 2020-02-22 18:31:36 -08:00
snapshot-tool Fix S3 3rd Party Tests (#50983) 2020-01-14 17:46:47 +01:00
test Document SAML APIs (#45105) (#47909) 2019-10-11 16:34:11 +03:00
transport-client Apply 2-space indent to all gradle scripts (#49071) 2019-11-14 11:01:23 +00:00
NOTICE.txt
README.md
build.gradle [7.x] Update opensaml dependency (#44972) (#49512) 2019-11-29 00:17:16 +02:00

README.md

Elastic License Functionality

This directory tree contains files subject to the Elastic License. The files subject to the Elastic License are grouped in this directory to clearly separate them from files licensed under the Apache License 2.0.