2019-03-06 07:29:34 -05:00
|
|
|
|
2017-06-19 21:01:52 -04:00
|
|
|
[role="xpack"]
|
2017-04-17 14:53:31 -04:00
|
|
|
[[ml-settings]]
|
2018-09-28 12:41:14 -04:00
|
|
|
=== Machine learning settings in Elasticsearch
|
2017-08-11 13:00:35 -04:00
|
|
|
++++
|
2018-09-28 12:41:14 -04:00
|
|
|
<titleabbrev>Machine learning settings</titleabbrev>
|
2017-08-11 13:00:35 -04:00
|
|
|
++++
|
|
|
|
|
2020-07-02 19:40:45 -04:00
|
|
|
[[ml-settings-description]]
|
|
|
|
// tag::ml-settings-description-tag[]
|
2017-04-27 13:51:48 -04:00
|
|
|
You do not need to configure any settings to use {ml}. It is enabled by default.
|
2017-04-17 14:53:31 -04:00
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
IMPORTANT: {ml-cap} uses SSE4.2 instructions, so it works only on machines whose
|
|
|
|
CPUs {wikipedia}/SSE4#Supporting_CPUs[support] SSE4.2. If you run {es} on older
|
|
|
|
hardware, you must disable {ml} (by setting `xpack.ml.enabled` to `false`).
|
2018-09-28 12:41:14 -04:00
|
|
|
|
2020-07-02 19:40:45 -04:00
|
|
|
// end::ml-settings-description-tag[]
|
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2017-04-17 14:53:31 -04:00
|
|
|
[[general-ml-settings]]
|
2018-09-28 12:41:14 -04:00
|
|
|
==== General machine learning settings
|
2017-04-17 14:53:31 -04:00
|
|
|
|
2020-08-18 06:06:17 -04:00
|
|
|
`node.roles: [ ml ]`::
|
2020-08-24 10:29:25 -04:00
|
|
|
(<<static-cluster-setting,Static>>) Set `node.roles` to contain `ml` to identify
|
|
|
|
the node as a _{ml} node_ that is capable of running jobs. Every node is a {ml}
|
|
|
|
node by default.
|
2017-10-25 12:00:53 -04:00
|
|
|
+
|
2020-10-06 11:19:37 -04:00
|
|
|
If you use the `node.roles` setting, then all required roles must be explicitly
|
2020-08-18 06:06:17 -04:00
|
|
|
set. Consult <<modules-node>> to learn more.
|
2017-10-25 12:00:53 -04:00
|
|
|
+
|
2020-08-18 06:06:17 -04:00
|
|
|
IMPORTANT: On dedicated coordinating nodes or dedicated master nodes, do not set
|
|
|
|
the `ml` role.
|
|
|
|
|
2017-10-25 12:00:53 -04:00
|
|
|
|
2017-04-17 14:53:31 -04:00
|
|
|
`xpack.ml.enabled`::
|
2020-08-24 10:29:25 -04:00
|
|
|
(<<static-cluster-setting,Static>>) Set to `true` (default) to enable {ml} APIs
|
|
|
|
on the node.
|
2017-04-17 14:53:31 -04:00
|
|
|
+
|
2020-04-02 18:34:37 -04:00
|
|
|
If set to `false`, the {ml} APIs are disabled on the node. Therefore the node
|
|
|
|
cannot open jobs, start {dfeeds}, or receive transport (internal) communication
|
|
|
|
requests related to {ml} APIs. If the node is a coordinating node, {ml} requests
|
|
|
|
from clients (including {kib}) also fail. For more information about disabling
|
|
|
|
{ml} in specific {kib} instances, see
|
|
|
|
{kibana-ref}/ml-settings-kb.html[{kib} {ml} settings].
|
2017-04-17 14:53:31 -04:00
|
|
|
+
|
2020-04-02 18:34:37 -04:00
|
|
|
IMPORTANT: If you want to use {ml-features} in your cluster, it is recommended
|
2020-10-06 11:19:37 -04:00
|
|
|
that you set `xpack.ml.enabled` to `true` on all nodes. This is the default
|
|
|
|
behavior. At a minimum, it must be enabled on all master-eligible nodes. If you
|
|
|
|
want to use {ml-features} in clients or {kib}, it must also be enabled on all
|
2020-08-18 06:06:17 -04:00
|
|
|
coordinating nodes.
|
2017-04-17 14:53:31 -04:00
|
|
|
|
2019-11-19 16:43:19 -05:00
|
|
|
`xpack.ml.inference_model.cache_size`::
|
2020-08-24 10:29:25 -04:00
|
|
|
(<<static-cluster-setting,Static>>) The maximum inference cache size allowed.
|
|
|
|
The inference cache exists in the JVM heap on each ingest node. The cache
|
|
|
|
affords faster processing times for the `inference` processor. The value can be
|
|
|
|
a static byte sized value (i.e. "2gb") or a percentage of total allocated heap.
|
|
|
|
The default is "40%". See also <<model-inference-circuit-breaker>>.
|
2019-11-19 16:43:19 -05:00
|
|
|
|
2020-07-02 19:40:45 -04:00
|
|
|
[[xpack-interference-model-ttl]]
|
|
|
|
// tag::interference-model-ttl-tag[]
|
|
|
|
`xpack.ml.inference_model.time_to_live` {ess-icon}::
|
2020-08-24 10:29:25 -04:00
|
|
|
(<<static-cluster-setting,Static>>) The time to live (TTL) for models in the
|
|
|
|
inference model cache. The TTL is calculated from last access. The `inference`
|
|
|
|
processor attempts to load the model from cache. If the `inference` processor
|
|
|
|
does not receive any documents for the duration of the TTL, the referenced model
|
|
|
|
is flagged for eviction from the cache. If a document is processed later, the
|
|
|
|
model is again loaded into the cache. Defaults to `5m`.
|
2020-07-02 19:40:45 -04:00
|
|
|
// end::interference-model-ttl-tag[]
|
2019-11-19 16:43:19 -05:00
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.max_inference_processors`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The total number of `inference` type
|
|
|
|
processors allowed across all ingest pipelines. Once the limit is reached,
|
|
|
|
adding an `inference` processor to a pipeline is disallowed. Defaults to `50`.
|
|
|
|
|
|
|
|
`xpack.ml.max_machine_memory_percent`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The maximum percentage of the machine's
|
|
|
|
memory that {ml} may use for running analytics processes. (These processes are
|
|
|
|
separate to the {es} JVM.) Defaults to `30` percent. The limit is based on the
|
|
|
|
total memory of the machine, not current free memory. Jobs are not allocated to
|
|
|
|
a node if doing so would cause the estimated memory use of {ml} jobs to exceed
|
|
|
|
the limit.
|
|
|
|
|
|
|
|
`xpack.ml.max_model_memory_limit`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The maximum `model_memory_limit` property
|
|
|
|
value that can be set for any job on this node. If you try to create a job with
|
|
|
|
a `model_memory_limit` property value that is greater than this setting value,
|
|
|
|
an error occurs. Existing jobs are not affected when you update this setting.
|
|
|
|
For more information about the `model_memory_limit` property, see
|
|
|
|
<<put-analysislimits>>.
|
2017-12-15 14:19:11 -05:00
|
|
|
|
2020-02-03 08:33:02 -05:00
|
|
|
[[xpack.ml.max_open_jobs]]
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.max_open_jobs`::
|
2020-10-06 11:19:37 -04:00
|
|
|
(<<cluster-update-settings,Dynamic>>) The maximum number of jobs that can run
|
2020-08-24 10:29:25 -04:00
|
|
|
simultaneously on a node. Defaults to `20`. In this context, jobs include both
|
|
|
|
{anomaly-jobs} and {dfanalytics-jobs}. The maximum number of jobs is also
|
|
|
|
constrained by memory usage. Thus if the estimated memory usage of the jobs
|
|
|
|
would be higher than allowed, fewer jobs will run on a node. Prior to version
|
|
|
|
7.1, this setting was a per-node non-dynamic setting. It became a cluster-wide
|
|
|
|
dynamic setting in version 7.1. As a result, changes to its value after node
|
|
|
|
startup are used only after every node in the cluster is running version 7.1 or
|
|
|
|
higher. The maximum permitted value is `512`.
|
|
|
|
|
2020-10-06 11:19:37 -04:00
|
|
|
`xpack.ml.nightly_maintenance_requests_per_second`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The rate at which the nightly maintenance task
|
|
|
|
deletes expired model snapshots and results. The setting is a proxy to the
|
|
|
|
[requests_per_second](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#_throttling_delete_requests)
|
|
|
|
parameter used in the Delete by query requests and controls throttling.
|
|
|
|
Valid values must be greater than `0.0` or equal to `-1.0` where `-1.0` means a default value
|
|
|
|
is used. Defaults to `-1.0`
|
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.node_concurrent_job_allocations`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The maximum number of jobs that can
|
|
|
|
concurrently be in the `opening` state on each node. Typically, jobs spend a
|
|
|
|
small amount of time in this state before they move to `open` state. Jobs that
|
|
|
|
must restore large models when they are opening spend more time in the `opening`
|
|
|
|
state. Defaults to `2`.
|
2018-09-28 12:41:14 -04:00
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2018-09-28 12:41:14 -04:00
|
|
|
[[advanced-ml-settings]]
|
|
|
|
==== Advanced machine learning settings
|
|
|
|
|
2020-07-02 19:40:45 -04:00
|
|
|
These settings are for advanced use cases; the default values are generally
|
2018-09-28 12:41:14 -04:00
|
|
|
sufficient:
|
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.enable_config_migration`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) Reserved.
|
2019-01-03 12:26:57 -05:00
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.max_anomaly_records`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The maximum number of records that are
|
|
|
|
output per bucket. The default value is `500`.
|
2018-09-28 12:41:14 -04:00
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.max_lazy_ml_nodes`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The number of lazily spun up {ml} nodes.
|
|
|
|
Useful in situations where {ml} nodes are not desired until the first {ml} job
|
|
|
|
opens. It defaults to `0` and has a maximum acceptable value of `3`. If the
|
|
|
|
current number of {ml} nodes is greater than or equal to this setting, it is
|
2018-10-18 17:11:36 -04:00
|
|
|
assumed that there are no more lazy nodes available as the desired number
|
2020-08-24 10:29:25 -04:00
|
|
|
of nodes have already been provisioned. If a job is opened and this setting has
|
|
|
|
a value greater than zero and there are no nodes that can accept the job, the
|
|
|
|
job stays in the `OPENING` state until a new {ml} node is added to the cluster
|
|
|
|
and the job is assigned to run on that node.
|
2018-10-18 17:11:36 -04:00
|
|
|
+
|
2020-08-24 10:29:25 -04:00
|
|
|
IMPORTANT: This setting assumes some external process is capable of adding {ml}
|
|
|
|
nodes to the cluster. This setting is only useful when used in conjunction with
|
2018-10-18 17:11:36 -04:00
|
|
|
such an external process.
|
2019-06-25 11:36:02 -04:00
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`xpack.ml.process_connect_timeout`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) The connection timeout for {ml} processes
|
|
|
|
that run separately from the {es} JVM. Defaults to `10s`. Some {ml} processing
|
|
|
|
is done by processes that run separately to the {es} JVM. When such processes
|
|
|
|
are started they must connect to the {es} JVM. If such a process does not
|
|
|
|
connect within the time period specified by this setting then the process is
|
|
|
|
assumed to have failed. Defaults to `10s`. The minimum value for this setting is
|
|
|
|
`5s`.
|
2020-06-08 16:02:48 -04:00
|
|
|
|
2020-07-23 12:42:33 -04:00
|
|
|
[discrete]
|
2020-06-08 16:02:48 -04:00
|
|
|
[[model-inference-circuit-breaker]]
|
|
|
|
==== {ml-cap} circuit breaker settings
|
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`breaker.model_inference.limit`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) Limit for the model inference breaker,
|
|
|
|
which defaults to 50% of the JVM heap. If the parent circuit breaker is less
|
|
|
|
than 50% of the JVM heap, it is bound to that limit instead. See
|
|
|
|
<<circuit-breaker>>.
|
2020-07-03 14:06:42 -04:00
|
|
|
|
2020-08-24 10:29:25 -04:00
|
|
|
`breaker.model_inference.overhead`::
|
|
|
|
(<<cluster-update-settings,Dynamic>>) A constant that all accounting estimations
|
|
|
|
are multiplied by to determine a final estimation. Defaults to 1. See
|
|
|
|
<<circuit-breaker>>.
|
2020-07-03 14:06:42 -04:00
|
|
|
|
|
|
|
`breaker.model_inference.type`::
|
2020-08-24 10:29:25 -04:00
|
|
|
(<<static-cluster-setting,Static>>) The underlying type of the circuit breaker.
|
|
|
|
There are two valid options: `noop` and `memory`. `noop` means the circuit
|
|
|
|
breaker does nothing to prevent too much memory usage. `memory` means the
|
|
|
|
circuit breaker tracks the memory used by inference models and can potentially
|
|
|
|
break and prevent `OutOfMemory` errors. The default is `memory`.
|