OpenSearch/docs/reference/migration/migrate_5_0/settings.asciidoc

324 lines
15 KiB
Plaintext
Raw Normal View History

2016-03-13 16:17:48 -04:00
[[breaking_50_settings_changes]]
=== Settings changes
From Elasticsearch 5.0 on all settings are validated before they are applied.
Node level and default index level settings are validated on node startup,
dynamic cluster and index setting are validated before they are updated/added
to the cluster state.
Every setting must be a *known* setting. All settings must have been
registered with the node or transport client they are used with. This implies
that plugins that define custom settings must register all of their settings
during plugin loading using the `SettingsModule#registerSettings(Setting)`
method.
Improve upgrade experience of node level index settings In 5.0 we don't allow index settings to be specified on the node level ie. in yaml files or via commandline argument. This can cause problems during upgrade if this was used extensively. For instance if analyzers where specified on a node level this might cause the index to be closed when imported (see #17187). In such a case all indices relying on this must be updated via `PUT /${index}/_settings`. Yet, this API has slightly different semantics since it overrides existing settings. To make this less painful this change adds a `preserve_existing` parameter on that API to ensure we have the same semantics as if the setting was applied on the node level. This change also adds a better error message and a change to the migration guide to ensure upgrades are smooth if index settings are specified on the node level. If a index setting is detected this change fails the node startup and prints a message like this: ``` ************************************************************************************* Found index level settings on node level configuration. Since elasticsearch 5.x index level settings can NOT be set on the nodes configuration like the elasticsearch.yaml, in system properties or command line arguments.In order to upgrade all indices the settings must be updated via the /${index}/_settings API. Unless all settings are dynamic all indices must be closed in order to apply the upgradeIndices created in the future should use index templates to set default values. Please ensure all required values are updated on all indices by executing: curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{ "index.number_of_shards" : "1", "index.query.default_field" : "main_field", "index.translog.durability" : "async", "index.ttl.disable_purge" : "true" }' ************************************************************************************* ```
2016-03-21 08:02:15 -04:00
==== Index Level Settings
In previous versions Elasticsearch allowed to specify index level setting
as _defaults_ on the node level, inside the `elasticsearch.yaml` file or even via
command-line parameters. From Elasticsearch 5.0 on only selected settings like
for instance `index.codec` can be set on the node level. All other settings must be
set on each individual index. To set default values on every index, index templates
should be used instead.
2016-03-13 16:17:48 -04:00
==== Node settings
The `name` setting has been removed and is replaced by `node.name`. Usage of
`-Dname=some_node_name` is not supported anymore.
Persistent Node Ids (#19140) Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys. The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same. Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join. Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking. Other less important notes: - DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead. - I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic. - I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :) - TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
2016-07-04 15:09:25 -04:00
The `node.add_id_to_custom_path` was renamed to `add_lock_id_to_custom_path`.
Persistent Node Names (#19456) With #19140 we started persisting the node ID across node restarts. Now that we have a "stable" anchor, we can use it to generate a stable default node name and make it easier to track nodes over a restarts. Sadly, this means we will not have those random fun Marvel characters but we feel this is the right tradeoff. On the implementation side, this requires a bit of juggling because we now need to read the node id from disk before we can log as the node node is part of each log message. The PR move the initialization of NodeEnvironment as high up in the starting sequence as possible, with only one logging message before it to indicate we are initializing. Things look now like this: ``` [2016-07-15 19:38:39,742][INFO ][node ] [_unset_] initializing ... [2016-07-15 19:38:39,826][INFO ][node ] [aAmiW40] node name set to [aAmiW40] by default. set the [node.name] settings to change it [2016-07-15 19:38:39,829][INFO ][env ] [aAmiW40] using [1] data paths, mounts [[ /(/dev/disk1)]], net usable_space [5.5gb], net total_space [232.6gb], spins? [unknown], types [hfs] [2016-07-15 19:38:39,830][INFO ][env ] [aAmiW40] heap size [1.9gb], compressed ordinary object pointers [true] [2016-07-15 19:38:39,837][INFO ][node ] [aAmiW40] version[5.0.0-alpha5-SNAPSHOT], pid[46048], build[473d3c0/2016-07-15T17:38:06.771Z], OS[Mac OS X/10.11.5/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_51/25.51-b03] [2016-07-15 19:38:40,980][INFO ][plugins ] [aAmiW40] modules [percolator, lang-mustache, lang-painless, reindex, aggs-matrix-stats, lang-expression, ingest-common, lang-groovy, transport-netty], plugins [] [2016-07-15 19:38:43,218][INFO ][node ] [aAmiW40] initialized ``` Needless to say, settings `node.name` explicitly still works as before. The commit also contains some clean ups to the relationship between Environment, Settings and Plugins. The previous code suggested the path related settings could be changed after the initial Environment was changed. This did not have any effect as the security manager already locked things down.
2016-07-23 16:46:48 -04:00
The default for the `node.name` settings is now the first 7 charachters of the node id,
which is in turn a randomly generated UUID.
2016-07-14 07:21:10 -04:00
The settings `node.mode` and `node.local` are removed. Local mode should be configured via
`discovery.type: local` and `transport.type:local`. In order to disable _http_ please use `http.enabled: false`
==== Node attribute settings
Node level attributes used for allocation filtering, forced awareness or other node identification / grouping
must be prefixed with `node.attr`. In previous versions it was possible to specify node attributes with the `node.`
prefix. All node attributes except of `node.master`, `node.data` and `node.ingest` must be moved to the new `node.attr.`
namespace.
==== Node types settings
The `node.client` setting has been removed. A node with such a setting set will not
start up. Instead, each node role needs to be set separately using the existing
`node.master`, `node.data` and `node.ingest` supported static settings.
==== Gateway settings
The `gateway.format` setting for configuring global and index state serialization
format has been removed. By default, `smile` is used as the format.
2016-03-13 16:17:48 -04:00
==== Transport Settings
All settings with a `netty` infix have been replaced by their already existing
`transport` synonyms. For instance `transport.netty.bind_host` is no longer
supported and should be replaced by the superseding setting
`transport.bind_host`.
==== Security manager settings
The option to disable the security manager `security.manager.enabled` has been
removed. In order to grant special permissions to elasticsearch users must
edit the local Java Security Policy.
==== Network settings
The `_non_loopback_` value for settings like `network.host` would arbitrarily
pick the first interface not marked as loopback. Instead, specify by address
scope (e.g. `_local_,_site_` for all loopback and private network addresses)
or by explicit interface names, hostnames, or addresses.
The `netty.epollBugWorkaround` settings is removed. This settings allow people to enable
a netty work around for https://github.com/netty/netty/issues/327[a high CPU usage issue] with early JVM versions.
This bug was http://bugs.java.com/view_bug.do?bug_id=6403933[fixed in Java 7]. Since Elasticsearch 5.0 requires Java 8 the settings is removed. Note that if the workaround needs to be reintroduced you can still set the `org.jboss.netty.epollBugWorkaround` system property to control Netty directly.
2016-03-13 16:17:48 -04:00
==== Forbid changing of thread pool types
Previously, <<modules-threadpool,thread pool types>> could be dynamically
adjusted. The thread pool type effectively controls the backing queue for the
thread pool and modifying this is an expert setting with minimal practical
benefits and high risk of being misused. The ability to change the thread pool
type for any thread pool has been removed. It is still possible to adjust
relevant thread pool parameters for each of the thread pools (e.g., depending
on the thread pool type, `keep_alive`, `queue_size`, etc.).
==== Threadpool settings
The `suggest` threadpool has been removed, now suggest requests use the
`search` threadpool.
2016-03-13 16:17:48 -04:00
The prefix on all thread pool settings has been changed from
`threadpool` to `thread_pool`.
The minimum size setting for a scaling thread pool has been changed
from `min` to `core`.
The maximum size setting for a scaling thread pool has been changed
from `size` to `max`.
The queue size setting for a fixed thread pool must be `queue_size`
(all other variants that were previously supported are no longer
supported).
Thread pool settings are now node-level settings. As such, it is not
possible to update thread pool settings via the cluster settings API.
2016-03-13 16:17:48 -04:00
==== Analysis settings
The `index.analysis.analyzer.default_index` analyzer is not supported anymore.
If you wish to change the analyzer to use for indexing, change the
`index.analysis.analyzer.default` analyzer instead.
==== Ping settings
2016-03-13 16:17:48 -04:00
Previously, there were three settings for the ping timeout:
`discovery.zen.initial_ping_timeout`, `discovery.zen.ping.timeout` and
`discovery.zen.ping_timeout`. The former two have been removed and the only
setting key for the ping timeout is now `discovery.zen.ping_timeout`. The
default value for ping timeouts remains at three seconds.
`discovery.zen.master_election.filter_client` and `discovery.zen.master_election.filter_data` have
been removed in favor of the new `discovery.zen.master_election.ignore_non_master_pings`. This setting control how ping responses
are interpreted during master election and should be used with care and only in extreme cases. See documentation for details.
2016-03-13 16:17:48 -04:00
==== Recovery settings
Recovery settings deprecated in 1.x have been removed:
* `index.shard.recovery.translog_size` is superseded by `indices.recovery.translog_size`
* `index.shard.recovery.translog_ops` is superseded by `indices.recovery.translog_ops`
* `index.shard.recovery.file_chunk_size` is superseded by `indices.recovery.file_chunk_size`
* `index.shard.recovery.concurrent_streams` is superseded by `indices.recovery.concurrent_streams`
* `index.shard.recovery.concurrent_small_file_streams` is superseded by `indices.recovery.concurrent_small_file_streams`
* `indices.recovery.max_size_per_sec` is superseded by `indices.recovery.max_bytes_per_sec`
If you are using any of these settings please take the time to review their
purpose. All of the settings above are considered _expert settings_ and should
only be used if absolutely necessary. If you have set any of the above setting
as persistent cluster settings please use the settings update API and set
their superseded keys accordingly.
The following settings have been removed without replacement
* `indices.recovery.concurrent_small_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders
* `indices.recovery.concurrent_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders
==== Translog settings
The `index.translog.flush_threshold_ops` setting is not supported anymore. In
order to control flushes based on the transaction log growth use
`index.translog.flush_threshold_size` instead.
Changing the translog type with `index.translog.fs.type` is not supported
anymore, the `buffered` implementation is now the only available option and
uses a fixed `8kb` buffer.
The translog by default is fsynced after every `index`, `create`, `update`,
`delete`, or `bulk` request. The ability to fsync on every operation is not
necessary anymore. In fact, it can be a performance bottleneck and it's trappy
since it enabled by a special value set on `index.translog.sync_interval`.
Now, `index.translog.sync_interval` doesn't accept a value less than `100ms`
which prevents fsyncing too often if async durability is enabled. The special
value `0` is no longer supported.
`index.translog.interval` has been removed.
2016-03-13 16:17:48 -04:00
==== Request Cache Settings
The deprecated settings `index.cache.query.enable` and
`indices.cache.query.size` have been removed and are replaced with
`index.requests.cache.enable` and `indices.requests.cache.size` respectively.
`indices.requests.cache.clean_interval` has been replaced with
`indices.cache.clean_interval` and is no longer supported.
2016-03-13 16:17:48 -04:00
==== Field Data Cache Settings
The `indices.fielddata.cache.clean_interval` setting has been replaced with
`indices.cache.clean_interval`.
==== Allocation settings
The `cluster.routing.allocation.concurrent_recoveries` setting has been
replaced with `cluster.routing.allocation.node_concurrent_recoveries`.
==== Similarity settings
The 'default' similarity has been renamed to 'classic'.
==== Indexing settings
The `indices.memory.min_shard_index_buffer_size` and
`indices.memory.max_shard_index_buffer_size` have been removed as
Elasticsearch now allows any one shard to use amount of heap as long as the
total indexing buffer heap used across all shards is below the node's
`indices.memory.index_buffer_size` (defaults to 10% of the JVM heap).
==== Removed es.max-open-files
Setting the system property es.max-open-files to true to get
Elasticsearch to print the number of maximum open files for the
Elasticsearch process has been removed. This same information can be
obtained from the <<cluster-nodes-info>> API, and a warning is logged
on startup if it is set too low.
==== Removed es.netty.gathering
2016-03-15 10:05:28 -04:00
Disabling Netty from using NIO gathering could be done via the escape
2016-03-13 16:17:48 -04:00
hatch of setting the system property "es.netty.gathering" to "false".
Time has proven enabling gathering by default is a non-issue and this
non-documented setting has been removed.
==== Removed es.useLinkedTransferQueue
The system property `es.useLinkedTransferQueue` could be used to
control the queue implementation used in the cluster service and the
handling of ping responses during discovery. This was an undocumented
setting and has been removed.
==== Cache concurrency level settings removed
Two cache concurrency level settings
`indices.requests.cache.concurrency_level` and
`indices.fielddata.cache.concurrency_level` because they no longer apply to
the cache implementation used for the request cache and the field data cache.
==== Using system properties to configure Elasticsearch
Elasticsearch can no longer be configured by setting system properties.
Instead, use `-Ename.of.setting=value.of.setting`.
==== Removed using double-dashes to configure Elasticsearch
Elasticsearch could previously be configured on the command line by
setting settings via `--name.of.setting value.of.setting`. This feature
has been removed. Instead, use `-Ename.of.setting=value.of.setting`.
==== Remove support for .properties config files
The Elasticsearch configuration and logging configuration can no longer be stored in the Java
properties file format (line-delimited key=value pairs with a `.properties` extension).
==== Discovery Settings
The `discovery.zen.minimum_master_node` must be set for nodes that have
`network.host`, `network.bind_host`, `network.publish_host`,
`transport.host`, `transport.bind_host`, or `transport.publish_host`
configuration options set. We see those nodes as in "production" mode
and thus require the setting.
==== Realtime get setting
The `action.get.realtime` setting has been removed. This setting was
a fallback realtime setting for the get and mget APIs when realtime
wasn't specified. Now if the parameter isn't specified we always
default to true.
=== Script settings
==== Indexed script settings
Due to the fact that indexed script has been replaced by stored
scripts the following settings have been replaced to:
* `script.indexed` has been replaced by `script.stored`
* `script.engine.*.indexed.aggs` has been replaced by `script.engine.*.stored.aggs` (where `*` represents the script language, like `groovy`, `mustache`, `painless` etc.)
* `script.engine.*.indexed.mapping` has been replaced by `script.engine.*.stored.mapping` (where `*` represents the script language, like `groovy`, `mustache`, `painless` etc.)
* `script.engine.*.indexed.search` has been replaced by `script.engine.*.stored.search` (where `*` represents the script language, like `groovy`, `mustache`, `painless` etc.)
* `script.engine.*.indexed.update` has been replaced by `script.engine.*.stored.update` (where `*` represents the script language, like `groovy`, `mustache`, `painless` etc.)
* `script.engine.*.indexed.plugin` has been replaced by `script.engine.*.stored.plugin` (where `*` represents the script language, like `groovy`, `mustache`, `painless` etc.)
==== Script mode settings
Previously script mode settings (e.g., "script.inline: true",
"script.engine.groovy.inline.aggs: false", etc.) accepted a wide range of
"truthy" or "falsy" values. This is now much stricter and supports only the
`true` and `false` options.
==== Script sandbox settings removed
Prior to 5.0 a third option could be specified for the `script.inline` and
`script.stored` settings ("sandbox"). This has been removed, You can now only
set `script.line: true` or `script.stored: true`.
==== Search settings
The setting `index.query.bool.max_clause_count` has been removed. In order to
set the maximum number of boolean clauses `indices.query.bool.max_clause_count`
should be used instead.
==== Memory lock settings
The setting `bootstrap.mlockall` has been renamed to
`bootstrap.memory_lock`.
==== Snapshot settings
The default setting `include_global_state` for restoring snapshots has been
changed from `true` to `false`. It has not been changed for taking snapshots and
still defaults to `true` in that case.
==== Time value parsing
The unit 'w' representing weeks is no longer supported.
Fractional time values (e.g., 0.5s) are no longer supported. For example, this means when setting
timeouts "0.5s" will be rejected and should instead be input as "500ms".
==== Node max local storage nodes
Previous versions of Elasticsearch defaulted to allowing multiple nodes to share the same data
directory (up to 50). This can be confusing where users accidentally startup multiple nodes and end
up thinking that they've lost data because the second node will start with an empty data directory.
While the default of allowing multiple nodes is friendly to playing with forming a small cluster on
a laptop, and end-users do sometimes run multiple nodes on the same host, this tends to be the
exception. Keeping with Elasticsearch's continual movement towards safer out-of-the-box defaults,
and optimizing for the norm instead of the exception, the default for
`node.max_local_storage_nodes` is now one.