Daniel Mitterdorfer f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00

81 lines
3.9 KiB
Plaintext

[[breaking_70_indices_changes]]
=== Indices changes
==== `:` is no longer allowed in index name
Due to cross-cluster search using `:` to separate a cluster and index name,
index names may no longer contain `:`.
==== `index.unassigned.node_left.delayed_timeout` may no longer be negative
Negative values were interpreted as zero in earlier versions but are no
longer accepted.
==== `_flush` and `_force_merge` will no longer refresh
In previous versions issuing a `_flush` or `_force_merge` (with `flush=true`)
had the undocumented side-effect of refreshing the index which made new documents
visible to searches and non-realtime GET operations. From now on these operations
don't have this side-effect anymore. To make documents visible an explicit `_refresh`
call is needed unless the index is refreshed by the internal scheduler.
==== Limit to the difference between max_size and min_size in NGramTokenFilter and NGramTokenizer
To safeguard against creating too many index terms, the difference between `max_ngram` and
`min_ngram` in `NGramTokenFilter` and `NGramTokenizer` has been limited to 1. This default
limit can be changed with the index setting `index.max_ngram_diff`. Note that if the limit is
exceeded a error is thrown only for new indices. For existing pre-7.0 indices, a deprecation
warning is logged.
==== Limit to the difference between max_size and min_size in ShingleTokenFilter
To safeguard against creating too many tokens, the difference between `max_shingle_size` and
`min_shingle_size` in `ShingleTokenFilter` has been limited to 3. This default
limit can be changed with the index setting `index.max_shingle_diff`. Note that if the limit is
exceeded a error is thrown only for new indices. For existing pre-7.0 indices, a deprecation
warning is logged.
==== Document distribution changes
Indices created with version `7.0.0` onwards will have an automatic `index.number_of_routing_shards`
value set. This might change how documents are distributed across shards depending on how many
shards the index has. In order to maintain the exact same distribution as a pre `7.0.0` index, the
`index.number_of_routing_shards` must be set to the `index.number_of_shards` at index creation time.
Note: if the number of routing shards equals the number of shards `_split` operations are not supported.
==== Skipped background refresh on search idle shards
Shards belonging to an index that does not have an explicit
`index.refresh_interval` configured will no longer refresh in the background
once the shard becomes "search idle", ie the shard hasn't seen any search
traffic for `index.search.idle.after` seconds (defaults to `30s`). Searches
that access a search idle shard will be "parked" until the next refresh
happens. Indexing requests with `wait_for_refresh` will also trigger
a background refresh.
==== Remove deprecated url parameters for Clear Indices Cache API
The following previously deprecated url parameter have been removed:
* `filter` - use `query` instead
* `filter_cache` - use `query` instead
* `request_cache` - use `request` instead
* `field_data` - use `fielddata` instead
==== `network.breaker.inflight_requests.overhead` increased to 2
Previously the in flight requests circuit breaker considered only the raw byte representation.
By bumping the value of `network.breaker.inflight_requests.overhead` from 1 to 2, this circuit
breaker considers now also the memory overhead of representing the request as a structured object.
==== Parent circuit breaker changes
The parent circuit breaker defines a new setting `indices.breaker.total.use_real_memory` which is
`true` by default. This means that the parent circuit breaker will trip based on currently used
heap memory instead of only considering the reserved memory by child circuit breakers. When this
setting is `true`, the default parent breaker limit also changes from 70% to 95% of the JVM heap size.
The previous behavior can be restored by setting `indices.breaker.total.use_real_memory` to `false`.