When a search on a shard to a remove node fails, and then replica exists on the local node, then the execution of the search is done on the network thread. This is problematic since we need to execute it on the actual search thread pool, but can also explain #4519, where the get happens on the network thread and it waits to send the get request till the network thread we use is freed (deadlock...)
fixes#4526
note, re-enable the geo shape fetch test, this fix should solve it as well
Allow to have a new index level setting index.codec.bloom.load (default to true), that can control if the boom filters will be loaded or not. This is an updateable setting, that can be updated on a live index using the update settings API.
Note though, when this setting is updated, a fresh Lucene index will be reopened, causing associate caches to be dropped potentially.
closes#4525
Note, this change also disables the returning lucene ram usage stats, due to a bug in Lucene, relates to #4512
The shards in the set are mutated after they are added to the
set such that the hashcode doesn't fit anymore. For this reason
this used an identity hashset before but the downside of this is
that the iteration order is not deterministic. We can just use a list
since shard removal is a very rare action and the size of the list is
very small such that iteration is fast.
The memory used for the Lucene index (term dict, bloom filter, ...) can now be reported per segment using the segments API, and on the segments flag on node/indices stats
closes#4512
a previous change introduces an identity hashset that has non-deterministic
iteration order which kill the reproducibility of our unittests if they fail.
This patch adds back deterministic allocations.
Make sure to evict an existing node with the same transport address as a new node that joins. This can happen for example when there is a bug in a cluster state event handler, which causes the "old" node to not be evicted, or a load on the master node that will take time for the "old" node leaving to be processed.
closes#4503
Currently we trying to find a replica for a primary that is allocated by
running through all shards in the cluster while RoutingNodes already has
a datastructure keyed by shard ID for this. We should lookup this
directly rather than using linear probing. This improves shard allocation performance
by 5x.
This is an extreme case, exposed by a bug we had in our allocation in local gateway, causing a cluster state that doesn't include a node in the nodes list, but still has the shard in the routing table pointing at the non existent node. Then, when a node on the same box comes back, it will cause the local shard data to be deleted because it thinks its fully allocated on other nodes.
fixes#4502
We support three different settings in templates
* "settings" : { "index" : { "number_of_shards" : 12 } }
* "settings" : { "index.number_of_shards" : 12 }
* "settings" : { "number_of_shards" : 12 }
The latter one was not supported by the fix in #4235
This commit fixes this issue and uses randomized testing to test any of the three cases above when running integration tests.
Closes#4411
When we allocate unassigned shards we can terminate early for some
shards like if we already tried to allocate a replica we don't need
to try the same replica if the first one got rejected. We also
can check if certain nodes can't allocate any primaries or shrads
at all and take those nodes out of the picture for the current round
since it will not change in the current round.
This commit allows to trade precision for memory when storing geo points.
This new field data impl accepts a `precision` parameter that controls the
maximum expected error for storing coordinates. This option can be updated on
a live index with the PUT mapping API.
Default precision is 1cm, which requires 8 bytes per geo-point (50% memory
saving compared to using 2 doubles).
Close#4386
The `text` query was replaced by the `match` query and has been
deprecated for quite a while.
The `field` query should be replaced by a `query_string` query with
the `default_field` specified.
Fixes#4033
If the term suggester is used the results are merged depending on
the number of terms produced by the tokenizer / tokenfilter. If a
term suggester is executed across multiple indices that share the
same field but with different analysis chains we can't merge the
result anymore sicne tokens are our of order or have a different size.
This commit throws ESIllegalArgumentException if the number of entries
are not the same across all results.
Closes#3196
This commit changes field data configuration updates so that they are
immediately taken into account for loading new segments. The way it works
is that field data configuration is now cached separately from the field
data cache, meaning that it is now possible to clear the field data
configuration from IndexFieldDataService while the cache will stay around. On
the next time that Elasticsearch will reload field data configuration, it will
check if there is already a cache entry, and reuse it if it exists.
To disable field data loading, all that is required is to change the field
data format to "none" (supported by all field data types) using the update
mapping API. Elasticsearch will then refuse to load field data on any new
segment, but field data which has been loaded on the previous segments will
remain available. So you need to clear the field data cache in order to
reclaim memory (otherwise memory will be reclaimed slower, as segments get
merged).
Close#4430Close#4431
Currently we miss to reset the source shards status to ACTIVE if we cancel
a relocation. If the shard is RELOCATING we need to reset to state ACTIVE.
Closes#4457
Currently the RoutingNodes API allows modification of it's internal state outside of the class.
This commit improves the APIs of `RoutingNode` and `RoutingNode` to change internal state
only within the classes itself.
Closes#4458