Today if a shard fails during initialization phase due to misconfiguration, broken disks,
missing analyzers, not installed plugins etc. elasticsaerch keeps on trying to initialize
or rather allocate that shard. Yet, in the worst case scenario this ends in an endless
allocation loop. To prevent this loop and all it's sideeffects like spamming log files over
and over again this commit adds an allocation decider that stops allocating a shard that
failed more than N times in a row to allocate. The number or retries can be configured via
`index.allocation.max_retry` and it's default is set to `5`. Once the setting is updated
shards with less failures than the number set per index will be allowed to allocate again.
Internally we maintain a counter on the UnassignedInfo that is reset to `0` once the shards
has been started.
Relates to #18417
This commit adds a variety of real disk metrics for the block devices
that back Elasticsearch data paths. A collection of statistics are read
from /proc/diskstats and are used to report the raw metrics for
operations and read/write bytes.
Relates #15915
Adds infrastructure so `gradle :docs:check` will extract tests from
snippets in the documentation and execute the tests. This is included
in `gradle check` so it should happen on CI and during a normal build.
By default each `// AUTOSENSE` snippet creates a unique REST test. These
tests are executed in a random order and the cluster is wiped between
each one. If multiple snippets chain together into a test you can annotate
all snippets after the first with `// TEST[continued]` to have the
generated tests for both snippets joined.
Snippets marked as `// TESTRESPONSE` are checked against the response
of the last action.
See docs/README.asciidoc for lots more.
Closes#12583. That issue is about catching bugs in the docs during build.
This catches *some* bugs in the docs during build which is a good start.
By default, tasks are grouped by node. However, task execution in elasticsearch can be quite complex and an individual task that runs on a coordinating node can have many subtasks running on other nodes in the cluster. This commit makes it possible to list task grouped by common parents instead of by node. When this option is enabled all subtask are grouped under the coordinating node task that started all subtasks in the group. To group tasks by common parents, use the following syntax:
GET /tasks?group_by=parents
This adds a new `/_cluster/allocation/explain` API that explains why a
shard can or cannot be allocated to nodes in the cluster. Additionally,
it will show where the master *desires* to put the shard, according to
the `ShardsAllocator`.
It looks like this:
```
GET /_cluster/allocation/explain?pretty
{
"index": "only-foo",
"shard": 0,
"primary": false
}
```
Though, you can optionally send an empty body, which means "explain the
allocation for the first unassigned shard you find".
The output when a shard is unassigned looks like this:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : false
},
"assigned" : false,
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2016-03-22T20:04:23.620Z"
},
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 0.06666675,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "NO",
"weight" : -1.3833332,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 2.3166666,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
And when the shard *is* assigned, the output looks like:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : true
},
"assigned" : true,
"assigned_node_id" : "Qc6VL8c5RWaw1qXZ0Rg57g",
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 1.4499999,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "CURRENTLY_ASSIGNED",
"weight" : 0.0,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 3.6999998,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
Only "NO" decisions are returned by default, but all decisions can be
shown by specifying the `?include_yes_decisions=true` parameter in the
request.
Resolves#14593
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node
Also these stats are returned on a per pipeline basis.
Internally the put pipeline API uses this information in node info API to validate if all specified processors in a pipeline exist on all nodes in the cluster.
The cluster stats api now returns counts for each node role. The `master_data`, `master_only`, `data_only` and `client` fields have been removed from the response in favour of `master`, `data`, `ingest` and `coordinating_only`. The same node can have multiple roles, hence contribute to multiple roles counts. Every node is implicitly a coordinating node, so whenever a node has no explicit roles, it will be counted as coordinating only.
This commit modifies the load_average in the node stats API response
to be an object containing the one-minute, five-minute and
fifteen-minute load averages as fields (if those values are
available). Additionally, this commit modifies the cat nodes API
response to format the one-minute, five-minute and fifteen-minute load
averages as null if any of the respective values are not available.
Current processors setting is not reflected in nodes info API
("os.available_processors"). Add os.allocated_processors to shows
actual number of processors that we are using.
This commit consolidates several abstractions on the shard level in
ordinary classes not managed by the shard level guice injector.
Several classes have been collapsed into IndexShard and IndexShardGatewayService
was cleaned up to be more lightweight and self-contained. It has also been moved into
the index.shard package and it's operation is renamed from recovery from "gateway" to recovery
from "store" or "shard_store".
Closes#11847
we currently don't expose this.
This adds the following to the OS section of `_nodes`:
```
"os": {
"name": "Mac OS X",
...
}
```
and the following to the OS section of `_cluster/stats`:
```
"os": {
...
"names": [
{
"name": "Mac OS X",
"count": 1
}
],
...
},
```
Closes#11807
This removes Elasticsearch's filter cache and uses Lucene's instead. It has some
implications:
- custom cache keys (`_cache_key`) are unsupported
- decisions are made internally and can't be overridden by users ('_cache`)
- not only filters can be cached but also all queries that do not need scores
- parent/child queries can now be cached, however cached entries are only
valid for the current top-level reader so in practice it will likely only
be used on read-only indices
- the cache deduplicates filters, which plays nicer with large keys (eg. `terms`)
- better stats: we already had ram usage and evictions, but now also hit count,
miss count, lookup count, number of cached doc id sets and current number of
doc id sets in the cache
- dynamically changing the filter cache size is not supported anymore
Internally, an important change is that it removes the NoCacheFilter infrastructure
in favour of making Query.rewrite specializing the query for the current reader so
that it will only be cached on this reader (look for IndexCacheableQuery).
Note that consuming filters with the query API (createWeight/scorer) instead of
the filter API (getDocIdSet) is important for parent/child queries because
otherwise a QueryWrapperFilter(ParentQuery) would run the wrapped query per
segment while relations might be cross segments.