Unassigned meta includes additional information as to why a shard is unassigned, this is especially handy when a shard moves to unassigned due to node leaving or shard failure.
The additional data is provided as part of the cluster state, and as part of `_cat/shards` API.
The additional meta includes the timestamp that the shard has moved to unassigned, allowing us in the future to build functionality such as delay allocation due to node leaving until a copy of the shard is found.
closes#11653
Since the /var/run/elasticsearch directory is cleaned when the operating system starts, the init.d script must ensure that the PID_DIR is correctly created.
Closes#11594
The cache we are using for fielddata reclaims memory lazily/asynchronously. While this
helps with throughput this is an issue when a clear operation is issued manually since
memory is not reclaimed immediately. Since our clear methods already perform in linear
time anyway, this commit changes the fielddata cache to reclaim memory synchronously
when a clear command is issued manually. However, it remains lazy in case cache entries
are invalidated because segments or readers are closed, which is important since such
events happen all the time.
Close#11695
Some of our Java api builders had wrong logic when it comes to serializing the query in json format, resulting in missing fields like _name. Also, regexp parser was ignoring the _name field.
Closes#11694
`_timestamp` uses NumericDocValues instead of SortedNumericDocValues like other
numeric fields since it is guaranteed to be single-valued. However, we don't
need a different fielddata impl for it since DocValues.getSortedNumeric already
falls back to NUMERIC doc values if SORTED_NUMERIC are not available.
[maven] clean pom.xml
In Maven parent project, in dependency management, we should only declare which versions of 3rd party jars we want to use but not force any scope.
It makes then more obvious in modules what is exactly the scope of any dependency.
For example, one could imagine importing `jimfs` as a `compile` dependency in another module/plugin with:
```xml
<dependency>
<groupId>com.google.jimfs</groupId>
<artifactId>jimfs</artifactId>
</dependency>
```
But it won't work as expected as the default maven `scope` should be `compile` but here it's `test` as defined in the parent project.
So, if you want to use this lib for tests, you should simply define:
```xml
<dependency>
<groupId>com.google.jimfs</groupId>
<artifactId>jimfs</artifactId>
<scope>test</scope>
</dependency>
```
We also remove `maven-s3-wagon` from gce plugin as it's not used.
This was previously a container for an ObjectMapper, along with the
DocumentMapper that ObjectMapper came from. However, there was
only one use of needing the associated DocumentMapper, and that
wasn't actually used.
In Maven parent project, in dependency management, we should only declare which versions of 3rd party jars we want to use but not force any scope.
It makes then more obvious in modules what is exactly the scope of any dependency.
For example, one could imagine importing `jimfs` as a `compile` dependency in another module/plugin with:
```xml
<dependency>
<groupId>com.google.jimfs</groupId>
<artifactId>jimfs</artifactId>
</dependency>
```
But it won't work as expected as the default maven `scope` should be `compile` but here it's `test` as defined in the parent project.
So, if you want to use this lib for tests, you should simply define:
```xml
<dependency>
<groupId>com.google.jimfs</groupId>
<artifactId>jimfs</artifactId>
<scope>test</scope>
</dependency>
```
We also remove `maven-s3-wagon` from gce plugin as it's not used.
Now that doc values are the default for fielddata, specialized in-memory
formats are becoming an esoteric option. This commit removes such formats:
- `fst` on string fields,
- `compressed` on geo points.
I also removed documentation and tests that the fielddata cache is shared if
you change the format, since this is only true for in-memory fielddata formats
(given that for doc values, the caching is done directly in Lucene).
The script APIs have been deprecated long ago we can now remove them.
This commit still keeps the parsing code since it might be used in a
query that is still stuck in transaction log. This issue should be discussed
elsewhere.
Closes#11619
In order to restrict a single set of field type settings for a given
field name across an index, we need the ability to compare field types.
This change adds equals and hashcode, as well as tests for every field
type.
Make sure there is a single place where shard routing move to unassigned, so we can add additional metadata when it does, also, simplify shard routing implementations a bit
closes#11634
The AsyncShardFetch retrieves shard information from the different nodes in order to detirment the best location for unassigned shards. The class uses TransportNodesListGatewayStartedShards and TransportNodesListShardStoreMetaData in order to fetch this information. These actions, inherit from TransportNodesAction and are activated using a list of node ids. Those node ids are extracted from the cluster state that is used to assign shards.
If we perform a reroute and adding new news in the same cluster state update task, it is possible that the AsyncShardFetch administration is based on
a different cluster state then the one used by TransportNodesAction to resolve nodes. This can cause a problem since TransportNodesAction filters away unkown nodes, causing the administration in AsyncShardFetch to get confused.
This commit fixes this allowing to override node resolving in TransportNodesAction and uses the exact node ids transfered by AsyncShardFetch
Information about in-progress snapshot and restore processes is not really metadata and should be represented as a part of the cluster state similar to discovery nodes, routing table, and cluster blocks. Since in-progress snapshot and restore information is no longer part of metadata, this refactoring also enables us to handle cluster blocks in more consistent manner and allow creation of snapshots of a read-only cluster.
Closes#8102
This commit cleans up all the MergeScheduler infrastructure
and simplifies / removes all unneeded abstractions. The MergeScheduler
itself is now private to the Engine and all abstractions like Providers
that had support for multiple merge schedulers etc. are removed.
Closes#11602
This is helpful to track down the origin of pending_tasks that aren't
expected. In tests we catch this with an assert, but in production
asserts may not be enabled so we should at least add the class name.
Today we provide the ability to plug in MergePolicy and
we provide the once lucene ships with. We do not recommend to change
the default and even only a small number of expert users would ever touch
this. This commit removes the ancient log byte size and log doc count
merge policy providers, simplifies the MergePolicy wiring and makes the
tiered MP the one and only default. All notions of a merge policy has been
removed from the docs and should be deprecated in the previous version.
Closes#11588