Since #13068 refresh and flush requests go to the primary first and are then replicated.
One difference to before is though that if a shard is not available (INITIALIZING for example)
we wait a little for an indexing request but for refresh we don't and just give up immediately.
Before, refresh requests were just send to the shards regardless of what their state is.
In tests we sometimes create an index, issue an indexing request, refresh and
then get the document. But we do not wait until all nodes know that all primaries have ben assigned.
Now potentially one node can be one cluster state behind and not know yet that
the shards have ben started. If the refresh is executed through this node then the
refresh request will silently fail on shards that are started already because from
the nodes perspective they are still initializing. As a consequence, documents
that expected to be available in the test are now not.
Example test failures are here: http://build-us-00.elastic.co/job/elasticsearch-20-oracle-jdk7/395/
This commit changes the timeout to 1m (default) to make sure we don't miss shards
when we refresh. This will trigger the same retry mechanism as for indexing requests.
We still have to make a decision if this change of behavior is acceptable.
see #13238
This allows reducing privileges with doPrivileged to work,
otherwise it will fail with NPE.
In general, if some code wants to do that, let it. The null
check is needed, even though ProtectionDomain(CodeSource, PermissionCollection)
is more than a bit misleading: "the current Policy will not be consulted".
Additionally add a defensive check for location, since the docs
there are even more confusing: https://bugs.openjdk.java.net/browse/JDK-8129972
The jdk policy impl has both these checks.
Several shard-level operations that previously broadcasted a request
per shard were converted to broadcast a request per node. This commit
converts upgrade action to this new model as well.
Closes#13204
Switch to use HTTPS by default for all hardcoded plugin URLs.
If users want to install via HTTP they can still specify a HTTP
URL manually.
Closes#12748
The current netty multiport tests bind on localhost and then try to connect
to 127.0.0.1, which may fail, if localhost is resolved to ipv6 by default.
This randomly chooses between 127.0.0.1, localhost and ::1 (if available) for
binding and then uses this throughout the test.
The path that a shard is allocated on is not taken into account when
we decide to move a shard away from a node because it passed a watermark.
Even worse we potentially moved away (relocated) a shard that was not even
allocated on that disk but on another on the node in question. This commit
adds a ShardRouting -> dataPath mapping to ClusterInfo that allows to identify
on which disk the shards are allocated on.
Relates to #13106
```sh
bin/plugin install lmenezes/elasticsearch-kopf/develop
-> Installing lmenezes/elasticsearch-kopf/develop...
Trying http://download.elastic.co/lmenezes/elasticsearch-kopf/elasticsearch-kopf-develop.zip ...
Trying http://search.maven.org/remotecontent?filepath=lmenezes/elasticsearch-kopf/develop/elasticsearch-kopf-develop.zip ...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/lmenezes/elasticsearch-kopf/develop/elasticsearch-kopf-develop.zip ...
Trying https://github.com/lmenezes/elasticsearch-kopf/archive/develop.zip ...
Downloading .................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................DONE
Verifying https://github.com/lmenezes/elasticsearch-kopf/archive/develop.zip checksums if available ...
Trying https://github.com/lmenezes/elasticsearch-kopf/archive/master.zip ...
Downloading ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................DONE
Verifying https://github.com/lmenezes/elasticsearch-kopf/archive/master.zip checksums if available ...
```
This happens because we don't have anymore ElasticsearchWrapperException here but standard java exceptions.
Closes#13196.
Currently, many shard-level operations are transported with a request
per shard via TransportBroadcastAction. These shard-level requests are
then submitted to unbounded execution queues for asynchronous execution
on the receiving node. This transport mechanism and stuffing of the
execution queues can be problematic on large clusters. A better
mechanism would be to aggregate the shard-level requests, transport
them via a single request per node, and execute the shard-level
operations serially on the receiving node.
This commit introduces TransportNodeBroadcastAction which is the
high-level mechanism for transporting the shard-level operations in a
single request per node. The shard-level operations are executed
serially on the receiving node and per-node shard-level results are
aggregated into a single response per node. These node-level results
are then aggregated into a single response to the initial request.
One item of note is a new mechanism for registering request handlers.
This mechanism enables registrants to provide a callback for
instantiating new instances of the request class. Doing this enables
the inner class to be instantiated with the context of its outer class.
This is done so that a single NodeRequest class can be defined rather
than defining a class per operation.
Closes#7990
Today we sum up the disk usage for the allocation decider which is broken since
we don't stripe across multiple data paths. Each shard has it's own private path
now but the allocation deciders still treat all paths as one big disk. This commit
adds allows allocation deciders to access the least used and most used path to make
better allocation decidsions upon canRemain and canAllocate calls.
Yet, this commit doesn't fix all the issues since we still can't tell which shard
can remain and which can't. This problem is out of scope in this commit and will be solved
in a followup commit.
Relates to #13106
Previously multiple clusters in the same JVM reused the same port ranges, leading to potential big gaps in port selection, which in turns causes unicast based discovery to fail, missing to find another node in the default 5 port range.
Also the previous logic had http use a range that is assigned to another JVMs.
We default the value to be 20x the value of a ping timeout, however we only use the legacy ping timeout settings value for the calculation.
Closes#13162
This commit removes and now forbids all uses of
com.google.common.collect.Lists across the codebase. This is the first
of many steps in the eventual removal of Guava as a dependency.
lessThanOrEqualTo is more appropriate when comparing _ttl than lessThan
because in rare cases, when tests run very fast, the ttl you fetch will
still equal the one you sent.