By default term vectors are now realtime, as opposed to previously near
realtime. If they are not found in the index, they will be generated on the
fly. The document is fetched from the transaction log and treated as an
artificial document. One can set `realtime` parameter to `false` in order to
disable this functionality. This consequently makes the MLT query realtime in
fetching documents, as it previsouly used to be before switching from using
the multi get API to the mtv API.
Closes#7846
Currently DEBUG logs can get very verbose because IndicesClusterStateService logs the complete mapping with every mapping update. We should suppress it if long in DEBUG mode and always log the full one in TRACE.
Closes#7949
When sending a multicast ping, there is no way to determine how long it will take before all nodes will respond. Currently we send two pings (one at start, one after half timeout) and wait until the ping timeout has passed for all responses to come back. However, if all nodes are fast to respond, there is a gap relatively large between the moment that pings were gathered and the election that is based on them. This commits adds a last ping round (at timeout) where we know the number of nodes we expect to receive answers from. Once all nodes responded, we complete the pinging.
Closes#7924
Due to component start order we may process an incoming ping while the ZenDiscovery module is not yet started. This leads to exception (from which we recover correctly, but the logs are note nice). UnicastZenPing should only start processing pings if it is started. We previously processed if not closed or stopped.
Closes#7950
This commit makes the lookup structures that are used for mappings immutable.
When changes are required, a new instance is created while the current instance
is left unmodified. This is done efficiently thanks to a hash table
implementation based on a array hash trie, see
org.elasticsearch.common.collect.CopyOnWriteHashMap.
ManyMappingsBenchmark returns indexing times that are similar to the ones that
can be observed in current master.
Ultimately, I would like to see if we can make mappings completely immutable as
well and updated atomically. This is not trivial however, eg. because of dynamic
mappings. So here is a first baby step that should help move towards that
direction.
Close#7486
Don't insist on log file removal until after usage is printed.
Some simple Python code improvements (x.find(y) != -1 --> y in x)
Make sure the git area is "clean" (has no unpushed changes, has pulled
all changes, has no untracked files)
Add label color detail when creating next github version label.
Closes#7913
The Put Warmer API executes the search encapsulated in the warmer before accepting it. This requires that at least one shard will be started. The tests used to use ensureGreen to check for that because of a publish timeout of 0 (needed to check the ack mechanism) that doesn't guarantee the shard is really started - just that the master has changed the CS to say so. This commit changes the ensureGreen to a the indexing of a single document.
Adds a check to make sure that all ids in the query are either strings
or numbers. This is to prevent the case where a user accidentally
specifies:
"ids": [["1", "2"]]
(note the double array)
With this change, an exception will be thrown since the second "[" is
not a string or number, it is a Token.START_ARRAY.
Fixes#7686
The PluginManager had a subtle bug in case the config directory was not in the
es home directory - which is always true in case of packaging.
This fixes the plugin manager, so that when specifying a path.home and a
path.conf variable on the commandline, the plugin manager acts
appropriately.
Before this change all persistent custom metadata is stored as part of snapshot. It requires us to remove repositories metadata later during recovery process. This change allows custom metadata to specify whether or not it should be stored as part of a snapshot.
Fixes#7900
Failing a parent breaker check is eventually consistent, so the test
could fail the parent limit, throw an exception, and before being
adjusted back down, increment more and throw a circuit breaking
exception on the child. This increases the child's limit, to ensure
we're only testing the parent limit.
It adds an additional assert to ensure that the breaker total is
correctly re-adjusted when the parent breaker has been tripped.
This does the following:
* Make 'force' flag only build a merge if the delegate MP returned no merges
* Add async handling for 'flush' when 'waitForMerges' is false
* Remove flush at the beginning of optimize. This is something the user can
do if they wish, before calling optimize.
closes#7886closes#7904closes#7920
Remove the creation of a node client if not there before each test through setup method. `numClientNodes` makes sure that the client node gets created during suite cluster initialization.
Previously, the only way to specify a document not present in the index was to
use `like_text`. This would usually lead to complex queries made of multiple
MLT queries per document field. This commit adds the ability to the MLT query
to directly specify documents not present in the index (artificial documents).
The syntax is similar to the Percolator API or to the Multi Term Vector API.
Closes#7725
Create client nodes using `node.client: true` instead of `node.data: false` and `node.master: false`.
We should create client nodes in our test infra using the `node.client:true` settings as that is the one that users use, and the one that we use as well in `ClientNodePredicate` thus we end up not finding client nodes otherwise as they weren't created with the proper setting.
Updated also the `DataNodePredicate` so that `client: true` is enough, no need for `data: false` as well.
Closes#7911
The minimum number of optional should clauses of the generated query to match
can now be set using the more extensive minimum should match syntax. This
makes the `percent_terms_to_match` parameter deprecated, and replaced in favor
to a new `minimum_should_match` parameter.
Closes#7898
With #7834, we simplified ZenDiscovery by making it use the current cluster state for all it's decision. This had the side effect a node may start it's Master FD before the master has fully processed that cluster state update that adds that node (or elects the master master). This is due to the fact that master FD is started when a node receives a cluster state from the master but the master it self may still be publishing to other node.
This commit makes sure that a master FD ping is only failed once we know that there is no current cluster state update in progress.
Closes#7908
It was only used by `readSource`, it has been changed to return a
Translog.Operation, which can have .getSource() called on it to return
the source. `readSource` has been removed.
This also removes the checked IOException, any exception thrown is
unexpected and should throw a runtime exception.
Moves the ReleasableBytesStreamOutput allocation into the body of the
try-catch block so the lock can be released in the event of an exception
during allocation.