This commit adds support for random merge policies set for every
index created in an AbstractIntegrationTest. It will either set
'logbyte', 'logdoc' or 'tiered' merge policy as well as a random
value for compound files.
Added checks to IndexRequest#source(Object...) to ensure and even number
of parameters. This method otherwise throws an AIOOBException which is
confusing to users and doesn't report the root cause of the problem.
This commit allows for using Lucene doc values as a backend for field data,
moving the cost of building field data from the refresh operation to indexing.
In addition, Lucene doc values can be stored on disk (partially, or even
entirely), so that memory management is done at the operating system level
(file-system cache) instead of the JVM, avoiding long pauses during major
collections due to large heaps.
So far doc values are supported on numeric types and non-analyzed strings
(index:no or index:not_analyzed). Under the hood, it uses SORTED_SET doc values
which is the only type to support multi-valued fields. Since the field data API
set is a bit wider than the doc values API set, some operations are not
supported:
- field data filtering: this will fail if doc values are enabled,
- field data cache clearing, even for memory-based doc values formats,
- getting the memory usage for a specific field,
- knowing whether a field is actually multi-valued.
This commit also allows for configuring doc-values formats on a per-field basis
similarly to postings formats. In particular the doc values format of the
_version field can be configured through its own field mapper (it used to be
handled in UidFieldMapper previously).
Closes#3806
Migrate SearchContext.Rewrite to Releasable. All the parent child queries are now implemented as Lucene queries and the state is kept in the Weight.
Completely disable caching for `has_child` and `has_parent` filters, this has never worked and it also can also never work. The matching docIds are cached per segment while the collection of parent ids is top level.
Closes#3822
* Merged segments are now warmed-up at the end of the merge operation instead
of _refresh, so that _refresh doesn't pay the price for the warm-up of merged
segments, which is often higher than flushed segments because of their size.
* Even when no _warmer is registered, some basic warm-up of the segments is
performed: norms, doc values (_version). This should help a bit people who
forget to register warmers.
* Eager loading support for the parent id cache and field data: when one
can't predict what terms will be present in the index, it is tempting to use
a match_all query in a warmer, but in that case, query execution might not be
much faster than field data loading so having a warmer that only loads field
data without running a query can be useful.
Closes#3819
I tried to install a plugin directly from disk, but used the wrong arguments. Totally my fault for not RTFM properly.
However, by following the instructions printed out by `bin/plugin`, I ended up deleting my $ES/bin directory.
```
$ pwd
/tmp/elasticsearch-0.90.5
$ ls
LICENSE.txt NOTICE.txt README.textile bin config lib
$ bin/plugin --install file:///tmp/foo.zip
-> Installing file:///tmp/foo.zip...
Failed to install file:///tmp/foo.zip, reason: plugin directory /tmp/elasticsearch-0.90.5/plugins already exists. To update the plugin, uninstall it first using -remove file:///tmp/foo.zip command
$ bin/plugin -remove file:///tmp/foo.zip
-> Removing file:///tmp/foo.zip
Removed file:///tmp/foo.zip
$ ls
LICENSE.txt NOTICE.txt README.textile config lib
```
I reproduced the problem in 0.90.5 and the latest master.
Closes#3847.
Adding the mappers first could cause the wrong FieldMappers to be used during object parsing
Also:
Removed MergeContext.newFieldMappers, MergeContext.newObjectMappers and relatives as they are not used anymore.
Similarly removed MergeContext.addFieldDataChange & fieldMapperChanges as they were not used.
Closes#3844
Due to a change in 0.90 some exceptions may show up when starting a river,
even though this is not a problem as the shard has not been initialized
yet and a retry is scheduled.
Only valid fields (input, output, weight and payload) are now allowed inside a completion field.
This makes sure that typos like "inputs" are caught on indexing.
When parsing a filter, we effectively have a case where we need to ignore a filter. For example, when an "and" filter has no elements. Ignoring a filter is different compared to matching on all or matching on none, because an and filter with no elements should simply be ignored in the context of a bool filter for example, regardless if its used within a must or a must_not filter.
We should be consistent in our codebase when handling these use case. See also #3809closes#3838
Added a load_average_format=array|hash param to OsStats to allow serializing load averages into a hash (with keys 1m,5m, 15m).
Added node_info_format to NodeStats, allowing to suppress node info output
Also:
We always return a valid json back
If there are no templates define we return 200 OK ( as opposed to a request for a specific template id which returns 404)
Closes#3812
If 'pattern_capture' tokenfilter is create / mapped without a 'patterns'
settings we now throw an exception since this is a misconfiguration and
likely due to the similar settings on related token filters.
Closes#3808
If nodes are shutting down we close thread pools and throw
'EsRejectedExcutionException'. This commit handles these exceptions
gracefully if throw during Ping execution.
Index settings can now override default behavior for 'double write'
and 'no delete open files' on MockDirectoryWrapper if tests use these
options in a legit. way. ie. in a restore situation a double write
is a legit operation.
This can go wrong if indices with the same name are repeatably created and deleted.
UUIDs can not be null anymore. If UUID is not available `_na_` will be used as a value.
Also - some minor clean up in ShardStateAction where shard started events could be added twice to the to-be-applied list where the second instance will be ignored.
Closes#3783
instead of relying on the current cluster state from the cluster service, make sure to rely on the cluster state we get from the change event, this will allow us to move processing of the cluster event around potentially to before the local cluster state has been updated
Currently we fail tests is any searcher reference is pending. Yet,
on a slow machine the freeContext calls that are async could still be
in flight so if there are pending searchers we wait for a bit to make
sure we don't fail if a freeContext call is in flight.
The MockEngine now also contains the stack trace of the first close call
if a searcher is closed twice.
DateHistogramFacets can be used on fields that store dates in a numeric
(long) field using different resolutions like seconds instead of
milliseconds. This commit adds a test that checks if the factor is
applied correctly to scale it up to milliseconds.
Introduce a new internal, index shard level, post recovery state, where the shard moves to when its done with recovery. The shard will now move to started only once the cluster state with its respective cluster state level state is started.
This change allow to have more fine grained control over when to allow reads on a shard, resolving potential refresh temporal visibility aspects while indexing and issuign a refresh. By only allowing reads on started shards, and making sure we refresh right before we move to started
Added node restart capabilities to TestCluster
Trigger retry mechanism (onFailure method) instead of invoking transport service with null DiscoveryNode when no DiscoveryNode can be found for a ShardRouting.
Currently tests only run with node clients but eventually we want to
run all tests with randomly choosen node / transport clients. To enable
this during development and on test servers as a transition phase this
commit adds the ability to allow a fraction of the clients used in
tests to be transport clients. By default this is still disabled.
To enable transport clients in tests pass '-Dtests.client.ratio=[0..1]'
where '1.0' will force transport clients and '0.0' completely disables
them. If an empty string is passed the ratio is chosen at random for
each test.
Catching Throwable instead of Exception in TransportClient
and TransportClientNodesService and restore interrupted flag
if interrupt exception is caught and ignored
TransportClient doesn't add the initial nodes to the nodes list
if it doesn't retrieve any nodes from the listeners which can cause
the transport client to throw a 'NoNodeAvailableException' if the
'sniff' response didn't return any nodes. This situation can occure
if the client tries to get the listener nodes cluster state while that
node is not yet connected to any other nodes.