Date math index name resolution enables you to search a range of time-series indices, rather than searching all of your time-series indices and filtering the the results or maintaining aliases. Limiting the number of indices that are searched reduces the load on the cluster and improves execution performance. For example, if you are searching for errors in your daily logs, you can use a date math name template to restrict the search to the past two days.
The added `ExpressionResolver` implementation that is responsible for resolving date math expressions in index names. This resolver is evaluated before wildcard expressions are evaluated.
The supported format: `<static_name{date_math_expr{date_format|timezone_id}}>` and the date math expressions must be enclosed within angle brackets. The `date_format` is optional and defaults to `YYYY.MM.dd`. The `timezone_id` id is optional too and defaults to `utc`.
The `{` character can be escaped by places `\\` before it.
Closes#12059
When we rewrite to a MatchNoTermsQuery we were throwing out the boost which
could could lead to funky changes when the query against _all was in a
bool query.
Previously we would write the entire ByteBuffer to the stream to serialise the HDRHistogram even if it was not all needed. Now we only write the bytes that are actually written to in the ByteBuffer.
We currently test the toQuery method by comparing its return value with the output of createExpectedQuery. This seemed at first like a good deep test for the lucene generated queries, but ends up causing a lot of code duplication between tests and prod code, which smells like this test is not that useful after all.
This commit removes the requirement for createExpectedQuery, we test a bit less the lucene generated queries, but we do still have a testToQuery method in BaseQueryTestCase. This test acts as smoke test to verify that there are no issues with our toQuery method implementation for each query: it verifies that two equal queries should result in the same lucene query. It also verifies that changing the boost value of a query affects the result of the toQuery.
Part of the previous createExpectedQuery can be moved to assertLuceneQuery using normal assertions, we do it when it makes sense only though.
This commit also adds support for boost and queryName to SpanMultiTermQueryParser and SpanMultiTermQueryBuilder, plus it removes no-op setters for queryName and boost from QueryFilterBuilder. Instead we simply declare in our test infra that query filter doesn't support boost and queryName and we disable tests aroudn these fields in that specific case. Boost and queryName support is also moved down to the QueryBuilder interface.
Closes#12473
This can happen in two ways:
1. The _all field is disabled.
2. There are documents in the index, the _all field is enabled, but there are
no fields in any of the documents.
In both of these cases we now rewrite the query to a MatchNoDocsQuery which
should be safe because there isn't anything to match.
Closes#12439
When a node discovers shard content on disk which isn't used, we reach out to all other nodes that supposed to have the shard active. Only once all of those have confirmed the shard active, the shard has no unassigned copies *and* no cluster state change have happened in the mean while, do we go and delete the shard folder.
Currently, after removing a shard, the IndicesStores checks the indices services if that has no more shard active for this index and if so, it tries to delete the entire index folder (unless on master node, where we keep the index metadata around). This is wrong as both the check and the protections in IndicesServices.deleteIndexStore make sure that there isn't any shard *in use* from that index. However, it may be the we erroneously delete other unused shard copies on disk, without the proper safety guards described above.
Normally, this is not a problem as the missing copy will be recovered from another shard copy on another node (although a shame). However, in extremely rare cases involving multiple node failures/restarts where all shard copies are not available (i.e., shard is red) there are race conditions which can cause all shard copies to be deleted.
Instead, we should change the decision to clean up an index folder to based on checking the index directory for being empty and containing no shards.
This change creates a proper `distribution` modules in which we have today packaging for
all of our four current packages:
* zip
* tar.gz
* rpm
* deb
Licenes have moved into the distribution project as well. So have the config/ and the bin/ directory
from the core/ project.
The RPM package is now built, if rpmbuild exists.
The bats tests have been moved as well.
Also the zip distribution now executes the REST integration tests.
As Robert pointed out on #12465, it has the undesirable property of relying on
the operating system. So it would be better to use a simple rule such as
checking whether the file name starts with a dot.
When an index is opened it will not be assigned to a node but also not have closed state
anymore. Before we only checked if an index either is closed or assigned to the data node
and therefore the change from close->open was not written.
Some of the test for meta data are redundant. Also, since they
somewhat test service disruptions (start master with empty
data folder) we might move them to DiscoveryWithServiceDisruptionsTests.
Also, this commit adds a test for
https://github.com/elastic/elasticsearch/issues/11665
If we cache the type filter and we e.g. set its boost which is now settable on all queries, the boost will change for all subsequent queries. We should rather create a new query every time.
Major changes:
* Changed MetaData to holds alias and index lookup information into a single TreeMap instead of two separate maps.
* Moved the building of the alias / index lookup to the metadata builder.
Settings are currently parsed by looping over the tokens until an END_OBJECT token is reached. However, this does not mean that the end of
the settings stream was reached. This can occur, for example, when parsing a YAML settings file with inconsistent indentation. Currently
in this case, some settings will be silently ignored. This commit forces a check that we have in fact reached the end of the settings
stream.
Closes#12382
HDRHistogram has been added as an option in the percentiles and percentile_ranks aggregation. It has one option `number_significant_digits` which controls the accuracy and memory size for the algorithm
Closes#8324
If we cache the type filter and we e.g. set its boost which is now settable on all queries, the boost will change for all subsequent queries. We should rather create a new query every time.
Following up to #12427, this PR does same changes, moving null-checks from construtors
and setters in query builder to the validate() method.
PR against query-refactoring branch
The ThrottlingAllocationDecider is responsible to limit the number of incoming/local recoveries on a node. It therefore shouldn't count shards marked as relocating which represent the source of the recovery.
Closes#12409
If there are multiple levels of filtered leaf readers then with the unwrap() method it immediately returns the most inner leaf reader and thus skipping of over any other filtered leaf reader that may be instance of ElasticsearchLeafReader. By using #getDelegate() method we can check each filter reader layer if it is instance of ElasticsearchLeafReader, so that we never skip over any wrapped filtered leaf reader and lose the shard id.
Also, as set of utility methods was introduced on ShardRouting to do various types of matching with other shard routings, giving control about what exactly should be matched (same shard id, same allocation id, all but version and shard info etc.). This is useful here, but also prepares the grounds for the change needed in #12387 (making ShardRouting.equals be strict and perform exact equality).
Closes#12397
When a replica is initializing from the primary, and we find a better node that has full sync id match, it is better to cancel the existing replica allocation and allocate it to the new node with sync id match (eventually)
Today, when a user provides settings and specifies a classloader to be used, the classloader gets
dropped when we copy the settings to check for prompt entries. This change copies the classloader
when replacing the prompt placeholders and adds a test to ensure the InternalSettingsPreparer
always retains the classloader.
Closes#12340
JarHell has a low level check, but its more of a best effort one,
only checking if X-Compile-Target-JDK is set in the manifest. This
is the case for all lucene- and elasticsearch- generated jars, but
lets just be explicit for plugins.
The TransportSingleCustomOperationAction `prefer_local` option has been removed as it isn't worth the effort.
The TransportSingleShardAction will execute the operation on the receiving node if a concrete list doesn't provide a list of candite shards routings to perform the operation on.
Change #12371 broke fielddata on the `_parent` child for indices created before
2.0. This pull request adds back caching of the `_parent` fielddata for indices
created before 2.0 and cleans some related stuff. For instance
DocumentTypeListener doesn't need to take care of removed mappings anymore since
mappings can't be removed in 2.0.
The rangeQuery() method in MappedFieldType and some overwriting subtypes takes
a nullable QueryParseContext argument which is unused and can be deleted.
This is also useful for the current query parsing refactoring, since
we want to avoid passing the context object around unnecessarily.
IndicesStoreIntegrationTests.indexCleanup() tests if the shard files on disk are actually removed
after relocation. In particular it tests the following:
Whenever a node deletes a shard because it was relocated somewhere else, it first
checks if enough other copies are started somewhere else. The node sends a
ShardActiveRequest to the nodes that should have a copy.
The nodes that receive this request check if the shard is in state STARTED in which case
they respond with true. If they have the shard in POST_RECOVERY they register a cluster state
observer that checks at each update if the shard has moved to STARTED and respond with true when
this happens.
To test that the cluster state observer mechanism actually works, the latter can be triggered by
blocking the cluster state processing when a recover starts and only unblocking it shortly
after the node receives the ShardActiveRequest.
This is more reliable than using random cluster state processing delays
because the random delays make it hard to reason about different timeouts
that can be reached.
closes#11989