This chance adds several random test infrastructure improvements that caused
issues in on-going developments but are generally useful. For instance is it impossible
to restart a node with a secure setting source since we close it after the node is started.
This change makes it cloneable such that we can reuse it for a restart.
Currently the `percentiles` aggregation allows specifying both possible methods
in the query DSL, but only the later one is used. This changes it to rejecting
such requests with an error. Setting the method multiple times via the java API
still works (and the last one wins).
Closes#26095
This simply removes the default identity hashcode and equals methods in InternalAggregation which where only temporarily put there while we implmeneted the methods in the subclasses.
The node setting `cluster.indices.tombstones.size` was not registered with the settings infrastructure, making it impossible for it to be set by a user.
Closes#26191
For CLI tools, we configure logging without reading the
log4j2.properties file. This because any log statements in a CLI tool
should dump to the console while reading from the log4j2.properties file
would cause them to dump whereever the log configuration there indicates
(e.g., possibly a remote machine). To do this, we added some code to the
base implementation of all CLI tools to configure logging without a
config file. This code is also executed when Elasticsearch starts up. In
the past this was fine yet we previously added detection to
Elasticsearch to find cases where we use logging before it is
configured. Because of configuring logging without a config, this means
we only catch uses of logging before the logging without config is
performed. To correct this, we enable a CLI tool to skip enabling
logging without a config and then in the Elasticsearch CLI we indeed
utilize this to skip configuring logging without a config.
Relates #26209
The environment variable CONF_DIR was previously inconsistently used in
our packaging to customize the location of Elasticsearch configuration
files. The importance of this environment variable has increased
starting in 6.0.0 as it's now used consistently to ensure Elasticsearch
and all secondary scripts (e.g., elasticsearch-keystore) all use the
same configuration. The name CONF_DIR is there for legacy reasons yet
it's too generic. This commit renames CONF_DIR to ES_PATH_CONF.
Relates #26197
The flood warning checks the wrong threshold, namely the high
watermark. This would impact any node for which the disk usage is above
the high watermark and below the flood stage watermark. This commit
fixes this so that it compares to the flood threshold.
Relates #26204
In bin/elasticsearch, we grep the command line looking for various flags
that indicate the process should be daemonized. To do this, we simply
test command status from the grep. Sadly, this is utterly broken
(unreleased) as instead we are testing the output of the command, not
the command status. This commit fixes this issue.
Relates #26196
This commit clarifies the minimum IDE versions that we support for
development. We need to formally state that the minimum that we support
for Eclipse is Eclipse Oxygen because Eclipse Neon and prior releases
have type inference bugs that lead to compilation issues that cause us
to have to contort our code to support Eclipse and it appears that
Eclipse Oxygen is less-prone to these issue. And the recent high-level
REST shading work seems to work best in Intellij 2017.2. Therefore, we
state these versions explicitly.
Relates #26194
* Rewrite range queries with open bounds to exists query
This change rewrites range query with open bounds to an exists query that should be faster to execute.
Fixes#22640
`epoch_millis` and `epoch_second` date formats truncate float values, as numbers or as strings.
The `coerce` parameter is not defined for `date` field type and this is not changing.
See PR #26119Closes#14641
This test was too lenient with its randomization of targetFieldName and
resulting in a conflict with the original existing fields. This commit
fixes that.
Closes#26177.
The following token filters were moved: arabic_stem, brazilian_stem, czech_stem, dutch_stem, french_stem, german_stem and russian_stem.
Relates to #23658
In reindex APIs, when using the `slices` parameter to choose the number of slices, adds the option to specify `slices` as "auto" which will choose a reasonable number of slices. It uses the number of shards in the source index, up to a ceiling. If there is more than one source index, it uses the smallest number of shards among them.
This gives users an easy way to use slicing in these APIs without having to make decisions about how to configure it, as it provides a good-enough configuration for them out of the box. This may become the default behavior for these APIs in the future.
By default we only serialize analyzers if the index analyzer is not the
`default` analyzer or if the `search_analyzer` is different from the index
`analyzer`. This raises issues with the `_all` field when the
`index.analysis.analyzer.default_search` is set, since it automatically makes
the `search_analyzer` different from the index `analyzer`. Then there are
exceptions since we expect the `_all` configuration to be empty on 6.0 indices.
Closes#26136
The percolator field mapper doesn't need to extract all terms and ranges from a bool query with must or filter clauses.
In order to help to default extraction behavior, boost fields can be configured, so that fields that are known for not being
selective enough can be ignored in favor for other fields or clauses with specific fields can forcefully take precedence over other clauses.
This can help selecting clauses for fields that don't match with a lot of percolator queries over other clauses and thus improving performance of the percolate query.
For example a status like field is something that should configured as an ignore field.
Queries on this field tend to match with more documents and so if clauses for this fields
get selected as best clause then that isn't very helpful for the candidate query that the
percolate query generates to filter out percolator queries that are likely not going to match.
With this commit we remove the following three previously unused
(and undocumented) Netty 4 related settings:
* transport.netty.max_cumulation_buffer_capacity,
* transport.netty.max_composite_buffer_components and
* http.netty.max_cumulation_buffer_capacity
from Elasticsearch.
Today we have a `null` invariant on all `ClusterState.Custom`. This makes
several code paths complicated and requires complex state handling in some cases.
This change allows to register a custom supplier that is used to initialize the
initial clusterstate with these transient customs.
The build was ignoring suffixes like "beta1" and "rc1" on the version numbers which was causing the backwards compatibility packaging tests to fail because they expected to be upgrading from 6.0.0 even though they were actually upgrading from 6.0.0-beta1. This adds the suffixes to the information that the build scrapes from Version.java. It then uses those suffixes when it resolves artifacts build from the bwc branch and for testing.
Closes#26017
When using the High Level Rest Client 6.0.0-beta1, we are missing some transitive dependencies for Lucene as Lucene 7 has not been released yet. See the following `pom.xml`:
```xml
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>6.0.0-beta1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>6.0.0-beta1</version>
</dependency>
```
It gives:
```
[ERROR] Failed to execute goal on project fscrawler: Could not resolve dependencies for project fr.pilato.elasticsearch.crawler:fscrawler:jar:2.4-SNAPSHOT: The following artifacts could not be resolved: org.apache.lucene:lucene-analyzers-common:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-backward-codecs:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-grouping:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-highlighter:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-join:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-memory:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-misc:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-queries:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-queryparser:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-sandbox:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-spatial:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-spatial-extras:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-spatial3d:jar:7.0.0-snapshot-00142c9, org.apache.lucene:lucene-suggest:jar:7.0.0-snapshot-00142c9: Failure to find org.apache.lucene:lucene-analyzers-common:jar:7.0.0-snapshot-00142c9 in https://artifacts.elastic.co/maven/ was cached in the local repository, resolution will not be reattempted until the update interval of elastic-download-service has elapsed or updates are forced -
```
We need to add some temporary documentation on how to add the missing repository to a gradle or maven project:
```xml
<repository>
<id>elastic-lucene-snapshots</id>
<name>Elastic Lucene Snapshots</name>
<url>http://s3.amazonaws.com/download.elasticsearch.org/lucenesnapshots/00142c9</url>
<releases><enabled>true</enabled></releases>
<snapshots><enabled>false</enabled></snapshots>
</repository>
```
This also applies to the transport client.
Closes#26106.
This is a safer default since sorting by sub aggregations prevents these
aggregations from being deferred. `global_ordinals_hash` will at least
make sure that we do not use memory for buckets that are not collected.
Closes#24359
Two tests were still using the static indices:
* IndexFolderUpgraderTests#testUpgradeRealIndex()
* InternalEngineTests#testUpgradeOldIndex()
I removed these tests too, because these tests functionally overlap
with the full-cluster-restart qa tests.
Relates to #24939
This occasionally fails now because if `top` is `-Infinity` (which we sometimes
test for in randomization), the value might not get changed for the
equals/hashCode tests.
Closes#26107
* Adds ToXContentFragment
This interface is meant for objects that implement `ToXContent` but are not complete objects. It is basically the opposite of `ToXContentObject`. It means that it will be easier to track the migration of classes over to the fragment/not fragment ToXContent model as it will be clear which classes are not migrated. When no classes directly implement `ToXContent` we can make `ToXContent` package private to be sure that all new classes must implement `ToXContentObject` or `ToXContentFragment`.
* review comments
* more review comments
* javadocs
* iter
* Adds tests
* iter
* adds toString test for aggs
* improves tests following review comments
* iter
* iter
* validate half float values
* test upper bound for numeric mapper
* test for upper bound for float, double and half_float
* more tests on NaN and Infinity for NumberFieldMapper
* fix checkstyle errors
* minor renaming
* comments for disabled test
* tests for byte/short/integer/long removed and will be added in separate PR
* remove unused import
* Fix scaledfloat out of range validation message
* 1) delayed autoboxing in numbertype.parse(...)
2) no redudant checks in half_float validation
3) tests with negative values for half_float/float/double
* Add support for auto_generate_synonyms_phrase_query in match_query, multi_match_query, query_string and simple_query_string
This change adds a new parameter called auto_generate_synonyms_phrase_query (defaults to true).
This option can be used in conjunction with synonym_graph token filter to generate phrase queries
when multi terms synonyms are encountered.
For example, a synonym like "ny, new york" would produce the following boolean query when "ny city" is parsed:
((ny OR "new york") AND city)
Note how the multi terms synonym "new york" produces a phrase query.
We should have the same behavior for Azure repositories as we have for S3 (see #22762).
Instead of:
```yml
cloud:
azure:
storage:
my_account1:
account: your_azure_storage_account1
key: your_azure_storage_key1
default: true
my_account2:
account: your_azure_storage_account2
key: your_azure_storage_key2
```
Support something like:
```
azure.client:
default:
account: your_azure_storage_account1
key: your_azure_storage_key1
my_account2:
account: your_azure_storage_account2
key: your_azure_storage_key2
```
Then instead of:
```
PUT _snapshot/my_backup3
{
"type": "azure",
"settings": {
"account": "my_account2"
}
}
```
Use:
```
PUT _snapshot/my_backup3
{
"type": "azure",
"settings": {
"config": "my_account2"
}
}
```
If someone uses:
```
PUT _snapshot/my_backup3
{
"type": "azure"
}
```
It will use the `default` azure repository settings.
And mark as deprecated old settings.
Closes#22763.