SimpleQueryStringParser applies whatever boost the query holds, even if the default 1, to the query obtained from parsing of the query string. that might contain its boost, for instance if it resolved to a simple query like term (single term query against a single field). We should rather multiply the existing boost with the boost set to the query, same as we do in query_string
Relates to #13272Closes#13331
doParse() was supposed to allow aggs to perform extra parsing. Unfortunately, this forced the
parser to carry instance-level state, which would carry-over and "corrupt" any other aggs of the
same type in the same query.
Instead, we are now collecting all unknown params and pasing them as a Map<String, Object>
to buildFactory(). The agg may then parse them and instantiate a factory. Each param the
agg uses, it should unset from the unusedParams object.
After building the factory, the parser verifies that unusedParams is empty. If it is not empty,
an exception is raised so the user knows they provided unknown params.
Fixes#13337
This changes construction of Phrase and Boolean queries to use the builder,
and replaces BitDocIdSetFilter with BitSetProducer for nested and parent/child
queries. I had to remove the ParentIdsFilter for the case when there was a
single parent as it was using the source of BitSets for parents as a regular
Filter, which is not possible anymore now. I don't think this is an issue since
this case rarely occurs, and the alternative logic for when there are several
matching parent ids should not be much worse.
This pipeline will calculate percentiles over a set of sibling buckets. This is an exact
implementation, meaning it needs to cache a copy of the series in memory and sort it to determine
the percentiles.
This comes with a few limitations: to prevent serializing data around, only the requested percentiles
are calculated (unlike the TDigest version, which allows the java API to ask for any percentile).
It also needs to store the data in-memory, resulting in some overhead if the requested series is
very large.
This commit moves ignore_malformed and coerce options from the GeoPointFieldType to the Builder in GeoPointFieldMapper. This makes these options consistent with other types in 2.0.
This was supposed to just help the user, in case they misconfigured something.
Broadcast is an ipv4 only thing, the only way you can really detect its a broadcast
address, is to look and see if an interface has that address as its broadcast address.
But we cannot trust that container interfaces won't have a crazy setup...
Closes#13327
The match_phrase_prefix query properly parses the boost etc. but it loses it in its rewrite method. Fixed that by setting the orginal boost to the rewritten query before returning it. Also cleaned up some warning in MultiPhrasePrefixQuery.
Closes#13129Closes#13142
Target-type inference has been improved in Java 8. This leads to these
lines now being interpreted as invoking String#valueOf(char[]) whereas
they previously were interpreted as invoking String#valueOf(Object).
This change leads to ClassCastExceptions during test execution. Simply
casting the parameter to Object restores the old invocation.
Closes#13315
We have some optimization in FilteredQueryParser that tries to mimic what the rewrite method in lucene does, based on what gets parsed we return the simplest query possible. That might cause issues with boost values though, if specified in both the main query and the inner query that we shortcut to. We should rather rely on lucene's rewrite method to simplify the lucene representation of the query, and always build a filtered query instead.
relates to #13272Closes#13312
We currently optimize scroll when sort=_doc because docs are returned in order.
But documents are also returned in order when sorting by score and the query
gives constant scores. This optimization has the nice side-effect of also
optimizing scrolls with the default `match_all` query.
Until now we had a cloud-aws plugin which is providing 2 disctinct features:
* discovery on EC2
* snapshot/restore on S3
This commit splits the plugin by feature so people can use either one or the other or both features.
Doc is updated accordingly.
Before this change the check would check that all test classes end in Tests but the message would say they need to end in Test or Tests which was confusing.
Today we always collect in order to compute counts, but some of them can be
easily optimized by using pre-computed index statistics. This is especially
true in the case that there are no deletions, which should be common for the
time-based data use-case.
Counts on match_all queries can always be optimized, so requests like
```
GET index/_search?size=0
GET index/_search
{
"size": 0,
"query" : {
"match_all": {}
}
}
```
should now return almost instantly. Additionally, when there are no deletions,
term queries are also optimized, so the below queries which all boil down to a
single term query would also return almost immediately:
```
GET index/type/_search?size=0
GET index/_search
{
"size": 0,
"query" : {
"match": {
"foo": "bar"
}
}
}
GET index/_search
{
"size": 0,
"query" : {
"constant_score": {
"filter": {
"exists": {
"field": "foo"
}
}
}
}
}
```
Users might specify something like -Des.network.host=0.0.0.0, as that
was the old default with previous versions of elasticsearch. This means
to bind to all interfaces, but it makes no sense as a publish address.
Pick a good one in this case, just like we do in other cases where
publish isn't explicitly specified and we are bound to multiple (e.g.
when configured by interface, or dns hostname with multiple addresses).
However, in this case warn the user about it: since its arbitrarily
picking the first non-loopback address like the old versions
did, thats a little too heuristical, but lets make the cutover easy.
Separately, fail hard if things like multicast or broadcast addresses are
configured as bind or publish addresses, as that is simply invalid.
Closes#13274
The number and distribution of errors in some restore test may cause restore process to continue to fail for a prolong time. This test caps the total number of simulated failures to make sure that restore is guaranteed to eventually succeed after a limited number of retries.
The `-Xloggc:filename.log` parameter has very strict filename semantics:
```
[A-Z][a-z][0-9]-_.%[p|t]
```
Our script specifies \" and \" to surround it, which makes Java think we
are sending: -Xloggc:"foo.log" and it fails with:
```
Invalid file name for use with -Xloggc: Filename can only contain the characters [A-Z][a-z][0-9]-_.%[p|t] but it has been "foo.log"
Note %p or %t can only be used once
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
```
We can't quote this, and we should not need to since the valid
characters don't include a space character, so we don't need to worry
about quoting.
Resolves#13277
We currently have a small number of test classes with the suffix "Test",
yet most use the suffix "Tests". This change renames all the "Test"
classes, so that we have a simple rule: "Non-inner classes ending with
Tests".