This commit removes and now forbids all uses of
com.google.common.collect.Maps across the codebase. This is one of many
steps in the eventual removal of Guava as a dependency.
Relates #13224
This commit addresses several bugs that prevented the Windows
service from being started or stopped:
- Extra white space in the concatenation of java options in
elasticsearch.in.bat which tripped up Apache Commons Daemon
and caused ES to startup without any params, eventually leading
to the "path.home is not configured" exception.
- service.bat was not passing the start argument to ES
- The service could not be stopped gracefully via the stop command
because there wasn't a method for procrun to call.
Closes#13247Closes#13401
Allocation filtering by IP only works today using the node host address. But in some cases, you might want to filter using the publish address which could be different.
Previously the parser could take any Term Vectors request, but this would be
not the case of the builder which would still use MultiGetRequest.Item. This
introduces a new Item class which is used by both the builder and parser.
Beyond that the rest is mostly cleanups such as:
1) Deprecating the ignoreLike methods, in favor to using unlike.
2) Deprecating and renaming MoreLikeThisBuilder#addItem to addLikeItem.
3) Ordering the methods of MoreLikeThisBuilder more logically.
This change is needed for the upcoming query refactoring of MLT.
Closes#13372
1) A shared immutable fieldtype for the _parent field (used for direct access to that field in the dsl). This field type is stored and indexed.
2) A per type field type for the child join field. The field type has doc values enabled if index is created on or post 2.0 and field data type is allowed to be changed.
3) A per type field type for the parent join field. The field type has doc values enabled if index is created on or post 2.0.
This resolves the issue that a mapping is not compatible if parent and child types have different field data loading settings.
Closes#13169
When we commit the translog, documents that were in it before cannot be retrieved from
it anymore via get and have to be retrieved from the index instead. But they will only
be visible if between index and get a refresh is called. Therfore we have to call
first refresh and then translog.commit() because otherwise there is a small gap
in which we cannot read from the translog anymore but also not from the index.
closes#13379
We have a handful of compiler warnings, mostly because of passing an
array to varargs methods. This change fixes these warnings and adds
-Werror so we don't get anymore of these warnings.
Note this does *not* enable deprecation or unchecked type warnings, so
these remain "hidden". We should work towards removing those as well,
but this is a first step.
This commit removes and now forbids all uses of
com.google.common.base.Throwables across the codebase.
For uses of com.google.common.base.Throwables#getStackTraceAsString,
use org.elasticsearch.ExceptionsHelper#stackTrace.
Relates #13224
Whe we call optimize we ignore Exceptions that indicate a closed shard.
However, when a shard is closed while an optimize request is in flight it
might also trigger an AlreadyClosedException from the IndexWriter when we
get the config or ForceMergeFailedEngineException with the EngineClosedException
wrapped inside. Because these are not identified as exceptions that indicate
a closed shard (TransportActions.isShardNotAvailableException(..)) optimize
would sometimes report failures when shards were relocating while optimize was called
and sometimes not. This caused weird test failures, see #13266 .
Instead, we should let EngineClosedException bubble up and also recognize
AlreadyClosedException as an indicator for a closed shard.
Today we try to allocate primaries first and then replicas
but don't take the index creation date and priority into account
as we do in the GatewayAlloactor.
Closes#13249
Adds a listeners to each of the caches that allows us to remove the dependency on IndexService which is cyclic since
the IndexService depends on both of these caches. This cyclic dependency makes
testing the indiviual parts very hard and is only added for the sake of
incrementing some stats.
Transport clients run embedded within external applications, so
elasticsearch should not be doing anything with the filesystem, as there
is not elasticsearch home.
This change makes a number of cleanups to the internal API for loading
settings and creating an environment. The loadFromConfig option was
removed, since it was always true except for tests. We now always
attempt to load settings from config a file when an environment is
created. The prepare methods were also simplified so there is now
prepareSettingsAndEnvironment which nodes use, and prepareSettings which
the transport client uses. I also attempted to improve the tests, but
there is a still a lot of follow up work to do there.
closes#13155
This commit removes and now forbids all uses of
com.google.common.base.Strings across the codebase.
For uses of com.google.common.base.Strings.isNullOrEmpty, use
org.elasticsearch.common.Strings.isNullOrEmpty.
For uses of com.google.common.base.Strings.padStart use
org.elasticsearch.common.Strings.padStart.
For uses of com.google.common.base.Strings.nullToEmpty use
org.elasticsearch.common.Strings.coalesceToEmpty.
Relates #13224
Before #13068 refresh and flush ignored all exceptions that matched
TransportActions.isShardNotAvailableException(e) and this should not change.
In addition, refresh and flush which are based on broadcast replication
might now get UnavailableShardsException from TransportReplicationAction if a shard
is unavailable and this is not caught by TransportActions.isShardNotAvailableException(e).
This must be ignored as well.
This commit removes and now forbids all uses of
com.google.common.base.Predicate and com.google.common.base.Predicates
across the codebase. This is one of the many steps in the eventual
removal of Guava as a dependency. This was enabled by #13314.
Relates #13224
This commit removes and now forbids all uses of
com.google.common.base.Objects across the codebase. This is a small
step in the eventual removal of Guava as a dependency.
Relates #13224
Previously we skip deleting the index store for indices on a shared
filesystem, because we don't want to delete the data when the shard is
relocating around the cluster. This adds a flag to the
`deleteIndexStore` method signifying that the index is closed and that
we should allow deleting the contents even if it is on a shared
filesystem.
Includes a unit test for the IndicesService.canDeleteIndexContents and
integration tests ensure a closed shadow replica index deletes files
correctly.
Resolves#13297
As a refinement to Project Coin (JEP-213, JDK-8042880), Java 9 is going
to disallow the use of ‘_’ as a one-character identifier. This will be
done by adding ‘_’ as a keyword to the Java language (JDK-8065599).
Currently, uses of ‘_’ as a one-character identifier are warnings in
the Java 8 compiler. This commit removes all uses of ‘_’ as a
one-character identifier from the codebase.
SpanContainingQueryParser and SpanWithinQueryParser always set the boost to the parsed lucene query, even if it is the default one. The default boost of the main query though is the boost coming from the inner little query, value that we end up overriding all the time. We should instead set the boost to the main query only if it differs from the default, to mimic lucene's behaviour.
Relates to #13272Closes#13339
SimpleQueryStringParser applies whatever boost the query holds, even if the default 1, to the query obtained from parsing of the query string. that might contain its boost, for instance if it resolved to a simple query like term (single term query against a single field). We should rather multiply the existing boost with the boost set to the query, same as we do in query_string
Relates to #13272Closes#13331
doParse() was supposed to allow aggs to perform extra parsing. Unfortunately, this forced the
parser to carry instance-level state, which would carry-over and "corrupt" any other aggs of the
same type in the same query.
Instead, we are now collecting all unknown params and pasing them as a Map<String, Object>
to buildFactory(). The agg may then parse them and instantiate a factory. Each param the
agg uses, it should unset from the unusedParams object.
After building the factory, the parser verifies that unusedParams is empty. If it is not empty,
an exception is raised so the user knows they provided unknown params.
Fixes#13337
This changes construction of Phrase and Boolean queries to use the builder,
and replaces BitDocIdSetFilter with BitSetProducer for nested and parent/child
queries. I had to remove the ParentIdsFilter for the case when there was a
single parent as it was using the source of BitSets for parents as a regular
Filter, which is not possible anymore now. I don't think this is an issue since
this case rarely occurs, and the alternative logic for when there are several
matching parent ids should not be much worse.
This pipeline will calculate percentiles over a set of sibling buckets. This is an exact
implementation, meaning it needs to cache a copy of the series in memory and sort it to determine
the percentiles.
This comes with a few limitations: to prevent serializing data around, only the requested percentiles
are calculated (unlike the TDigest version, which allows the java API to ask for any percentile).
It also needs to store the data in-memory, resulting in some overhead if the requested series is
very large.
This commit moves ignore_malformed and coerce options from the GeoPointFieldType to the Builder in GeoPointFieldMapper. This makes these options consistent with other types in 2.0.
This was supposed to just help the user, in case they misconfigured something.
Broadcast is an ipv4 only thing, the only way you can really detect its a broadcast
address, is to look and see if an interface has that address as its broadcast address.
But we cannot trust that container interfaces won't have a crazy setup...
Closes#13327
The match_phrase_prefix query properly parses the boost etc. but it loses it in its rewrite method. Fixed that by setting the orginal boost to the rewritten query before returning it. Also cleaned up some warning in MultiPhrasePrefixQuery.
Closes#13129Closes#13142
Target-type inference has been improved in Java 8. This leads to these
lines now being interpreted as invoking String#valueOf(char[]) whereas
they previously were interpreted as invoking String#valueOf(Object).
This change leads to ClassCastExceptions during test execution. Simply
casting the parameter to Object restores the old invocation.
Closes#13315
We have some optimization in FilteredQueryParser that tries to mimic what the rewrite method in lucene does, based on what gets parsed we return the simplest query possible. That might cause issues with boost values though, if specified in both the main query and the inner query that we shortcut to. We should rather rely on lucene's rewrite method to simplify the lucene representation of the query, and always build a filtered query instead.
relates to #13272Closes#13312
We currently optimize scroll when sort=_doc because docs are returned in order.
But documents are also returned in order when sorting by score and the query
gives constant scores. This optimization has the nice side-effect of also
optimizing scrolls with the default `match_all` query.
Until now we had a cloud-aws plugin which is providing 2 disctinct features:
* discovery on EC2
* snapshot/restore on S3
This commit splits the plugin by feature so people can use either one or the other or both features.
Doc is updated accordingly.
Before this change the check would check that all test classes end in Tests but the message would say they need to end in Test or Tests which was confusing.
Today we always collect in order to compute counts, but some of them can be
easily optimized by using pre-computed index statistics. This is especially
true in the case that there are no deletions, which should be common for the
time-based data use-case.
Counts on match_all queries can always be optimized, so requests like
```
GET index/_search?size=0
GET index/_search
{
"size": 0,
"query" : {
"match_all": {}
}
}
```
should now return almost instantly. Additionally, when there are no deletions,
term queries are also optimized, so the below queries which all boil down to a
single term query would also return almost immediately:
```
GET index/type/_search?size=0
GET index/_search
{
"size": 0,
"query" : {
"match": {
"foo": "bar"
}
}
}
GET index/_search
{
"size": 0,
"query" : {
"constant_score": {
"filter": {
"exists": {
"field": "foo"
}
}
}
}
}
```
Users might specify something like -Des.network.host=0.0.0.0, as that
was the old default with previous versions of elasticsearch. This means
to bind to all interfaces, but it makes no sense as a publish address.
Pick a good one in this case, just like we do in other cases where
publish isn't explicitly specified and we are bound to multiple (e.g.
when configured by interface, or dns hostname with multiple addresses).
However, in this case warn the user about it: since its arbitrarily
picking the first non-loopback address like the old versions
did, thats a little too heuristical, but lets make the cutover easy.
Separately, fail hard if things like multicast or broadcast addresses are
configured as bind or publish addresses, as that is simply invalid.
Closes#13274
The number and distribution of errors in some restore test may cause restore process to continue to fail for a prolong time. This test caps the total number of simulated failures to make sure that restore is guaranteed to eventually succeed after a limited number of retries.
We currently have a small number of test classes with the suffix "Test",
yet most use the suffix "Tests". This change renames all the "Test"
classes, so that we have a simple rule: "Non-inner classes ending with
Tests".
These are not actually tests, but command line applications that must be
run manually. This change removes the entire stresstest package. We can
add back individual tests that we find necessary, and make them real
tests (whether integ or not).