Commit Graph

5086 Commits

Author SHA1 Message Date
Alan Woodward 71b8494181
Upgrade to lucene 8.0.0-snapshot-ff9509a8df (#39444)
Backport of #39350

Contains the following:

* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
2019-02-27 14:36:08 +00:00
Marios Trivyzas 11fe8cd16f
[Tests] Fix flakiness by ensuring stable cluster (#39300) (#39356)
In integration tests where `setBootstrapMasterNodeIndex()` is used in
combination with `autoMinMasterNodes = false` the cluster can start
bootstrapping once the number of nodes set with the
`setBootstrapMasterNodeIndex` have been started but it's not ensured
that all nodes have successfully joined to form the cluster.

This behaviour was introduced with 5db7ed22a0
and in order to ensure that the cluster is properly formed before proceeding
with the integration test, use `ensureStableCluster()` with the
appropriate number of expected nodes.

Fixes: #39220
2019-02-25 17:26:15 +01:00
Mayya Sharipova e80284231d
Backport distance functions vectors (#39330)
Distance functions for dense and sparse vectors

Backport for #37947, #39313
2019-02-23 11:52:43 -05:00
Zachary Tong 8af0e7c4b6
Only create final MatrixStatsResults on final reduction (#39205)
MatrixStatsResults is the "final" result object, and runs an additional
computation in it's ctor to calculate covariance, etc.  This means
it should only run on the final reduction instead of on every reduce.
2019-02-21 14:18:45 -05:00
Christoph Büscher 4b77d0434a Remove `nGram` and `edgeNGram` token filter names (#39070)
In #30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and
did the same for `edgeNGram` and `edge_ngram` and we are removing those names in
8.0. This change disallows using the deprecated names for new indices created in 7.0 by
throwing an error if these filters are used.

Relates to #38911
2019-02-21 16:55:40 +01:00
Jason Tedor 751c05eff9
Bump jackson-databind version for ingest-geoip (#39182)
This commit bumps the jackson-databind version for ingest-geoip to
2.8.11.3.
2019-02-20 11:40:31 -05:00
Henning Andersen 00a26b9dd2 Blob store compression fix (#39073)
Blob store compression was not enabled for some of the files in
snapshots due to constructor accessing sub-class fields. Fixed to
instead accept compress field as constructor param. Also fixed chunk
size validation to work.

Deprecated repositories.fs.compress setting as well to be able to unify
in a future commit.
2019-02-20 09:24:41 +01:00
Ioannis Kakavas ec2b64af63 Disable date parsing test in non english locale (#39052)
This ensures we do not attempt to parse non english locale dates
in FIPS mode. The error, originally assumed to affect only Joda,
affects Java time in the same manner and manifests only with the
version of BouncyCastle FIPS certified provider we use in tests.
The upstream issue https://github.com/bcgit/bc-java/issues/405
indicates that the behavior is resolved in later versions of the
BouncyCastle library and should be tested again when the new
versions become FIPS 140 certified
2019-02-20 09:02:37 +02:00
Tal Levy f30f1fe9b6
fix RethrottleTests retry (#38978) (#39131)
the RethrottleTests assumed that tasks that were
unprepared to rethrottle would bubble up into the
Rethrottle response as an ElasticsearchException
wrapping an IllegalArgumentException. This seems to
have changed to potentially involve further levels of
wrapping.

This change makes the retry logic more resilient to
arbitrary nesting of the underlying IllegalArgumentException
2019-02-19 11:10:39 -08:00
Jake Landis 46bb663a09
Make 7.x like 6.7 user agent ecs, but default to true (#38828)
Forward port of https://github.com/elastic/elasticsearch/pull/38757

This change reverts the initial 7.0 commits and replaces them
with the 6.7 variant that still allows for the ecs flag. 
This commit differs from the 6.7 variants in that ecs flag will 
now default to true. 

6.7: `ecs` : default `false`
7.x: `ecs` : default `true`
8.0: no option, but behaves as `true`

* Revert "Ingest node - user agent, move device to an object (#38115)"
This reverts commit 5b008a34aa.

* Revert "Add ECS schema for user-agent ingest processor (#37727) (#37984)"
This reverts commit cac6b8e06f.

* cherry-pick 5dfe1935345da3799931fd4a3ebe0b6aa9c17f57 
Add ECS schema for user-agent ingest processor (#37727)

* cherry-pick ec8ddc890a34853ee8db6af66f608b0ad0cd1099 
Ingest node - user agent, move device to an object (#38115) (#38121)
  
* cherry-pick f63cbdb9b426ba24ee4d987ca767ca05a22f2fbb (with manual merge fixes)
Dep. check for ECS changes to User Agent processor (#38362)

* make true the default for the ecs option, and update 7.0 references and tests
2019-02-13 10:28:01 -06:00
Alexander Reelsen 884b5063a4 Create ISO8601 joda compatible java time formatter (#38434)
The existing formatter being used was not on par with the joda formatter
as it was missing the ability to parse a comma as a separator between
seconds and milliseconds.

While a real iso8601 would be much more complex, this might be
sufficient for some more use-cases.

The ingest date formatter now also uses the iso8601 formatter by
default.

Closes #38345
2019-02-11 15:11:26 +01:00
Christoph Büscher f61420140d
Use only default type in rank_eval API (#38530)
Currently tests still use custom type names. In preparation for the final types
removal this change moves all of them to use the default "_doc" type in tests.
2019-02-11 10:18:13 +01:00
Alexander Reelsen 56edc8e37f
Fix timezone fallback in ingest processor (#38407) (#38664)
If no timezone was specified in the date processor, then the conversion
would lead to wrong time, as UTC was assumed by default, leading to
incorrectly parsed dates.

This commit does not assume a default timezone and will thus not format
the dates in a wrong way.
2019-02-09 20:28:59 +01:00
Luca Cavanna a7046e001c
Remove support for maxRetryTimeout from low-level REST client (#38085)
We have had various reports of problems caused by the maxRetryTimeout
setting in the low-level REST client. Such setting was initially added
in the attempts to not have requests go through retries if the request
already took longer than the provided timeout.

The implementation was problematic though as such timeout would also
expire in the first request attempt (see #31834), would leave the
request executing after expiration causing memory leaks (see #33342),
and would not take into account the http client internal queuing (see #25951).

Given all these issues, it seems that this custom timeout mechanism 
gives little benefits while causing a lot of harm. We should rather rely 
on connect and socket timeout exposed by the underlying http client 
and accept that a request can overall take longer than the configured 
timeout, which is the case even with a single retry anyways.

This commit removes the `maxRetryTimeout` setting and all of its usages.
2019-02-06 08:43:47 +01:00
Boaz Leskes 033ba725af
Remove support for internal versioning for concurrency control (#38254)
Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:

> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document 

We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. 

This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.

Relates to #1078
2019-02-05 20:53:35 +01:00
Julie Tibshirani 3ce7d2c9b6
Make sure to reject mappings with type _doc when include_type_name is false. (#38270)
`CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when
deserializing index creation requests, accidentally accepts mappings that are
nested twice under the type key (as described in the bug report #38266).

This in turn causes us to be too lenient in parsing typeless mappings. In
particular, we accept the following index creation request, even though it
should not contain the type key `_doc`:

```
PUT index?include_type_name=false
{
  "mappings": {
    "_doc": {
      "properties": { ... }
    }
  }
}
```

There is a similar issue for both 'put templates' and 'put mappings' requests
as well.

This PR makes the minimal changes to detect and reject these typed mappings in
requests. It does not address #38266 generally, or attempt a larger refactor
around types in these server-side requests, as I think this should be done at a
later time.
2019-02-05 10:52:32 -08:00
David Turner f2dd5dd6eb
Remove DiscoveryPlugin#getDiscoveryTypes (#38414)
With this change we no longer support pluggable discovery implementations. No
known implementations of `DiscoveryPlugin` actually override this method, so in
practice this should have no effect on the wider world. However, we were using
this rather extensively in tests to provide the `test-zen` discovery type. We
no longer need a separate discovery type for tests as we no longer need to
customise its behaviour.

Relates #38410
2019-02-05 17:42:24 +00:00
David Turner 3b2a0d7959
Rename no-master-block setting (#38350)
Replaces `discovery.zen.no_master_block` with `cluster.no_master_block`. Any
value set for the old setting is now ignored.
2019-02-05 08:47:56 +00:00
Christoph Büscher 820029522b
Mute DateProcessorTests#testJodaPatternLocale (#38265)
Only fails on FIPS 8, muting this selectively.
2019-02-03 19:52:53 +01:00
Jack Conradson 630889baec
Remove extraneous test from Painless lambda tests (#38111)
This test has been awaiting a fix that isn't currently relevant because incoming
lambda parameters are read-only. If this ever changes a new set of tests can
be added that are up-to-date.
2019-02-01 15:10:59 -08:00
Ioannis Kakavas 78a65c340d
Correctly disable tests for FIPS JVMs (#38214)
Replace assertFalse with assumeFalse

Resolves: #38212
2019-02-01 23:56:35 +02:00
Julie Tibshirani c2e9d13ebd
Default include_type_name to false in the yml test harness. (#38058)
This PR removes the temporary change we made to the yml test harness in #37285
to automatically set `include_type_name` to `true` in index creation requests
if it's not already specified. This is possible now that the vast majority of
index creation requests were updated to be typeless in #37611. A few additional
tests also needed updating here.

Additionally, this PR updates the test harness to set `include_type_name` to
`false` in index creation requests when communicating with 6.x nodes. This
mirrors the logic added in #37611 to allow for typeless document write requests
in test set-up code. With this update in place, we can remove many references
to `include_type_name: false` from the yml tests.
2019-02-01 11:44:13 -08:00
Nhat Nguyen 70235838d1
AwaitsFix testClientSucceedsWithVerificationDisabled (#38213)
Tracked at #38212
2019-02-01 12:50:07 -05:00
Desmond Vehar c1c4abae10 Throw if two inner_hits have the same name (#37645)
This change throws an error if two inner_hits have the same name

Closes #37584
2019-02-01 15:53:50 +01:00
Andrey Ershov bfd618cf83
Universal cluster bootstrap method for tests with autoMinMasterNodes=false (#38038)
Currently, there are a few tests that use autoMinMasterNodes=false and
hence override addExtraClusterBootstrapSettings, mostly this is 10-30
lines of codes that are copy-pasted from class to class.

This PR introduces `InternalTestCluster.setBootstrapMasterNodeIndex`
which is suitable for all classes and copy-paste could be removed.

Removing code is always a good thing!
2019-02-01 11:34:31 +01:00
Alexander Reelsen 6c5a7387af
Replace joda time in ingest-common module (#38088)
This commit fully replaces any remaining joda time time classes with
java time implementations.

Relates #27330
2019-02-01 10:15:18 +01:00
Jake Landis 5b008a34aa
Ingest node - user agent, move device to an object (#38115)
When the ingest node user agent parses the device field, it
will result in a string value. To match the ecs schema
this commit moves the value of the parsed device to an
object with an inner field named 'name'. There are not
any passivity concerns since this modifies an unreleased change.

closes #38094
relates #37329
2019-01-31 13:54:34 -06:00
Henning Andersen 68ed72b923
Handle scheduler exceptions (#38014)
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
2019-01-31 17:51:45 +01:00
Jack Conradson e066a59c89
Fix Painless void return bug (#38046)
Painless now allows void functions and contexts to with a void return type to use
a return statement without a following expression.
2019-01-31 08:32:38 -08:00
Boaz Leskes 91d7050a5b remove unused parser fields in RemoteResponseParsers 2019-01-31 15:27:42 +01:00
Luca Cavanna 622fb7883b
Introduce ability to minimize round-trips in CCS (#37828)
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.

This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.

Relates to #32125
2019-01-31 15:12:14 +01:00
Tal Levy e0d5de33da fix DateIndexNameProcessorTests offset pattern (#38069)
`XX` was being used to represent an offset pattern,
it should be `ZZ`

Fixes #38067.
2019-01-31 08:57:56 +01:00
Alexander Reelsen b94acb608b
Speed up converting of temporal accessor to zoned date time (#37915)
The existing implementation was slow due to exceptions being thrown if
an accessor did not have a time zone. This implementation queries for
having a timezone, local time and local date and also checks for an
instant preventing to throw an exception and thus speeding up the conversion.

This removes the existing method and create a new one named
DateFormatters.from(TemporalAccessor accessor) to resemble the naming of
the java time ones.

Before this change an epoch millis parser using the toZonedDateTime
method took approximately 50x longer.

Relates #37826
2019-01-31 08:55:40 +01:00
Tim Vernum a8596de31f
Introduce ssl settings to reindex from remote (#37527)
Adds reindex.ssl.* settings for reindex from remote.

This uses the ssl-config/ internal library to parse and load SSL
configuration and files. This is applied when using the low level
rest client to connect to a remote ES node

Relates: #37287
Resolves: #29755
2019-01-31 18:06:05 +11:00
Jason Tedor 89bffc25de
Mute failing date index name processor test
This test is repeatedly failing, so this commit mutes it.

Relates #38067
2019-01-30 20:37:52 -05:00
David Turner 81c443c9de
Deprecate minimum_master_nodes (#37868)
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
2019-01-30 20:09:15 +00:00
Jack Conradson 36ee78d924
Add test coverage for Painless general casting of boolean and Boolean (#37780)
This adds test coverage for general casts in Painless between boolean and other types and Boolean and other types.
2019-01-30 11:01:45 -08:00
Lee Hinman cac6b8e06f
Add ECS schema for user-agent ingest processor (#37727) (#37984)
* Add ECS schema for user-agent ingest processor (#37727)

This switches the format of the user agent processor to use the schema from [ECS](https://github.com/elastic/ecs).
So rather than something like this:

```
{
  "patch" : "3538",
  "major" : "70",
  "minor" : "0",
  "os" : "Mac OS X 10.14.1",
  "os_minor" : "14",
  "os_major" : "10",
  "name" : "Chrome",
  "os_name" : "Mac OS X",
  "device" : "Other"
}
```

The structure is now like this:

```
{
  "name" : "Chrome",
  "original" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36",
  "os" : {
    "name" : "Mac OS X",
    "version" : "10.14.1",
    "full" : "Mac OS X 10.14.1"
  },
  "device" : "Other",
  "version" : "70.0.3538.102"
}
```

This is now the default for 7.0. The deprecated `ecs` setting in 6.x is not
supported.

Resolves #37329

* Remove `ecs` setting from docs
2019-01-30 11:24:18 -07:00
Nik Everett e97718245d
Test: Enable strict deprecation on all tests (#36558)
This drops the option for tests to disable strict deprecation mode in
the low level rest client in favor of configuring expected warnings on
any calls that should expect warnings. This behavior is
paranoid-by-default which is generally the right way to handle
deprecations and tests in general.
2019-01-30 11:48:34 -05:00
Colin Goodheart-Smithe 21e392e95e
Removes typed calls from YAML REST tests (#37611)
This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour.

The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.
2019-01-30 16:32:58 +00:00
David Roberts 2f7776c8b7
Switch default time format for ingest from Joda to Java for v7 (#37934)
Date formats with and without the "8" prefix are now all treated
as Java time formats, so that ingest does the same as mappings
in this respect.
2019-01-30 16:26:28 +00:00
Luca Cavanna b91d587275
Move SearchHit and SearchHits to Writeable (#37931)
This allowed to make SearchHits immutable, while quite a few fields in
SearchHit have to stay mutable unfortunately.

Relates to #34389
2019-01-30 12:05:54 +01:00
Boaz Leskes 218df3009a
Move update and delete by query to use seq# for optimistic concurrency control (#37857)
The delete and update by query APIs both offer protection against overriding concurrent user changes to the documents they touch. They currently are using internal versioning. This PR changes that to rely on sequences numbers and primary terms.

Relates #37639 
Relates #36148 
Relates #10708
2019-01-29 10:23:05 -05:00
Luca Cavanna 2325fb9cb3
Remove test only SearchShardTarget constructor (#37912)
Remove SearchShardTarget test only constructor and replace all the usages with calls to the other constructor that accepts a ShardId.
2019-01-29 14:58:11 +01:00
Christoph Büscher b4b4cd6ebd
Clean codebase from empty statements (#37822)
* Remove empty statements

There are a couple of instances of undocumented empty statements all across the
code base. While they are mostly harmless, they make the code hard to read and
are potentially error-prone. Removing most of these instances and marking blocks
that look empty by intention as such.

* Change test, slightly more verbose but less confusing
2019-01-25 14:23:02 +01:00
Alexander Reelsen 9e350d027e
Add BWC compatible processing to ingest date processors (#37407)
The ingest date processor is currently only able to parse joda formats.
However it is not using the existing elasticsearch classes but access
joda directly. This means that our existing BWC layer does not notify
the user about deprecated formats. This commit switches to use the
exising Elasticsearch Joda methods to acquire a date format, that
includes the BWC check and the ability to parse java 8 dates.

The date parsing in ingest has also another extra feature, that the
fallback year, when a date format without a year is used, is the current
year, and not 1970 like usual. This is currently not properly supported
in the DateFormatter class. As this is the only case for this feature
and java time can take care of this using the toZonedDateTime() method,
a workaround just for the joda time parser has been created, that can be
removed soon again from 7.0.
2019-01-25 13:50:19 +01:00
Mayya Sharipova a30ce6a00a
Rename feature, feature_vector and feature_query (#37794)
Ranaming as follows:
feature -> rank_feature
feature_vector -> rank_features
feature query -> rank_feature query

Ranaming is done to distinguish from other vector types.

Closes #36723
2019-01-24 19:18:48 -05:00
Boaz Leskes af2f4c8f73 enable bwc tests and bump versions after backporting https://github.com/elastic/elasticsearch/pull/37639 2019-01-24 20:55:55 +01:00
Mayya Sharipova fdb66039d4
Change `rational` to `saturation` in script_score (#37766)
This change of the function name is necessary for conformity
with feature queries.

Closes #37714
2019-01-23 14:28:20 -05:00
Alexander Reelsen daa2ec8a60
Switch mapping/aggregations over to java time (#36363)
This commit moves the aggregation and mapping code from joda time to
java time. This includes field mappers, root object mappers, aggregations with date
histograms, query builders and a lot of changes within tests.

The cut-over to java time is a requirement so that we can support nanoseconds
properly in a future field mapper.

Relates #27330
2019-01-23 10:40:05 +01:00