Commit Graph

48071 Commits

Author SHA1 Message Date
Dimitris Athanasiou 36884a3c32
[7.x][ML] Restore analytics state if available (#47128) (#47393)
This commit restores the model state if available in data
frame analytics jobs.

In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.

Note that in order to be able to distinguish between a job
that runs for the first time and another that is restarting,
we ensure reindexing progress is reported to be at least 1
for a running task.
2019-10-02 10:24:05 +03:00
Ryan Ernst bd5f64848e Clarify missing java error message (#46160)
Since the bundled jdk was added to Elasticsearch, there are now 2 ways
java can be missing. Either JAVA_HOME is set but does not exist, or the
bundled jdk does not exist. This commit improves the error messages in
those two cases, and also ensures our tests cover both cases.
2019-10-01 22:10:19 -07:00
Nhat Nguyen 5cfcd7c458 Re-fetch shard info of primary when new node joins (#47035)
Today, we don't clear the shard info of the primary shard when a new
node joins; then we might risk of making replica allocation decisions
based on the stale information of the primary. The serious problem is
that we can cancel the current recovery which is more advanced than the
copy on the new node due to the old info we have from the primary.

With this change, we ensure the shard info from the primary is not older
than any node when allocating replicas.

Relates #46959

This work was done by Henning in #42518.

Co-authored-by: Henning Andersen <henning.andersen@elastic.co>
2019-10-01 22:16:26 -04:00
Mark Vieira ff15495b98
Remove empty buildSrc subproject (#47415) 2019-10-01 16:34:31 -07:00
James Rodewig 079bf887c0
[DOCS] Reorder index APIs alphabetically (#46981) (#47402) 2019-10-01 17:07:28 -04:00
Gordon Brown ba6ee2d40d
[7.x] Adjust randomization in cluster shard limit tests (#47254)
This commit adjusts randomization for the cluster shard limit tests so
that there is often more of a gap left between the limit and the size of
the first index. This allows the same randomization to be used for all
tests, and alleviates flakiness in
`testIndexCreationOverLimitFromTemplate`.
2019-10-01 14:53:10 -06:00
David Turner 99b25d3740 Keep nodes above watermark in testAutomaticReleaseOfIndexBlock (#47387)
Today the comment boldly claims that this line of code keeps nodes above the
10-byte low watermark when in fact this is not true at all. This change fixes
this so that it really does keep nodes above the low watermark.

Fixes #45338. Again.
2019-10-01 19:58:23 +01:00
James Rodewig 0179f93544
[DOCS] Reformat simulate pipeline API (#47301) (#47398) 2019-10-01 14:49:14 -04:00
James Rodewig aeb4edce3a
[DOCS] Reformat put pipeline API (#47171) (#47395) 2019-10-01 14:48:18 -04:00
Benjamin Trent f5fe5e7cd6
[7.x] [ML][Inference] Adding preprocessors to definition object (#47320) (#47370)
* [ML][Inference] Adding preprocessors to definition object (#47320)

* [ML][Inference] Adding preprocessors to definition object

* Update TrainedModelConfig.java

* adjusting for backport
2019-10-01 13:31:25 -04:00
lcawl 66116e39ba [DOCS] Edits ML release notes 2019-10-01 10:15:06 -07:00
Armin Braun 3d6ef6a90e
Speed up and Reorder Snapshot Delete Operations (#47293) (#47350)
This is a preliminary of #46250 making the snapshot
delete work by doing all the metadata updates first
and then bulk deleting all of the now unreferenced
blobs.
Before this change, the metadata updates for each shard
and subsequent deletion of the blobs that have become unreferenced
due to the delete would happen sequentially shard-by-shard
parallelising only over all the indices in the snapshot.
This change makes it so the all the metadata updates
happen in parallel on a shard level first.
Once all of the updates of shard-level metadata have finished,
all the now unreferenced blobs are deleted in bulk.
This has two benefits (outside of making #46250 a smaller change):
* We have a lower likelihood of failing to update shard level metadata because
it happens with priority and a higher degree of parallelism
* Deleting of unreferenced data in the shards should go much faster in many cases (rolling indices, large number of indices with many unchanged shards) as well because a number of small bulk deletions (just two blobs for `index-N` and `snap-` for each unchanged shard) are grouped into larger bulk deletes of `100-1000` blobs depending on Cloud provider (even though the final bulk deletes are happening sequentially this should be much faster in almost all cases as you'd parallelism of 50 (GCS) to 500 (S3) snapshot threads to achieve the same delete rates when deleting from unchanged shards).
2019-10-01 19:05:43 +02:00
James Rodewig e70220857d
[DOCS] Document cat tasks API (#47321) (#47375) 2019-10-01 12:22:50 -04:00
Colin Goodheart-Smithe c93b39c65b
Adds version 7.4.1 2019-10-01 16:03:11 +01:00
Mark Tozzi 5bdf25320a
Documentation notes for Range field histograms (#46890) (#47366) 2019-10-01 10:58:44 -04:00
Lisa Cawley 5ba543fd6c [DOCS] Adds machine learning PRs to release notes (#47316) 2019-10-01 10:17:41 -04:00
James Rodewig 2ca075dee4 [DOCS] Remove coming tags for 7.4.0 release (#47318) 2019-10-01 10:17:36 -04:00
Albert Zaharovits 78558a7b2f
Fix AD realm additional metadata (#47179)
Due to a regression bug the metadata Active Directory realm
setting is ignored (it works correctly for the LDAP realm type).
This commit redresses it.

Closes #45848
2019-10-01 17:05:25 +03:00
Marios Trivyzas f792dbf239 SQL: Implement DATE_PART function (#47206)
DATE_PART(<datetime unit>, <date/datetime>) is a function that allows
the user to extract the specified unit from a date/datetime field
similar to the EXTRACT (<datetime unit> FROM <date/datetime>) but
with different names and aliases for the units and it also provides more
options like `DATE_PART('tzoffset', datetimeField)`.

Implemented following the SQL server's spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017
with the difference that the <datetime unit> argument is either a
literal single quoted string or gets a value from a table field, whereas
in SQL server keywords are used (unquoted identifiers) and it's not
possible to use a value coming for a table column.

Closes: #46372
(cherry picked from commit ead743d3579eb753fd314d4a58fae205e465d72e)
2019-10-01 16:28:27 +03:00
Benjamin Trent 4335e07716
[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267) (#47310)
* [ML][Inference] adding .ml-inference* index and storage (#47267)

* [ML][Inference] adding .ml-inference* index and storage

* Addressing PR comments

* Allowing null definition, adding validation tests for model config

* fixing line length

* adjusting for backport
2019-10-01 08:20:33 -04:00
Tanguy Leroux c43e932a0c Fix CharArraysTests.testConstantTimeEquals() (#47346)
The change #47238 fixed a first issue (#47076) but introduced 
another one that can be reproduced using:

org.elasticsearch.common.CharArraysTests > testConstantTimeEquals FAILED

java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at __randomizedtesting.SeedInfo.seed([DFCA64FE2C786BE3:ED987E883715C63B]:0)
at java.lang.String.substring(String.java:1963)
at org.elasticsearch.common.CharArraysTests.testConstantTimeEquals(CharArraysTests.java:74)

REPRODUCE WITH: ./gradlew ':libs:elasticsearch-core:test' --tests 
"org.elasticsearch.common.CharArraysTests.testConstantTimeEquals" 
-Dtests.seed=DFCA64FE2C786BE3 -Dtests.security.manager=true -Dtests.locale=fr-CA 
-Dtests.timezone=Pacific/Johnston -Dcompiler.java=12 -Druntime.java=8

that happens when the first randomized string has a length of 0.
2019-10-01 12:49:15 +02:00
Ioannis Kakavas 3b06916fcd Revert "Fix Active Directory tests (#47266)"
This reverts commit 7d9c064218.
2019-10-01 13:32:31 +03:00
Howard a9cd42c05d Cancel recoveries even if all shards assigned (#46520)
We cancel ongoing peer recoveries if a node joins the cluster with a completely
up-to-date copy of a shard, because we can use such a copy to recover a replica
instantly. However, today we only look for recoveries to cancel while there are
unassigned shards in the cluster. This means that we do not contemplate the
cancellation of the last few recoveries since recovering shards are not
unassigned.  It might take much longer for these recoveries to complete than
would be necessary if they were cancelled.

This commit fixes this by checking for cancellable recoveries even if all
shards are assigned.
2019-10-01 10:55:32 +01:00
Ignacio Vera 03d717dc32
Provide better error when updating geo_shape field mapper settings (#47281) (#47338) 2019-10-01 10:52:39 +02:00
Ioannis Kakavas 7d9c064218 Fix Active Directory tests (#47266)
Fixes multiple Active Directory related tests that run against the
samba fixture. Some were failing since we changed the realm settings
format in 7.0 and a few were slightly broken in other ways.
We can move to cleanup the tests in a follow up but this work fits
better to be done with or after we move the tests from a Samba
based fixture to a real(-ish) Microsoft Active Directory based
fixture.

Resolves: #33425, #35738
2019-10-01 10:52:07 +03:00
Tanguy Leroux f5c5411fe8
Differentiate base paths in repository integration tests (#47284) (#47300)
This commit change the repositories base paths used in Azure/S3/GCS
integration tests so that they don't conflict with each other when tests
 run in parallel on real storage services.

Closes #47202
2019-10-01 08:39:55 +02:00
Yannick Welsch dd0af2e425 Fix CloseIndexIT.testRelocatedClosedIndexIssue (#47169)
Closes #47330
2019-10-01 08:34:27 +02:00
Ioannis Kakavas 33c5e5b09d Fix SSLErrorMessageTests in Windows (#47315)
- Build paths with PathUtils#get instead of hard-coding a string with
forward slashes.
- Do not try to match the whole message that includes paths. The
file separator is `\\` in windows but when we throw an Elasticsearch
Exception, the message is formatted with LoggerMessageFormat#format
which replaces `\\` with `\` in Path names. That means that in Windows
the Exception message will contain paths with single backslashes while
the expected string that comes from Path#toString on filename and
env.configFile will contain double backslashes. There is no point in
attempting to match the whole message string for the purpose of this test.

Resolves: #45598
2019-10-01 09:14:36 +03:00
Marios Trivyzas fa0b1b641a
SQL: Add examples fo muting sql/csv integ tests (#47291)
Add examples of failures for both sql and csv integeration
tests and instructions on how to mute them.

(cherry picked from commit 591bba46516d770f5fc95a4c536dd7448b74dd49)
2019-10-01 09:12:20 +03:00
István Zoltán Szabó 170b102ab5 [DOCS] Changes wording to move away from data frame terminology in the ES repo (#47093)
* [DOCS] Changes wording to move away from data frame terminology in the ES repo.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
2019-10-01 08:08:17 +02:00
Mark Vieira 45605cfd7a
Use VAULT_TOKEN environment variable if it exists (#45525)
(cherry picked from commit b57f2ff68049a4d86ab37399531616b7ad26cfd9)
2019-09-30 15:16:05 -07:00
Alpar Torok f4e32a5f5f
Disable the use of artifactory in CI (#47100)
(cherry picked from commit cbe8b3645f74da0a7e2794536d84a3c46905e49e)
2019-09-30 15:15:55 -07:00
Ryan Ernst 67f0ffd134 Ensure char array test uses different values (#47238)
The test of constantTimeEquals could get unlucky and randomly produce
the same two strings. This commit tweaks the test to ensure the two
string are unique, and the loop inside constantTimeEquals is actually
executed (which requires the strings be of the same length).

fixes #47076
2019-09-30 14:46:53 -07:00
Armin Braun 3d23cb44a3
Speed up Snapshot Finalization (#47283) (#47309)
As a result of #45689 snapshot finalization started to
take significantly longer than before. This may be a
little unfortunate since it increases the likelihood
of failing to finalize after having written out all
the segment blobs.
This change parallelizes all the metadata writes that
can safely run in parallel in the finalization step to
speed the finalization step up again. Also, this will
generally speed up the snapshot process overall in case
of large number of indices.

This is also a nice to have for #46250 since we add yet
another step (deleting of old index- blobs in the shards
to the finalization.
2019-09-30 23:28:59 +02:00
Marios Trivyzas bd2abeef40
SQL: [TESTS] Improve error messages on failures (#47308)
When an integration test fails before the assertion of the results it's
missing information, like the file name and the line in the file where
the test resides.

(cherry picked from commit 683dc7213311d13c81e06829e08f3f9f80ebf73a)
2019-09-30 22:18:39 +03:00
Jason Tedor 890951113f
Make Setting#getRaw have private access (#47287)
The method Setting#getRaw leaks implementation details about settings,
namely that they are backed by strings. We do not want code to rely upon
this, so this commit makes Setting#getRaw private as a first step
towards hiding the implementaton details of settings from the rest of
the codebase.
2019-09-30 14:14:30 -04:00
James Rodewig fd421bd12d
[7.x] [DOCS] Add response body parms to search API docs (#47042) (#47303) 2019-09-30 13:54:06 -04:00
Lisa Cawley 0c3ee0b15c
[DOCS] Moves Watcher content into Elasticsearch book (#47147) (#47255)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-30 10:18:50 -07:00
Jack Conradson 8f1a80a43d Move Painless local methods to a dedicated FunctionTable (#46889)
This moves the way Painless maintains function headers for use
across compilation into its own class - FunctionTable. This
allows us to store a dedicated object for function lookup at
runtime for the def type instead of a loose Map of functions.
2019-09-30 09:06:40 -07:00
David Turner 41bc878738 Clarify rolling-upgrade docs (#47279)
Note to upgrade the master-eligible nodes last, and note that
`cluster.initial_master_nodes` should not be set.
2019-09-30 17:02:58 +01:00
David Turner 72b63635de
Remove unused pluggable metadata upgraders (#47277)
Today plugins may provide upgraders for custom metadata and index metadata, but
these upgraders are bypassed during a rolling restart. Fortunately this
extension mechanism is unused by all known plugins. This commit removes these
extension points.

Relates #47297
2019-09-30 16:58:29 +01:00
James Rodewig 024d1f2ab9
[DOCS] Reformat delete pipeline API (#47172) (#47294) 2019-09-30 11:38:46 -04:00
James Rodewig 312e32a3d7 [DOCS] Correct snippet in query string syntax 2019-09-30 11:30:33 -04:00
Andrew Naguib ae85a0e29a [DOCS] Note double backslashes (`\\`) are required to escape JSON chars (#46863) 2019-09-30 11:20:07 -04:00
David Roberts 24b3703005
[TEST] Only wait for 6.6 prerequisites if BWC version is 6.6 or higher (#47289)
With this change the test setup for ML config upgrade
tests only waits for v6.6+ ML index templates to be
installed if the old cluster is running version 6.6.0
or higher.

Previously it was always waiting, but timing out without
failing the test if the templates were not installed
within 10 seconds, effectively just adding a pointless
10 second sleep to BWC tests against versions earlier
than 6.6.0. This problem was exposed by #47112.

Fixes #47286
2019-09-30 14:55:50 +01:00
Gaurav614 052c523d41 Fail allocation of new primaries in empty cluster (#43284)
Today if you create an index in a cluster without any data nodes then it will
report yellow health because it never attempts to assign any shards if there
are no data nodes, so the new shards remain at `AllocationStatus.NO_ATTEMPT`.
This commit moves the new primaries to `AllocationStatus.DECIDERS_NO` in this
situation, causing the cluster health to move to red.

Fixes #41073
2019-09-30 14:27:12 +01:00
emasab 87156ad93b
SQL: Fix issue with duplicate columns in SELECT (#42122)
Previously, if a column (field, scalar, alias) appeared more than once in the
SELECT list, the value was returned only once (1st appearance) in each row.

Fixes: #41811

(cherry picked from commit 097ea36581a751605fc4f2088319d954ce35b5d1)
2019-09-30 15:56:29 +03:00
James Rodewig 5990532cb7
[DOCS] Reformat flush API docs (#46875) (#47230) 2019-09-30 08:42:52 -04:00
David Roberts 0807d409bf [ML] Reinstate ML daily maintenance actions (#47103)
A refactoring in 6.6 meant that the ML daily
maintenance actions have not been run at all
since then. This change installs the local
master listener that schedules the ML daily
maintenance, and also defends against some
subtle race conditions that could occur in the
future if a node flipped very quickly between
master and non-master.

Fixes #47003
2019-09-30 13:12:32 +01:00
Yannick Welsch 467596871a Omit writing index metadata for non-replicated closed indices on data-only node (#47285)
Fixes a bug related to how "closed replicated indices" (introduced in 7.2) interact with the index
metadata storage mechanism, which has special handling for closed indices (but incorrectly
handles replicated closed indices). On non-master-eligible data nodes, it's possible for the
node's manifest file (which tracks the relevant metadata state that the node should persist) to
become out of sync with what's actually stored on disk, leading to an inconsistency that is then
detected at startup, refusing for the node to start up.

Closes #47276
2019-09-30 13:56:52 +02:00