Commit Graph

1337 Commits

Author SHA1 Message Date
Nhat Nguyen 624b6bb487
Copy and validatie soft-deletes setting on resize (#33517)
This change copies and validates the soft-deletes setting during resize.
If the source enables soft-deletes, the target must also enable it.

Closes #33321
2018-09-10 17:38:58 -04:00
Alan Woodward 39c3234c2f
Upgrade to latest Lucene snapshot (#33505)
* LeafCollector.setScorer() now takes a Scorable
* Scorers may not have null Weights
* IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk
2018-09-10 20:51:55 +01:00
Armin Braun 9a2c77d1c3
MINOR: Remove Dead Code in SearchScript (#33569)
* `lookup` is not used anywhere
* `getLeafContext` is not used anywhere
2018-09-10 18:56:21 +02:00
Tanguy Leroux 079d130d8c
[Test] Remove duplicate method in TestShardRouting (#32815) 2018-09-10 18:29:00 +02:00
David Turner 284c45a6ff
Strengthen FilterRoutingTests (#33149)
Today the FilterRoutingTests take the belt-and-braces approach of excluding
some node attribute values and including some others. This means that we don't
really test that both inclusion and exclusion work correctly: as long as one of
them works as expected then the test will pass. This change improves these
tests by only using one approach at once, demonstrating that both do indeed
work, and adds tests for various other scenarios too.
2018-09-10 11:23:05 +02:00
Nhat Nguyen e6ca55bca6 Adjust bwc for stale primary recovery source (#33432)
Relates #33432
2018-09-09 21:34:32 -04:00
Jason Tedor 6bb817004b
Add infrastructure to upgrade settings (#33536)
In some cases we want to deprecate a setting, and then automatically
upgrade uses of that setting to a replacement setting. This commit adds
infrastructure for this so that we can upgrade settings when recovering
the cluster state, as well as when such settings are dynamically applied
on cluster update settings requests. This commit only focuses on cluster
settings, index settings can build on this infrastructure in a
follow-up.
2018-09-09 20:49:19 -04:00
Armin Braun d4b212c4c9
CORE: Make Pattern Exclusion Work with Aliases (#33518)
* CORE: Make Pattern Exclusion Work with Aliases

* Adds the pattern exclusion logic to finding aliases
* Closes #33395
2018-09-09 17:31:02 +02:00
S.Y. Wang 9073dbefd6 HLRC: Add put stored script support to high-level rest client (#31323)
Relates to #27205
2018-09-09 13:47:47 +02:00
Nhat Nguyen 94e4cb64c2
Bootstrap a new history_uuid when force allocating a stale primary (#33432)
This commit ensures that we bootstrap a new history_uuid when force
allocating a stale primary. A stale primary should never be the source
of an operation-based recovery to another shard which exists before the
forced-allocation.

Closes #26712
2018-09-08 19:29:31 -04:00
Armin Braun f27c3dcf88
INGEST: Remove Outdated TODOs (#33458)
* CompoundProcessor is in the ingest package now
-> resolved
* Java generics don't offer type checking so nothing
can be done here -> remvoed TODO and test
* #16019 was closed and not acted on
-> todo can go away
2018-09-08 10:18:45 +02:00
Jason Tedor 9a404f3def
Include fallback settings when checking dependencies (#33522)
Today when checking settings dependencies, we do not check if fallback
settings are present. This means, for example, that if
cluster.remote.*.seeds falls back to search.remote.*.seeds, and
cluster.remote.*.skip_unavailable and search.remote.*.skip_unavailable
depend on cluster.remote.*.seeds, and we have set search.remote.*.seeds
and search.remote.*.skip_unavailable, then validation will fail because
it is expected that cluster.ermote.*.seeds is set here. This commit
addresses this by also checking fallback settings when validating
dependencies. To do this, we adjust the settings exist method to also
check for fallback settings, a case that it was not handling previously.
2018-09-07 20:09:53 -04:00
Nik Everett 190ea9a6de
Logging: Configure the node name when we have it (#32983)
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
2018-09-07 14:31:23 -04:00
Nhat Nguyen ab7e696108
TEST: Ensure merge triggered in _source retention test (#33487)
We invoke force merge twice in the test to verify that recovery sources
are pruned when the global checkpoint advanced. However, if the global
checkpoint equals to the local checkpoint in the first force-merge, the
second force-merge will be a noop because all deleted docs are expunged
in the first merge already. We need to flush a new segment to make merge
happen so we can verify that all recovery sources are pruned.
2018-09-07 12:58:00 -04:00
Simon Willnauer c12d232215
Pass Directory instead of DirectoryService to Store (#33466)
Instead of passing DirectoryService which causes yet another dependency
on Store we can just pass in a Directory since we will just call
`DirectoryService#newDirectory()` on it anyway.
2018-09-07 14:00:24 +02:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Jim Ferenczi 34859414a0 Fix bwc serialization of total hits when track_total_hits is false 2018-09-07 10:30:53 +02:00
Nik Everett 0d45752e50
Fix IndexMetaData loads after rollover (#33394)
When we rollover and index we write the conditions of the rollover that
the old index met into the old index. Loading this index metadata
requires a working `NamedXContentRegistry` that has been populated with
parsers from the rollover infrastructure. We had a few loads that didn't
use a working `NamedXContentRegistry` and so would fail if they ever
encountered an index that had been rolled over. Here are the locations
of the loads and how I fixed them:

* IndexFolderUpgrader - removed entirely. It existed to support opening
indices made in Elasticsearch 2.x. Since we only need this change as far
back as 6.4.1 which will supports reading from indices created as far
back as 5.0.0 we should be good here.
* TransportNodesListGatewayStartedShards - wired the
`NamedXContentRegistry` into place.
* TransportNodesListShardStoreMetaData - wired the
`NamedXContentRegistry` into place.
* OldIndexUtils - removed entirely. It existed to support the zip based
index backwards compatibility tests which we've since replaced with code
that actually runs old versions of Elasticsearch.

In addition to fixing the actual problem I added full cluster restart
integration tests for rollover which would have caught this problem and
I added an extra assertion to IndexMetaData's deserialization code which
will trip if we try to deserialize and index's metadata without a fully
formed `NamedXContentRegistry`. It won't catch if use the *wrong*
`NamedXContentRegistry` but it is better than nothing.

Closes #33316
2018-09-06 17:55:24 -04:00
Simon Willnauer c6c456e8cb
Move up acquireSearcher logic to Engine (#33453)
By moving the logic to acquire the searcher up to the engine
it's simpler to build new engines that are for instance read-only.
2018-09-06 18:48:05 +02:00
Nhat Nguyen 8afe09a749
Pass TranslogRecoveryRunner to engine from outside (#33449)
This commit allows us to use different TranslogRecoveryRunner when
recovering an engine from its local translog. This change is a
prerequisite for the commit-based rollback PR.

Relates #32867
2018-09-06 11:59:16 -04:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Alan Woodward e134f9b5f3
Fix generics in ScriptPlugin#getContexts() (#33426)
Changes the return value from List<ScriptContext> to List<ScriptContext<?>> to remove raw-types warnings.
2018-09-06 09:04:22 +01:00
Alexander Reelsen 82fab40099
Core: Fix IndicesSegmentResponse.toXcontent() serialization (#33414)
When index sorting is enabled, toXContent tried to serialize an
SortField object, resulting in an exception, when using the _segments
endpoint.

Relates #29120
2018-09-06 09:56:20 +02:00
Daniel Mitterdorfer 5236f2b1af Improve reproducability of RestControllerTests
With this commit we use the classic parent circuit breaker which does
not account for real memory usage. In those tests we want to have
reproducible results and hence it makes sense to disable the real memory
circuit breaker there.
2018-09-06 09:44:05 +02:00
Martijn van Groningen a721d09c81
[CCR] Added auto follow patterns feature (#33118)
Auto Following Patterns is a cross cluster replication feature that
keeps track whether in the leader cluster indices are being created with
names that match with a specific pattern and if so automatically let
the follower cluster follow these newly created indices.

This change adds an `AutoFollowCoordinator` component that is only active
on the elected master node. Periodically this component checks the
 the cluster state of remote clusters if there new leader indices that
match with configured auto follow patterns that have been defined in
`AutoFollowMetadata` custom metadata.

This change also adds two new APIs to manage auto follow patterns. A put
auto follow pattern api:

```
PUT /_ccr/_autofollow/{{remote_cluster}}
{
   "leader_index_pattern": ["logs-*", ...],
   "follow_index_pattern": "{{leader_index}}-copy",
   "max_concurrent_read_batches": 2
   ... // other optional parameters
}
```

and delete auto follow pattern api:

```
DELETE /_ccr/_autofollow/{{remote_cluster_alias}}
```

The auto follow patterns are directly tied to the remote cluster aliases
configured in the follow cluster.

Relates to #33007


Co-authored-by: Jason Tedor jason@tedor.me
2018-09-06 08:01:58 +02:00
Jason Tedor d71ced1b00
Generalize search.remote settings to cluster.remote (#33413)
With features like CCR building on the CCS infrastructure, the settings
prefix search.remote makes less sense as the namespace for these remote
cluster settings than does a more general namespace like
cluster.remote. This commit replaces these settings with cluster.remote
with a fallback to the deprecated settings search.remote.
2018-09-05 20:43:44 -04:00
Nhat Nguyen 39e3bd93c7
TEST: Create following engines in the main thread (#33391)
There are two races in the testUpdateAndReadChangesConcurrently if the
following engines are created in the worker threads. We fixed the
translog issue in #33352, but there is still another race with
createStore.

This commit ensures that we create all engines in the main thread.

Relates #33352
Closes #33344
2018-09-05 19:05:41 -04:00
Nhat Nguyen 41839cf9a8
Acquire seacher on closing engine should throw ACE (#33331)
Closes #33330
2018-09-05 19:03:34 -04:00
Tim Brooks b697f485bb
Introduce `TransportLogger` for common logging (#32725)
Historically we have had a ESLoggingHandler in the netty module that
logs low-level connection operations. This class just extends the netty
logging handler with some (broken) message deserialization. This commit
fixes this message serialization and moves the class to server.

This new logger logs inbound and outbound messages. Eventually, we
should move other event logging to this class (connect, close, flush).
That way we will have consistent logging regards of which transport is
loaded.

Resolves #27306 on master. Older branches will need a different fix.
2018-09-05 16:12:37 -06:00
Tim Brooks 88c178dca6
Add sni name to SSLEngine in netty transport (#33144)
This commit is related to #32517. It allows an "server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. This functionality is only implemented for
the netty security transport.
2018-09-05 16:12:10 -06:00
Armin Braun ef1066d7f8
INGEST: Allow Repeated Invocation of Pipeline (#33419)
* Allows repeated, non-recursive invocation
of the same pipeline
2018-09-05 22:04:53 +02:00
Jim Ferenczi 50e07dd413
Add an index setting to control TieredMergePolicy#deletesPctAllowed (#32907)
This change adds an expert index setting called `index.merge.policy.deletes_pct_allowed`.
It controls the maximum percentage of deleted documents that is tolerated in the index.
Lower values make the index more space efficient at the expense of increased CPU and I/O activity.
Values must be between `20` and `50`. Default value is `33`.
2018-09-05 19:57:36 +02:00
Nik Everett 5c624bc55b
Logging: Further clean up logging ctors (#33378)
Drops and unused logging constructor, simplifies a rarely used one, and
removes `Settings` from a third. There is now only a single logging ctor
that takes `Settings` and we'll remove that one in a follow up change.
2018-09-05 13:04:26 -04:00
Adrien Grand 46ac8d1a51 Make test less GC-intensive. 2018-09-05 18:59:43 +02:00
Christoph Büscher eafc2a5470
Don't count metadata fields towards index.mapping.total_fields.limit (#33386)
The maximum number of fields per index is limited to 1000 by default by the
`index.mapping.total_fields.limit` setting to prevent accidental mapping
explosions due to too many fields. Currently all metadata fields also count
towards this limit, which can lead to some confusion when using lower limits.
It is not obvious for users that they cannot actually add as many fields as
are specified by the limit in this case.

This change takes the number of metadata fields out of the field count that we
check against the field limit. It also adds tests that check that we can add
fields up to the specified limit, but throw an exception for any additional field added.

Closes #24096
2018-09-05 18:27:21 +02:00
Jason Tedor 23934e39d2
Fix deprecated setting specializations (#33412)
Deprecating a some setting specializations (e.g., list settings) does
not cause deprecation warning headers and deprecation log messages to
appear. This is due to a missed check for deprecation. This commit fixes
this for all setting specializations, and ensures that this can not be
missed again.
2018-09-05 11:01:58 -04:00
Adrien Grand 913d5fd820 Disable IndexRecoveryIT.testRerouteRecovery.
Relates #32686.
2018-09-05 14:53:22 +02:00
Armin Braun 46774098d9
INGEST: Implement Drop Processor (#32278)
* INGEST: Implement Drop Processor
* Adjust Processor API
* Implement Drop Processor
* Closes #23726
2018-09-05 14:25:29 +02:00
Paul Sanwald c303006e6b
Add interval response parameter to AutoDateInterval histogram (#33254)
Adds the interval used to the aggregation response.
2018-09-05 07:35:59 -04:00
Armin Braun 4156cc3fae
MINOR+CORE: Remove Dead Methods ClusterService (#33346)
* None of these methods are used anywhere
2018-09-05 12:08:28 +02:00
Gordon Brown cfd3fa72ed
Add user-defined cluster metadata (#33325)
Adds a place for users to store cluster-wide data they wish to associate
with the cluster via the Cluster Settings API. This is strictly for
user-defined data, Elasticsearch makes no other other use of these
settings.
2018-09-04 16:14:18 -06:00
Jim Ferenczi dbc7102c86
Fix inner hits retrieval when stored fields are disabled (_none_) (#33018)
Now that types are unique per mapping we can retrieve the document mapper
without referencing the type. This fixes an NPE when stored fields are disabled.
For 6x we'll need a different fix since mappings can still have multiple types.

Relates #32941
2018-09-04 16:25:52 +02:00
Sohaib Iftikhar 761e8c461f HLRC: Add delete by query API (#32782)
Adds the delete-by-query API to the High Level REST Client.
2018-09-04 08:56:26 -04:00
Julie Tibshirani 78df00ff24
Simplify the return type of FieldMapper#parse. (#32654) 2018-09-04 01:15:19 +00:00
Jason Tedor 09bf4e5f00
Introduce private settings (#33327)
This commit introduces the formal notion of a private setting. This
enables us to register some settings that we had previously not
registered as fully-fledged settings to avoid them being exposed via
APIs such as the create index API. For example, we had hacks in the
codebase to allow index.version.created to be passed around inside of
settings objects, but was not registered as a setting so that if a user
tried to use the setting on any API then they would get an
exception. This prevented users from setting index.version.created on
index creation, or updating it via the index settings API. By
introducing private settings, we can continue to reject these attempts,
yet now we can represent these settings as actual settings. In this
change, we register index.version.created as an actual setting. We do
not cutover all settings that we had been treating as private in this
pull request, it is already quite large due to moving some tests around
to account for the fact that some tests need to be able to set the
index.version.created. This can be done in a follow-up change.
2018-09-03 19:17:57 -04:00
Armin Braun 1f046617bf
TESTS: Fix Race Condition in Temp Path Creation (#33352)
* TESTS: Fix Race Condition in Temp Path Creation

* Calling `createTempDir` concurrently here in
the `Follower`s causes collisions at times
which lead to `createEngine` throwing because
of unexpected files in the newly created temp
dir
   * Fixed by creating all temp dirs in the main test thread
* closes #33344
2018-09-03 19:55:59 +02:00
Nhat Nguyen 24d60c7f4b
Fix from_range in search_after in changes snapshot (#33335)
We can have multiple documents in Lucene with the same seq_no for
parent-child documents (or without rollback). In this case, the usage
"lastSeenSeqNo + 1" is an off-by-one error as it may miss some
documents. This error merely affects the `skippedOperations` contract.

See: https://github.com/elastic/elasticsearch/pull/33222#discussion_r213842257

Closes #33318
2018-09-03 11:58:49 -04:00
Armin Braun 42424aff21
TESTS+DISTR.: Fix testIndexCheckOnStartup Flake (#33349)
* Ignore all `RuntimeException` since random
file corruption triggers other RTE in addition
to the randomly caught one
* closes #33345
2018-09-03 17:06:12 +02:00
tony-dillon a9d2b1dde8 Null completion field should not throw IAE (#33268)
Ignore null value on the completion field

Closes #33200
2018-09-03 16:49:53 +02:00
Colin Goodheart-Smithe 0bf36253a9
Adds code to help with IndicesRequestCacheIT failures (#33313)
* Adds code to help with IndicesRequestCacheIT failures

Relates to #32827

* Adds comment

* Fixes test failure
2018-09-03 14:54:17 +01:00
Alexander Reelsen 246a7df8c2
Core: Fix epoch millis java time formatter (#33302)
The existing implemention could not deal with negative numbers as well
as +- 999 milliseconds around the epoch.

This commit uses Instant.ofEpochMilli() and parses the input to
a number instead of using a date formatter.
2018-09-03 13:13:19 +02:00
Jim Ferenczi 9310d2eaf3 [CI] Mute IndexShardTests#testIndexCheckOnStartup fails #33345 2018-09-03 10:27:42 +02:00
Jim Ferenczi 2fa75b4438 [CI] Mute LuceneChangesSnapshotTests#testUpdateAndReadChangesConcurrently 2018-09-03 10:14:00 +02:00
Jim Ferenczi 713c07e14d
Add early termination support to BucketCollector (#33279)
This commit adds the support to early terminate the collection of a leaf
in the aggregation framework. This change introduces a MultiBucketCollector which
handles CollectionTerminatedException exactly like the Lucene MultiCollector.
Any aggregator can now throw a CollectionTerminatedException without stopping
the collection of a sibling aggregator. This is useful for aggregators that
can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).
2018-09-03 09:34:35 +02:00
Nik Everett f8b7a4dbc8
Logging: Drop Settings from some logging ctors (#33332)
Drops `Settings` from some logging ctors now that they are no longer
needed. This should allow us to stop passing `Settings` around to quite
as many places.
2018-09-02 16:51:26 -04:00
Jason Tedor ea4eef8641
Merge branch 'master' into ccr
* master:
  HLREST: add update by query API (#32760)
2018-09-02 16:07:50 -04:00
Sohaib Iftikhar 389bf67275 HLREST: add update by query API (#32760)
Adds update by query to the high level rest client.
2018-09-02 15:15:00 -04:00
Nhat Nguyen 3197a6bbdd Merge branch 'master' into ccr
* master:
  HLRC: ML Flush job (#33187)
  HLRC: Adding ML Job stats (#33183)
  LLREST: Drop deprecated methods (#33223)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (#31118)
2018-09-02 09:30:51 -04:00
Nhat Nguyen ce635f5f15 Mute testSyncerOnClosingShard
Tracked at #33330
2018-09-01 09:53:31 -04:00
Nhat Nguyen b93507608a Merge branch 'master' into ccr
* master:
  Mute test watcher usage stats output
  [Rollup] Fix FullClusterRestart test
  Adjust soft-deletes version after backport into 6.5
  completely drop `index.shard.check_on_startup: fix` for 7.0 (#33194)
  Fix AwaitsFix issue number
  Mute SmokeTestWatcherWithSecurityIT testsi
  drop `index.shard.check_on_startup: fix` (#32279)
  tracked at
  [DOCS] Moves ml folder from x-pack/docs to docs (#33248)
  [DOCS] Move rollup APIs to docs (#31450)
  [DOCS] Rename X-Pack Commands section (#33005)
  TEST: Disable soft-deletes in ParentChildTestCase
  Fixes SecurityIntegTestCase so it always adds at least one alias (#33296)
  Fix pom for build-tools (#33300)
  Lazy evaluate java9home (#33301)
  SQL: test coverage for JdbcResultSet (#32813)
  Work around to be able to generate eclipse projects (#33295)
  Highlight that index_phrases only works if no slop is used (#33303)
  Different handling for security specific errors in the CLI. Fix for https://github.com/elastic/elasticsearch/issues/33230 (#33255)
  [ML] Refactor delimited file structure detection (#33233)
  SQL: Support multi-index format as table identifier (#33278)
  MINOR: Remove Dead Code from PathTrie (#33280)
  Enable forbiddenapis server java9 (#33245)
2018-08-31 19:03:04 -04:00
Nhat Nguyen 08b9247ce2 Adjust soft-deletes version after backport into 6.5
Relates #33222
2018-08-31 16:50:08 -04:00
Vladimir Dolzhenko 00b272af32 completely drop `index.shard.check_on_startup: fix` for 7.0 (#33194)
Relates to #32279
2018-08-31 22:08:28 +02:00
Vladimir Dolzhenko 3d82a30fad
drop `index.shard.check_on_startup: fix` (#32279)
drop `index.shard.check_on_startup: fix`

Relates #31389
2018-08-31 21:29:06 +02:00
Armin Braun c6cfa08a61
MINOR: Remove Dead Code from PathTrie (#33280)
* The array size checks are redundant since the array sizes
are checked earlier in those methods too
* The removed methods are just not used anywhere
2018-08-31 08:40:27 +02:00
Alpar Torok 44ed5f6306
Enable forbiddenapis server java9 (#33245) 2018-08-31 09:31:55 +03:00
Nhat Nguyen ad4dd086d2 Integrates soft-deletes into Elasticsearch (#33222)
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand <jpountz@gmail.com>
Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Simon Willnauer <simonw@apache.org>
2018-08-30 23:46:07 -04:00
Nhat Nguyen 547de71d59 Revert "Integrates soft-deletes into Elasticsearch (#33222)"
Revert to correct co-author tags.
This reverts commit 6dd0aa54f6.
2018-08-30 23:44:57 -04:00
Nhat Nguyen d3f32273eb Merge branch 'master' into ccr 2018-08-30 23:22:58 -04:00
Nhat Nguyen 6dd0aa54f6
Integrates soft-deletes into Elasticsearch (#33222)
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand jpountz@gmail.com
Co-authored-by: Boaz Leskes b.leskes@gmail.com
Co-authored-by: Jason Tedor jason@tedor.me
Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com
Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co
Co-authored-by: Simon Willnauer simonw@apache.org
2018-08-30 22:11:23 -04:00
Lee Hinman 8a2d154bad Update serialization versions for custom IndexMetaData backport 2018-08-30 15:56:53 -06:00
Igor Motov 001b78f704 Replace IndexMetaData.Custom with Map-based custom metadata (#32749)
This PR removes the deprecated `Custom` class in `IndexMetaData`, in favor
of a `Map<String, DiffableStringMap>` that is used to store custom index
metadata. As part of this, there is now no way to set this metadata in a
template or create index request (since it's only set by plugins, or dedicated
REST endpoints).

The `Map<String, DiffableStringMap>` is intended to be a namespaced `Map<String,
String>` (`DiffableStringMap` implements `Map<String, String>`, so the signature
is more like `Map<String, Map<String, String>>`). This is so we can do things
like:

``` java
Map<String, String> ccrMeta = indexMetaData.getCustom("ccr");
```

And then have complete control over the metadata. This also means any
plugin/feature that uses this has to manage its own BWC, as the map is just
serialized as a map. It also means that if metadata is put in the map that isn't
used (for instance, if a plugin were removed), it causes no failures the way
an unregistered `Setting` would.

The reason I use a custom `DiffableStringMap` here rather than a plain
`Map<String, String>` is so the map can be diffed with previous cluster state
updates for serialization.

Supersedes #32683
2018-08-30 13:57:00 -06:00
Simon Willnauer af2eaf2a6c
Remove usage of `index.shrink.source.*` in 7.x (#33271)
We cut over to `index.resize.source.*` but still have these constants
being public in `IndexMetaData`. Those Settings and constants are not needed
in 7.x while we still need to keep the keys known to private settings since
they might be part of the index settings of old indices. We can remove that
in 8.0. Yet, we should remove the settings to make sure they are not used again.
2018-08-30 21:08:35 +02:00
Jim Ferenczi d0630093cd
Fix serialization of empty field capabilities response (#33263)
Fix serialization of empty field capabilities response

When no response are required (no indices match the requested patterns) the
empty response throws an NPE in the transport serialization (writeTo).
2018-08-30 18:07:58 +02:00
Jim Ferenczi 1404dd2a42
Fix nested _source retrieval with includes/excludes (#33180)
If an exclude or an include clause removes an entry to a nested field in the original source at query time,
the creation of nested hits fails with an NPE. This change fixes this exception and replaces the nested document
source with an empty map.

Closes #33163
Closes #33170
2018-08-30 15:15:50 +02:00
Nhat Nguyen 13261996ce
Add NoOps to Lucene for failed delete ops (#33217)
Today we add a NoOp to Lucene and translog if we fail to process an
indexing operation. However, we are only adding NoOps to translog for
delete operations. In order to have a complete history in Lucene, we
should add NoOps of failed delete operations to both Lucene and translog.

Relates #29530
2018-08-30 07:55:13 -04:00
David Turner 47859e56ac
Move file-based discovery to core (#33241)
Today we support a static list of seed hosts in core Elasticsearch, and allow a
dynamic list of seed hosts to be provided via a file using the `discovery-file`
plugin. In fact the ability to provide a dynamic list of seed hosts is
increasingly useful, so this change moves this functionality to core
Elasticsearch to avoid the need for a plugin.

Furthermore, in order to start up nodes in integration tests we currently
assign a known port to each node before startup, which unfortunately sometimes
fails if another process grabs the selected port in the meantime. By moving the
`discovery-file` functionality into the core product we can use it to avoid
this race.

This change also moves the expected path to the file from
`$ES_PATH_CONF/discovery-file/unicast_hosts.txt` to
`$ES_PATH_CONF/unicast_hosts.txt`. An example of this file is not included in
distributions.

For BWC purposes the plugin still exists, but does nothing more than create the
example file in the old location, and issue a warning when it is used. We also
continue to support the old location for the file, but warn about its
deprecation.

Relates #29244
Closes #33030
2018-08-30 06:43:04 +01:00
Armin Braun cc4d7059bf
Ingest: Add conditional per processor (#32398)
* Ingest: Add conditional per processor
* closes #21248
2018-08-30 03:46:39 +02:00
Jason Tedor 0f22dbb1cc
Apply settings filter to get cluster settings API (#33247)
Some settings have filters applied to them and we use this in logs and
the get nodes info API. For consistency, we should apply this in the get
cluster settings API too.
2018-08-29 15:56:13 -04:00
Nhat Nguyen 5632e31c74 Merge branch 'master' into ccr
* master:
  Painless: Add Bindings (#33042)
  Update version after client credentials backport
  Fix forbidden apis on FIPS (#33202)
  Remote 6.x transport BWC Layer for `_shrink` (#33236)
  Test fix - Graph HLRC tests needed another field adding to randomisation exception list
  HLRC: Add ML Get Records API (#33085)
  [ML] Fix character set finder bug with unencodable charsets (#33234)
  TESTS: Fix overly long lines (#33240)
  Test fix - Graph HLRC test was missing field name to be excluded from randomisation logic
  Remove unsupported group_shard_failures parameter (#33208)
  Update BucketUtils#suggestShardSideQueueSize signature (#33210)
  Parse PEM Key files leniantly (#33173)
  INGEST: Add Pipeline Processor (#32473)
  Core: Add java time xcontent serializers (#33120)
  Consider multi release jars when running third party audit (#33206)
  Update MSI documentation (#31950)
  HLRC: create base timed request class (#33216)
  [DOCS] Fixes command page titles
  HLRC: Move ML protocol classes into client ml package (#33203)
  Scroll queries asking for rescore are considered invalid (#32918)
  Painless: Fix Semicolon Regression (#33212)
  ingest: minor - update test to include dissect (#33211)
  Switch remaining LLREST usage to new style Requests (#33171)
  HLREST: add reindex API (#32679)
2018-08-29 12:30:24 -04:00
Simon Willnauer 6a0d4b4a77
Remote 6.x transport BWC Layer for `_shrink` (#33236)
The shrink action was renamed to `_resize` with the addition
or split. This bwc layer is unnecessary on 7.x since 6.latest will
always use the resize action.
2018-08-29 16:43:13 +02:00
Luca Cavanna 49109187e2
Remove unsupported group_shard_failures parameter (#33208)
We have had support for the `group_shard_failures` parameter in our code for a while, since we introduced failures grouping. When we introduced validation of parameters at REST, we seem to have forgotten to expose such parameter. Given that the parameter is effectively not supported for many months now, that no user has complained about that and that grouping is the expected behaviour, this commit removes support for the parameter.
2018-08-29 14:05:41 +02:00
Luca Cavanna 034fdbca28
Update BucketUtils#suggestShardSideQueueSize signature (#33210)
`BucketUtils#suggestShardSideQueueSize` used to calculate the shard_size based on the number of shards. It returns now a different value only based on whether we are querying a single shard or multiple shards. This commit replaces the numberOfShards argument with a boolean that tells whether we are querying a single shard or not.
2018-08-29 13:51:54 +02:00
Armin Braun f690b492e7
INGEST: Add Pipeline Processor (#32473)
* INGEST: Add Pipeline Processor

* Adds Processor capable of invoking other pipelines
* Closes #31842
2018-08-29 11:03:10 +02:00
Alexander Reelsen 48b388ce82
Core: Add java time xcontent serializers (#33120)
This ensures that the java time class exposed by painless have proper
serialization/string representations.

Closes #31853
2018-08-29 10:00:16 +02:00
Alpar Torok f29f0af7bc
Consider multi release jars when running third party audit (#33206)
Exclude classes meant for newer versions than what we are auditing against, those classes won't be found. There's no reason to exclude JDK classes from newer versions, with this PR, we will not extract them in the first place.
2018-08-29 09:53:04 +03:00
Mark Tozzi 84b61d0738
Scroll queries asking for rescore are considered invalid (#32918)
This PR changes our behavior from silently ignoring rescore in a scroll query to instead report to the user that such a query is invalid.

Closes #31775
2018-08-28 15:48:23 -04:00
Nhat Nguyen c42dc77896 Merge branch 'master' into ccr
* master:
  [Rollup] Better error message when trying to set non-rollup index (#32965)
  HLRC: Use Optional in validation logic (#33104)
  Remove unused User class from protocol (#33137)
  ingest: Introduce the dissect processor (#32884)
  [Docs] Add link to es-kotlin-wrapper-client (#32618)
  [Docs] Remove repeating words (#33087)
  Minor spelling and grammar fix (#32931)
  Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979)
  Watcher: Simplify finding next date in cron schedule (#33015)
  Run Third party audit with forbidden APIs CLI  (part3/3) (#33052)
  Fix plugin build test on Windows (#33078)
  HLRC+MINOR: Remove Unused Private Method (#33165)
  Remove old unused test script files (#32970)
  Build analysis-icu client JAR (#33184)
  Ensure to generate identical NoOp for the same failure (#33141)
  ShardSearchFailure#readFrom to set index and shardId (#33161)
2018-08-28 13:56:38 -04:00
Sohaib Iftikhar 7f5e29ddb2 HLREST: add reindex API (#32679)
Adds the reindex API to the high level REST client.
2018-08-28 13:02:23 -04:00
Nhat Nguyen e39689a198
Send only ops after checkpoint in file-based recovery with soft-deletes (#33190)
Today a file-based recovery will replay all existing translog operations
from the primary on a replica so that that replica can have a full
history in translog as the primary. However, with soft-deletes enabled,
we should not do it because:

1. All operations before the local checkpoint of the safe commit exist in
the commit already.

2. The number of operations before the local checkpoint may be considerable
and requires a significant amount of time to replay on a replica.

Relates #30522
Relates #29530
2018-08-28 12:32:09 -04:00
Nhat Nguyen e2b931e80b
Use Lucene history in primary-replica resync (#33178)
This commit makes primary-replica resyncer use Lucene as the source of
history operation instead of translog if soft-deletes is enabled. With
this change, we no longer expose translog snapshot directly in IndexShard.

Relates #29530
2018-08-28 10:44:15 -04:00
Nhat Nguyen d8a1b7cb17
Make soft-deletes settings final (#33172)
For now, we do not support changing the soft-deletes setting even with
closed indices. Therefore we should make it a final setting.

Relates #29530
2018-08-28 08:48:42 -04:00
Jonathan Little 9d92a87ae6 Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979) 2018-08-28 09:27:43 +01:00
Alpar Torok 2cc611604f
Run Third party audit with forbidden APIs CLI (part3/3) (#33052)
The new implementation is functional equivalent with the old, ant based one.
It parses task standard error to get the missing classes and violations in the same way.
I considered re-using ForbiddenApisCliTask but Gradle makes it hard to build inheritance with tasks that have task actions , since the order of the task actions can't be controlled.
This inheritance isn't dully desired either as the third party audit task is much more opinionated and we don't want to expose some of the configuration.
We could probably extract a common base class without any task actions, but probably more trouble than it's worth.

Closes #31715
2018-08-28 10:03:30 +03:00
Nhat Nguyen 014b3236dc
Ensure to generate identical NoOp for the same failure (#33141)
We generate slightly different NoOps in InternalEngine and
TransportShardBulkAction for the same failure.

1. InternalEngine uses Exception#getFailure to generate a message
without the class name: newOp [NoOp{seqNo=1, primaryTerm=1,
reason='Contexts are mandatory in context enabled completion field
[suggest_context]'}].

2. TransportShardBulkAction uses Exception#toString to generate a
message with the class name: NoOp{seqNo=1, primaryTerm=1,
reason='java.lang.IllegalArgumentException: Contexts are mandatory in
context enabled completion field [suggest_context]'}.

If a write operation fails while a replica is recovering, that replica
will possibly receive two different NoOps: one from recovery and one
from replication. These two different NoOps will trip
TranslogWriter#assertNoSeqNumberConflict assertion.

This commit ensures that we generate the same Noop for the same failure.

Closes #32986
2018-08-27 15:59:42 -04:00
Luca Cavanna ed0571e16c
ShardSearchFailure#readFrom to set index and shardId (#33161)
As part of recent changes made to `ShardOperationFailedException` we introduced `index` and `shardId` members to the base class, but the subclasses are entirely responsible for the serialization of such fields. In the case of `ShardSearchFailure`, we have an additional `SearchShardTarget` instance member which also holds the index and the shardId, hence they get serialized as part of `SearchShardTarget` itself. When de-serializing a `ShardSearchFailure` though, we need to remember to also set the parent class `index` and `shardId` fields otherwise they get lost

Relates to #32640
2018-08-27 20:31:27 +02:00
Jason Tedor 0e5d42ca38
Merge branch 'master' into ccr
* master:
  Adjust BWC version on mapping version
  Token API supports the client_credentials grant (#33106)
  Build: forked compiler max memory matches jvmArgs (#33138)
  Introduce mapping version to index metadata (#33147)
  SQL: Enable aggregations to create a separate bucket for missing values (#32832)
  Fix grammar in contributing docs
  SECURITY: Fix Compile Error in ReservedRealmTests (#33166)
  APM server monitoring (#32515)
  Support only string `format` in date, root object & date range (#28117)
  [Rollup] Move toBuilders() methods out of rollup config objects (#32585)
  Fix forbiddenapis on java 11  (#33116)
  Apply publishing to genreate pom (#33094)
  Have circuit breaker succeed on unknown mem usage
  Do not lose default mapper on metadata updates (#33153)
  Fix a mappings update test (#33146)
  Reload Secure Settings REST specs & docs (#32990)
  Refactor CachingUsernamePassword realm (#32646)
2018-08-27 13:49:59 -04:00
Jason Tedor 318df2a107
Adjust BWC version on mapping version
The introduction of mapping version on index metadata has been
backported to 6.x. This commit adjusts the BWC version around mapping
version to account for this backport.
2018-08-27 13:17:15 -04:00
Jason Tedor 2aef7e0900
Introduce mapping version to index metadata (#33147)
This commit introduces mapping version to index metadata. This value is
monotonically increasing and is updated on mapping updates. This will be
useful in cross-cluster replication so that we can request mapping
updates from the leader only when there is a mapping update as opposed
to the strategy we employ today which is to request a mapping update any
time there is an index metadata update. As index metadata updates can
occur for many reasons other than mapping updates, this leads to some
unnecessary requests and work in cross-cluster replication.
2018-08-27 12:21:11 -04:00
Mikita Karaliou f1f6d4ed33 Support only string `format` in date, root object & date range (#28117)
Limit date `format` attribute to String values only.

Closes #23650
2018-08-27 12:24:51 +02:00
Daniel Mitterdorfer 06c0055c0f
Have circuit breaker succeed on unknown mem usage
With this commit we implement a workaround for
https://bugs.openjdk.java.net/browse/JDK-8207200 which is a race
condition in the JVM that results in `IllegalArgumentException` to be
thrown in rare cases when we determine memory usage via `MemoryMXBean`.
As we do not want to fail requests in those cases we always return zero
memory usage.

Relates #31767
Relates #33125
2018-08-27 07:09:27 +02:00