Commit Graph

1745 Commits

Author SHA1 Message Date
Jim Ferenczi 544de13d8e
Disallow negative query boost (#34486)
This change disallows negative query boosts. Negative scores are not allowed in Lucene 8 so
it is easier to just disallow negative boosts entirely. We should also deprecate negative boosts
in 6x in order to ensure that users are aware when they'll upgrade to ES 7.

Relates #33309
2018-10-16 11:31:53 +01:00
Armin Braun ebca27371c
SCRIPTING: Move Aggregation Script Context to its own class (#33820)
* SCRIPTING: Move Aggregation Script Context to its own class
2018-10-15 17:28:05 +01:00
Colin Goodheart-Smithe 0b42eda0e3
Merge branch 'master' into index-lifecycle 2018-10-15 16:03:37 +01:00
Andrey Ershov e3a1981a57 Mute testToQuery test 2018-10-15 14:08:04 +02:00
Nhat Nguyen 33791ac27c
CCR: Following primary should process operations once (#34288)
Today we rewrite the operations from the leader with the term of the
following primary because the follower should own its history. The
problem is that a newly promoted primary may re-assign its term to
operations which were replicated to replicas before by the previous
primary. If this happens, some operations with the same seq_no may be
assigned different terms. This is not good for the future optimistic
locking using a combination of seqno and term.

This change ensures that the primary of a follower only processes an
operation if that operation was not processed before. The skipped
operations are guaranteed to be delivered to replicas via either
primary-replica resync or peer-recovery. However, the primary must not
acknowledge until the global checkpoint is at least the highest seqno of
all skipped ops (i.e., they all have been processed on every replica).

Relates #31751
Relates #31113
2018-10-10 15:39:57 -04:00
Nik Everett 06993e0c35
Logging: Make ESLoggerFactory package private (#34199)
Since all calls to `ESLoggerFactory` outside of the logging package were
deprecated, it seemed like it'd simplify things to migrate all of the
deprecated calls and declare `ESLoggerFactory` to be package private.
This does that.
2018-10-06 09:54:08 -04:00
Kazuhiro Sera d45fe43a68 Fix a variety of typos and misspelled words (#32792) 2018-10-03 18:11:38 +01:00
Gordon Brown fb907706ec Merge branch 'master' into index-lifecycle 2018-10-02 13:43:46 -06:00
Nik Everett f904c41506
HLRC: Add get rollup job (#33921)
Adds support for the get rollup job to the High Level REST Client. I had
to do three interesting and unexpected things:
1. I ported the rollup state wiping code into the high level client
tests. I'll move this into the test framework in a followup and remove
the x-pack version.
2. The `timeout` in the rollup config was serialized using the
`toString` representation of `TimeValue` which produces fractional time
values which are more human readable but aren't supported by parsing. So
I switched it to `getStringRep`.
3. Refactor the xcontent round trip testing utilities so we can test
parsing of classes that don't implements `ToXContent`.
2018-10-02 09:11:29 -04:00
Lee Hinman 2d9cb21490 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-10-01 14:10:09 -06:00
Nhat Nguyen ad61398879
CCR: Optimize indexing ops using seq_no on followers (#34099)
This change introduces the indexing optimization using sequence numbers
in the FollowingEngine. This optimization uses the max_seq_no_updates
which is tracked on the primary of the leader and replicated to replicas
and followers.

Relates #33656
2018-09-28 20:42:26 -04:00
Ryan Ernst 47cbae9b26
Scripting: Remove ExecutableScript (#34154)
This commit removes the legacy ExecutableScript, which was no longer
used except in tests. All uses have previously been converted to script
contexts.
2018-09-28 17:13:08 -07:00
Lee Hinman 6ea396a476 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-28 15:40:12 -06:00
Hendrik Muhs e2f310b56c
Fix AggregationFactories.Builder equality and hash regarding order (#34005)
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization
and deserialization. This ensures storing configs with aggregation works properly.

This also addresses a potential issue in caching when the same query contains aggregations but in 
different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in 
the cache.
2018-09-28 13:30:50 +02:00
Alan Woodward f243d75f59
Remove special-casing of Synonym filters in AnalysisRegistry (#34034)
The synonym filters no longer need access to the AnalysisRegistry in their
constructors, so we can remove the special-case code and move them to the
common analysis module.

This commit means that synonyms are no longer available for `server` integration tests,
so several of these are either rewritten or migrated to the common analysis module
as rest-spec-api tests
2018-09-28 09:02:47 +01:00
Ryan Ernst a2c941806b
Tests: Add support for custom contexts to mock scripts (#34100)
This commit adds the ability to plug in compilation of custom contexts
in mock script engine. This is needed for testing plugins which add
custom contexts like watcher.
2018-09-27 12:23:59 -07:00
Lee Hinman a26cc1a242 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-27 11:00:37 -06:00
Jim Ferenczi 269ae0bc15
Handle MatchNoDocsQuery in span query wrappers (#34106)
* Handle MatchNoDocsQuery in span query wrappers

This change adds a new SpanMatchNoDocsQuery query that replaces
MatchNoDocsQuery in the span query wrappers.
The `wildcard` query now returns MatchNoDocsQuery if the target field is not
in the mapping (#34093) so we need the equivalent span query in order to
be able to pass it to other span wrappers.

Closes #34105
2018-09-27 14:19:08 +02:00
Simon Willnauer bda7bc145b
Fold EngineSearcher into Engine.Searcher (#34082)
EngineSearcher can be easily folded into Engine.Searcher which removes
a level of inheritance that is necessary for most of it's subclasses.
This change folds it into Engine.Searcher and removes the dependency on
ReferenceManager.
2018-09-27 09:06:04 +02:00
Yogesh Gaikwad 0301062c6e Mute SpanMultiTermQueryBuilderTests#testToQuery 2018-09-27 15:26:06 +10:00
Nik Everett ddce9704d4
Logging: Drop two deprecated methods (#34055)
This drops two deprecated methods from `ESLoggerFactory`, switching all
calls to those methods to calls to methods of the same name on
`LogManager`.
2018-09-26 11:20:52 -04:00
Adrien Grand 3c2841d493
REST test for typeless APIs. (#33934)
This commit duplicates REST tests for the
 - `indices.create`
 - `indices.put_mapping`
 - `indices.get_mapping`
 - `index`
 - `get`
 - `delete`
 - `update`
 - `bulk`
APIs, so that we both test them when used without types (include_type_name=false)
and with types, mostly for mixed-version cluster tests.

Given a suite called `X_test_name.yml`, I first copied it to
`(X+1)_test_name_with_types.yml` and then changed `X_test_name.yml` to set
`include_type_name=false` on every API that supports it.

Relates #15613
2018-09-26 17:11:37 +02:00
Ryan Ernst 7800b4fa91
Core: Abstract DateMathParser in an interface (#33905)
This commits creates a DateMathParser interface, which is already
implemented for both joda and java time. While currently the java time
DateMathParser is not used, this change will allow a followup which will
create a DateMathParser from a DateFormatter, so the caller does not
need to know the internals of the DateFormatter they have.
2018-09-26 07:56:25 -07:00
Zachary Tong 25d74bd0cb
Prefer mapped aggs to lead reductions (#33528)
Previously, unmapped aggs try to delegate reduction to a sibling agg that is
mapped.  That delegated agg will run the reductions, and also
reduce any pipeline aggs.  But because delegation comes before running
pipelines, the unmapped agg _also_ tries to run pipeline aggs.

This causes the pipeline to run twice, and potentially double it's output
in buckets which can create invalid JSON (e.g. same key multiple times)
and break when converting to maps.

This fixes by sorting the list of aggregations ahead of time so that mapped
aggs appear first, meaning they preferentially lead the reduction.  If all aggs
are unmapped, the first unmapped agg simply creates a new unmapped object
and returns that for the reduction.

This means that unmapped aggs no longer defer and there is no chance for 
a secondary execution of pipelines (or other side effects caused by deferring
execution).

Closes #33514
2018-09-26 10:09:31 -04:00
Christoph Büscher ba3ceeaccf
Clean up "unused variable" warnings (#31876)
This change cleans up "unused variable" warnings. There are several cases were we 
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
2018-09-26 14:09:32 +02:00
Ryan Ernst be8475955e
Scripting: Use ParameterMap for deprecated ctx var in update scripts (#34065)
This commit removes the sysprop controlling whether ctx is in params for
update scripts and replaces it with use of the new ParameterMap, which
outputs a deprecation warning whenever params.ctx is used.
2018-09-25 22:08:02 -07:00
Nhat Nguyen 5166dd0a4c
Replicate max seq_no of updates to replicas (#33967)
We start tracking max seq_no_of_updates on the primary in #33842. This
commit replicates that value from a primary to its replicas in replication 
requests or the translog phase of peer-recovery.

With this change, we guarantee that the value of max seq_no_of_updates
on a replica when any index/delete operation is performed at least the
max_seq_no_of_updates on the primary when that operation was executed.

Relates #33656
2018-09-25 08:07:57 -04:00
Lee Hinman 243e863f6e Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-24 10:33:51 -06:00
Tim Brooks 78e483e8d8
Introduce abstract security transport testcase (#33878)
This commit introduces an AbstractSimpleSecurityTransportTestCase for
security transports. This classes provides transport tests that are
specific for security transports. Additionally, it fixes the tests referenced in
#33285.
2018-09-24 09:44:44 -06:00
Nhat Nguyen 7944a0cb25
Track max seq_no of updates or deletes on primary (#33842)
This PR is the first step to use seq_no to optimize indexing operations.
The idea is to track the max seq_no of either update or delete ops on a
primary, and transfer this information to replicas, and replicas use it
to optimize indexing plan for index operations (with assigned seq_no).

The max_seq_no_of_updates on primary is initialized once when a primary
finishes its local recovery or peer recovery in relocation or being
promoted. After that, the max_seq_no_of_updates is only advanced internally
inside an engine when processing update or delete operations.

Relates #33656
2018-09-22 08:02:57 -04:00
Vladimir Dolzhenko 477391d751
Don't test corruption detection within CFS checksum (#33911)
Closes #33881
2018-09-22 10:21:36 +02:00
lipsill b48d5a8942 [TEST] ClientYamlSuiteRestApiParser to parse spec without path parts (#33720)
Previously ClientYamlSuiteRestApiParser threw an exception when an api
spec contained neither path parts nor url parameter sections.

Closes #31649
2018-09-21 17:26:55 +02:00
Alexander Reelsen 1de2a925ce
Watcher: Ensure that execution triggers properly on initial setup (#33360)
This commit reverts most of #33157 as it introduces another race
condition and breaks a common case of watcher, when the first watch is
added to the system and the index does not exist yet.

This means, that the index will be created, which triggers a reload, but
during this time the put watch operation that triggered this is not yet
indexed, so that both processes finish roughly add the same time and
should not overwrite each other but act complementary.

This commit reverts the logic of cleaning out the ticker engine watches
on start up, as this is done already when the execution is paused -
which also gets paused on the cluster state listener again, as we can be
sure here, that the watches index has not yet been created.

This also adds a new test, that starts a one node cluster and emulates
the case of a non existing watches index and a watch being added, which
should result in proper execution.

Closes #33320
2018-09-21 14:22:34 +02:00
Armin Braun 3a5b8a71b4
NETWORKING: Fix Portability of SO_LINGER=0 in Tests (#33895)
* Setting SO_LINGER for open but not connected non-blocking sockets
throws on OSX
  * Fixed by only applying setting to connected sockets which will save
the same number of FDs as doing it on open sockets anyway
* closes #33879
2018-09-21 10:08:16 +02:00
Nhat Nguyen 5f7f793f43
Propagate max_auto_id_timestamp in peer recovery (#33693)
Today we don't store the auto-generated timestamp of append-only
operations in Lucene; and assign -1 to every index operations
constructed from LuceneChangesSnapshot. This looks innocent but it
generates duplicate documents on a replica if a retry append-only
arrives first via peer-recovery; then an original append-only arrives
via replication. Since the retry append-only (delivered via recovery)
does not have timestamp, the replica will happily optimizes the original
request while it should not.

This change transmits the max auto-generated timestamp from the primary
to replicas before translog phase in peer recovery. This timestamp will
prevent replicas from optimizing append-only requests if retry
counterparts have been processed.

Relates #33656 
Relates #33222
2018-09-20 19:53:30 -04:00
Nhat Nguyen 76a1a863e3
TEST: stop assertSeqNos if shards movement (#33875)
Currently, assertSeqNos assumes that the cluster is stable at the end of
the test (i.e., no more shard movement). However, this assumption does
not always hold. In these cases, we can stop the assertion instead of
failing a test.

Closes #33704
2018-09-20 13:44:26 -04:00
Tim Vernum ff934e3dcd Mute broken test on MacOS
Seems to be triggered by 0cf0d73
See: https://github.com/elastic/elasticsearch/issues/33879
2018-09-20 14:06:40 +10:00
Nik Everett 26c4f1fb6c
Core: Default node.name to the hostname (#33677)
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.

Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.

1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.

As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
2018-09-19 15:21:29 -04:00
Nik Everett 3ede13a454
Test framework fall cleaning (#33423)
Wraps all lines in our test framework at 140 characters because that is
our standard line length and removes all of the checkstyle suppressions
for the test framework.

Drops most of `ModuleTestCase` because it isn't used and we're moving
away from using guice in the way that it wants to test anyway. Also
switches a few classes that extend it but don't use it to extend
`ESTestCase` instead.
2018-09-19 14:34:02 -04:00
Lee Hinman 81e9150c7a Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-19 09:43:26 -06:00
Vladimir Dolzhenko a3e8b831ee
add elasticsearch-shard tool (#32281)
Relates #31389
2018-09-19 10:28:22 +02:00
Armin Braun 0cf0d73813
TESTS: Set SO_LINGER = 0 for MockNioTransport (#32560)
* TESTS: Set SO_LINGER = 0 for MockNioTransport

* Prevents lingering sockets in TIME_WAIT piling up during test runs and leading to port collisions that manifest as timeouts
* Fixes #32552
2018-09-19 06:05:36 +02:00
Lee Hinman c87cff22b4 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-18 13:57:41 -06:00
Or Bin a5bad4d92c Docs: Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...' (#33744)
Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...'

Closes #33728
2018-09-17 15:35:54 -04:00
Lee Hinman 7ff11b4ae1 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-17 10:41:10 -06:00
Alpar Torok 5ca6f31205
Move precommit task implementation to java (#33407)
Replace precommit tasks that execute with Java implementations
2018-09-17 14:09:28 +03:00
Lee Hinman e6cbaa5a78 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-14 16:27:37 -06:00
Armin Braun 0b4960ff6b
SCRIPTING: Move terms_set Context to its Own Class (#33602)
* SCRIPTING: Move terms_set Context to its Own Class

* Extracted TermsSetQueryScript
* Kept mechanics close to what they were with SearchScript
2018-09-14 06:21:18 +02:00
Colin Goodheart-Smithe 8e59de3eb2
Merge branch 'master' into index-lifecycle 2018-09-13 09:46:14 +01:00
Jim Ferenczi 6ca36bba15
Fix field mapping updates with similarity (#33634)
This change fixes a bug introduced in 6.3 that prevents fields with an explicit
similarity to be updated. It also adds a test that checks this case for similarities
but also for analyzers since they could suffer from the same problem.

Closes #33611
2018-09-13 09:21:27 +02:00
David Turner 5a3fd8e4e7
Use file-based discovery not MockUncasedHostsProvider (#33554)
Today we use a special unicast hosts provider, the `MockUncasedHostsProvider`,
in many integration tests, to deal with the dynamic nature of the allocation of
ports to nodes. However #33241 allows us to use file-based discovery to achieve
the same goal, so the special test-only `MockUncasedHostsProvider` is no longer
required.

This change removes `MockUncasedHostProvider` and replaces it with file-based
discovery in tests based on `EsIntegTestCase`.
2018-09-13 07:37:15 +02:00
Martijn van Groningen 5fa81310cc
[CCR] Added history uuid validation (#33546)
For correctness we need to verify whether the history uuid of the leader
index shards never changes while that index is being followed.

* The history UUIDs are recorded as custom index metadata in the follow index.
* The follow api validates whether the current history UUIDs of the leader
  index shards are the same as the recorded history UUIDs.
  If not the follow api fails.
* While a follow index is following a leader index; shard follow tasks
  on each shard changes api call verify whether their current history uuid
  is the same as the recorded history uuid.

Relates to #30086 

Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
2018-09-12 19:42:00 +02:00
Simon Willnauer c783488e97
Add `_source`-only snapshot repository (#32844)
This change adds a `_source` only snapshot repository that allows to wrap
any existing repository as a _backend_ to snapshot only the `_source` part
including live docs markers. Snapshots taken with the `source` repository
won't include any indices,  doc-values or points. The snapshot will be reduced in size and
functionality such that it requires full re-indexing after it's successfully restored.

The restore process will copy the `_source` data locally starts a special shard and engine
to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception.  The restored index is also marked as read-only.

This feature aims mainly for disaster recovery use-cases where snapshot size is
a concern or where time to restore is less of an issue.

**NOTE**: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.
2018-09-12 17:47:10 +02:00
Nhat Nguyen 743327efc2
Reset replica engine to global checkpoint on promotion (#33473)
When a replica starts following a newly promoted primary, it may have
some operations which don't exist on the new primary. Thus we need to
throw those operations to align a replica with the new primary. This can
be done by first resetting an engine from the safe commit, then replaying
the local translog up to the global checkpoint.

Relates #32867
2018-09-11 22:09:37 -04:00
Jason Tedor 73c75bef21
Preserve cluster settings on full restart tests (#33590)
Today the full cluster restart tests do not preserve cluster settings on
restart. This is a mistake because it is not an accurate reflection of
reality, we do not expect users to clear cluster settings when they
perform a full cluster restart. This commit makes it so that all full
cluster restart tests preserve settings on upgrade.
2018-09-11 08:40:22 -04:00
Jason Tedor ea3fdc90c6
Add full cluster restart base class (#33577)
This commit adds a base class for full cluster restart tests.
2018-09-10 20:06:42 -04:00
Colin Goodheart-Smithe cdc4f57a77
Merge branch 'master' into index-lifecycle 2018-09-10 21:30:44 +01:00
Alan Woodward 39c3234c2f
Upgrade to latest Lucene snapshot (#33505)
* LeafCollector.setScorer() now takes a Scorable
* Scorers may not have null Weights
* IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk
2018-09-10 20:51:55 +01:00
Jason Tedor 5f4244755e
Enable not wiping cluster settings after REST test (#33575)
In some cases we want to skip wiping cluster settings after a REST
test. For example, one use-case would be in the full cluster restart
tests where want to test cluster settings before and after a full
cluster restart. If we wipe the cluster settings before the restart,
then it would not be possible to assert on them after the restart.
2018-09-10 14:25:30 -04:00
Tanguy Leroux 079d130d8c
[Test] Remove duplicate method in TestShardRouting (#32815) 2018-09-10 18:29:00 +02:00
Jason Tedor 6bb817004b
Add infrastructure to upgrade settings (#33536)
In some cases we want to deprecate a setting, and then automatically
upgrade uses of that setting to a replacement setting. This commit adds
infrastructure for this so that we can upgrade settings when recovering
the cluster state, as well as when such settings are dynamically applied
on cluster update settings requests. This commit only focuses on cluster
settings, index settings can build on this infrastructure in a
follow-up.
2018-09-09 20:49:19 -04:00
Jason Tedor 5a38c930fc
Add license checks for auto-follow implementation (#33496)
This commit adds license checks for the auto-follow implementation. We
check the license on put auto-follow patterns, and then for every
coordination round we check that the local and remote clusters are
licensed for CCR. In the case of non-compliance, we skip coordination
yet continue to schedule follow-ups.
2018-09-09 07:06:55 -04:00
Nhat Nguyen 94e4cb64c2
Bootstrap a new history_uuid when force allocating a stale primary (#33432)
This commit ensures that we bootstrap a new history_uuid when force
allocating a stale primary. A stale primary should never be the source
of an operation-based recovery to another shard which exists before the
forced-allocation.

Closes #26712
2018-09-08 19:29:31 -04:00
Nik Everett 190ea9a6de
Logging: Configure the node name when we have it (#32983)
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
2018-09-07 14:31:23 -04:00
Simon Willnauer c12d232215
Pass Directory instead of DirectoryService to Store (#33466)
Instead of passing DirectoryService which causes yet another dependency
on Store we can just pass in a Directory since we will just call
`DirectoryService#newDirectory()` on it anyway.
2018-09-07 14:00:24 +02:00
Colin Goodheart-Smithe 017ffe5d12
Merge branch 'master' into index-lifecycle 2018-09-07 10:59:10 +01:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Nik Everett 0d45752e50
Fix IndexMetaData loads after rollover (#33394)
When we rollover and index we write the conditions of the rollover that
the old index met into the old index. Loading this index metadata
requires a working `NamedXContentRegistry` that has been populated with
parsers from the rollover infrastructure. We had a few loads that didn't
use a working `NamedXContentRegistry` and so would fail if they ever
encountered an index that had been rolled over. Here are the locations
of the loads and how I fixed them:

* IndexFolderUpgrader - removed entirely. It existed to support opening
indices made in Elasticsearch 2.x. Since we only need this change as far
back as 6.4.1 which will supports reading from indices created as far
back as 5.0.0 we should be good here.
* TransportNodesListGatewayStartedShards - wired the
`NamedXContentRegistry` into place.
* TransportNodesListShardStoreMetaData - wired the
`NamedXContentRegistry` into place.
* OldIndexUtils - removed entirely. It existed to support the zip based
index backwards compatibility tests which we've since replaced with code
that actually runs old versions of Elasticsearch.

In addition to fixing the actual problem I added full cluster restart
integration tests for rollover which would have caught this problem and
I added an extra assertion to IndexMetaData's deserialization code which
will trip if we try to deserialize and index's metadata without a fully
formed `NamedXContentRegistry`. It won't catch if use the *wrong*
`NamedXContentRegistry` but it is better than nothing.

Closes #33316
2018-09-06 17:55:24 -04:00
Nhat Nguyen 8afe09a749
Pass TranslogRecoveryRunner to engine from outside (#33449)
This commit allows us to use different TranslogRecoveryRunner when
recovering an engine from its local translog. This change is a
prerequisite for the commit-based rollback PR.

Relates #32867
2018-09-06 11:59:16 -04:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Colin Goodheart-Smithe b1257d873b
Merge branch 'master' into index-lifecycle 2018-09-06 08:17:40 +01:00
Lee Hinman 96d515e3f5
Replace PhaseAfterStep with PhaseCompleteStep (#33398)
This removes `PhaseAfterStep` in favor of a new `PhaseCompleteStep`. This step
in only a marker that the `LifecyclePolicyRunner` needs to halt until the time
indicated for entering the next phase.

This also fixes a bug where phase times were encapsulated into the policy
instead of dynamically adjusting to policy changes.

Supersedes #33140, which it replaces
Relates to #29823
2018-09-05 16:37:45 -06:00
Tim Brooks 88c178dca6
Add sni name to SSLEngine in netty transport (#33144)
This commit is related to #32517. It allows an "server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. This functionality is only implemented for
the netty security transport.
2018-09-05 16:12:10 -06:00
Armin Braun 46774098d9
INGEST: Implement Drop Processor (#32278)
* INGEST: Implement Drop Processor
* Adjust Processor API
* Implement Drop Processor
* Closes #23726
2018-09-05 14:25:29 +02:00
Jim Ferenczi dbc7102c86
Fix inner hits retrieval when stored fields are disabled (_none_) (#33018)
Now that types are unique per mapping we can retrieve the document mapper
without referencing the type. This fixes an NPE when stored fields are disabled.
For 6x we'll need a different fix since mappings can still have multiple types.

Relates #32941
2018-09-04 16:25:52 +02:00
Jason Tedor 09bf4e5f00
Introduce private settings (#33327)
This commit introduces the formal notion of a private setting. This
enables us to register some settings that we had previously not
registered as fully-fledged settings to avoid them being exposed via
APIs such as the create index API. For example, we had hacks in the
codebase to allow index.version.created to be passed around inside of
settings objects, but was not registered as a setting so that if a user
tried to use the setting on any API then they would get an
exception. This prevented users from setting index.version.created on
index creation, or updating it via the index settings API. By
introducing private settings, we can continue to reject these attempts,
yet now we can represent these settings as actual settings. In this
change, we register index.version.created as an actual setting. We do
not cutover all settings that we had been treating as private in this
pull request, it is already quite large due to moving some tests around
to account for the fact that some tests need to be able to set the
index.version.created. This can be done in a follow-up change.
2018-09-03 19:17:57 -04:00
Nik Everett f8b7a4dbc8
Logging: Drop Settings from some logging ctors (#33332)
Drops `Settings` from some logging ctors now that they are no longer
needed. This should allow us to stop passing `Settings` around to quite
as many places.
2018-09-02 16:51:26 -04:00
Vladimir Dolzhenko 3d82a30fad
drop `index.shard.check_on_startup: fix` (#32279)
drop `index.shard.check_on_startup: fix`

Relates #31389
2018-08-31 21:29:06 +02:00
Nhat Nguyen ad4dd086d2 Integrates soft-deletes into Elasticsearch (#33222)
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand <jpountz@gmail.com>
Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Simon Willnauer <simonw@apache.org>
2018-08-30 23:46:07 -04:00
Nhat Nguyen 547de71d59 Revert "Integrates soft-deletes into Elasticsearch (#33222)"
Revert to correct co-author tags.
This reverts commit 6dd0aa54f6.
2018-08-30 23:44:57 -04:00
Nhat Nguyen 6dd0aa54f6
Integrates soft-deletes into Elasticsearch (#33222)
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand jpountz@gmail.com
Co-authored-by: Boaz Leskes b.leskes@gmail.com
Co-authored-by: Jason Tedor jason@tedor.me
Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com
Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co
Co-authored-by: Simon Willnauer simonw@apache.org
2018-08-30 22:11:23 -04:00
Nhat Nguyen 39839f97ef
TEST: Access cluster state directly in assertSeqNos (#33277)
Some AbstractDisruptionTestCase tests start failing since we enabled
assertSeqNos (in #33130). They fail because the assertSeqNos assertion
queries cluster stats while the cluster is disrupted or not formed yet.

This commit switches to use the cluster state and shard stats directly
from the test cluster.

Closes #33251
2018-08-30 19:37:20 -04:00
Armin Braun cc4d7059bf
Ingest: Add conditional per processor (#32398)
* Ingest: Add conditional per processor
* closes #21248
2018-08-30 03:46:39 +02:00
Matt Weber 92bd7242a3 Fix classpath security checks for external tests. (#33066)
This commit checks that when we manually add a class to
the codebase map, that it does in-fact not exist on the classpath
in a jar.  This will only be true if we are using the test framework
externally such as when a user develops a plugin.
2018-08-29 12:19:58 -07:00
Alpar Torok f29f0af7bc
Consider multi release jars when running third party audit (#33206)
Exclude classes meant for newer versions than what we are auditing against, those classes won't be found. There's no reason to exclude JDK classes from newer versions, with this PR, we will not extract them in the first place.
2018-08-29 09:53:04 +03:00
Jonathan Little 9d92a87ae6 Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979) 2018-08-28 09:27:43 +01:00
Alpar Torok 2cc611604f
Run Third party audit with forbidden APIs CLI (part3/3) (#33052)
The new implementation is functional equivalent with the old, ant based one.
It parses task standard error to get the missing classes and violations in the same way.
I considered re-using ForbiddenApisCliTask but Gradle makes it hard to build inheritance with tasks that have task actions , since the order of the task actions can't be controlled.
This inheritance isn't dully desired either as the third party audit task is much more opinionated and we don't want to expose some of the configuration.
We could probably extract a common base class without any task actions, but probably more trouble than it's worth.

Closes #31715
2018-08-28 10:03:30 +03:00
Nhat Nguyen 014b3236dc
Ensure to generate identical NoOp for the same failure (#33141)
We generate slightly different NoOps in InternalEngine and
TransportShardBulkAction for the same failure.

1. InternalEngine uses Exception#getFailure to generate a message
without the class name: newOp [NoOp{seqNo=1, primaryTerm=1,
reason='Contexts are mandatory in context enabled completion field
[suggest_context]'}].

2. TransportShardBulkAction uses Exception#toString to generate a
message with the class name: NoOp{seqNo=1, primaryTerm=1,
reason='java.lang.IllegalArgumentException: Contexts are mandatory in
context enabled completion field [suggest_context]'}.

If a write operation fails while a replica is recovering, that replica
will possibly receive two different NoOps: one from recovery and one
from replication. These two different NoOps will trip
TranslogWriter#assertNoSeqNumberConflict assertion.

This commit ensures that we generate the same Noop for the same failure.

Closes #32986
2018-08-27 15:59:42 -04:00
Simon Willnauer 3376922e8b
Add proxy support to RemoteClusterConnection (#33062)
This adds support for connecting to a remote cluster through
a tcp proxy. A remote cluster can configured with an additional
`search.remote.$clustername.proxy` setting. This proxy will be used
to connect to remote nodes for every node connection established.
We still try to sniff the remote clsuter and connect to nodes directly
through the proxy which has to support some kind of routing to these nodes.
Yet, this routing mechanism requires the handshake request to include some
kind of information where to route to which is not yet implemented. The effort
to use the hostname and an optional node attribute for routing is tracked
in #32517

Closes #31840
2018-08-25 20:41:32 +02:00
Nhat Nguyen 9dad82ece8
TEST: Skip assertSeqNos for closed shards (#33130)
If a shard was closed, we return null for SeqNoStats. Therefore the
assertion assertSeqNos will hit NPE when it verifies a closed shard.

This commit skips closed shards in assertSeqNos and enables this
assertion in AbstractDisruptionTestCase.
2018-08-24 21:02:13 -04:00
Nhat Nguyen 739a8d3d44
TEST: resync operation on replica should acquire shard permit (#33103)
This change makes sure that resync operations on replicas in the test
framework are executed under shard permits as the production code.
2018-08-24 20:25:13 -04:00
Jason Tedor 619e0b28b9
Add hook to skip asserting x-content equivalence (#33114)
This commit adds a hook to AbstractSerializingTestCase to enable
skipping asserting that the x-content of the test instance and an
instance parsed from the x-content of the test instance are the
same. While we usually expect these to be the same, they will not be the
same when exceptions are involved because the x-content there is lossy.
2018-08-24 06:53:44 -04:00
Jim Ferenczi f4e9729d64
Remove unsupported Version.V_5_* (#32937)
This change removes the es 5x version constants and their usages.
2018-08-24 09:51:21 +02:00
Armin Braun 917e5a8c94
TESTS: Fix Random Fail in MockTcpTransportTests (#33061)
* `foobar.txGet()` appears to return before `serviceB.stop()` returns, causing `ServiceB.close()` to run concurrently with the `stop` call and running into a race codition
* Closes #32863
2018-08-23 13:19:21 +02:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Nhat Nguyen 262d3c0783
Allow engine to recover from translog upto a seqno (#33032)
This change allows an engine to recover from its local translog up to
the given seqno. The extended API can be used in these use cases:

When a replica starts following a new primary, it resets its index to
the safe commit, then replays its local translog up to the current
global checkpoint (see #32867).

When a replica starts a peer-recovery, it can initialize the
start_sequence_number to the persisted global checkpoint instead of the
local checkpoint of the safe commit. A replica will then replay its
local translog up to that global checkpoint before accepting remote
translog from the primary. This change will increase the chance of
operation-based recovery. I will make this in a follow-up.

Relates #32867
2018-08-22 07:57:44 -04:00
David Turner ab000323fa
Allow extension of CapturingTransport by subclasses (#33012)
Today, CapturingTransport#createCapturingTransportService creates a transport
service with a connection manager with reasonable default behaviours, but
overriding this behaviour in a consumer is a litle tricky. Additionally, the
default behaviour for opening a connection duplicates the content of the
CapturingTransport#openConnection() method.

This change removes this duplication by delegating to openConnection() and
introduces overridable nodeConnected() and onSendRequest() methods so that
consumers can alter this behaviour more easily.

Relates #32246 in which we test the mechanisms for opening connections to
unknown (and possibly unreachable) nodes.
2018-08-22 09:09:08 +01:00
Alpar Torok 82d10b484a
Run forbidden api checks with runtimeJavaVersion (#32947)
Run forbidden APIs checks with runtime hava version
2018-08-22 09:05:22 +03:00
Simon Willnauer 92076497e5
Use a dedicated ConnectionManger for RemoteClusterConnection (#32988)
This change introduces a dedicated ConnectionManager for every RemoteClusterConnection
such that there is not state shared with the TransportService internal ConnectionManager.
All connections to a remote cluster are isolated from the TransportService but still uses
the TransportService and it's internal properties like the Transport, tracing and internal
listener actions on disconnects etc.
This allows a remote cluster connection to have a different lifecycle than a local cluster connection,
also local discovery code doesn't get notified if there is a disconnect on from a remote cluster and
each connection can use it's own dedicated connection profile which allows to have a reduced set of
connections per cluster without conflicting with the local cluster.

Closes #31835
2018-08-21 12:43:25 +02:00
Tim Brooks cd83ddcecc
Fix assertion in AbstractSimpleTransportTestCase (#32991)
This is a follow-up to #32956. That commit incorrectly used assertBusy
which led to a possible race in the test. This commit fixes it.
2018-08-20 16:09:22 -06:00
Tim Brooks faa42de66d
Pass DiscoveryNode to initiateChannel (#32958)
This is related to #32517. This commit passes the DiscoveryNode to the
initiateChannel method for different Transport implementation. This
will allow additional attributes (besides just the socket address) to be
used when opening channels.
2018-08-20 08:54:55 -06:00
Alpar Torok 4b34b3f4aa
Set forbidden APIs target compatibility to compiler java version (#32935)
Set forbidden apis target compatibility to compiler version

Fix outstanding deprecation
2018-08-20 09:27:02 +03:00
Tim Brooks de92d2ef1f
Move connection listener to ConnectionManager (#32956)
This is a followup to #31886. After that commit the
TransportConnectionListener had to be propogated to both the
Transport and the ConnectionManager. This commit moves that listener
to completely live in the ConnectionManager. The request and response
related methods are moved to a TransportMessageListener. That listener
continues to live in the Transport class.
2018-08-18 10:09:24 -06:00
Tim Brooks 2464b68613
Move connection profile into connection manager (#32858)
This is related to #31835. It moves the default connection profile into
the ConnectionManager class. The will allow us to have different
connection managers with different profiles.
2018-08-15 09:08:33 -06:00
Lee Hinman 48281ac5bc
Use generic AcknowledgedResponse instead of extended classes (#32859)
This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead.

While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.
2018-08-15 08:06:14 -06:00
Ryan Ernst 0158b59a5a
Test: Fix forbidden uses in test framework (#32824)
This commit fixes existing uses of forbidden apis in the test framework
and re-enables the forbidden apis check. It was previously completely
disabled and had missed a rename of the forbidden apis signatures files.

closes #32772
2018-08-14 11:35:09 -07:00
Tim Brooks 10fddb62ee
Remove client connections from TcpTransport (#31886)
This is related to #31835. This commit adds a connection manager that
manages client connections to other nodes. This means that the
TcpTransport no longer maintains a map of nodes that it is connected
to.
2018-08-13 16:44:09 -06:00
Armin Braun d412230cda
SCRIPTING: Support BucketAggScript return null (#32811)
* As explained in #32790, `BucketAggregationScript` must support `null` as a return value
* Closes #32790
2018-08-13 20:08:26 +02:00
Nik Everett f5ba801c6b Test: Only sniff host metadata for node_selectors (#32750)
Our rest testing framework has support for sniffing the host metadata on
startup and, before this change, it'd sniff that metadata before running
the first test. This prevents running these tests against
elasticsearch installations that won't support sniffing like Elastic
Cloud. This change allows tests to only sniff for metadata when they
encounter a test with a `node_selector`. These selectors are the things
that need the metadata anyway and they are super rare. Tests that use
these won't be able to run against installations that don't support
sniffing but we can just skip them. In the case of Elastic Cloud, these
tests were never going to work against Elastic Cloud anyway.
2018-08-10 13:35:47 -04:00
Christoph Büscher 22f7b03430
Fix test reproducability in AbstractBuilderTestCase setup (#32403)
Currently AbstractBuilderTestCase generates certain random values in its
`beforeTest()` method annotated with @Before only the first time that a test
method in the suite is run while initializing the serviceHolder that we use for
the rest of the test. This changes the values of subsequent random values
and has the effect that when running single methods from a test suite with
"-Dtests.method=*", the random values it sees are different from when the same
test method is run as part of the whole test suite. This makes it hard to use
the reproduction lines logged on failure.

This change runs the inialization of the serviceHolder and the randomization 
connected to it using the test runners master seed, so reproduction by running
just one method is possible again.


Closes #32400
2018-08-10 15:13:44 +02:00
Boaz Leskes f58ed21720
Refactor TransportShardBulkAction to better support retries (#31821)
Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a `RetryOnPrimaryException`, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure.

This commit refactors how we deal with retries with the goal of removing  `RetryOnPrimaryException` and `RetryOnReplicaException` (not done yet). It achieves so by introducing a class `BulkPrimaryExecutionContext` that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through:
1) A translation phase for updates
2) Execution phase (always index/delete)
3) Waiting for a mapping update to come in, if needed
4) Requires a retry (for updates and cases where the mapping are still not available after the put mapping call returns)
5) A finalization phase which allows updates to the index/delete result to an update result.
2018-08-10 10:15:01 +02:00
Alpar Torok af8c23eb40
Java version reproduction (#32715)
Enhance reproduction line with info about jdks

Provide the ability to control compiler and hava versions just by
passing a property. The actual java home comes from the
`JAVA<major>_HOME` env vars that we allready require.
This works better with the Gradle daemon as well.

Output is also changed a bit.

for `-Druntime.java=8 -Dcompiler.java=9`:
```
=======================================
Elasticsearch Build Hamster says Hello!
  Gradle Version        : 4.9
  OS Info               : Linux 4.17.8-1-ARCH (amd64)
  Compiler JDK Version  : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22])
  Runtime JDK Version   : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22])
  Gradle JDK Version    : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10])
  Compiler java.home    : /home/alpar/opt/jdk-11-ea22/
  Runtime java.home     : /home/alpar/opt/jdk-11-ea22/
  Gradle java.home      : /usr/lib/jvm/java-10-openjdk
  Random Testing Seed   : EA858533191E8DFB
=======================================
```

Without configuration:
```
=======================================
Elasticsearch Build Hamster says Hello!
=======================================
  Gradle Version        : 4.9
  OS Info               : Linux 4.17.8-1-ARCH (amd64)
  JDK Version           : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10])
  JAVA_HOME             : /usr/lib/jvm/java-10-openjdk
  Random Testing Seed   : 4BD5B2A839C8FCA1
=======================================
```

Here's how a reproduction line will look like (test made to fail):
```
./gradlew :modules:lang-painless:test -Dtests.seed=2DA2379065A4EEAB -Dtests.class=org.elasticsearch.painless.AdditionTests -Dtests.method="testInt" -Dtests.security.manager=true -Dtests.locale=es-PE -Dtests.timezone=WET -Dcompiler.java=10 -Druntime.java=10
```
2018-08-10 08:07:43 +00:00
Armin Braun 79375d35bb
Scripting: Replace Update Context (#32096)
* SCRIPTING: Move Update Scripts to their own context
* Added system property for backwards compatibility of change to `ctx.params`
2018-08-09 14:32:36 +02:00
Jason Tedor dcc816427e
Expose whether or not the global checkpoint updated (#32659)
It will be useful for future efforts to know if the global checkpoint
was updated. To this end, we need to expose whether or not the global
checkpoint was updated when the state of the replication tracker
updates. For this, we add to the tracker a callback that is invoked
whenever the global checkpoint is updated. For primaries this will be
invoked when the computed global checkpoint is updated based on state
changes to the tracker. For replicas this will be invoked when the local
knowledge of the global checkpoint is advanced from the primary.
2018-08-07 15:10:09 -04:00
Tim Brooks 3d5e9114e3
Reduce connections used by MockNioTransport (#32620)
The MockNioTransport (similar to the MockTcpTransport) is used for integ
tests. The MockTcpTransport has always only opened a single for all of
its work. The MockNioTransport has awlays opened the default number of
connections (13). This means that every test where two transports
connect requires 26 connections. This is more than is necessary. This
commit modifies the MockNioTransport to only require 3 connections.
2018-08-07 12:52:28 -06:00
Lee Hinman b3e15851a2 [TEST] Comment out account breaker assertion while diagnosing
Relates to #30290
2018-08-07 09:36:37 -06:00
Armin Braun 0a67cb4133
LOGGING: Upgrade to Log4J 2.11.1 (#32616)
* LOGGING: Upgrade to Log4J 2.11.1
* Upgrade to `2.11.1` to fix memory leaks in slow logger when logging large requests
   * This was caused by a bug in Log4J https://issues.apache.org/jira/browse/LOG4J2-2269 and is fixed in `2.11.1` via https://git-wip-us.apache.org/repos/asf?p=logging-log4j2.git;h=9496c0c
* Fixes #32537
* Fixes #27300
2018-08-06 14:56:21 +02:00
Armin Braun 6fa7016bbf
SCRIPTING: Move Aggregation Scripts to their own context (#32068)
* SCRIPTING: Move Aggregation Scripts to their own context
2018-08-04 10:37:07 +02:00
Yannick Welsch 0d60e8a029
Fix race between replica reset and primary promotion (#32442)
We've recently seen a number of test failures that tripped an assertion in IndexShard (see issues
linked below), leading to the discovery of a race between resetting a replica when it learns about a
higher term and when the same replica is promoted to primary. This commit fixes the race by
distinguishing between a cluster state primary term (called pendingPrimaryTerm) and a shard-level
operation term. The former is set during the cluster state update or when a replica learns about a
new primary. The latter is only incremented under the operation block, which can happen in a
delayed fashion. It also solves the issue where a replica that's still adjusting to the new term
receives a cluster state update that promotes it to primary, which can happen in the situation of
multiple nodes being shut down in short succession. In that case, the cluster state update thread
would call `asyncBlockOperations` in `updateShardState`, which in turn would throw an exception
as blocking permits is not allowed while an ongoing block is in place, subsequently failing the shard.
This commit therefore extends the IndexShardOperationPermits to allow it to queue multiple blocks
(which will all take precedence over operations acquiring permits). Finally, it also moves the primary
activation of the replication tracker under the operation block, so that the actual transition to
primary only happens under the operation block.

Relates to #32431, #32304 and #32118
2018-08-03 09:33:08 +02:00
Yannick Welsch db6e8c736d
Remove cluster state initial customs (#32501)
This infrastructure was introduced in #26144 and made obsolete in #30743
2018-08-02 15:49:59 +02:00
Jay Modi f2f33f3149 Use hostname instead of IP with SPNEGO test (#32514)
This change updates KerberosAuthenticationIT to resolve the host used
to connect to the test cluster. This is needed because the host could
be an IP address but SPNEGO requires a hostname to work properly. This
is done by adding a hook in ESRestTestCase for building the HttpHost
from the host and port.

Additionally, the project now specifies the IPv4 loopback address as
the http host. This is done because we need to be able to resolve the
address used for the HTTP transport before the node starts up, but the
http.ports file is not written until the node is started.

Closes #32498
2018-08-01 12:57:33 +10:00
Nik Everett 22459576d7
Logging: Make node name consistent in logger (#31588)
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.

Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```

The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```

Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```

The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.

This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.

Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
2018-07-31 10:54:24 -04:00
Luca Cavanna 9a4d0069f6
REST high-level client: parse back _ignored meta field (#32362)
`GetResult` and `SearchHit` have been adjusted to parse back the `_ignored` meta field whenever it gets printed out. Expanded the existing tests to make sure this is covered. Fixed also a small problem around highlighted fields in `SearchHitTests`.
2018-07-30 13:43:40 +02:00
Armin Braun 1628c833c7
TESTS: Move netty leak detection to paranoid level (#32354) 2018-07-26 21:36:49 +02:00
Jim Ferenczi 8e5f281b27
AbstractQueryTestCase should run without type less often (#28936)
This commit changes the randomization to always create an index with a type.
It also adds a way to create a query shard context that maps to an index with
no type registered in order to explicitely test cases where there is no type.
2018-07-26 20:29:05 +02:00
Jason Tedor eb675a1c4d
Introduce index store plugins (#32375)
Today we allow plugins to add index store implementations yet we are not
doing this in our new way of managing plugins as pull versus push. That
is, today we still allow plugins to push index store providers via an on
index module call where they can turn around and add an index
store. Aside from being inconsistent with how we manage plugins today
where we would look to pull such implementations from plugins at node
creation time, it also means that we do not know at a top-level (for
example, in the indices service) which index stores are available. This
commit addresses this by adding a dedicated plugin type for index store
plugins, removing the index module hook for adding index stores, and by
aggregating these into the top-level of the indices service.
2018-07-26 08:05:49 -04:00
Tim Vernum 387c3c7f1d Introduce Application Privileges with support for Kibana RBAC (#32309)
This commit introduces "Application Privileges" to the X-Pack security
model.

Application Privileges are managed within Elasticsearch, and can be
tested with the _has_privileges API, but do not grant access to any
actions or resources within Elasticsearch. Their purpose is to allow
applications outside of Elasticsearch to represent and store their own
privileges model within Elasticsearch roles.

Access to manage application privileges is handled in a new way that
grants permission to specific application names only. This lays the
foundation for more OLS on cluster privileges, which is implemented by
allowing a cluster permission to inspect not just the action being
executed, but also the request to which the action is applied.
To support this, a "conditional cluster privilege" is introduced, which
is like the existing cluster privilege, except that it has a Predicate
over the request as well as over the action name.

Specifically, this adds
- GET/PUT/DELETE actions for defining application level privileges
- application privileges in role definitions
- application privileges in the has_privileges API
- changes to the cluster permission class to support checking of request
  objects
- a new "global" element on role definition to provide cluster object
  level security (only for manage application privileges)
- changes to `kibana_user`, `kibana_dashboard_only_user` and
  `kibana_system` roles to use and manage application privileges

Closes #29820
Closes #31559
2018-07-24 10:34:46 -06:00
Yogesh Gaikwad a525c36c60 [Kerberos] Add Kerberos authentication support (#32263)
This commit adds support for Kerberos authentication with a platinum
license. Kerberos authentication support relies on SPNEGO, which is
triggered by challenging clients with a 401 response with the
`WWW-Authenticate: Negotiate` header. A SPNEGO client will then provide
a Kerberos ticket in the `Authorization` header. The tickets are
validated using Java's built-in GSS support. The JVM uses a vm wide
configuration for Kerberos, so there can be only one Kerberos realm.
This is enforced by a bootstrap check that also enforces the existence
of the keytab file.

In many cases a fallback authentication mechanism is needed when SPNEGO
authentication is not available. In order to support this, the
DefaultAuthenticationFailureHandler now takes a list of failure response
headers. For example, one realm can provide a
`WWW-Authenticate: Negotiate` header as its default and another could
provide `WWW-Authenticate: Basic` to indicate to the client that basic
authentication can be used in place of SPNEGO.

In order to test Kerberos, unit tests are run against an in-memory KDC
that is backed by an in-memory ldap server. A QA project has also been
added to test against an actual KDC, which is provided by the krb5kdc
fixture.

Closes #30243
2018-07-24 08:44:26 -06:00
Daniel Mitterdorfer 73a38895fd
Add Restore Snapshot High Level REST API
With this commit we add the restore snapshot API to the Java high level
REST client.

Relates #27205
Relates #32155
2018-07-24 16:17:09 +02:00
Ioannis Kakavas a2dbd83db1
Allow Integ Tests to run in a FIPS-140 JVM (#31989)
* Complete changes for running IT in a fips JVM

- Mute :x-pack:qa:sql:security:ssl:integTest as it
  cannot run in FIPS 140 JVM until the SQL CLI supports key/cert.
- Set default JVM keystore/truststore password in top level build
  script for all integTest tasks in a FIPS 140 JVM
- Changed top level x-pack build script to use keys and certificates
  for trust/key material when spinning up clusters for IT
2018-07-24 12:48:14 +03:00
Andrey Ershov 33f11e637d
Fail shard if IndexShard#storeStats runs into an IOException (#32241)
Fail shard if IndexShard#storeStats runs into an IOException. Closes #29008
2018-07-23 16:38:55 +02:00
Christoph Büscher ff87b7aba4
Remove unnecessary warning supressions (#32250) 2018-07-23 11:31:04 +02:00
Armin Braun 24068a773d
TESTS: Check for Netty resource leaks (#31861)
* Enabled advanced leak detection when loading `EsTestCase`
* Added custom `Appender` to collect leak logs and check for logged errors in a way similar to what is done for the `StatusLogger`
* Fixes #20398
2018-07-20 09:12:32 +02:00
Julie Tibshirani 15ff3da653
Add support for field aliases. (#32172)
* Add basic support for field aliases in index mappings. (#31287)
* Allow for aliases when fetching stored fields. (#31411)
* Add tests around accessing field aliases in scripts. (#31417)
* Add documentation around field aliases. (#31538)
* Add validation for field alias mappings. (#31518)
* Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671)
* Make sure that field-level security is enforced when using field aliases. (#31807)
* Add more comprehensive tests for field aliases in queries + aggregations. (#31565)
* Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)
2018-07-18 09:33:09 -07:00
Boaz Leskes 5856c396dd
A replica can be promoted and started in one cluster state update (#32042)
When a replica is fully recovered (i.e., in `POST_RECOVERY` state) we send a request to the master
to start the shard. The master changes the state of the replica and publishes a cluster state to that
effect. In certain cases, that cluster state can be processed on the node hosting the replica
*together* with a cluster state that promotes that, now started, replica to a primary. This can
happen due to cluster state batched processing or if the master died after having committed the
cluster state that starts the shard but before publishing it to the node with the replica. If the master
also held the primary shard, the new master node will remove the primary (as it failed) and will also
immediately promote the replica (thinking it is started). 

Sadly our code in IndexShard didn't allow for this which caused [assertions](13917162ad/server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java (L482)) to be tripped in some of our tests runs.
2018-07-18 11:30:44 +02:00
Boaz Leskes 93d7468f3a ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are
Sometimes we have a test failure that hits an `UnsupportedOperationException` in this infrastructure. When
debugging you want to know what caused this unexpected failure, but right now we're silent about it. This
commit adds some information to the `UnsupportedOperationException`

Relates to #32127
2018-07-18 08:49:16 +02:00
Nhat Nguyen df1380b8d3
Remove versionType from translog (#31945)
With the introduction of sequence number, we no longer use versionType to
resolve out of order collision in replication and recovery requests.

This PR removes removes the versionType from translog. We can only remove
it in 7.0 because it is still required in a mixed cluster between 6.x and 5.x.
2018-07-17 21:59:48 -04:00
Ioannis Kakavas 9e529d9d58
Enable testing in FIPS140 JVM (#31666)
Ensure our tests can run in a FIPS JVM

JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.

IT will be covered in a subsequent PR
2018-07-17 10:54:10 +03:00
Daniel Mitterdorfer 016e8760f0
Turn off real-mem breaker in single node tests
With this commit we disable the real-memory circuit breaker in tests
that inherit from `ESSingleNodeTestCase`. As this breaker is based on
real memory usage over which we have no (full) control in tests and
their purpose is also not to test the circuit breaker, we use the
deterministic circuit breaker implementation that only accounts for
explicitly reserved memory.

Closes #32047
Relates #32071
2018-07-16 10:40:36 +02:00
Armin Braun 3679d00a74
Replace Ingest ScriptContext with Custom Interface (#32003)
* Replace Ingest ScriptContext with Custom Interface
* Make org.elasticsearch.ingest.common.ScriptProcessorTests#testScripting more precise
* Don't mock script factory in ScriptProcessorTests
* Adjust mock script plugin in IT for new API
2018-07-13 23:26:10 +02:00
Vladimir Dolzhenko b1bf643e41
lazy snapshot repository initialization (#31606)
lazy snapshot repository initialization
2018-07-13 20:05:49 +02:00
Colin Goodheart-Smithe 0edb096eb4 Adds a new auto-interval date histogram (#28993)
* Adds a new auto-interval date histogram

This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time.

This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary.

Closes #9572

* Adds documentation

* Added sub aggregator test

* Fixes failing docs test

* Brings branch up to date with master changes

* trying to get tests to pass again

* Fixes multiBucketConsumer accounting

* Collects more buckets than needed on shards

This gives us more options at reduce time in terms of how we do the
final merge of the buckeets to produce the final result

* Revert "Collects more buckets than needed on shards"

This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5.

* Adds ability to merge within a rounding

* Fixes nonn-timezone doc test failure

* Fix time zone tests

* iterates on tests

* Adds test case and documentation changes

Added some notes in the documentation about the intervals that can bbe
returned.

Also added a test case that utilises the merging of conseecutive buckets

* Fixes performance bug

The bug meant that getAppropriate rounding look a huge amount of time
if the range of the data was large but also sparsely populated. In
these situations the rounding would be very low so iterating through
the rounding values from the min key to the max keey look a long time
(~120 seconds in one test).

The solution is to add a rough estimate first which chooses the
rounding based just on the long values of the min and max keeys alone
but selects the rounding one lower than the one it thinks is
appropriate so the accurate method can choose the final rounding taking
into account the fact that intervals are not always fixed length.

Thee commit also adds more tests

* Changes to only do complex reduction on final reduce

* merge latest with master

* correct tests and add a new test case for 10k buckets

* refactor to perform bucket number check in innerBuild

* correctly derive bucket setting, update tests to increase bucket threshold

* fix checkstyle

* address code review comments

* add documentation for default buckets

* fix typo
2018-07-13 13:08:35 -04:00
Daniel Mitterdorfer f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00
olcbean 334c255516 XContentTests : Insert random fields at random positions (#30867)
Currently AbstractXContentTestCase#testFromXContent appends random fields, but in
a fixed position. This PR shuffles all fields after the random fields have been appended, 
hence the random fields are actually added to random positions.
2018-07-12 19:10:51 +02:00
Nik Everett 38e09a1508
Switch test framework to new style requests (#31939)
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `test/framework` project to use the new
versions.
2018-07-11 10:04:17 -04:00
Armin Braun b4087d69d2
Fix assertIngestDocument wrongfully passing (#31913)
* Fix assertIngestDocument wrongfully passing

* Previously docA being subset of docB passed because iteration was over docA's keys only
* Scalars in nested fields were not compared in all cases
* Assertion errors were hard to interpret (message wasn't correct since it only mentioned the class type)
* In cases where two paths contained different types a ClassCastException was thrown instead of an AssertionError
* Fixes #28492
2018-07-11 10:24:21 +02:00
Alexander Reelsen 1c32497c44
Date: Add DateFormatters class that uses java.time (#31856)
A newly added class called DateFormatters now contains java.time based
builders for dates, which also intends to be fully backwards compatible,
when the name based date formatters are picked. Also a new class named 
CompoundDateTimeFormatter for being able to parse multiple different 
formats has been added.

A duelling test class has been added that ensures the same dates when
parsing java or joda time formatted dates for the name based dates.

Note, that java.time and joda time are not fully backwards compatible,
which also means that old formats will currently not work with this
setup.
2018-07-10 09:28:28 +02:00
Yannick Welsch cce7dc20ad
Smaller aesthetic fixes to InternalTestCluster (#31831)
Allows cluster to auto-reconfigure faster by starting up nodes in parallel.
2018-07-06 11:42:09 +02:00
Nik Everett 1099060735
Test: Do not remove xpack templates when cleaning (#31642)
At the end of every `ESRestTestCase` we clean the cluster which includes
deleting all of the templates. If xpack is installed it'll automatically
recreate a few templates every time they are removed. Which is slow.

This change stops the cleanup from removing the xpack templates. It cuts
the time to run the docs tests more than in half and it probably saves a
bit more time on other tests as well.
2018-07-05 09:43:43 -04:00
Christoph Büscher bd1c513422
Reduce more raw types warnings (#31780)
Similar to #31523.
2018-07-05 15:38:06 +02:00