Commit Graph

1681 Commits

Author SHA1 Message Date
Alexander Reelsen 1de2a925ce
Watcher: Ensure that execution triggers properly on initial setup (#33360)
This commit reverts most of #33157 as it introduces another race
condition and breaks a common case of watcher, when the first watch is
added to the system and the index does not exist yet.

This means, that the index will be created, which triggers a reload, but
during this time the put watch operation that triggered this is not yet
indexed, so that both processes finish roughly add the same time and
should not overwrite each other but act complementary.

This commit reverts the logic of cleaning out the ticker engine watches
on start up, as this is done already when the execution is paused -
which also gets paused on the cluster state listener again, as we can be
sure here, that the watches index has not yet been created.

This also adds a new test, that starts a one node cluster and emulates
the case of a non existing watches index and a watch being added, which
should result in proper execution.

Closes #33320
2018-09-21 14:22:34 +02:00
Armin Braun 3a5b8a71b4
NETWORKING: Fix Portability of SO_LINGER=0 in Tests (#33895)
* Setting SO_LINGER for open but not connected non-blocking sockets
throws on OSX
  * Fixed by only applying setting to connected sockets which will save
the same number of FDs as doing it on open sockets anyway
* closes #33879
2018-09-21 10:08:16 +02:00
Nhat Nguyen 5f7f793f43
Propagate max_auto_id_timestamp in peer recovery (#33693)
Today we don't store the auto-generated timestamp of append-only
operations in Lucene; and assign -1 to every index operations
constructed from LuceneChangesSnapshot. This looks innocent but it
generates duplicate documents on a replica if a retry append-only
arrives first via peer-recovery; then an original append-only arrives
via replication. Since the retry append-only (delivered via recovery)
does not have timestamp, the replica will happily optimizes the original
request while it should not.

This change transmits the max auto-generated timestamp from the primary
to replicas before translog phase in peer recovery. This timestamp will
prevent replicas from optimizing append-only requests if retry
counterparts have been processed.

Relates #33656 
Relates #33222
2018-09-20 19:53:30 -04:00
Nhat Nguyen 76a1a863e3
TEST: stop assertSeqNos if shards movement (#33875)
Currently, assertSeqNos assumes that the cluster is stable at the end of
the test (i.e., no more shard movement). However, this assumption does
not always hold. In these cases, we can stop the assertion instead of
failing a test.

Closes #33704
2018-09-20 13:44:26 -04:00
Tim Vernum ff934e3dcd Mute broken test on MacOS
Seems to be triggered by 0cf0d73
See: https://github.com/elastic/elasticsearch/issues/33879
2018-09-20 14:06:40 +10:00
Nik Everett 26c4f1fb6c
Core: Default node.name to the hostname (#33677)
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.

Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.

1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.

As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
2018-09-19 15:21:29 -04:00
Nik Everett 3ede13a454
Test framework fall cleaning (#33423)
Wraps all lines in our test framework at 140 characters because that is
our standard line length and removes all of the checkstyle suppressions
for the test framework.

Drops most of `ModuleTestCase` because it isn't used and we're moving
away from using guice in the way that it wants to test anyway. Also
switches a few classes that extend it but don't use it to extend
`ESTestCase` instead.
2018-09-19 14:34:02 -04:00
Lee Hinman 81e9150c7a Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-19 09:43:26 -06:00
Vladimir Dolzhenko a3e8b831ee
add elasticsearch-shard tool (#32281)
Relates #31389
2018-09-19 10:28:22 +02:00
Armin Braun 0cf0d73813
TESTS: Set SO_LINGER = 0 for MockNioTransport (#32560)
* TESTS: Set SO_LINGER = 0 for MockNioTransport

* Prevents lingering sockets in TIME_WAIT piling up during test runs and leading to port collisions that manifest as timeouts
* Fixes #32552
2018-09-19 06:05:36 +02:00
Lee Hinman c87cff22b4 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-18 13:57:41 -06:00
Or Bin a5bad4d92c Docs: Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...' (#33744)
Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...'

Closes #33728
2018-09-17 15:35:54 -04:00
Lee Hinman 7ff11b4ae1 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-17 10:41:10 -06:00
Alpar Torok 5ca6f31205
Move precommit task implementation to java (#33407)
Replace precommit tasks that execute with Java implementations
2018-09-17 14:09:28 +03:00
Lee Hinman e6cbaa5a78 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-14 16:27:37 -06:00
Armin Braun 0b4960ff6b
SCRIPTING: Move terms_set Context to its Own Class (#33602)
* SCRIPTING: Move terms_set Context to its Own Class

* Extracted TermsSetQueryScript
* Kept mechanics close to what they were with SearchScript
2018-09-14 06:21:18 +02:00
Colin Goodheart-Smithe 8e59de3eb2
Merge branch 'master' into index-lifecycle 2018-09-13 09:46:14 +01:00
Jim Ferenczi 6ca36bba15
Fix field mapping updates with similarity (#33634)
This change fixes a bug introduced in 6.3 that prevents fields with an explicit
similarity to be updated. It also adds a test that checks this case for similarities
but also for analyzers since they could suffer from the same problem.

Closes #33611
2018-09-13 09:21:27 +02:00
David Turner 5a3fd8e4e7
Use file-based discovery not MockUncasedHostsProvider (#33554)
Today we use a special unicast hosts provider, the `MockUncasedHostsProvider`,
in many integration tests, to deal with the dynamic nature of the allocation of
ports to nodes. However #33241 allows us to use file-based discovery to achieve
the same goal, so the special test-only `MockUncasedHostsProvider` is no longer
required.

This change removes `MockUncasedHostProvider` and replaces it with file-based
discovery in tests based on `EsIntegTestCase`.
2018-09-13 07:37:15 +02:00
Martijn van Groningen 5fa81310cc
[CCR] Added history uuid validation (#33546)
For correctness we need to verify whether the history uuid of the leader
index shards never changes while that index is being followed.

* The history UUIDs are recorded as custom index metadata in the follow index.
* The follow api validates whether the current history UUIDs of the leader
  index shards are the same as the recorded history UUIDs.
  If not the follow api fails.
* While a follow index is following a leader index; shard follow tasks
  on each shard changes api call verify whether their current history uuid
  is the same as the recorded history uuid.

Relates to #30086 

Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
2018-09-12 19:42:00 +02:00
Simon Willnauer c783488e97
Add `_source`-only snapshot repository (#32844)
This change adds a `_source` only snapshot repository that allows to wrap
any existing repository as a _backend_ to snapshot only the `_source` part
including live docs markers. Snapshots taken with the `source` repository
won't include any indices,  doc-values or points. The snapshot will be reduced in size and
functionality such that it requires full re-indexing after it's successfully restored.

The restore process will copy the `_source` data locally starts a special shard and engine
to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception.  The restored index is also marked as read-only.

This feature aims mainly for disaster recovery use-cases where snapshot size is
a concern or where time to restore is less of an issue.

**NOTE**: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.
2018-09-12 17:47:10 +02:00
Nhat Nguyen 743327efc2
Reset replica engine to global checkpoint on promotion (#33473)
When a replica starts following a newly promoted primary, it may have
some operations which don't exist on the new primary. Thus we need to
throw those operations to align a replica with the new primary. This can
be done by first resetting an engine from the safe commit, then replaying
the local translog up to the global checkpoint.

Relates #32867
2018-09-11 22:09:37 -04:00
Jason Tedor 73c75bef21
Preserve cluster settings on full restart tests (#33590)
Today the full cluster restart tests do not preserve cluster settings on
restart. This is a mistake because it is not an accurate reflection of
reality, we do not expect users to clear cluster settings when they
perform a full cluster restart. This commit makes it so that all full
cluster restart tests preserve settings on upgrade.
2018-09-11 08:40:22 -04:00
Jason Tedor ea3fdc90c6
Add full cluster restart base class (#33577)
This commit adds a base class for full cluster restart tests.
2018-09-10 20:06:42 -04:00
Colin Goodheart-Smithe cdc4f57a77
Merge branch 'master' into index-lifecycle 2018-09-10 21:30:44 +01:00
Alan Woodward 39c3234c2f
Upgrade to latest Lucene snapshot (#33505)
* LeafCollector.setScorer() now takes a Scorable
* Scorers may not have null Weights
* IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk
2018-09-10 20:51:55 +01:00
Jason Tedor 5f4244755e
Enable not wiping cluster settings after REST test (#33575)
In some cases we want to skip wiping cluster settings after a REST
test. For example, one use-case would be in the full cluster restart
tests where want to test cluster settings before and after a full
cluster restart. If we wipe the cluster settings before the restart,
then it would not be possible to assert on them after the restart.
2018-09-10 14:25:30 -04:00
Tanguy Leroux 079d130d8c
[Test] Remove duplicate method in TestShardRouting (#32815) 2018-09-10 18:29:00 +02:00
Jason Tedor 6bb817004b
Add infrastructure to upgrade settings (#33536)
In some cases we want to deprecate a setting, and then automatically
upgrade uses of that setting to a replacement setting. This commit adds
infrastructure for this so that we can upgrade settings when recovering
the cluster state, as well as when such settings are dynamically applied
on cluster update settings requests. This commit only focuses on cluster
settings, index settings can build on this infrastructure in a
follow-up.
2018-09-09 20:49:19 -04:00
Jason Tedor 5a38c930fc
Add license checks for auto-follow implementation (#33496)
This commit adds license checks for the auto-follow implementation. We
check the license on put auto-follow patterns, and then for every
coordination round we check that the local and remote clusters are
licensed for CCR. In the case of non-compliance, we skip coordination
yet continue to schedule follow-ups.
2018-09-09 07:06:55 -04:00
Nhat Nguyen 94e4cb64c2
Bootstrap a new history_uuid when force allocating a stale primary (#33432)
This commit ensures that we bootstrap a new history_uuid when force
allocating a stale primary. A stale primary should never be the source
of an operation-based recovery to another shard which exists before the
forced-allocation.

Closes #26712
2018-09-08 19:29:31 -04:00
Nik Everett 190ea9a6de
Logging: Configure the node name when we have it (#32983)
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
2018-09-07 14:31:23 -04:00
Simon Willnauer c12d232215
Pass Directory instead of DirectoryService to Store (#33466)
Instead of passing DirectoryService which causes yet another dependency
on Store we can just pass in a Directory since we will just call
`DirectoryService#newDirectory()` on it anyway.
2018-09-07 14:00:24 +02:00
Colin Goodheart-Smithe 017ffe5d12
Merge branch 'master' into index-lifecycle 2018-09-07 10:59:10 +01:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Nik Everett 0d45752e50
Fix IndexMetaData loads after rollover (#33394)
When we rollover and index we write the conditions of the rollover that
the old index met into the old index. Loading this index metadata
requires a working `NamedXContentRegistry` that has been populated with
parsers from the rollover infrastructure. We had a few loads that didn't
use a working `NamedXContentRegistry` and so would fail if they ever
encountered an index that had been rolled over. Here are the locations
of the loads and how I fixed them:

* IndexFolderUpgrader - removed entirely. It existed to support opening
indices made in Elasticsearch 2.x. Since we only need this change as far
back as 6.4.1 which will supports reading from indices created as far
back as 5.0.0 we should be good here.
* TransportNodesListGatewayStartedShards - wired the
`NamedXContentRegistry` into place.
* TransportNodesListShardStoreMetaData - wired the
`NamedXContentRegistry` into place.
* OldIndexUtils - removed entirely. It existed to support the zip based
index backwards compatibility tests which we've since replaced with code
that actually runs old versions of Elasticsearch.

In addition to fixing the actual problem I added full cluster restart
integration tests for rollover which would have caught this problem and
I added an extra assertion to IndexMetaData's deserialization code which
will trip if we try to deserialize and index's metadata without a fully
formed `NamedXContentRegistry`. It won't catch if use the *wrong*
`NamedXContentRegistry` but it is better than nothing.

Closes #33316
2018-09-06 17:55:24 -04:00
Nhat Nguyen 8afe09a749
Pass TranslogRecoveryRunner to engine from outside (#33449)
This commit allows us to use different TranslogRecoveryRunner when
recovering an engine from its local translog. This change is a
prerequisite for the commit-based rollback PR.

Relates #32867
2018-09-06 11:59:16 -04:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Colin Goodheart-Smithe b1257d873b
Merge branch 'master' into index-lifecycle 2018-09-06 08:17:40 +01:00
Lee Hinman 96d515e3f5
Replace PhaseAfterStep with PhaseCompleteStep (#33398)
This removes `PhaseAfterStep` in favor of a new `PhaseCompleteStep`. This step
in only a marker that the `LifecyclePolicyRunner` needs to halt until the time
indicated for entering the next phase.

This also fixes a bug where phase times were encapsulated into the policy
instead of dynamically adjusting to policy changes.

Supersedes #33140, which it replaces
Relates to #29823
2018-09-05 16:37:45 -06:00
Tim Brooks 88c178dca6
Add sni name to SSLEngine in netty transport (#33144)
This commit is related to #32517. It allows an "server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. This functionality is only implemented for
the netty security transport.
2018-09-05 16:12:10 -06:00
Armin Braun 46774098d9
INGEST: Implement Drop Processor (#32278)
* INGEST: Implement Drop Processor
* Adjust Processor API
* Implement Drop Processor
* Closes #23726
2018-09-05 14:25:29 +02:00
Jim Ferenczi dbc7102c86
Fix inner hits retrieval when stored fields are disabled (_none_) (#33018)
Now that types are unique per mapping we can retrieve the document mapper
without referencing the type. This fixes an NPE when stored fields are disabled.
For 6x we'll need a different fix since mappings can still have multiple types.

Relates #32941
2018-09-04 16:25:52 +02:00
Jason Tedor 09bf4e5f00
Introduce private settings (#33327)
This commit introduces the formal notion of a private setting. This
enables us to register some settings that we had previously not
registered as fully-fledged settings to avoid them being exposed via
APIs such as the create index API. For example, we had hacks in the
codebase to allow index.version.created to be passed around inside of
settings objects, but was not registered as a setting so that if a user
tried to use the setting on any API then they would get an
exception. This prevented users from setting index.version.created on
index creation, or updating it via the index settings API. By
introducing private settings, we can continue to reject these attempts,
yet now we can represent these settings as actual settings. In this
change, we register index.version.created as an actual setting. We do
not cutover all settings that we had been treating as private in this
pull request, it is already quite large due to moving some tests around
to account for the fact that some tests need to be able to set the
index.version.created. This can be done in a follow-up change.
2018-09-03 19:17:57 -04:00
Nik Everett f8b7a4dbc8
Logging: Drop Settings from some logging ctors (#33332)
Drops `Settings` from some logging ctors now that they are no longer
needed. This should allow us to stop passing `Settings` around to quite
as many places.
2018-09-02 16:51:26 -04:00
Vladimir Dolzhenko 3d82a30fad
drop `index.shard.check_on_startup: fix` (#32279)
drop `index.shard.check_on_startup: fix`

Relates #31389
2018-08-31 21:29:06 +02:00
Nhat Nguyen ad4dd086d2 Integrates soft-deletes into Elasticsearch (#33222)
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand <jpountz@gmail.com>
Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Simon Willnauer <simonw@apache.org>
2018-08-30 23:46:07 -04:00
Nhat Nguyen 547de71d59 Revert "Integrates soft-deletes into Elasticsearch (#33222)"
Revert to correct co-author tags.
This reverts commit 6dd0aa54f6.
2018-08-30 23:44:57 -04:00
Nhat Nguyen 6dd0aa54f6
Integrates soft-deletes into Elasticsearch (#33222)
This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch.
Highlight works in this PR include:

- Replace hard-deletes by soft-deletes in InternalEngine
- Use _recovery_source if _source is disabled or modified (#31106)
- Soft-deletes retention policy based on the global checkpoint (#30335)
- Read operation history from Lucene instead of translog (#30120)
- Use Lucene history in peer-recovery (#30522)

Relates #30086
Closes #29530

---
These works have been done by the whole team; however, these individuals
(lexical order) have significant contribution in coding and reviewing:

Co-authored-by: Adrien Grand jpountz@gmail.com
Co-authored-by: Boaz Leskes b.leskes@gmail.com
Co-authored-by: Jason Tedor jason@tedor.me
Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com
Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co
Co-authored-by: Simon Willnauer simonw@apache.org
2018-08-30 22:11:23 -04:00
Nhat Nguyen 39839f97ef
TEST: Access cluster state directly in assertSeqNos (#33277)
Some AbstractDisruptionTestCase tests start failing since we enabled
assertSeqNos (in #33130). They fail because the assertSeqNos assertion
queries cluster stats while the cluster is disrupted or not formed yet.

This commit switches to use the cluster state and shard stats directly
from the test cluster.

Closes #33251
2018-08-30 19:37:20 -04:00
Armin Braun cc4d7059bf
Ingest: Add conditional per processor (#32398)
* Ingest: Add conditional per processor
* closes #21248
2018-08-30 03:46:39 +02:00
Matt Weber 92bd7242a3 Fix classpath security checks for external tests. (#33066)
This commit checks that when we manually add a class to
the codebase map, that it does in-fact not exist on the classpath
in a jar.  This will only be true if we are using the test framework
externally such as when a user develops a plugin.
2018-08-29 12:19:58 -07:00
Jonathan Little 9d92a87ae6 Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979) 2018-08-28 09:27:43 +01:00
Nhat Nguyen 014b3236dc
Ensure to generate identical NoOp for the same failure (#33141)
We generate slightly different NoOps in InternalEngine and
TransportShardBulkAction for the same failure.

1. InternalEngine uses Exception#getFailure to generate a message
without the class name: newOp [NoOp{seqNo=1, primaryTerm=1,
reason='Contexts are mandatory in context enabled completion field
[suggest_context]'}].

2. TransportShardBulkAction uses Exception#toString to generate a
message with the class name: NoOp{seqNo=1, primaryTerm=1,
reason='java.lang.IllegalArgumentException: Contexts are mandatory in
context enabled completion field [suggest_context]'}.

If a write operation fails while a replica is recovering, that replica
will possibly receive two different NoOps: one from recovery and one
from replication. These two different NoOps will trip
TranslogWriter#assertNoSeqNumberConflict assertion.

This commit ensures that we generate the same Noop for the same failure.

Closes #32986
2018-08-27 15:59:42 -04:00
Simon Willnauer 3376922e8b
Add proxy support to RemoteClusterConnection (#33062)
This adds support for connecting to a remote cluster through
a tcp proxy. A remote cluster can configured with an additional
`search.remote.$clustername.proxy` setting. This proxy will be used
to connect to remote nodes for every node connection established.
We still try to sniff the remote clsuter and connect to nodes directly
through the proxy which has to support some kind of routing to these nodes.
Yet, this routing mechanism requires the handshake request to include some
kind of information where to route to which is not yet implemented. The effort
to use the hostname and an optional node attribute for routing is tracked
in #32517

Closes #31840
2018-08-25 20:41:32 +02:00
Nhat Nguyen 9dad82ece8
TEST: Skip assertSeqNos for closed shards (#33130)
If a shard was closed, we return null for SeqNoStats. Therefore the
assertion assertSeqNos will hit NPE when it verifies a closed shard.

This commit skips closed shards in assertSeqNos and enables this
assertion in AbstractDisruptionTestCase.
2018-08-24 21:02:13 -04:00
Nhat Nguyen 739a8d3d44
TEST: resync operation on replica should acquire shard permit (#33103)
This change makes sure that resync operations on replicas in the test
framework are executed under shard permits as the production code.
2018-08-24 20:25:13 -04:00
Jason Tedor 619e0b28b9
Add hook to skip asserting x-content equivalence (#33114)
This commit adds a hook to AbstractSerializingTestCase to enable
skipping asserting that the x-content of the test instance and an
instance parsed from the x-content of the test instance are the
same. While we usually expect these to be the same, they will not be the
same when exceptions are involved because the x-content there is lossy.
2018-08-24 06:53:44 -04:00
Jim Ferenczi f4e9729d64
Remove unsupported Version.V_5_* (#32937)
This change removes the es 5x version constants and their usages.
2018-08-24 09:51:21 +02:00
Armin Braun 917e5a8c94
TESTS: Fix Random Fail in MockTcpTransportTests (#33061)
* `foobar.txGet()` appears to return before `serviceB.stop()` returns, causing `ServiceB.close()` to run concurrently with the `stop` call and running into a race codition
* Closes #32863
2018-08-23 13:19:21 +02:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Nhat Nguyen 262d3c0783
Allow engine to recover from translog upto a seqno (#33032)
This change allows an engine to recover from its local translog up to
the given seqno. The extended API can be used in these use cases:

When a replica starts following a new primary, it resets its index to
the safe commit, then replays its local translog up to the current
global checkpoint (see #32867).

When a replica starts a peer-recovery, it can initialize the
start_sequence_number to the persisted global checkpoint instead of the
local checkpoint of the safe commit. A replica will then replay its
local translog up to that global checkpoint before accepting remote
translog from the primary. This change will increase the chance of
operation-based recovery. I will make this in a follow-up.

Relates #32867
2018-08-22 07:57:44 -04:00
David Turner ab000323fa
Allow extension of CapturingTransport by subclasses (#33012)
Today, CapturingTransport#createCapturingTransportService creates a transport
service with a connection manager with reasonable default behaviours, but
overriding this behaviour in a consumer is a litle tricky. Additionally, the
default behaviour for opening a connection duplicates the content of the
CapturingTransport#openConnection() method.

This change removes this duplication by delegating to openConnection() and
introduces overridable nodeConnected() and onSendRequest() methods so that
consumers can alter this behaviour more easily.

Relates #32246 in which we test the mechanisms for opening connections to
unknown (and possibly unreachable) nodes.
2018-08-22 09:09:08 +01:00
Alpar Torok 82d10b484a
Run forbidden api checks with runtimeJavaVersion (#32947)
Run forbidden APIs checks with runtime hava version
2018-08-22 09:05:22 +03:00
Simon Willnauer 92076497e5
Use a dedicated ConnectionManger for RemoteClusterConnection (#32988)
This change introduces a dedicated ConnectionManager for every RemoteClusterConnection
such that there is not state shared with the TransportService internal ConnectionManager.
All connections to a remote cluster are isolated from the TransportService but still uses
the TransportService and it's internal properties like the Transport, tracing and internal
listener actions on disconnects etc.
This allows a remote cluster connection to have a different lifecycle than a local cluster connection,
also local discovery code doesn't get notified if there is a disconnect on from a remote cluster and
each connection can use it's own dedicated connection profile which allows to have a reduced set of
connections per cluster without conflicting with the local cluster.

Closes #31835
2018-08-21 12:43:25 +02:00
Tim Brooks cd83ddcecc
Fix assertion in AbstractSimpleTransportTestCase (#32991)
This is a follow-up to #32956. That commit incorrectly used assertBusy
which led to a possible race in the test. This commit fixes it.
2018-08-20 16:09:22 -06:00
Tim Brooks faa42de66d
Pass DiscoveryNode to initiateChannel (#32958)
This is related to #32517. This commit passes the DiscoveryNode to the
initiateChannel method for different Transport implementation. This
will allow additional attributes (besides just the socket address) to be
used when opening channels.
2018-08-20 08:54:55 -06:00
Alpar Torok 4b34b3f4aa
Set forbidden APIs target compatibility to compiler java version (#32935)
Set forbidden apis target compatibility to compiler version

Fix outstanding deprecation
2018-08-20 09:27:02 +03:00
Tim Brooks de92d2ef1f
Move connection listener to ConnectionManager (#32956)
This is a followup to #31886. After that commit the
TransportConnectionListener had to be propogated to both the
Transport and the ConnectionManager. This commit moves that listener
to completely live in the ConnectionManager. The request and response
related methods are moved to a TransportMessageListener. That listener
continues to live in the Transport class.
2018-08-18 10:09:24 -06:00
Tim Brooks 2464b68613
Move connection profile into connection manager (#32858)
This is related to #31835. It moves the default connection profile into
the ConnectionManager class. The will allow us to have different
connection managers with different profiles.
2018-08-15 09:08:33 -06:00
Lee Hinman 48281ac5bc
Use generic AcknowledgedResponse instead of extended classes (#32859)
This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead.

While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.
2018-08-15 08:06:14 -06:00
Ryan Ernst 0158b59a5a
Test: Fix forbidden uses in test framework (#32824)
This commit fixes existing uses of forbidden apis in the test framework
and re-enables the forbidden apis check. It was previously completely
disabled and had missed a rename of the forbidden apis signatures files.

closes #32772
2018-08-14 11:35:09 -07:00
Tim Brooks 10fddb62ee
Remove client connections from TcpTransport (#31886)
This is related to #31835. This commit adds a connection manager that
manages client connections to other nodes. This means that the
TcpTransport no longer maintains a map of nodes that it is connected
to.
2018-08-13 16:44:09 -06:00
Armin Braun d412230cda
SCRIPTING: Support BucketAggScript return null (#32811)
* As explained in #32790, `BucketAggregationScript` must support `null` as a return value
* Closes #32790
2018-08-13 20:08:26 +02:00
Nik Everett f5ba801c6b Test: Only sniff host metadata for node_selectors (#32750)
Our rest testing framework has support for sniffing the host metadata on
startup and, before this change, it'd sniff that metadata before running
the first test. This prevents running these tests against
elasticsearch installations that won't support sniffing like Elastic
Cloud. This change allows tests to only sniff for metadata when they
encounter a test with a `node_selector`. These selectors are the things
that need the metadata anyway and they are super rare. Tests that use
these won't be able to run against installations that don't support
sniffing but we can just skip them. In the case of Elastic Cloud, these
tests were never going to work against Elastic Cloud anyway.
2018-08-10 13:35:47 -04:00
Christoph Büscher 22f7b03430
Fix test reproducability in AbstractBuilderTestCase setup (#32403)
Currently AbstractBuilderTestCase generates certain random values in its
`beforeTest()` method annotated with @Before only the first time that a test
method in the suite is run while initializing the serviceHolder that we use for
the rest of the test. This changes the values of subsequent random values
and has the effect that when running single methods from a test suite with
"-Dtests.method=*", the random values it sees are different from when the same
test method is run as part of the whole test suite. This makes it hard to use
the reproduction lines logged on failure.

This change runs the inialization of the serviceHolder and the randomization 
connected to it using the test runners master seed, so reproduction by running
just one method is possible again.


Closes #32400
2018-08-10 15:13:44 +02:00
Boaz Leskes f58ed21720
Refactor TransportShardBulkAction to better support retries (#31821)
Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a `RetryOnPrimaryException`, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure.

This commit refactors how we deal with retries with the goal of removing  `RetryOnPrimaryException` and `RetryOnReplicaException` (not done yet). It achieves so by introducing a class `BulkPrimaryExecutionContext` that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through:
1) A translation phase for updates
2) Execution phase (always index/delete)
3) Waiting for a mapping update to come in, if needed
4) Requires a retry (for updates and cases where the mapping are still not available after the put mapping call returns)
5) A finalization phase which allows updates to the index/delete result to an update result.
2018-08-10 10:15:01 +02:00
Alpar Torok af8c23eb40
Java version reproduction (#32715)
Enhance reproduction line with info about jdks

Provide the ability to control compiler and hava versions just by
passing a property. The actual java home comes from the
`JAVA<major>_HOME` env vars that we allready require.
This works better with the Gradle daemon as well.

Output is also changed a bit.

for `-Druntime.java=8 -Dcompiler.java=9`:
```
=======================================
Elasticsearch Build Hamster says Hello!
  Gradle Version        : 4.9
  OS Info               : Linux 4.17.8-1-ARCH (amd64)
  Compiler JDK Version  : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22])
  Runtime JDK Version   : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22])
  Gradle JDK Version    : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10])
  Compiler java.home    : /home/alpar/opt/jdk-11-ea22/
  Runtime java.home     : /home/alpar/opt/jdk-11-ea22/
  Gradle java.home      : /usr/lib/jvm/java-10-openjdk
  Random Testing Seed   : EA858533191E8DFB
=======================================
```

Without configuration:
```
=======================================
Elasticsearch Build Hamster says Hello!
=======================================
  Gradle Version        : 4.9
  OS Info               : Linux 4.17.8-1-ARCH (amd64)
  JDK Version           : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10])
  JAVA_HOME             : /usr/lib/jvm/java-10-openjdk
  Random Testing Seed   : 4BD5B2A839C8FCA1
=======================================
```

Here's how a reproduction line will look like (test made to fail):
```
./gradlew :modules:lang-painless:test -Dtests.seed=2DA2379065A4EEAB -Dtests.class=org.elasticsearch.painless.AdditionTests -Dtests.method="testInt" -Dtests.security.manager=true -Dtests.locale=es-PE -Dtests.timezone=WET -Dcompiler.java=10 -Druntime.java=10
```
2018-08-10 08:07:43 +00:00
Armin Braun 79375d35bb
Scripting: Replace Update Context (#32096)
* SCRIPTING: Move Update Scripts to their own context
* Added system property for backwards compatibility of change to `ctx.params`
2018-08-09 14:32:36 +02:00
Jason Tedor dcc816427e
Expose whether or not the global checkpoint updated (#32659)
It will be useful for future efforts to know if the global checkpoint
was updated. To this end, we need to expose whether or not the global
checkpoint was updated when the state of the replication tracker
updates. For this, we add to the tracker a callback that is invoked
whenever the global checkpoint is updated. For primaries this will be
invoked when the computed global checkpoint is updated based on state
changes to the tracker. For replicas this will be invoked when the local
knowledge of the global checkpoint is advanced from the primary.
2018-08-07 15:10:09 -04:00
Tim Brooks 3d5e9114e3
Reduce connections used by MockNioTransport (#32620)
The MockNioTransport (similar to the MockTcpTransport) is used for integ
tests. The MockTcpTransport has always only opened a single for all of
its work. The MockNioTransport has awlays opened the default number of
connections (13). This means that every test where two transports
connect requires 26 connections. This is more than is necessary. This
commit modifies the MockNioTransport to only require 3 connections.
2018-08-07 12:52:28 -06:00
Lee Hinman b3e15851a2 [TEST] Comment out account breaker assertion while diagnosing
Relates to #30290
2018-08-07 09:36:37 -06:00
Armin Braun 6fa7016bbf
SCRIPTING: Move Aggregation Scripts to their own context (#32068)
* SCRIPTING: Move Aggregation Scripts to their own context
2018-08-04 10:37:07 +02:00
Yannick Welsch 0d60e8a029
Fix race between replica reset and primary promotion (#32442)
We've recently seen a number of test failures that tripped an assertion in IndexShard (see issues
linked below), leading to the discovery of a race between resetting a replica when it learns about a
higher term and when the same replica is promoted to primary. This commit fixes the race by
distinguishing between a cluster state primary term (called pendingPrimaryTerm) and a shard-level
operation term. The former is set during the cluster state update or when a replica learns about a
new primary. The latter is only incremented under the operation block, which can happen in a
delayed fashion. It also solves the issue where a replica that's still adjusting to the new term
receives a cluster state update that promotes it to primary, which can happen in the situation of
multiple nodes being shut down in short succession. In that case, the cluster state update thread
would call `asyncBlockOperations` in `updateShardState`, which in turn would throw an exception
as blocking permits is not allowed while an ongoing block is in place, subsequently failing the shard.
This commit therefore extends the IndexShardOperationPermits to allow it to queue multiple blocks
(which will all take precedence over operations acquiring permits). Finally, it also moves the primary
activation of the replication tracker under the operation block, so that the actual transition to
primary only happens under the operation block.

Relates to #32431, #32304 and #32118
2018-08-03 09:33:08 +02:00
Yannick Welsch db6e8c736d
Remove cluster state initial customs (#32501)
This infrastructure was introduced in #26144 and made obsolete in #30743
2018-08-02 15:49:59 +02:00
Jay Modi f2f33f3149 Use hostname instead of IP with SPNEGO test (#32514)
This change updates KerberosAuthenticationIT to resolve the host used
to connect to the test cluster. This is needed because the host could
be an IP address but SPNEGO requires a hostname to work properly. This
is done by adding a hook in ESRestTestCase for building the HttpHost
from the host and port.

Additionally, the project now specifies the IPv4 loopback address as
the http host. This is done because we need to be able to resolve the
address used for the HTTP transport before the node starts up, but the
http.ports file is not written until the node is started.

Closes #32498
2018-08-01 12:57:33 +10:00
Nik Everett 22459576d7
Logging: Make node name consistent in logger (#31588)
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.

Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```

The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```

Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```

The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.

This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.

Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
2018-07-31 10:54:24 -04:00
Luca Cavanna 9a4d0069f6
REST high-level client: parse back _ignored meta field (#32362)
`GetResult` and `SearchHit` have been adjusted to parse back the `_ignored` meta field whenever it gets printed out. Expanded the existing tests to make sure this is covered. Fixed also a small problem around highlighted fields in `SearchHitTests`.
2018-07-30 13:43:40 +02:00
Armin Braun 1628c833c7
TESTS: Move netty leak detection to paranoid level (#32354) 2018-07-26 21:36:49 +02:00
Jim Ferenczi 8e5f281b27
AbstractQueryTestCase should run without type less often (#28936)
This commit changes the randomization to always create an index with a type.
It also adds a way to create a query shard context that maps to an index with
no type registered in order to explicitely test cases where there is no type.
2018-07-26 20:29:05 +02:00
Jason Tedor eb675a1c4d
Introduce index store plugins (#32375)
Today we allow plugins to add index store implementations yet we are not
doing this in our new way of managing plugins as pull versus push. That
is, today we still allow plugins to push index store providers via an on
index module call where they can turn around and add an index
store. Aside from being inconsistent with how we manage plugins today
where we would look to pull such implementations from plugins at node
creation time, it also means that we do not know at a top-level (for
example, in the indices service) which index stores are available. This
commit addresses this by adding a dedicated plugin type for index store
plugins, removing the index module hook for adding index stores, and by
aggregating these into the top-level of the indices service.
2018-07-26 08:05:49 -04:00
Tim Vernum 387c3c7f1d Introduce Application Privileges with support for Kibana RBAC (#32309)
This commit introduces "Application Privileges" to the X-Pack security
model.

Application Privileges are managed within Elasticsearch, and can be
tested with the _has_privileges API, but do not grant access to any
actions or resources within Elasticsearch. Their purpose is to allow
applications outside of Elasticsearch to represent and store their own
privileges model within Elasticsearch roles.

Access to manage application privileges is handled in a new way that
grants permission to specific application names only. This lays the
foundation for more OLS on cluster privileges, which is implemented by
allowing a cluster permission to inspect not just the action being
executed, but also the request to which the action is applied.
To support this, a "conditional cluster privilege" is introduced, which
is like the existing cluster privilege, except that it has a Predicate
over the request as well as over the action name.

Specifically, this adds
- GET/PUT/DELETE actions for defining application level privileges
- application privileges in role definitions
- application privileges in the has_privileges API
- changes to the cluster permission class to support checking of request
  objects
- a new "global" element on role definition to provide cluster object
  level security (only for manage application privileges)
- changes to `kibana_user`, `kibana_dashboard_only_user` and
  `kibana_system` roles to use and manage application privileges

Closes #29820
Closes #31559
2018-07-24 10:34:46 -06:00
Daniel Mitterdorfer 73a38895fd
Add Restore Snapshot High Level REST API
With this commit we add the restore snapshot API to the Java high level
REST client.

Relates #27205
Relates #32155
2018-07-24 16:17:09 +02:00
Ioannis Kakavas a2dbd83db1
Allow Integ Tests to run in a FIPS-140 JVM (#31989)
* Complete changes for running IT in a fips JVM

- Mute :x-pack:qa:sql:security:ssl:integTest as it
  cannot run in FIPS 140 JVM until the SQL CLI supports key/cert.
- Set default JVM keystore/truststore password in top level build
  script for all integTest tasks in a FIPS 140 JVM
- Changed top level x-pack build script to use keys and certificates
  for trust/key material when spinning up clusters for IT
2018-07-24 12:48:14 +03:00
Andrey Ershov 33f11e637d
Fail shard if IndexShard#storeStats runs into an IOException (#32241)
Fail shard if IndexShard#storeStats runs into an IOException. Closes #29008
2018-07-23 16:38:55 +02:00
Christoph Büscher ff87b7aba4
Remove unnecessary warning supressions (#32250) 2018-07-23 11:31:04 +02:00
Armin Braun 24068a773d
TESTS: Check for Netty resource leaks (#31861)
* Enabled advanced leak detection when loading `EsTestCase`
* Added custom `Appender` to collect leak logs and check for logged errors in a way similar to what is done for the `StatusLogger`
* Fixes #20398
2018-07-20 09:12:32 +02:00
Julie Tibshirani 15ff3da653
Add support for field aliases. (#32172)
* Add basic support for field aliases in index mappings. (#31287)
* Allow for aliases when fetching stored fields. (#31411)
* Add tests around accessing field aliases in scripts. (#31417)
* Add documentation around field aliases. (#31538)
* Add validation for field alias mappings. (#31518)
* Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671)
* Make sure that field-level security is enforced when using field aliases. (#31807)
* Add more comprehensive tests for field aliases in queries + aggregations. (#31565)
* Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)
2018-07-18 09:33:09 -07:00
Boaz Leskes 5856c396dd
A replica can be promoted and started in one cluster state update (#32042)
When a replica is fully recovered (i.e., in `POST_RECOVERY` state) we send a request to the master
to start the shard. The master changes the state of the replica and publishes a cluster state to that
effect. In certain cases, that cluster state can be processed on the node hosting the replica
*together* with a cluster state that promotes that, now started, replica to a primary. This can
happen due to cluster state batched processing or if the master died after having committed the
cluster state that starts the shard but before publishing it to the node with the replica. If the master
also held the primary shard, the new master node will remove the primary (as it failed) and will also
immediately promote the replica (thinking it is started). 

Sadly our code in IndexShard didn't allow for this which caused [assertions](13917162ad/server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java (L482)) to be tripped in some of our tests runs.
2018-07-18 11:30:44 +02:00
Boaz Leskes 93d7468f3a ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are
Sometimes we have a test failure that hits an `UnsupportedOperationException` in this infrastructure. When
debugging you want to know what caused this unexpected failure, but right now we're silent about it. This
commit adds some information to the `UnsupportedOperationException`

Relates to #32127
2018-07-18 08:49:16 +02:00
Nhat Nguyen df1380b8d3
Remove versionType from translog (#31945)
With the introduction of sequence number, we no longer use versionType to
resolve out of order collision in replication and recovery requests.

This PR removes removes the versionType from translog. We can only remove
it in 7.0 because it is still required in a mixed cluster between 6.x and 5.x.
2018-07-17 21:59:48 -04:00
Ioannis Kakavas 9e529d9d58
Enable testing in FIPS140 JVM (#31666)
Ensure our tests can run in a FIPS JVM

JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.

IT will be covered in a subsequent PR
2018-07-17 10:54:10 +03:00
Daniel Mitterdorfer 016e8760f0
Turn off real-mem breaker in single node tests
With this commit we disable the real-memory circuit breaker in tests
that inherit from `ESSingleNodeTestCase`. As this breaker is based on
real memory usage over which we have no (full) control in tests and
their purpose is also not to test the circuit breaker, we use the
deterministic circuit breaker implementation that only accounts for
explicitly reserved memory.

Closes #32047
Relates #32071
2018-07-16 10:40:36 +02:00
Armin Braun 3679d00a74
Replace Ingest ScriptContext with Custom Interface (#32003)
* Replace Ingest ScriptContext with Custom Interface
* Make org.elasticsearch.ingest.common.ScriptProcessorTests#testScripting more precise
* Don't mock script factory in ScriptProcessorTests
* Adjust mock script plugin in IT for new API
2018-07-13 23:26:10 +02:00
Vladimir Dolzhenko b1bf643e41
lazy snapshot repository initialization (#31606)
lazy snapshot repository initialization
2018-07-13 20:05:49 +02:00
Colin Goodheart-Smithe 0edb096eb4 Adds a new auto-interval date histogram (#28993)
* Adds a new auto-interval date histogram

This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time.

This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary.

Closes #9572

* Adds documentation

* Added sub aggregator test

* Fixes failing docs test

* Brings branch up to date with master changes

* trying to get tests to pass again

* Fixes multiBucketConsumer accounting

* Collects more buckets than needed on shards

This gives us more options at reduce time in terms of how we do the
final merge of the buckeets to produce the final result

* Revert "Collects more buckets than needed on shards"

This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5.

* Adds ability to merge within a rounding

* Fixes nonn-timezone doc test failure

* Fix time zone tests

* iterates on tests

* Adds test case and documentation changes

Added some notes in the documentation about the intervals that can bbe
returned.

Also added a test case that utilises the merging of conseecutive buckets

* Fixes performance bug

The bug meant that getAppropriate rounding look a huge amount of time
if the range of the data was large but also sparsely populated. In
these situations the rounding would be very low so iterating through
the rounding values from the min key to the max keey look a long time
(~120 seconds in one test).

The solution is to add a rough estimate first which chooses the
rounding based just on the long values of the min and max keeys alone
but selects the rounding one lower than the one it thinks is
appropriate so the accurate method can choose the final rounding taking
into account the fact that intervals are not always fixed length.

Thee commit also adds more tests

* Changes to only do complex reduction on final reduce

* merge latest with master

* correct tests and add a new test case for 10k buckets

* refactor to perform bucket number check in innerBuild

* correctly derive bucket setting, update tests to increase bucket threshold

* fix checkstyle

* address code review comments

* add documentation for default buckets

* fix typo
2018-07-13 13:08:35 -04:00
Daniel Mitterdorfer f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00
olcbean 334c255516 XContentTests : Insert random fields at random positions (#30867)
Currently AbstractXContentTestCase#testFromXContent appends random fields, but in
a fixed position. This PR shuffles all fields after the random fields have been appended, 
hence the random fields are actually added to random positions.
2018-07-12 19:10:51 +02:00
Nik Everett 38e09a1508
Switch test framework to new style requests (#31939)
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `test/framework` project to use the new
versions.
2018-07-11 10:04:17 -04:00
Armin Braun b4087d69d2
Fix assertIngestDocument wrongfully passing (#31913)
* Fix assertIngestDocument wrongfully passing

* Previously docA being subset of docB passed because iteration was over docA's keys only
* Scalars in nested fields were not compared in all cases
* Assertion errors were hard to interpret (message wasn't correct since it only mentioned the class type)
* In cases where two paths contained different types a ClassCastException was thrown instead of an AssertionError
* Fixes #28492
2018-07-11 10:24:21 +02:00
Alexander Reelsen 1c32497c44
Date: Add DateFormatters class that uses java.time (#31856)
A newly added class called DateFormatters now contains java.time based
builders for dates, which also intends to be fully backwards compatible,
when the name based date formatters are picked. Also a new class named 
CompoundDateTimeFormatter for being able to parse multiple different 
formats has been added.

A duelling test class has been added that ensures the same dates when
parsing java or joda time formatted dates for the name based dates.

Note, that java.time and joda time are not fully backwards compatible,
which also means that old formats will currently not work with this
setup.
2018-07-10 09:28:28 +02:00
Yannick Welsch cce7dc20ad
Smaller aesthetic fixes to InternalTestCluster (#31831)
Allows cluster to auto-reconfigure faster by starting up nodes in parallel.
2018-07-06 11:42:09 +02:00
Nik Everett 1099060735
Test: Do not remove xpack templates when cleaning (#31642)
At the end of every `ESRestTestCase` we clean the cluster which includes
deleting all of the templates. If xpack is installed it'll automatically
recreate a few templates every time they are removed. Which is slow.

This change stops the cleanup from removing the xpack templates. It cuts
the time to run the docs tests more than in half and it probably saves a
bit more time on other tests as well.
2018-07-05 09:43:43 -04:00
Christoph Büscher bd1c513422
Reduce more raw types warnings (#31780)
Similar to #31523.
2018-07-05 15:38:06 +02:00
Simon Willnauer 3f2a241b7f
Detach Transport from TransportService (#31727)
Today TransportService is tightly coupled with Transport since it
requires an instance of TransportService in order to receive responses
and send requests. This is mainly due to the Request and Response handlers
being maintained in TransportService but also because of the lack of a proper 
callback interface.

This change moves request handler registry and response handler registration into
Transport and adds all necessary methods to `TransportConnectionListener` in order
to remove the `TransportService` dependency from `Transport`
Transport now accepts one or more `TransportConnectionListener` instances that are
executed sequentially in a blocking fashion.
2018-07-04 11:32:35 +02:00
Yannick Welsch 2bb4f38371
Add write*Blob option to replace existing blob (#31729)
Adds a new parameter to the BlobContainer#write*Blob methods to specify whether the existing file
should be overridden or not. For some metadata files in the repository, we actually want to replace
the current file. This is currently implemented through an explicit blob delete and then a fresh write.
In case of using a cloud provider (S3, GCS, Azure), this results in 2 API requests instead of just 1.
This change will therefore allow us to achieve the same functionality using less API requests.
2018-07-03 09:13:50 +02:00
Christoph Büscher 31aabe4bf9
Clean up double semicolon code typos (#31687) 2018-07-02 15:14:44 +02:00
Konrad Beiske 2971dd56ca Enable setting client path prefix to / (#30119)
Some proxies require all requests to have paths starting with / since
there are no relative paths at the HTTP connection level. Elasticsearch
assumes paths are absolute. In order to run rest tests against a cluster
behind such a proxy, set the system property
tests.rest.client_path_prefix to /.
2018-07-01 13:42:03 -04:00
Tanguy Leroux d8b3f332ef
Remove extra check for object existence in repository-gcs read object (#31661) 2018-06-29 13:52:31 +02:00
Tanguy Leroux 0ef22db844
[Test] Clean up some repository-s3 tests (#31601)
This commit removes some tests in the repository-s3 plugin that 
have not been executed for 2+ years but have been maintained 
for nothing. Most of the tests in AbstractAwsTestCase were 
obsolete or superseded by fixture based integration tests.
2018-06-29 13:21:29 +02:00
Ryan Ernst f924835265
Core: Require all actions have a Task (#31627)
The TaskManager and TaskAwareRequest could return null when registering
a task according to their javadocs, but no implementations ever actually
did that. This commit removes that wording from the javadocs and ensures
null is no longer allowed.
2018-06-28 08:24:03 -07:00
Alpar Torok 8557bbab28
Upgrade gradle wrapper to 4.8 (#31525)
* Move to Gradle 4.8 RC1

* Use latest version of plugin

The current does not work with Gradle 4.8 RC1

* Switch to Gradle GA

* Add and configure build compare plugin

* add work-around for https://github.com/gradle/gradle/issues/5692

* work around https://github.com/gradle/gradle/issues/5696

* Make use of Gradle build compare with reference project

* Make the manifest more compare friendly

* Clear the manifest in compare friendly mode

* Remove animalsniffer from buildscript classpath

* Fix javadoc errors

* Fix doc issues

* reference Gradle issues in comments

* Conditionally configure build compare

* Fix some more doclint issues

* fix typo in build script

* Add sanity check to make sure the test task was replaced

Relates to #31324. It seems like Gradle has an inconsistent behavior and
the taks is not always replaced.

* Include number of non conforming tasks in the exception.

* No longer replace test task, create implicit instead

Closes #31324. The issue has full context in comments.

With this change the `test` task becomes nothing more than an alias for `utest`.
Some of the stand alone tests that had a `test` task now have `integTest`, and a
few of them that used to have `integTest` to run multiple tests now only
have `check`.
This will also help separarate unit/micro tests from integration tests.

* Revert "No longer replace test task, create implicit instead"

This reverts commit f1ebaf7d93e4a0a19e751109bf620477dc35023c.

* Fix replacement of the test task

Based on information from gradle/gradle#5730 replace the task taking
into account the task providres.
Closes #31324.

* Only apply build comapare plugin if needed

* Make sure test runs before integTest

* Fix doclint aftter merge

* PR review comments

* Switch to Gradle 4.8.1 and remove workaround

* PR review comments

* Consolidate task ordering
2018-06-28 08:13:21 +03:00
Luca Cavanna a35b5341c4
[TEST] call yaml client close method from test suite (#31591)
We added a way to close the yaml test client with #31575.
Such close method also needs to be called from the test suite though for
the additional clients to be closed.
2018-06-27 08:23:53 +02:00
Luca Cavanna 823a9d34da
[TEST] Close additional clients created while running yaml tests (#31575)
We recently introduced a mechanism that allows to specify a node
selector as part of do sections (see #31471). When a node selector that
is not the default one is configured, a new client will be initialized
with the same properties as the default one, but with the specified
node selector. This commit improves such mechanism but also closing
the additional clients being created and adding equals/hashcode impl to
the custom node selector as they are cached into a map.
2018-06-26 16:56:35 +02:00
Alpar Torok 08b8d11e30
Add support for switching distribution for all integration tests (#30874)
* remove left-over comment

* make sure of the property for plugins

* skip installing modules if these exist in the distribution

* Log the distrbution being ran

* Don't allow running with integ-tests-zip passed externally

* top level x-pack/qa can't run with oss distro

* Add support for matching objects in lists

Makes it possible to have a key that points to a list and assert that a
certain object is present in the list. All keys have to be present and
values have to match. The objects in the source list may have additional
fields.

example:
```
  match:  { 'nodes.$master.plugins': { name: ingest-attachment }  }
```

* Update plugin and module tests to work with other distributions

Some of the tests expected that the integration tests will always be ran
with  the `integ-test-zip` distribution so that there will be no other
plugins loaded.

With this change, we check for the presence of the plugin without
assuming exclusivity.

* Allow modules to run on other distros as well

To match the behavior of tets.distributions

* Add and use a new `contains` assertion

Replaces the  previus changes that caused `match` to do a partial match.

* Implement PR review comments
2018-06-26 06:49:03 -07:00
Nik Everett 232c71b6bf
QA: Create xpack yaml features (#31403)
This creates a YAML test "features" that indices if the cluster being
tested has xpack installed (`xpack`) or if it does *not* have xpack
installed (`no_xpack`). It uses those features to centralize skipping
a few tests that fail if xpack is installed.

The plan is to use this in a followup to skip docs tests that require
xpack when xpack is not installed. We *plan* to use the declaration
of required license level on the docs page to generate the required
`skip`.

Closes #30933.
2018-06-26 09:26:48 -04:00
Sohaib Iftikhar ca4c857a90 Improve test times for tests using `RandomObjects::addFields` (#31556)
Currently RandomObjects::addFields can potentially generate a large number of fields This commit decreases the chances that a new object or array is added as a new branch of an object, which lowers the probability of ending up with very big documents generated. It also reduces the number of documents generated for the SimulatePipelineResponseTests from 10 to 5 to reduce the testing time required for parsing.
2018-06-26 12:39:53 +02:00
Christoph Büscher 86ab3a2d1a
Reduce number of raw types warnings (#31523)
A first attempt to reduce the number of raw type warnings, 
most of the time by using the unbounded wildcard.
2018-06-25 15:59:03 +02:00
Jonathan Little 8e4768890a Migrate scripted metric aggregation scripts to ScriptContext design (#30111)
* Migrate scripted metric aggregation scripts to ScriptContext design #29328

* Rename new script context container class and add clarifying comments to remaining references to params._agg(s)

* Misc cleanup: make mock metric agg script inner classes static

* Move _score to an accessor rather than an arg for scripted metric agg scripts

This causes the score to be evaluated only when it's used.

* Documentation changes for params._agg -> agg

* Migration doc addition for scripted metric aggs _agg object change

* Rename "agg" Scripted Metric Aggregation script context variable to "state"

* Rename a private base class from ...Agg to ...State that I missed in my last commit

* Clean up imports after merge
2018-06-25 12:01:33 +01:00
Vladimir Dolzhenko f04c579203
IndexShard should not return null stats (#31528)
IndexShard should not return null stats - empty stats or AlreadyCloseException if it's closed is better
2018-06-22 21:08:11 +02:00
Luca Cavanna 16e4e7a7cf
Node selector per client rather than per request (#31471)
We have made node selectors configurable per request, but all 
of other language clients don't allow for that.
A good reason not to do so, is that having a different node selector 
per request breaks round-robin. This commit makes NodeSelector 
configurable only at client initialization. It also improves the docs 
on this matter, important given that a single node selector can still 
affect round-robin.
2018-06-22 17:15:29 +02:00
Ryan Ernst 59e7c6411a
Core: Combine messageRecieved methods in TransportRequestHandler (#31519)
TransportRequestHandler currently contains 2 messageReceived methods,
one which takes a Task, and one that does not. The first just delegates
to the second. This commit changes all existing implementors of
TransportRequestHandler to implement the version which takes Task, thus
allowing the class to be a functional interface, and eliminating the
need to throw exceptions when a task needs to be ensured.
2018-06-22 07:36:03 -07:00
Yannick Welsch f22f91c57a
Allow multiple unicast host providers (#31509)
Introduces support for multiple host providers, which allows the settings based hosts resolver to be
treated just as any other UnicastHostsProvider. Also introduces the notion of a HostsResolver so
that plugins such as FileBasedDiscovery do not need to create their own thread pool for resolving
hosts, making it easier to add new similar kind of plugins.
2018-06-22 15:31:23 +02:00
Yannick Welsch da69ab28c7
Return transport addresses from UnicastHostsProvider (#31426)
With #20695 we removed local transport and there is just TransportAddress now. The
UnicastHostsProvider currently returns DiscoveryNode instances, where, during pinging, we're
actually only making use of the TransportAddress to establish a first connection to the possible new 
node. To simplify the interface, we can just return a list of transport addresses instead, which
means that it's not necessary anymore to create fake node objects in each plugin just to return the
address information.
2018-06-21 16:00:26 +02:00
Tim Brooks 86423f9563
Ensure local addresses aren't null (#31440)
Currently we set local addresses on the creation time of a NioChannel.
However, this may return null as the local address may not have been
set yet. An example is the local address has not been set on a client
channel as the connection process is not yet complete.

This PR modifies the getter to set the local field if it is currently null.
2018-06-20 19:50:14 -06:00
Ryan Ernst 00283a61e1
Remove unused generic type for client execute method (#31444)
This commit removes the request builder generic type for AbstractClient
as it was unused.
2018-06-20 16:26:26 -07:00
Tim Brooks 9ab1325953
Introduce http and tcp server channels (#31446)
Historically in TcpTransport server channels were represented by the
same channel interface as socket channels. This was necessary as
TcpTransport was parameterized by the channel type. This commit
introduces TcpServerChannel and HttpServerChannel classes. Additionally,
it adds the implementations for the various transports. This allows
server channels to have unique functionality and not implement the
methods they do not support (such as send and getRemoteAddress).

Additionally, with the introduction of HttpServerChannel this commit
extracts some of the storing and closing channel work to the abstract
http server transport.
2018-06-20 16:34:56 -06:00
Nhat Nguyen db1b97fd85
Remove QueryCachingPolicy#ALWAYS_CACHE (#31451)
The QueryCachingPolicy#ALWAYS_CACHE was deprecated in Lucene-7.4 and
will be removed in Lucene-8.0. This change replaces it with QueryCachingPolicy.
This also makes INDEX_QUERY_CACHE_EVERYTHING_SETTING visible in testing only.
2018-06-20 10:34:08 -04:00
Tim Brooks 529e704b11
Unify http channels and exception handling (#31379)
This is a general cleanup of channels and exception handling in http.
This commit introduces a CloseableChannel that is a superclass of
TcpChannel and HttpChannel. This allows us to unify the closing logic
between tcp and http transports. Additionally, the normal http channels
are extracted to the abstract server transport.

Finally, this commit (mostly) unifies the exception handling between nio
and netty4 http server transports.
2018-06-19 11:50:03 -06:00
Ryan Ernst e67aa96c81
Core: Combine Action and GenericAction (#31405)
Since #30966, Action no longer has anything but a call to the
GenericAction super constructor. This commit renames GenericAction
into Action, thus eliminating the Action class. Additionally, this
commit removes the Request generic parameter of the class, since
it was unused.
2018-06-18 23:53:04 +02:00
Simon Willnauer 3d5f113ada
Ensure we don't use a remote profile if cluster name matches (#31331)
If we are running into a race condition between a node being configured
to be a remote node for cross cluster search etc. and that node joining
the cluster we might connect to that node with a remote profile. If that
node now joins the cluster it connected to it as a CCS remote node we use
the wrong profile and can't use bulk connections etc. anymore. This change
uses the remote profile only if we connect to a node that has a different cluster
name than the local cluster. This is not a perfect fix for this situation but
is the safe option while potentially only loose a small optimization of using
less connections per node which is small anyways since we only connect to a
small set of nodes.

Closes #29321
2018-06-17 13:32:53 +02:00
Tal Levy 3b70e943eb
add is-write-index flag to aliases (#30942)
This commit adds the is-write-index flag for aliases.
It allows requests to set the flag, and responses to display the flag.
It does not validate and/or affect any indexing/getting/updating behavior
of Elasticsearch -- this will be done in a follow-up PR.
2018-06-15 08:45:29 -07:00
Nhat Nguyen 8453ca638d
Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360) 2018-06-15 10:58:21 -04:00
Nik Everett 856936c286
REST Client: NodeSelector for node attributes (#31296)
Add a `NodeSelector` so that users can filter the nodes that receive
requests based on node attributes.

I believe we'll need this to backport #30523 and we want it anyway.

I also added a bash script to help with rebuilding the sniffer parsing
test documents.
2018-06-15 08:04:54 -04:00
Nhat Nguyen e5b7137508
TEST: getCapturedRequestsAndClear should be atomic (#31312)
We might lose messages between getCapturedRequestsAndClear calls.
This commit makes sure that both getCapturedRequestsAndClear and
getCapturedRequestsByTargetNodeAndClear are atomic.
2018-06-14 21:32:07 -04:00
Tim Brooks fcf1e41e42
Extract common http logic to server (#31311)
This is related to #28898. With the addition of the http nio transport,
we now have two different modules that provide http transports.
Currently most of the http logic lives at the module level. However,
some of this logic can live in server. In particular, some of the
setting of headers, cors, and pipelining. This commit begins this moving
in that direction by introducing lower level abstraction (HttpChannel,
HttpRequest, and HttpResonse) that is implemented by the modules. The
higher level rest request and rest channel work can live entirely in
server.
2018-06-14 15:10:02 -06:00
Tanguy Leroux bbfe1eccc7
[Tests] Mutualize fixtures code in BaseHttpFixture (#31210)
Many fixtures have similar code for writing the pid & ports files or
for handling HTTP requests. This commit adds an AbstractHttpFixture 
class in the test framework that can be extended for specific testing purposes.
2018-06-14 14:09:56 +02:00
Tanguy Leroux 4d7447cb5e
Reenable Checkstyle's unused import rule (#31270) 2018-06-14 09:52:46 +02:00
Nik Everett 77bb93557e
Test: Remove broken yml test feature (#31255)
The `requires_replica` yaml test feature hasn't worked for years. This
is what happens if you try to use it:
```
   > Throwable #1: java.lang.NullPointerException
   >    at __randomizedtesting.SeedInfo.seed([E6602FB306244B12:6E341069A8D826EA]:0)
   >    at org.elasticsearch.test.rest.yaml.Features.areAllSupported(Features.java:58)
   >    at org.elasticsearch.test.rest.yaml.section.SkipSection.skip(SkipSection.java:144)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:321)
```

None of our tests use it.
2018-06-13 09:33:06 -04:00
Ryan Ernst a65b18f19d Core: Remove plain execute method on TransportAction (#30998)
TransportAction has many variants of execute. One of those variants
executes by returning a future, which is then often blocked on by
calling get(). This commit removes this variant of execute, instead
using a helper method for tests that want to block, or having tests
pass in a PlainActionFuture directly as a listener.

Co-authored-by: Simon Willnauer <simonw@apache.org>
2018-06-13 09:58:13 +02:00