Commit Graph

2090 Commits

Author SHA1 Message Date
Luca Cavanna fd7cde88db Mute failing RolloverIT#testRolloverWithDateMath
Relates to #37037
2018-12-31 11:51:50 +01:00
Armin Braun 85be9d6a89
SNAPSHOT: Deterministic ClusterState Tests (#36644)
* Use `DeterministicTaskQueue` infrastructure to test `SnapshotsService`
2018-12-31 11:17:21 +01:00
Luca Cavanna adb957b5aa Mute failing DateMathExpressionResolverTests tests
Relates to #37037
2018-12-31 10:37:10 +01:00
Luca Cavanna d3f1fe46d3 Increase await timeouts in RemoteClusterServiceTests
Closes #33852
2018-12-28 17:03:40 +01:00
Armin Braun 4ac8fc6906
Force Refresh Listeners when Acquiring all Operation Permits (#36835)
* Fixes the issue reproduced in the added tests:
   * When having open index requests on a shard that are waiting for a refresh, relocating that shard
becomes blocked until that refresh happens (which could be never as in the test scenario).
2018-12-28 16:42:51 +01:00
Luca Cavanna cb6bac3f88
Skip final reduction if SearchRequest holds a cluster alias (#37000)
With #36997 we added the ability to provide a cluster alias with a
SearchRequest.

The next step is to disable the final reduction whenever a cluster alias
is provided with the SearchRequest. A cluster alias will be provided
when executing a cross-cluster search request with alternate execution
mode, where each cluster does its own reduction locally. In order for
the CCS node to be able to later perform an additional reduction of the
results, we need to make sure that all the needed info stays available.
This means that terms aggregations can be reduced but not pruned, and
pipeline aggs should not be executed. The final reduction will happen
later in the CCS coordinating node.

Relates to #36997 & #32125
2018-12-28 14:58:20 +01:00
Armin Braun 34d22f378d TESTS: Mute testSnapshotCanceledOnRemovedShard
* relates #37005
2018-12-28 14:20:45 +01:00
Luca Cavanna 51fe20e0c3
Add support for local cluster alias to SearchRequest (#36997)
With the upcoming cross-cluster search alternate execution mode, the CCS
 node will be able to split a CCS request into multiple search requests,
 one per remote cluster involved. In order to do that, the CCS node has
to be able to signal to each remote cluster that such sub-requests are
part of a CCS request. Each cluster does not know about the other
clusters involved, and does not know either what alias it is given in
the CCS node, hence the CCS coordinating node needs to be able to provide
the alias as part of the search request so that it is used as index prefix
in the returned search hits.

The cluster alias is a notion that's already supported in the search shards
iterator and search shard target, but it is currently used in CCS as both
 index prefix and connection lookup key when fanning out to all the shards.
With CCS alternate execution mode the provided cluster alias needs to be
used only as index prefix, as shards are local to each cluster hence no
cluster alias should be used for connection lookups.

The local cluster alias can be set to the SearchRequest at the transport layer
only, and its constructor/getter methods are package private.

Relates to #32125
2018-12-28 12:43:25 +01:00
Yannick Welsch 935c2e98b0
Zen2: Turn to follower on follower check when no state accepted yet from new leader (#37003)
Improves on #36449 which did not cover the situation where a node had bumped its term during
the election, and not when receiving the first follower check. This was uncovered while refactoring
NodeJoinTests so that they don't need to access to an internal field of Coordinator anymore (which
can now be made private).
2018-12-28 08:37:04 +01:00
Andrey Ershov 76da29ba84
Switch testClusterJoinDespiteOfPublishingIssues to Zen2 (#36979) 2018-12-27 22:19:20 +01:00
Andrey Ershov 6259ccb1cf
Move ensureAtomicMoveSupported to NodeEnvironment (#36975)
Atomic move support is needed not only for GatewayMetaState to work correctly
2018-12-27 22:18:47 +01:00
Armin Braun 6aae7c8516
SNAPSHOT+TESTS: Speed up Snapshot IT (#36990)
* This speeds up the test from an average 25s down to 7s runtime
* There is no need for artificially slowing down the snapshot to reproduce the issue of an out of sync routing table in practice.
Over hundreds of test runs the test's snapshot shard service still runs in the index not found exception every time reproducing this issue.
* Relates #36294
2018-12-27 12:18:17 +01:00
Jim Ferenczi b0d2957e10 [TEST] Fix possible identical mutation in SearchHitTests#mutate 2018-12-27 12:00:22 +01:00
Itamar Benjamin 48b0908fc6 Make InternalComposite key comparable
Keys are compared in BucketSortPipelineAggregation so making key type (ArrayMap) implement Comparable. Maps are compared using the entry set's iterator so ordered maps order is maintain. For each entry first comparing key then value. Assuming all keys are strings. When comparing entries' values if type is not identical and\or type not implementing Comparable, throwing exception. Not implementing equals() and hashCode() functions as parent's ones are sufficient. Tests included.
2018-12-27 08:51:24 +01:00
Nhat Nguyen 7580d9d925
Make SourceToParse immutable (#36971)
Today the routing of a SourceToParse is assigned in a separate step
after the object is created. We can easily forget to set the routing.
With this commit, the routing must be provided in the constructor of
SourceToParse.

Relates #36921
2018-12-24 14:06:50 -05:00
Andrey Ershov 531ae8b3ab
Remove ensureNoPre019State check (#36974)
This check is no longer needed because it's already ES v7
2018-12-24 16:26:08 +01:00
Nhat Nguyen 40c7ae6181
Rewrite SourceToParse with resolved docType (#36921)
We introduce a typeless API in #35790 where we translate the default
docType "_doc" to the user-defined docType. However, we do not rewrite
the SourceToParse with the resolved docType. This leads to a situation
where we have two translog operations for the same document with
different types:

- prvOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='_doc', seqNo=1,
  primaryTerm=1, version=1, autoGeneratedIdTimestamp=1545125562123}]

- newOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='not_doc', seqNo=1,
  primaryTerm=1, version=1, autoGeneratedIdTimestamp=-1}]

Closes #36769
2018-12-23 15:14:21 -05:00
Nhat Nguyen d238b2934c TEST: Fix Engine#testRebuildLocalCheckpointTracker
In this test, we verify that the LocalCheckpointTracker is initialized
with the operations of the safe commit. And the test fails because
Engine#Index does not implement the equals method (should not
implement as it consists of a mutable ParsedDocument).

Closes #36470
2018-12-22 18:00:42 -05:00
Tim Brooks c8a8391dfa
Only compress responses if request was compressed (#36867)
This is a follow-up to some discussions around #36399. Currently we have
relatively confusing compression behavior where compression can be
configured for requests based on transport.compress or a specific
setting for a remote cluster. However, we can only compress responses
based on transport.compress as we do not know where a request is
coming from (currently).

This commit modifies the behavior to NEVER compress responses based on
settings. Instead, a response will only be compressed if the request was
compressed. This commit also updates the documentation to more clearly
described transport level compression.
2018-12-21 10:14:00 -07:00
David Turner cfea2fd68c
RecoveryMonitor#lastSeenAccessTime should be volatile (#36781)
This local field is accessed on multiple threads and is nonvolatile so
theoretically could yield stale values. Not sure it does in practice.
2018-12-21 11:10:48 +00:00
Michael Basnight 7cbf03c001
Scripting: Remove deprecated params.ctx (#36848)
When the script contexts were created in 6, the use of params.ctx was
deprecated. This commit cleans up that code and ensures that params.ctx
is null in both watcher script contexts.

Relates: #34059
2018-12-20 21:30:24 -06:00
Julie Tibshirani fba710469a
Refactor the REST actions to clarify what endpoints are deprecated. (#36869) 2018-12-20 18:06:41 -08:00
Tim Brooks d9b2ed6135
Send clear session as routable remote request (#36805)
This commit adds a RemoteClusterAwareRequest interface that allows a
request to specify which remote node it should be routed to. The remote
cluster aware client will attempt to route the request directly to this
node. Otherwise it will send it as a proxy action to eventually end up
on the requested node.

It implements the ccr clean_session action with this client.
2018-12-20 17:43:12 -07:00
David Turner d6d5134890
Fix name of SettingsBasedHostsProviderIT (#36778)
It is not a _host_ provider because it provides more than one host.
2018-12-20 17:11:12 +00:00
David Turner 52d34e45e7
[Zen2] Minor logging improvements (#36818)
* Adds term number and greppable phrase 'coordinator becoming' to Coordinator
  mode changes
* Adds term and version to messages from the ClusterApplier about master
  changes
* Reduces some LeaderChecker messages to TRACE level
2018-12-20 15:31:52 +00:00
Andrey Ershov ca92d74e7e
[Zen2] Change unsafe bootstrap nodes count to nodes list in tests (#36559)
This commit modifies ESSingleNodeTestCase and ESIntegTestCase and
several concrete test classes to use node names when bootstrapping the
cluster.

Today ClusterBootstrapService.INITIAL_MASTER_NODE_COUNT_SETTING
setting is used to bootstrap clusters in tests. Instead, we want to use
ClusterBootrstapService.INITIAL_MASTER_NODES_SETTING and get rid of
the former setting eventually.

There were two main problems when refactoring InternalTestCluster:

1. Nodes are created one-by-one in buildNode method. And node.name
is created in this method as well. It's not suitable for bootstrapping,
because we need to have the names of all master eligible nodes in
advance, before creating the node with bootstrapping configuration set.
We address this issue by separating buildNode into two methods:
getNodeSettings and buildNode. We first iterate over all nodes to
get nodes settings, then change the setting for the bootstrapping node
and then proceed with building the node.
2. If autoManageMinMasterNodes = false, there is no way for the test to
set the list of bootstrapping nodes because node names are not known in
advance. This problem is solved by adding updateNodesSettings method
to NodeConfigurationSource and ESIntegTestCase (which could be
overridden by concrete integration test class). Once we have the list
of settings for all nodes, the integration test class is allowed to
update it. In our case, we update the
ClusterBootrstapService.INITIAL_MASTER_NODES_SETTING setting.
2018-12-20 15:20:33 +01:00
Dimitris Athanasiou 08bcd83757
[ML] Reduce persistent tasks periodic reassignment interval in ... (#36845)
... MlDistributedFailureIT.testLoseDedicatedMasterNode.

An intermittent failure has been observed in
`MlDistributedFailureIT. testLoseDedicatedMasterNode`.
The test launches a cluster comprised by a dedicated master node
and a data and ML node. It creates a job and datafeed and starts them.
It then shuts down and restarts the master node. Finally, the test asserts
that the two tasks have been reassigned within 10s.

The intermittent failure is due to the assertions that the tasks have been
reassigned failing. Investigating the failure revealed that the `assertBusy`
that performs that assertion times out. Furthermore, it appears that the
job task is not reassigned because the memory tracking info is stale.

Memory tracking info is refreshed asynchronously when a job is attempted
to be reassigned. Tasks are attempted to be reassigned either due to a relevant
cluster state change or periodically. The periodic interval is controlled by a cluster
setting called `cluster.persistent_tasks.allocation.recheck_interval` and defaults to 30s.

What seems to be happening in this test is that if all cluster state changes after the
master node is restarted come through before the async memory info refresh completes,
then the job might take up to 30s until it is attempted to reassigned. Thus the `assertBusy`
times out.

This commit changes the test to reduce the periodic check that reassigns persistent
tasks to `200ms`. If the above theory is correct, this should eradicate those failures.

Closes #36760
2018-12-20 14:53:36 +02:00
Ryan Ernst cfc0a47232
Core: Deprecate negative epoch timestamps (#36793)
Negative timestamps are currently supported in joda time. These are
dates before epoch. However, it doesn't really make sense to have a
negative timestamp, since this is a modern format. Any dates before
epoch can be represented with normal date formats, like ISO8601.
Additionally, implementing negative epoch timestamp parsing in java time
has an edge case which would more than double the code required. This
commit deprecates use of negative epoch timestamps.
2018-12-20 00:17:06 -08:00
Tim Brooks a7f344cc7f
Use ByteBuffer#slice in BytesReference wrapper (#36862)
Currently the ByteBufferReference does not duplicate the buffer.
This means that any changes to the buffer's limit or position will
impact the reference. This can lead to unexpected behavior. This commit
uses the ByteBuffer#slice method to ensure that the reference retains
its own ByteBuffer.
2018-12-19 16:58:43 -07:00
Julie Tibshirani ecb822c666
Deprecate the document create endpoint. (#36863) 2018-12-19 15:20:20 -08:00
Mayya Sharipova 9c1e47d434 Add 6.5.5 version constants 2018-12-19 17:02:14 -05:00
Julie Tibshirani 32ef80f3d4
Avoid duplicate types deprecation messages in search-related APIs. (#36802) 2018-12-19 12:59:25 -08:00
David Causse 3412627efe QueryRescorer should keep the window size when rewriting (#36836)
This attribute being controlled by the parent class it's easy to miss
it during rewrites.
2018-12-19 21:42:18 +01:00
Ryan Ernst c85c16bd94
Core: Revert back to joda's multi date formatters (#36814)
This commit partially reverts #36447 by using the ability of Joda time's
DateTimeFormatterBuilder to append multiple parsers instead of using the
MergedDateFormatter. The MergedDateFormatter will be removed in a future
change, as it is not as performant due to creating potentially many
exceptions during heavy date parsing. This change is a stop-gap until
that followup is ready.

closes #36602
2018-12-19 11:20:35 -08:00
Christoph Büscher 9058698d9d
Removing some deprecated methods (#36829)
Changes:
* Removed deprecated method in InnerHitBuilder
* Removed fields() from SearchRequestBuilder
* Removed deprecated GeoDistanceSortBuilder#geohashes
2018-12-19 18:45:43 +01:00
Christoph Büscher de052ae533
Bump laster 6.x version in VersionTests (#36851) 2018-12-19 18:19:16 +01:00
Nhat Nguyen b63f9b967c
Fix translog bwc serialization (#36676)
Serializing of a Translog#Index from v7.0.0 to 6.x is broken since
#29224 where we removed the _parent field.

Relates #29224
2018-12-19 11:19:25 -05:00
Ignacio Vera a34a3532ce
[Javadoc]: Remove lucene tags (#36834)
Remove lucene tags as they break gradle javadoc task

Relates #36794
2018-12-19 15:29:09 +01:00
Jason Tedor f2a5373495
Add the 6.7.0 constant to the master branch
Now that the 6.x branch has been bumped to the 6.7.0 version, this
commit adds knowledge of the 6.7.0 version to the master branch.
2018-12-19 08:24:10 -05:00
Yannick Welsch 8f141b8a41
Fix ClusterInfoServiceIT timeouts (#36758)
The test testClusterInfoServiceInformationClearOnError relies on timing behavior. It sets
InternalClusterInfoService.INTERNAL_CLUSTER_INFO_TIMEOUT_SETTING to 1s and relies on the
fact that the stats request completes within that timeframe (which our ever-so-slow CI seems to
violate at times). Unfortunately the logging has been misimplemented in InternalClusterInfoService,
so the corresponding log messages showing that the requests have timed out are missing for this.
The issue can be locally reproduced by reducing the timeout to something lower.

Closes #36554
2018-12-19 13:59:58 +01:00
David Turner 3f8f907606
Extend time allowed to detect disconnection (#36827)
If the master sends both its follower checks just before disconnection then
neither will receive a response, meaning it must wait for the checks to time
out and send another in order to detect the disconnection and stand down.

Closes #36788
2018-12-19 12:51:14 +00:00
Yannick Welsch 487a1c4f71
Fix cluster state persistence for single-node discovery (#36825)
Single-node discovery is not persisting cluster states, which was caused by a recent 7.0-only
refactoring. This commit ensures that the cluster state is properly persisted when using single-node
discovery and adds a corresponding test.
2018-12-19 13:26:04 +01:00
Boaz Leskes 216b154107
Replace 0L with an UNASSIGNED_PRIMARY_TERM constant (#36819)
* Replace 0L with an UNASSIGNED_PRIMARY_TERM constant

0 is an illegal value for a primary term that is often used to indicate
the primary term isn't assigned or is irrelevant. This PR replaces the
usage of 0 with a constant, to improve readability and so it can be
tracked and if needed, replaced.

* feedback
2018-12-19 13:15:05 +01:00
Armin Braun 978713a67c
SNAPSHOTS+TESTS: Correctly Wait for Clean State (#36801)
* Test must wait until there is no in-progress deletion as well here since the master failover leads to a snapshot deletion
* Closes #36779
2018-12-19 13:14:36 +01:00
Alan Woodward 344917efab
Add script filter to intervals (#36776)
This commit adds the ability to filter out intervals based on their start and end position, and internal
gaps:
```
POST _search
{
  "query": {
    "intervals" : {
      "my_text" : {
        "match" : {
          "query" : "hot porridge",
          "filter" : {
            "script" : {
              "source" : "interval.start > 10 && interval.end < 20 && interval.gaps == 0"
            }
          }
        }
      }
    }
  }
}
```
2018-12-19 11:12:18 +00:00
Przemyslaw Gomulka 1345dff507
Fix line length in org.elasticsearch.snapshots (#36646)
Remove the line length suppression for this package and fix offending
lines in both main and test

relates #34884
2018-12-19 11:29:21 +01:00
Luca Cavanna d2ce576c8c
Use SearchRequest copy constructor in ExpandSearchPhase (#36772)
Relates to #36641
2018-12-19 10:47:17 +01:00
Alan Woodward dd540ef618
Use index-prefix fields for terms of length min_chars - 1 (#36703)
The default index_prefix settings will index prefixes of between 2 and 5 characters in length. 
Currently, if a prefix search falls outside of this range at either end we fall back to a standard prefix 
expansion, which is still very expensive for single character prefixes. However, we have an option 
here to use a wildcard expansion rather than a prefix expansion, so that a query of a* gets remapped 
to a? against the _index_prefix field - likely to be a very small set of terms, and certain to be much
smaller than a* against the whole index.

This commit adds this extra level of mapping for any prefix term whose length is one less than
the min_chars parameter of the index_prefixes field.
2018-12-19 08:55:05 +00:00
Nhat Nguyen 1e9d0bb01e AwaitsFix testRebuildLocalCheckpointTracker
Tracked at #36470
2018-12-19 03:12:34 -05:00
Simon Willnauer 8e5db90eec
Never corrupt fully deleted segments in tests (#36741)
Today we might corrupt a fully deleted segment which is then pruned
once a snapshot is taken. This causes random test failures in CorruptedFileIT.
This change hardens the selection of files to corrupt and removes some fragile
code preventing fully deleted segments to be taken into account.

Closes #36526
2018-12-19 07:39:10 +01:00