OpenSearch

Commit Graph

Author	SHA1	Message	Date
Alexander Reelsen	1de2a925ce	Watcher: Ensure that execution triggers properly on initial setup (#33360 ) This commit reverts most of #33157 as it introduces another race condition and breaks a common case of watcher, when the first watch is added to the system and the index does not exist yet. This means, that the index will be created, which triggers a reload, but during this time the put watch operation that triggered this is not yet indexed, so that both processes finish roughly add the same time and should not overwrite each other but act complementary. This commit reverts the logic of cleaning out the ticker engine watches on start up, as this is done already when the execution is paused - which also gets paused on the cluster state listener again, as we can be sure here, that the watches index has not yet been created. This also adds a new test, that starts a one node cluster and emulates the case of a non existing watches index and a watch being added, which should result in proper execution. Closes #33320	2018-09-21 14:22:34 +02:00
Armin Braun	3a5b8a71b4	NETWORKING: Fix Portability of SO_LINGER=0 in Tests (#33895 ) * Setting SO_LINGER for open but not connected non-blocking sockets throws on OSX * Fixed by only applying setting to connected sockets which will save the same number of FDs as doing it on open sockets anyway * closes #33879	2018-09-21 10:08:16 +02:00
Nhat Nguyen	5f7f793f43	Propagate max_auto_id_timestamp in peer recovery (#33693 ) Today we don't store the auto-generated timestamp of append-only operations in Lucene; and assign -1 to every index operations constructed from LuceneChangesSnapshot. This looks innocent but it generates duplicate documents on a replica if a retry append-only arrives first via peer-recovery; then an original append-only arrives via replication. Since the retry append-only (delivered via recovery) does not have timestamp, the replica will happily optimizes the original request while it should not. This change transmits the max auto-generated timestamp from the primary to replicas before translog phase in peer recovery. This timestamp will prevent replicas from optimizing append-only requests if retry counterparts have been processed. Relates #33656 Relates #33222	2018-09-20 19:53:30 -04:00
Nhat Nguyen	76a1a863e3	TEST: stop assertSeqNos if shards movement (#33875 ) Currently, assertSeqNos assumes that the cluster is stable at the end of the test (i.e., no more shard movement). However, this assumption does not always hold. In these cases, we can stop the assertion instead of failing a test. Closes #33704	2018-09-20 13:44:26 -04:00
Tim Vernum	ff934e3dcd	Mute broken test on MacOS Seems to be triggered by `0cf0d73` See: https://github.com/elastic/elasticsearch/issues/33879	2018-09-20 14:06:40 +10:00
Nik Everett	26c4f1fb6c	Core: Default node.name to the hostname (#33677 ) Changes the default of the `node.name` setting to the hostname of the machine on which Elasticsearch is running. Previously it was the first 8 characters of the node id. This had the advantage of producing a unique name even when the node name isn't configured but the disadvantage of being unrecognizable and not being available until fairly late in the startup process. Of particular interest is that it isn't available until after logging is configured. This forces us to use a volatile read whenever we add the node name to the log. Using the hostname is available immediately on startup and is generally recognizable but has the disadvantage of not being unique when run on machines that don't set their hostname or when multiple elasticsearch processes are run on the same host. I believe that, taken together, it is better to default to the hostname. 1. Running multiple copies of Elasticsearch on the same node is a fairly advanced feature. We do it all the as part of the elasticsearch build for testing but we make sure to set the node name then. 2. That the node.name defaults to some flavor of "localhost" on an unconfigured box feels like it isn't going to come up too much in production. I expect most production deployments to at least set the hostname. As a bonus, production deployments need no longer set the node name in most cases. At least in my experience most folks set it to the hostname anyway.	2018-09-19 15:21:29 -04:00
Nik Everett	3ede13a454	Test framework fall cleaning (#33423 ) Wraps all lines in our test framework at 140 characters because that is our standard line length and removes all of the checkstyle suppressions for the test framework. Drops most of `ModuleTestCase` because it isn't used and we're moving away from using guice in the way that it wants to test anyway. Also switches a few classes that extend it but don't use it to extend `ESTestCase` instead.	2018-09-19 14:34:02 -04:00
Lee Hinman	81e9150c7a	Merge remote-tracking branch 'origin/master' into index-lifecycle	2018-09-19 09:43:26 -06:00
Vladimir Dolzhenko	a3e8b831ee	add elasticsearch-shard tool (#32281 ) Relates #31389	2018-09-19 10:28:22 +02:00
Armin Braun	0cf0d73813	TESTS: Set SO_LINGER = 0 for MockNioTransport (#32560 ) * TESTS: Set SO_LINGER = 0 for MockNioTransport * Prevents lingering sockets in TIME_WAIT piling up during test runs and leading to port collisions that manifest as timeouts * Fixes #32552	2018-09-19 06:05:36 +02:00
Lee Hinman	c87cff22b4	Merge remote-tracking branch 'origin/master' into index-lifecycle	2018-09-18 13:57:41 -06:00
Or Bin	a5bad4d92c	Docs: Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...' (#33744 ) Fixed a grammatical mistake: 'a HTTP ...' -> 'an HTTP ...' Closes #33728	2018-09-17 15:35:54 -04:00
Lee Hinman	7ff11b4ae1	Merge remote-tracking branch 'origin/master' into index-lifecycle	2018-09-17 10:41:10 -06:00
Alpar Torok	5ca6f31205	Move precommit task implementation to java (#33407 ) Replace precommit tasks that execute with Java implementations	2018-09-17 14:09:28 +03:00
Lee Hinman	e6cbaa5a78	Merge remote-tracking branch 'origin/master' into index-lifecycle	2018-09-14 16:27:37 -06:00
Armin Braun	0b4960ff6b	SCRIPTING: Move terms_set Context to its Own Class (#33602 ) * SCRIPTING: Move terms_set Context to its Own Class * Extracted TermsSetQueryScript * Kept mechanics close to what they were with SearchScript	2018-09-14 06:21:18 +02:00
Colin Goodheart-Smithe	8e59de3eb2	Merge branch 'master' into index-lifecycle	2018-09-13 09:46:14 +01:00
Jim Ferenczi	6ca36bba15	Fix field mapping updates with similarity (#33634 ) This change fixes a bug introduced in 6.3 that prevents fields with an explicit similarity to be updated. It also adds a test that checks this case for similarities but also for analyzers since they could suffer from the same problem. Closes #33611	2018-09-13 09:21:27 +02:00
David Turner	5a3fd8e4e7	Use file-based discovery not MockUncasedHostsProvider (#33554 ) Today we use a special unicast hosts provider, the `MockUncasedHostsProvider`, in many integration tests, to deal with the dynamic nature of the allocation of ports to nodes. However #33241 allows us to use file-based discovery to achieve the same goal, so the special test-only `MockUncasedHostsProvider` is no longer required. This change removes `MockUncasedHostProvider` and replaces it with file-based discovery in tests based on `EsIntegTestCase`.	2018-09-13 07:37:15 +02:00
Martijn van Groningen	5fa81310cc	[CCR] Added history uuid validation (#33546 ) For correctness we need to verify whether the history uuid of the leader index shards never changes while that index is being followed. * The history UUIDs are recorded as custom index metadata in the follow index. * The follow api validates whether the current history UUIDs of the leader index shards are the same as the recorded history UUIDs. If not the follow api fails. * While a follow index is following a leader index; shard follow tasks on each shard changes api call verify whether their current history uuid is the same as the recorded history uuid. Relates to #30086 Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>	2018-09-12 19:42:00 +02:00
Simon Willnauer	c783488e97	Add `_source`-only snapshot repository (#32844 ) This change adds a `_source` only snapshot repository that allows to wrap any existing repository as a _backend_ to snapshot only the `_source` part including live docs markers. Snapshots taken with the `source` repository won't include any indices, doc-values or points. The snapshot will be reduced in size and functionality such that it requires full re-indexing after it's successfully restored. The restore process will copy the `_source` data locally starts a special shard and engine to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception. The restored index is also marked as read-only. This feature aims mainly for disaster recovery use-cases where snapshot size is a concern or where time to restore is less of an issue. NOTE: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.	2018-09-12 17:47:10 +02:00
Nhat Nguyen	743327efc2	Reset replica engine to global checkpoint on promotion (#33473 ) When a replica starts following a newly promoted primary, it may have some operations which don't exist on the new primary. Thus we need to throw those operations to align a replica with the new primary. This can be done by first resetting an engine from the safe commit, then replaying the local translog up to the global checkpoint. Relates #32867	2018-09-11 22:09:37 -04:00
Jason Tedor	73c75bef21	Preserve cluster settings on full restart tests (#33590 ) Today the full cluster restart tests do not preserve cluster settings on restart. This is a mistake because it is not an accurate reflection of reality, we do not expect users to clear cluster settings when they perform a full cluster restart. This commit makes it so that all full cluster restart tests preserve settings on upgrade.	2018-09-11 08:40:22 -04:00
Jason Tedor	ea3fdc90c6	Add full cluster restart base class (#33577 ) This commit adds a base class for full cluster restart tests.	2018-09-10 20:06:42 -04:00
Colin Goodheart-Smithe	cdc4f57a77	Merge branch 'master' into index-lifecycle	2018-09-10 21:30:44 +01:00
Alan Woodward	39c3234c2f	Upgrade to latest Lucene snapshot (#33505 ) * LeafCollector.setScorer() now takes a Scorable * Scorers may not have null Weights * IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk	2018-09-10 20:51:55 +01:00
Jason Tedor	5f4244755e	Enable not wiping cluster settings after REST test (#33575 ) In some cases we want to skip wiping cluster settings after a REST test. For example, one use-case would be in the full cluster restart tests where want to test cluster settings before and after a full cluster restart. If we wipe the cluster settings before the restart, then it would not be possible to assert on them after the restart.	2018-09-10 14:25:30 -04:00
Tanguy Leroux	079d130d8c	[Test] Remove duplicate method in TestShardRouting (#32815 )	2018-09-10 18:29:00 +02:00
Jason Tedor	6bb817004b	Add infrastructure to upgrade settings (#33536 ) In some cases we want to deprecate a setting, and then automatically upgrade uses of that setting to a replacement setting. This commit adds infrastructure for this so that we can upgrade settings when recovering the cluster state, as well as when such settings are dynamically applied on cluster update settings requests. This commit only focuses on cluster settings, index settings can build on this infrastructure in a follow-up.	2018-09-09 20:49:19 -04:00
Jason Tedor	5a38c930fc	Add license checks for auto-follow implementation (#33496 ) This commit adds license checks for the auto-follow implementation. We check the license on put auto-follow patterns, and then for every coordination round we check that the local and remote clusters are licensed for CCR. In the case of non-compliance, we skip coordination yet continue to schedule follow-ups.	2018-09-09 07:06:55 -04:00
Nhat Nguyen	94e4cb64c2	Bootstrap a new history_uuid when force allocating a stale primary (#33432 ) This commit ensures that we bootstrap a new history_uuid when force allocating a stale primary. A stale primary should never be the source of an operation-based recovery to another shard which exists before the forced-allocation. Closes #26712	2018-09-08 19:29:31 -04:00
Nik Everett	190ea9a6de	Logging: Configure the node name when we have it (#32983 ) Change the logging infrastructure to handle when the node name isn't available in `elasticsearch.yml`. In that case the node name is not available until long after logging is configured. The biggest change is that the node name logging no longer fixed at pattern build time. Instead it is read from a `SetOnce` on every print. If it is unset it is printed as `unknown` so we have something that fits in the pattern. On normal startup we don't log anything until the node name is available so we never see the `unknown`s.	2018-09-07 14:31:23 -04:00
Simon Willnauer	c12d232215	Pass Directory instead of DirectoryService to Store (#33466 ) Instead of passing DirectoryService which causes yet another dependency on Store we can just pass in a Directory since we will just call `DirectoryService#newDirectory()` on it anyway.	2018-09-07 14:00:24 +02:00
Colin Goodheart-Smithe	017ffe5d12	Merge branch 'master' into index-lifecycle	2018-09-07 10:59:10 +01:00
Jim Ferenczi	79cd6385fe	Collapse package structure for metrics aggs (#33463 ) This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`. It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package. Relates #22868	2018-09-07 10:58:06 +02:00
Nik Everett	0d45752e50	Fix IndexMetaData loads after rollover (#33394 ) When we rollover and index we write the conditions of the rollover that the old index met into the old index. Loading this index metadata requires a working `NamedXContentRegistry` that has been populated with parsers from the rollover infrastructure. We had a few loads that didn't use a working `NamedXContentRegistry` and so would fail if they ever encountered an index that had been rolled over. Here are the locations of the loads and how I fixed them: * IndexFolderUpgrader - removed entirely. It existed to support opening indices made in Elasticsearch 2.x. Since we only need this change as far back as 6.4.1 which will supports reading from indices created as far back as 5.0.0 we should be good here. * TransportNodesListGatewayStartedShards - wired the `NamedXContentRegistry` into place. * TransportNodesListShardStoreMetaData - wired the `NamedXContentRegistry` into place. * OldIndexUtils - removed entirely. It existed to support the zip based index backwards compatibility tests which we've since replaced with code that actually runs old versions of Elasticsearch. In addition to fixing the actual problem I added full cluster restart integration tests for rollover which would have caught this problem and I added an extra assertion to IndexMetaData's deserialization code which will trip if we try to deserialize and index's metadata without a fully formed `NamedXContentRegistry`. It won't catch if use the wrong `NamedXContentRegistry` but it is better than nothing. Closes #33316	2018-09-06 17:55:24 -04:00
Nhat Nguyen	8afe09a749	Pass TranslogRecoveryRunner to engine from outside (#33449 ) This commit allows us to use different TranslogRecoveryRunner when recovering an engine from its local translog. This change is a prerequisite for the commit-based rollback PR. Relates #32867	2018-09-06 11:59:16 -04:00
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Colin Goodheart-Smithe	b1257d873b	Merge branch 'master' into index-lifecycle	2018-09-06 08:17:40 +01:00
Lee Hinman	96d515e3f5	Replace PhaseAfterStep with PhaseCompleteStep (#33398 ) This removes `PhaseAfterStep` in favor of a new `PhaseCompleteStep`. This step in only a marker that the `LifecyclePolicyRunner` needs to halt until the time indicated for entering the next phase. This also fixes a bug where phase times were encapsulated into the policy instead of dynamically adjusting to policy changes. Supersedes #33140, which it replaces Relates to #29823	2018-09-05 16:37:45 -06:00
Tim Brooks	88c178dca6	Add sni name to SSLEngine in netty transport (#33144 ) This commit is related to #32517. It allows an "server_name" attribute on a DiscoveryNode to be propagated to the server using the TLS SNI extentsion. This functionality is only implemented for the netty security transport.	2018-09-05 16:12:10 -06:00
Armin Braun	46774098d9	INGEST: Implement Drop Processor (#32278 ) * INGEST: Implement Drop Processor * Adjust Processor API * Implement Drop Processor * Closes #23726	2018-09-05 14:25:29 +02:00
Jim Ferenczi	dbc7102c86	Fix inner hits retrieval when stored fields are disabled (_none_) (#33018 ) Now that types are unique per mapping we can retrieve the document mapper without referencing the type. This fixes an NPE when stored fields are disabled. For 6x we'll need a different fix since mappings can still have multiple types. Relates #32941	2018-09-04 16:25:52 +02:00
Jason Tedor	09bf4e5f00	Introduce private settings (#33327 ) This commit introduces the formal notion of a private setting. This enables us to register some settings that we had previously not registered as fully-fledged settings to avoid them being exposed via APIs such as the create index API. For example, we had hacks in the codebase to allow index.version.created to be passed around inside of settings objects, but was not registered as a setting so that if a user tried to use the setting on any API then they would get an exception. This prevented users from setting index.version.created on index creation, or updating it via the index settings API. By introducing private settings, we can continue to reject these attempts, yet now we can represent these settings as actual settings. In this change, we register index.version.created as an actual setting. We do not cutover all settings that we had been treating as private in this pull request, it is already quite large due to moving some tests around to account for the fact that some tests need to be able to set the index.version.created. This can be done in a follow-up change.	2018-09-03 19:17:57 -04:00
Nik Everett	f8b7a4dbc8	Logging: Drop Settings from some logging ctors (#33332 ) Drops `Settings` from some logging ctors now that they are no longer needed. This should allow us to stop passing `Settings` around to quite as many places.	2018-09-02 16:51:26 -04:00
Vladimir Dolzhenko	3d82a30fad	drop `index.shard.check_on_startup: fix` (#32279 ) drop `index.shard.check_on_startup: fix` Relates #31389	2018-08-31 21:29:06 +02:00
Nhat Nguyen	ad4dd086d2	Integrates soft-deletes into Elasticsearch (#33222 ) This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch. Highlight works in this PR include: - Replace hard-deletes by soft-deletes in InternalEngine - Use _recovery_source if _source is disabled or modified (#31106) - Soft-deletes retention policy based on the global checkpoint (#30335) - Read operation history from Lucene instead of translog (#30120) - Use Lucene history in peer-recovery (#30522) Relates #30086 Closes #29530 --- These works have been done by the whole team; however, these individuals (lexical order) have significant contribution in coding and reviewing: Co-authored-by: Adrien Grand <jpountz@gmail.com> Co-authored-by: Boaz Leskes <b.leskes@gmail.com> Co-authored-by: Jason Tedor <jason@tedor.me> Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com> Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co> Co-authored-by: Simon Willnauer <simonw@apache.org>	2018-08-30 23:46:07 -04:00
Nhat Nguyen	547de71d59	Revert "Integrates soft-deletes into Elasticsearch (#33222 )" Revert to correct co-author tags. This reverts commit `6dd0aa54f6`.	2018-08-30 23:44:57 -04:00
Nhat Nguyen	6dd0aa54f6	Integrates soft-deletes into Elasticsearch (#33222 ) This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch. Highlight works in this PR include: - Replace hard-deletes by soft-deletes in InternalEngine - Use _recovery_source if _source is disabled or modified (#31106) - Soft-deletes retention policy based on the global checkpoint (#30335) - Read operation history from Lucene instead of translog (#30120) - Use Lucene history in peer-recovery (#30522) Relates #30086 Closes #29530 --- These works have been done by the whole team; however, these individuals (lexical order) have significant contribution in coding and reviewing: Co-authored-by: Adrien Grand jpountz@gmail.com Co-authored-by: Boaz Leskes b.leskes@gmail.com Co-authored-by: Jason Tedor jason@tedor.me Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co Co-authored-by: Simon Willnauer simonw@apache.org	2018-08-30 22:11:23 -04:00
Nhat Nguyen	39839f97ef	TEST: Access cluster state directly in assertSeqNos (#33277 ) Some AbstractDisruptionTestCase tests start failing since we enabled assertSeqNos (in #33130). They fail because the assertSeqNos assertion queries cluster stats while the cluster is disrupted or not formed yet. This commit switches to use the cluster state and shard stats directly from the test cluster. Closes #33251	2018-08-30 19:37:20 -04:00
Armin Braun	cc4d7059bf	Ingest: Add conditional per processor (#32398 ) * Ingest: Add conditional per processor * closes #21248	2018-08-30 03:46:39 +02:00
Matt Weber	92bd7242a3	Fix classpath security checks for external tests. (#33066 ) This commit checks that when we manually add a class to the codebase map, that it does in-fact not exist on the classpath in a jar. This will only be true if we are using the test framework externally such as when a user develops a plugin.	2018-08-29 12:19:58 -07:00
Jonathan Little	9d92a87ae6	Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979 )	2018-08-28 09:27:43 +01:00
Nhat Nguyen	014b3236dc	Ensure to generate identical NoOp for the same failure (#33141 ) We generate slightly different NoOps in InternalEngine and TransportShardBulkAction for the same failure. 1. InternalEngine uses Exception#getFailure to generate a message without the class name: newOp [NoOp{seqNo=1, primaryTerm=1, reason='Contexts are mandatory in context enabled completion field [suggest_context]'}]. 2. TransportShardBulkAction uses Exception#toString to generate a message with the class name: NoOp{seqNo=1, primaryTerm=1, reason='java.lang.IllegalArgumentException: Contexts are mandatory in context enabled completion field [suggest_context]'}. If a write operation fails while a replica is recovering, that replica will possibly receive two different NoOps: one from recovery and one from replication. These two different NoOps will trip TranslogWriter#assertNoSeqNumberConflict assertion. This commit ensures that we generate the same Noop for the same failure. Closes #32986	2018-08-27 15:59:42 -04:00
Simon Willnauer	3376922e8b	Add proxy support to RemoteClusterConnection (#33062 ) This adds support for connecting to a remote cluster through a tcp proxy. A remote cluster can configured with an additional `search.remote.$clustername.proxy` setting. This proxy will be used to connect to remote nodes for every node connection established. We still try to sniff the remote clsuter and connect to nodes directly through the proxy which has to support some kind of routing to these nodes. Yet, this routing mechanism requires the handshake request to include some kind of information where to route to which is not yet implemented. The effort to use the hostname and an optional node attribute for routing is tracked in #32517 Closes #31840	2018-08-25 20:41:32 +02:00
Nhat Nguyen	9dad82ece8	TEST: Skip assertSeqNos for closed shards (#33130 ) If a shard was closed, we return null for SeqNoStats. Therefore the assertion assertSeqNos will hit NPE when it verifies a closed shard. This commit skips closed shards in assertSeqNos and enables this assertion in AbstractDisruptionTestCase.	2018-08-24 21:02:13 -04:00
Nhat Nguyen	739a8d3d44	TEST: resync operation on replica should acquire shard permit (#33103 ) This change makes sure that resync operations on replicas in the test framework are executed under shard permits as the production code.	2018-08-24 20:25:13 -04:00
Jason Tedor	619e0b28b9	Add hook to skip asserting x-content equivalence (#33114 ) This commit adds a hook to AbstractSerializingTestCase to enable skipping asserting that the x-content of the test instance and an instance parsed from the x-content of the test instance are the same. While we usually expect these to be the same, they will not be the same when exceptions are involved because the x-content there is lossy.	2018-08-24 06:53:44 -04:00
Jim Ferenczi	f4e9729d64	Remove unsupported Version.V_5_* (#32937 ) This change removes the es 5x version constants and their usages.	2018-08-24 09:51:21 +02:00
Armin Braun	917e5a8c94	TESTS: Fix Random Fail in MockTcpTransportTests (#33061 ) * `foobar.txGet()` appears to return before `serviceB.stop()` returns, causing `ServiceB.close()` to run concurrently with the `stop` call and running into a race codition * Closes #32863	2018-08-23 13:19:21 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Nhat Nguyen	262d3c0783	Allow engine to recover from translog upto a seqno (#33032 ) This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see #32867). When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates #32867	2018-08-22 07:57:44 -04:00
David Turner	ab000323fa	Allow extension of CapturingTransport by subclasses (#33012 ) Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates #32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.	2018-08-22 09:09:08 +01:00
Alpar Torok	82d10b484a	Run forbidden api checks with runtimeJavaVersion (#32947 ) Run forbidden APIs checks with runtime hava version	2018-08-22 09:05:22 +03:00
Simon Willnauer	92076497e5	Use a dedicated ConnectionManger for RemoteClusterConnection (#32988 ) This change introduces a dedicated ConnectionManager for every RemoteClusterConnection such that there is not state shared with the TransportService internal ConnectionManager. All connections to a remote cluster are isolated from the TransportService but still uses the TransportService and it's internal properties like the Transport, tracing and internal listener actions on disconnects etc. This allows a remote cluster connection to have a different lifecycle than a local cluster connection, also local discovery code doesn't get notified if there is a disconnect on from a remote cluster and each connection can use it's own dedicated connection profile which allows to have a reduced set of connections per cluster without conflicting with the local cluster. Closes #31835	2018-08-21 12:43:25 +02:00
Tim Brooks	cd83ddcecc	Fix assertion in AbstractSimpleTransportTestCase (#32991 ) This is a follow-up to #32956. That commit incorrectly used assertBusy which led to a possible race in the test. This commit fixes it.	2018-08-20 16:09:22 -06:00
Tim Brooks	faa42de66d	Pass DiscoveryNode to initiateChannel (#32958 ) This is related to #32517. This commit passes the DiscoveryNode to the initiateChannel method for different Transport implementation. This will allow additional attributes (besides just the socket address) to be used when opening channels.	2018-08-20 08:54:55 -06:00
Alpar Torok	4b34b3f4aa	Set forbidden APIs target compatibility to compiler java version (#32935 ) Set forbidden apis target compatibility to compiler version Fix outstanding deprecation	2018-08-20 09:27:02 +03:00
Tim Brooks	de92d2ef1f	Move connection listener to ConnectionManager (#32956 ) This is a followup to #31886. After that commit the TransportConnectionListener had to be propogated to both the Transport and the ConnectionManager. This commit moves that listener to completely live in the ConnectionManager. The request and response related methods are moved to a TransportMessageListener. That listener continues to live in the Transport class.	2018-08-18 10:09:24 -06:00
Tim Brooks	2464b68613	Move connection profile into connection manager (#32858 ) This is related to #31835. It moves the default connection profile into the ConnectionManager class. The will allow us to have different connection managers with different profiles.	2018-08-15 09:08:33 -06:00
Lee Hinman	48281ac5bc	Use generic AcknowledgedResponse instead of extended classes (#32859 ) This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead. While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.	2018-08-15 08:06:14 -06:00
Ryan Ernst	0158b59a5a	Test: Fix forbidden uses in test framework (#32824 ) This commit fixes existing uses of forbidden apis in the test framework and re-enables the forbidden apis check. It was previously completely disabled and had missed a rename of the forbidden apis signatures files. closes #32772	2018-08-14 11:35:09 -07:00
Tim Brooks	10fddb62ee	Remove client connections from TcpTransport (#31886 ) This is related to #31835. This commit adds a connection manager that manages client connections to other nodes. This means that the TcpTransport no longer maintains a map of nodes that it is connected to.	2018-08-13 16:44:09 -06:00
Armin Braun	d412230cda	SCRIPTING: Support BucketAggScript return null (#32811 ) * As explained in #32790, `BucketAggregationScript` must support `null` as a return value * Closes #32790	2018-08-13 20:08:26 +02:00
Nik Everett	f5ba801c6b	Test: Only sniff host metadata for node_selectors (#32750 ) Our rest testing framework has support for sniffing the host metadata on startup and, before this change, it'd sniff that metadata before running the first test. This prevents running these tests against elasticsearch installations that won't support sniffing like Elastic Cloud. This change allows tests to only sniff for metadata when they encounter a test with a `node_selector`. These selectors are the things that need the metadata anyway and they are super rare. Tests that use these won't be able to run against installations that don't support sniffing but we can just skip them. In the case of Elastic Cloud, these tests were never going to work against Elastic Cloud anyway.	2018-08-10 13:35:47 -04:00
Christoph Büscher	22f7b03430	Fix test reproducability in AbstractBuilderTestCase setup (#32403 ) Currently AbstractBuilderTestCase generates certain random values in its `beforeTest()` method annotated with @Before only the first time that a test method in the suite is run while initializing the serviceHolder that we use for the rest of the test. This changes the values of subsequent random values and has the effect that when running single methods from a test suite with "-Dtests.method=*", the random values it sees are different from when the same test method is run as part of the whole test suite. This makes it hard to use the reproduction lines logged on failure. This change runs the inialization of the serviceHolder and the randomization connected to it using the test runners master seed, so reproduction by running just one method is possible again. Closes #32400	2018-08-10 15:13:44 +02:00
Boaz Leskes	f58ed21720	Refactor TransportShardBulkAction to better support retries (#31821 ) Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a `RetryOnPrimaryException`, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure. This commit refactors how we deal with retries with the goal of removing `RetryOnPrimaryException` and `RetryOnReplicaException` (not done yet). It achieves so by introducing a class `BulkPrimaryExecutionContext` that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through: 1) A translation phase for updates 2) Execution phase (always index/delete) 3) Waiting for a mapping update to come in, if needed 4) Requires a retry (for updates and cases where the mapping are still not available after the put mapping call returns) 5) A finalization phase which allows updates to the index/delete result to an update result.	2018-08-10 10:15:01 +02:00
Alpar Torok	af8c23eb40	Java version reproduction (#32715 ) Enhance reproduction line with info about jdks Provide the ability to control compiler and hava versions just by passing a property. The actual java home comes from the `JAVA<major>_HOME` env vars that we allready require. This works better with the Gradle daemon as well. Output is also changed a bit. for `-Druntime.java=8 -Dcompiler.java=9`: ``` ======================================= Elasticsearch Build Hamster says Hello! Gradle Version : 4.9 OS Info : Linux 4.17.8-1-ARCH (amd64) Compiler JDK Version : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22]) Runtime JDK Version : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22]) Gradle JDK Version : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10]) Compiler java.home : /home/alpar/opt/jdk-11-ea22/ Runtime java.home : /home/alpar/opt/jdk-11-ea22/ Gradle java.home : /usr/lib/jvm/java-10-openjdk Random Testing Seed : EA858533191E8DFB ======================================= ``` Without configuration: ``` ======================================= Elasticsearch Build Hamster says Hello! ======================================= Gradle Version : 4.9 OS Info : Linux 4.17.8-1-ARCH (amd64) JDK Version : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10]) JAVA_HOME : /usr/lib/jvm/java-10-openjdk Random Testing Seed : 4BD5B2A839C8FCA1 ======================================= ``` Here's how a reproduction line will look like (test made to fail): ``` ./gradlew :modules:lang-painless:test -Dtests.seed=2DA2379065A4EEAB -Dtests.class=org.elasticsearch.painless.AdditionTests -Dtests.method="testInt" -Dtests.security.manager=true -Dtests.locale=es-PE -Dtests.timezone=WET -Dcompiler.java=10 -Druntime.java=10 ```	2018-08-10 08:07:43 +00:00
Armin Braun	79375d35bb	Scripting: Replace Update Context (#32096 ) * SCRIPTING: Move Update Scripts to their own context * Added system property for backwards compatibility of change to `ctx.params`	2018-08-09 14:32:36 +02:00
Jason Tedor	dcc816427e	Expose whether or not the global checkpoint updated (#32659 ) It will be useful for future efforts to know if the global checkpoint was updated. To this end, we need to expose whether or not the global checkpoint was updated when the state of the replication tracker updates. For this, we add to the tracker a callback that is invoked whenever the global checkpoint is updated. For primaries this will be invoked when the computed global checkpoint is updated based on state changes to the tracker. For replicas this will be invoked when the local knowledge of the global checkpoint is advanced from the primary.	2018-08-07 15:10:09 -04:00
Tim Brooks	3d5e9114e3	Reduce connections used by MockNioTransport (#32620 ) The MockNioTransport (similar to the MockTcpTransport) is used for integ tests. The MockTcpTransport has always only opened a single for all of its work. The MockNioTransport has awlays opened the default number of connections (13). This means that every test where two transports connect requires 26 connections. This is more than is necessary. This commit modifies the MockNioTransport to only require 3 connections.	2018-08-07 12:52:28 -06:00
Lee Hinman	b3e15851a2	[TEST] Comment out account breaker assertion while diagnosing Relates to #30290	2018-08-07 09:36:37 -06:00
Armin Braun	6fa7016bbf	SCRIPTING: Move Aggregation Scripts to their own context (#32068 ) * SCRIPTING: Move Aggregation Scripts to their own context	2018-08-04 10:37:07 +02:00
Yannick Welsch	0d60e8a029	Fix race between replica reset and primary promotion (#32442 ) We've recently seen a number of test failures that tripped an assertion in IndexShard (see issues linked below), leading to the discovery of a race between resetting a replica when it learns about a higher term and when the same replica is promoted to primary. This commit fixes the race by distinguishing between a cluster state primary term (called pendingPrimaryTerm) and a shard-level operation term. The former is set during the cluster state update or when a replica learns about a new primary. The latter is only incremented under the operation block, which can happen in a delayed fashion. It also solves the issue where a replica that's still adjusting to the new term receives a cluster state update that promotes it to primary, which can happen in the situation of multiple nodes being shut down in short succession. In that case, the cluster state update thread would call `asyncBlockOperations` in `updateShardState`, which in turn would throw an exception as blocking permits is not allowed while an ongoing block is in place, subsequently failing the shard. This commit therefore extends the IndexShardOperationPermits to allow it to queue multiple blocks (which will all take precedence over operations acquiring permits). Finally, it also moves the primary activation of the replication tracker under the operation block, so that the actual transition to primary only happens under the operation block. Relates to #32431, #32304 and #32118	2018-08-03 09:33:08 +02:00
Yannick Welsch	db6e8c736d	Remove cluster state initial customs (#32501 ) This infrastructure was introduced in #26144 and made obsolete in #30743	2018-08-02 15:49:59 +02:00
Jay Modi	f2f33f3149	Use hostname instead of IP with SPNEGO test (#32514 ) This change updates KerberosAuthenticationIT to resolve the host used to connect to the test cluster. This is needed because the host could be an IP address but SPNEGO requires a hostname to work properly. This is done by adding a hook in ESRestTestCase for building the HttpHost from the host and port. Additionally, the project now specifies the IPv4 loopback address as the http host. This is done because we need to be able to resolve the address used for the HTTP transport before the node starts up, but the http.ports file is not written until the node is started. Closes #32498	2018-08-01 12:57:33 +10:00
Nik Everett	22459576d7	Logging: Make node name consistent in logger (#31588 ) First, some background: we have 15 different methods to get a logger in Elasticsearch but they can be broken down into three broad categories based on what information is provided when building the logger. Just a class like: ``` private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class); ``` or: ``` protected final Logger logger = Loggers.getLogger(getClass()); ``` The class and settings: ``` this.logger = Loggers.getLogger(getClass(), settings); ``` Or more information like: ``` Loggers.getLogger("index.store.deletes", settings, shardId) ``` The goal of the "class and settings" variant is to attach the node name to the logger. Because we don't always have the settings available, we often use the "just a class" variant and get loggers without node names attached. There isn't any real consistency here. Some loggers get the node name because it is convenient and some do not. This change makes the node name available to all loggers all the time. Almost. There are some caveats are testing that I'll get to. But in production code the node name is node available to all loggers. This means we can stop using the "class and settings" variants to fetch loggers which was the real goal here, but a pleasant side effect is that the ndoe name is now consitent on every log line and optional by editing the logging pattern. This is all powered by setting the node name statically on a logging formatter very early in initialization. Now to tests: tests can't set the node name statically because subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in the same class loader. Also, lots of tests don't run with a real node so they don't have a node name at all. To support multiple nodes in the same JVM tests suss out the node name from the thread name which works surprisingly well and easy to test in a nice way. For those threads that are not part of an `ESIntegTestCase` node we stick whatever useful information we can get form the thread name in the place of the node name. This allows us to keep the logger format consistent.	2018-07-31 10:54:24 -04:00
Luca Cavanna	9a4d0069f6	REST high-level client: parse back _ignored meta field (#32362 ) `GetResult` and `SearchHit` have been adjusted to parse back the `_ignored` meta field whenever it gets printed out. Expanded the existing tests to make sure this is covered. Fixed also a small problem around highlighted fields in `SearchHitTests`.	2018-07-30 13:43:40 +02:00
Armin Braun	1628c833c7	TESTS: Move netty leak detection to paranoid level (#32354 )	2018-07-26 21:36:49 +02:00
Jim Ferenczi	8e5f281b27	AbstractQueryTestCase should run without type less often (#28936 ) This commit changes the randomization to always create an index with a type. It also adds a way to create a query shard context that maps to an index with no type registered in order to explicitely test cases where there is no type.	2018-07-26 20:29:05 +02:00
Jason Tedor	eb675a1c4d	Introduce index store plugins (#32375 ) Today we allow plugins to add index store implementations yet we are not doing this in our new way of managing plugins as pull versus push. That is, today we still allow plugins to push index store providers via an on index module call where they can turn around and add an index store. Aside from being inconsistent with how we manage plugins today where we would look to pull such implementations from plugins at node creation time, it also means that we do not know at a top-level (for example, in the indices service) which index stores are available. This commit addresses this by adding a dedicated plugin type for index store plugins, removing the index module hook for adding index stores, and by aggregating these into the top-level of the indices service.	2018-07-26 08:05:49 -04:00
Tim Vernum	387c3c7f1d	Introduce Application Privileges with support for Kibana RBAC (#32309 ) This commit introduces "Application Privileges" to the X-Pack security model. Application Privileges are managed within Elasticsearch, and can be tested with the _has_privileges API, but do not grant access to any actions or resources within Elasticsearch. Their purpose is to allow applications outside of Elasticsearch to represent and store their own privileges model within Elasticsearch roles. Access to manage application privileges is handled in a new way that grants permission to specific application names only. This lays the foundation for more OLS on cluster privileges, which is implemented by allowing a cluster permission to inspect not just the action being executed, but also the request to which the action is applied. To support this, a "conditional cluster privilege" is introduced, which is like the existing cluster privilege, except that it has a Predicate over the request as well as over the action name. Specifically, this adds - GET/PUT/DELETE actions for defining application level privileges - application privileges in role definitions - application privileges in the has_privileges API - changes to the cluster permission class to support checking of request objects - a new "global" element on role definition to provide cluster object level security (only for manage application privileges) - changes to `kibana_user`, `kibana_dashboard_only_user` and `kibana_system` roles to use and manage application privileges Closes #29820 Closes #31559	2018-07-24 10:34:46 -06:00
Daniel Mitterdorfer	73a38895fd	Add Restore Snapshot High Level REST API With this commit we add the restore snapshot API to the Java high level REST client. Relates #27205 Relates #32155	2018-07-24 16:17:09 +02:00
Ioannis Kakavas	a2dbd83db1	Allow Integ Tests to run in a FIPS-140 JVM (#31989 ) * Complete changes for running IT in a fips JVM - Mute :x-pack:qa:sql:security:ssl:integTest as it cannot run in FIPS 140 JVM until the SQL CLI supports key/cert. - Set default JVM keystore/truststore password in top level build script for all integTest tasks in a FIPS 140 JVM - Changed top level x-pack build script to use keys and certificates for trust/key material when spinning up clusters for IT	2018-07-24 12:48:14 +03:00
Andrey Ershov	33f11e637d	Fail shard if IndexShard#storeStats runs into an IOException (#32241 ) Fail shard if IndexShard#storeStats runs into an IOException. Closes #29008	2018-07-23 16:38:55 +02:00
Christoph Büscher	ff87b7aba4	Remove unnecessary warning supressions (#32250 )	2018-07-23 11:31:04 +02:00
Armin Braun	24068a773d	TESTS: Check for Netty resource leaks (#31861 ) * Enabled advanced leak detection when loading `EsTestCase` * Added custom `Appender` to collect leak logs and check for logged errors in a way similar to what is done for the `StatusLogger` * Fixes #20398	2018-07-20 09:12:32 +02:00
Julie Tibshirani	15ff3da653	Add support for field aliases. (#32172 ) * Add basic support for field aliases in index mappings. (#31287) * Allow for aliases when fetching stored fields. (#31411) * Add tests around accessing field aliases in scripts. (#31417) * Add documentation around field aliases. (#31538) * Add validation for field alias mappings. (#31518) * Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671) * Make sure that field-level security is enforced when using field aliases. (#31807) * Add more comprehensive tests for field aliases in queries + aggregations. (#31565) * Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)	2018-07-18 09:33:09 -07:00
Boaz Leskes	5856c396dd	A replica can be promoted and started in one cluster state update (#32042 ) When a replica is fully recovered (i.e., in `POST_RECOVERY` state) we send a request to the master to start the shard. The master changes the state of the replica and publishes a cluster state to that effect. In certain cases, that cluster state can be processed on the node hosting the replica together with a cluster state that promotes that, now started, replica to a primary. This can happen due to cluster state batched processing or if the master died after having committed the cluster state that starts the shard but before publishing it to the node with the replica. If the master also held the primary shard, the new master node will remove the primary (as it failed) and will also immediately promote the replica (thinking it is started). Sadly our code in IndexShard didn't allow for this which caused [assertions](`13917162ad/server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java (L482)`) to be tripped in some of our tests runs.	2018-07-18 11:30:44 +02:00
Boaz Leskes	93d7468f3a	ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are Sometimes we have a test failure that hits an `UnsupportedOperationException` in this infrastructure. When debugging you want to know what caused this unexpected failure, but right now we're silent about it. This commit adds some information to the `UnsupportedOperationException` Relates to #32127	2018-07-18 08:49:16 +02:00
Nhat Nguyen	df1380b8d3	Remove versionType from translog (#31945 ) With the introduction of sequence number, we no longer use versionType to resolve out of order collision in replication and recovery requests. This PR removes removes the versionType from translog. We can only remove it in 7.0 because it is still required in a mixed cluster between 6.x and 5.x.	2018-07-17 21:59:48 -04:00
Ioannis Kakavas	9e529d9d58	Enable testing in FIPS140 JVM (#31666 ) Ensure our tests can run in a FIPS JVM JKS keystores cannot be used in a FIPS JVM as attempting to use one in order to init a KeyManagerFactory or a TrustManagerFactory is not allowed.( JKS keystore algorithms for private key encryption are not FIPS 140 approved) This commit replaces JKS keystores in our tests with the corresponding PEM encoded key and certificates both for key and trust configurations. Whenever it's not possible to refactor the test, i.e. when we are testing that we can load a JKS keystore, etc. we attempt to mute the test when we are running in FIPS 140 JVM. Testing for the JVM is naive and is based on the name of the security provider as we would control the testing infrastrtucture and so this would be reliable enough. Other cases of tests being muted are the ones that involve custom TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the SAMLAuthneticator class as we cannot sign XML documents in the way we were doing. SAMLAuthenticator tests in a FIPS JVM can be reenabled with precomputed and signed SAML messages at a later stage. IT will be covered in a subsequent PR	2018-07-17 10:54:10 +03:00
Daniel Mitterdorfer	016e8760f0	Turn off real-mem breaker in single node tests With this commit we disable the real-memory circuit breaker in tests that inherit from `ESSingleNodeTestCase`. As this breaker is based on real memory usage over which we have no (full) control in tests and their purpose is also not to test the circuit breaker, we use the deterministic circuit breaker implementation that only accounts for explicitly reserved memory. Closes #32047 Relates #32071	2018-07-16 10:40:36 +02:00
Armin Braun	3679d00a74	Replace Ingest ScriptContext with Custom Interface (#32003 ) * Replace Ingest ScriptContext with Custom Interface * Make org.elasticsearch.ingest.common.ScriptProcessorTests#testScripting more precise * Don't mock script factory in ScriptProcessorTests * Adjust mock script plugin in IT for new API	2018-07-13 23:26:10 +02:00
Vladimir Dolzhenko	b1bf643e41	lazy snapshot repository initialization (#31606 ) lazy snapshot repository initialization	2018-07-13 20:05:49 +02:00
Colin Goodheart-Smithe	0edb096eb4	Adds a new auto-interval date histogram (#28993 ) * Adds a new auto-interval date histogram This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time. This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary. Closes #9572 * Adds documentation * Added sub aggregator test * Fixes failing docs test * Brings branch up to date with master changes * trying to get tests to pass again * Fixes multiBucketConsumer accounting * Collects more buckets than needed on shards This gives us more options at reduce time in terms of how we do the final merge of the buckeets to produce the final result * Revert "Collects more buckets than needed on shards" This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5. * Adds ability to merge within a rounding * Fixes nonn-timezone doc test failure * Fix time zone tests * iterates on tests * Adds test case and documentation changes Added some notes in the documentation about the intervals that can bbe returned. Also added a test case that utilises the merging of conseecutive buckets * Fixes performance bug The bug meant that getAppropriate rounding look a huge amount of time if the range of the data was large but also sparsely populated. In these situations the rounding would be very low so iterating through the rounding values from the min key to the max keey look a long time (~120 seconds in one test). The solution is to add a rough estimate first which chooses the rounding based just on the long values of the min and max keeys alone but selects the rounding one lower than the one it thinks is appropriate so the accurate method can choose the final rounding taking into account the fact that intervals are not always fixed length. Thee commit also adds more tests * Changes to only do complex reduction on final reduce * merge latest with master * correct tests and add a new test case for 10k buckets * refactor to perform bucket number check in innerBuild * correctly derive bucket setting, update tests to increase bucket threshold * fix checkstyle * address code review comments * add documentation for default buckets * fix typo	2018-07-13 13:08:35 -04:00
Daniel Mitterdorfer	f174f72fee	Circuit-break based on real memory usage With this commit we introduce a new circuit-breaking strategy to the parent circuit breaker. Contrary to the current implementation which only accounts for memory reserved via child circuit breakers, the new strategy measures real heap memory usage at the time of reservation. This allows us to be much more aggressive with the circuit breaker limit so we bump it to 95% by default. The new strategy is turned on by default and can be controlled with the new cluster setting `indices.breaker.total.userealmemory`. Note that we turn it off for all integration tests with an internal test cluster because it leads to spurious test failures which are of no value (we cannot fully control heap memory usage in tests). All REST tests, however, will make use of the real memory circuit breaker. Relates #31767	2018-07-13 10:08:28 +02:00
olcbean	334c255516	XContentTests : Insert random fields at random positions (#30867 ) Currently AbstractXContentTestCase#testFromXContent appends random fields, but in a fixed position. This PR shuffles all fields after the random fields have been appended, hence the random fields are actually added to random positions.	2018-07-12 19:10:51 +02:00
Nik Everett	38e09a1508	Switch test framework to new style requests (#31939 ) In #29623 we added `Request` object flavored requests to the low level REST client and in #30315 we deprecated the old `performRequest`s. This changes all calls in the `test/framework` project to use the new versions.	2018-07-11 10:04:17 -04:00
Armin Braun	b4087d69d2	Fix assertIngestDocument wrongfully passing (#31913 ) * Fix assertIngestDocument wrongfully passing * Previously docA being subset of docB passed because iteration was over docA's keys only * Scalars in nested fields were not compared in all cases * Assertion errors were hard to interpret (message wasn't correct since it only mentioned the class type) * In cases where two paths contained different types a ClassCastException was thrown instead of an AssertionError * Fixes #28492	2018-07-11 10:24:21 +02:00
Alexander Reelsen	1c32497c44	Date: Add DateFormatters class that uses java.time (#31856 ) A newly added class called DateFormatters now contains java.time based builders for dates, which also intends to be fully backwards compatible, when the name based date formatters are picked. Also a new class named CompoundDateTimeFormatter for being able to parse multiple different formats has been added. A duelling test class has been added that ensures the same dates when parsing java or joda time formatted dates for the name based dates. Note, that java.time and joda time are not fully backwards compatible, which also means that old formats will currently not work with this setup.	2018-07-10 09:28:28 +02:00
Yannick Welsch	cce7dc20ad	Smaller aesthetic fixes to InternalTestCluster (#31831 ) Allows cluster to auto-reconfigure faster by starting up nodes in parallel.	2018-07-06 11:42:09 +02:00
Nik Everett	1099060735	Test: Do not remove xpack templates when cleaning (#31642 ) At the end of every `ESRestTestCase` we clean the cluster which includes deleting all of the templates. If xpack is installed it'll automatically recreate a few templates every time they are removed. Which is slow. This change stops the cleanup from removing the xpack templates. It cuts the time to run the docs tests more than in half and it probably saves a bit more time on other tests as well.	2018-07-05 09:43:43 -04:00
Christoph Büscher	bd1c513422	Reduce more raw types warnings (#31780 ) Similar to #31523.	2018-07-05 15:38:06 +02:00
Simon Willnauer	3f2a241b7f	Detach Transport from TransportService (#31727 ) Today TransportService is tightly coupled with Transport since it requires an instance of TransportService in order to receive responses and send requests. This is mainly due to the Request and Response handlers being maintained in TransportService but also because of the lack of a proper callback interface. This change moves request handler registry and response handler registration into Transport and adds all necessary methods to `TransportConnectionListener` in order to remove the `TransportService` dependency from `Transport` Transport now accepts one or more `TransportConnectionListener` instances that are executed sequentially in a blocking fashion.	2018-07-04 11:32:35 +02:00
Yannick Welsch	2bb4f38371	Add writeBlob option to replace existing blob (#31729 ) Adds a new parameter to the BlobContainer#writeBlob methods to specify whether the existing file should be overridden or not. For some metadata files in the repository, we actually want to replace the current file. This is currently implemented through an explicit blob delete and then a fresh write. In case of using a cloud provider (S3, GCS, Azure), this results in 2 API requests instead of just 1. This change will therefore allow us to achieve the same functionality using less API requests.	2018-07-03 09:13:50 +02:00
Christoph Büscher	31aabe4bf9	Clean up double semicolon code typos (#31687 )	2018-07-02 15:14:44 +02:00
Konrad Beiske	2971dd56ca	Enable setting client path prefix to / (#30119 ) Some proxies require all requests to have paths starting with / since there are no relative paths at the HTTP connection level. Elasticsearch assumes paths are absolute. In order to run rest tests against a cluster behind such a proxy, set the system property tests.rest.client_path_prefix to /.	2018-07-01 13:42:03 -04:00
Tanguy Leroux	d8b3f332ef	Remove extra check for object existence in repository-gcs read object (#31661 )	2018-06-29 13:52:31 +02:00
Tanguy Leroux	0ef22db844	[Test] Clean up some repository-s3 tests (#31601 ) This commit removes some tests in the repository-s3 plugin that have not been executed for 2+ years but have been maintained for nothing. Most of the tests in AbstractAwsTestCase were obsolete or superseded by fixture based integration tests.	2018-06-29 13:21:29 +02:00
Ryan Ernst	f924835265	Core: Require all actions have a Task (#31627 ) The TaskManager and TaskAwareRequest could return null when registering a task according to their javadocs, but no implementations ever actually did that. This commit removes that wording from the javadocs and ensures null is no longer allowed.	2018-06-28 08:24:03 -07:00
Alpar Torok	8557bbab28	Upgrade gradle wrapper to 4.8 (#31525 ) * Move to Gradle 4.8 RC1 * Use latest version of plugin The current does not work with Gradle 4.8 RC1 * Switch to Gradle GA * Add and configure build compare plugin * add work-around for https://github.com/gradle/gradle/issues/5692 * work around https://github.com/gradle/gradle/issues/5696 * Make use of Gradle build compare with reference project * Make the manifest more compare friendly * Clear the manifest in compare friendly mode * Remove animalsniffer from buildscript classpath * Fix javadoc errors * Fix doc issues * reference Gradle issues in comments * Conditionally configure build compare * Fix some more doclint issues * fix typo in build script * Add sanity check to make sure the test task was replaced Relates to #31324. It seems like Gradle has an inconsistent behavior and the taks is not always replaced. * Include number of non conforming tasks in the exception. * No longer replace test task, create implicit instead Closes #31324. The issue has full context in comments. With this change the `test` task becomes nothing more than an alias for `utest`. Some of the stand alone tests that had a `test` task now have `integTest`, and a few of them that used to have `integTest` to run multiple tests now only have `check`. This will also help separarate unit/micro tests from integration tests. * Revert "No longer replace test task, create implicit instead" This reverts commit f1ebaf7d93e4a0a19e751109bf620477dc35023c. * Fix replacement of the test task Based on information from gradle/gradle#5730 replace the task taking into account the task providres. Closes #31324. * Only apply build comapare plugin if needed * Make sure test runs before integTest * Fix doclint aftter merge * PR review comments * Switch to Gradle 4.8.1 and remove workaround * PR review comments * Consolidate task ordering	2018-06-28 08:13:21 +03:00
Luca Cavanna	a35b5341c4	[TEST] call yaml client close method from test suite (#31591 ) We added a way to close the yaml test client with #31575. Such close method also needs to be called from the test suite though for the additional clients to be closed.	2018-06-27 08:23:53 +02:00
Luca Cavanna	823a9d34da	[TEST] Close additional clients created while running yaml tests (#31575 ) We recently introduced a mechanism that allows to specify a node selector as part of do sections (see #31471). When a node selector that is not the default one is configured, a new client will be initialized with the same properties as the default one, but with the specified node selector. This commit improves such mechanism but also closing the additional clients being created and adding equals/hashcode impl to the custom node selector as they are cached into a map.	2018-06-26 16:56:35 +02:00
Alpar Torok	08b8d11e30	Add support for switching distribution for all integration tests (#30874 ) * remove left-over comment * make sure of the property for plugins * skip installing modules if these exist in the distribution * Log the distrbution being ran * Don't allow running with integ-tests-zip passed externally * top level x-pack/qa can't run with oss distro * Add support for matching objects in lists Makes it possible to have a key that points to a list and assert that a certain object is present in the list. All keys have to be present and values have to match. The objects in the source list may have additional fields. example: ``` match: { 'nodes.$master.plugins': { name: ingest-attachment } } ``` * Update plugin and module tests to work with other distributions Some of the tests expected that the integration tests will always be ran with the `integ-test-zip` distribution so that there will be no other plugins loaded. With this change, we check for the presence of the plugin without assuming exclusivity. * Allow modules to run on other distros as well To match the behavior of tets.distributions * Add and use a new `contains` assertion Replaces the previus changes that caused `match` to do a partial match. * Implement PR review comments	2018-06-26 06:49:03 -07:00
Nik Everett	232c71b6bf	QA: Create xpack yaml features (#31403 ) This creates a YAML test "features" that indices if the cluster being tested has xpack installed (`xpack`) or if it does not have xpack installed (`no_xpack`). It uses those features to centralize skipping a few tests that fail if xpack is installed. The plan is to use this in a followup to skip docs tests that require xpack when xpack is not installed. We plan to use the declaration of required license level on the docs page to generate the required `skip`. Closes #30933.	2018-06-26 09:26:48 -04:00
Sohaib Iftikhar	ca4c857a90	Improve test times for tests using `RandomObjects::addFields` (#31556 ) Currently RandomObjects::addFields can potentially generate a large number of fields This commit decreases the chances that a new object or array is added as a new branch of an object, which lowers the probability of ending up with very big documents generated. It also reduces the number of documents generated for the SimulatePipelineResponseTests from 10 to 5 to reduce the testing time required for parsing.	2018-06-26 12:39:53 +02:00
Christoph Büscher	86ab3a2d1a	Reduce number of raw types warnings (#31523 ) A first attempt to reduce the number of raw type warnings, most of the time by using the unbounded wildcard.	2018-06-25 15:59:03 +02:00
Jonathan Little	8e4768890a	Migrate scripted metric aggregation scripts to ScriptContext design (#30111 ) * Migrate scripted metric aggregation scripts to ScriptContext design #29328 * Rename new script context container class and add clarifying comments to remaining references to params._agg(s) * Misc cleanup: make mock metric agg script inner classes static * Move _score to an accessor rather than an arg for scripted metric agg scripts This causes the score to be evaluated only when it's used. * Documentation changes for params._agg -> agg * Migration doc addition for scripted metric aggs _agg object change * Rename "agg" Scripted Metric Aggregation script context variable to "state" * Rename a private base class from ...Agg to ...State that I missed in my last commit * Clean up imports after merge	2018-06-25 12:01:33 +01:00
Vladimir Dolzhenko	f04c579203	IndexShard should not return null stats (#31528 ) IndexShard should not return null stats - empty stats or AlreadyCloseException if it's closed is better	2018-06-22 21:08:11 +02:00
Luca Cavanna	16e4e7a7cf	Node selector per client rather than per request (#31471 ) We have made node selectors configurable per request, but all of other language clients don't allow for that. A good reason not to do so, is that having a different node selector per request breaks round-robin. This commit makes NodeSelector configurable only at client initialization. It also improves the docs on this matter, important given that a single node selector can still affect round-robin.	2018-06-22 17:15:29 +02:00
Ryan Ernst	59e7c6411a	Core: Combine messageRecieved methods in TransportRequestHandler (#31519 ) TransportRequestHandler currently contains 2 messageReceived methods, one which takes a Task, and one that does not. The first just delegates to the second. This commit changes all existing implementors of TransportRequestHandler to implement the version which takes Task, thus allowing the class to be a functional interface, and eliminating the need to throw exceptions when a task needs to be ensured.	2018-06-22 07:36:03 -07:00
Yannick Welsch	f22f91c57a	Allow multiple unicast host providers (#31509 ) Introduces support for multiple host providers, which allows the settings based hosts resolver to be treated just as any other UnicastHostsProvider. Also introduces the notion of a HostsResolver so that plugins such as FileBasedDiscovery do not need to create their own thread pool for resolving hosts, making it easier to add new similar kind of plugins.	2018-06-22 15:31:23 +02:00
Yannick Welsch	da69ab28c7	Return transport addresses from UnicastHostsProvider (#31426 ) With #20695 we removed local transport and there is just TransportAddress now. The UnicastHostsProvider currently returns DiscoveryNode instances, where, during pinging, we're actually only making use of the TransportAddress to establish a first connection to the possible new node. To simplify the interface, we can just return a list of transport addresses instead, which means that it's not necessary anymore to create fake node objects in each plugin just to return the address information.	2018-06-21 16:00:26 +02:00
Tim Brooks	86423f9563	Ensure local addresses aren't null (#31440 ) Currently we set local addresses on the creation time of a NioChannel. However, this may return null as the local address may not have been set yet. An example is the local address has not been set on a client channel as the connection process is not yet complete. This PR modifies the getter to set the local field if it is currently null.	2018-06-20 19:50:14 -06:00
Ryan Ernst	00283a61e1	Remove unused generic type for client execute method (#31444 ) This commit removes the request builder generic type for AbstractClient as it was unused.	2018-06-20 16:26:26 -07:00
Tim Brooks	9ab1325953	Introduce http and tcp server channels (#31446 ) Historically in TcpTransport server channels were represented by the same channel interface as socket channels. This was necessary as TcpTransport was parameterized by the channel type. This commit introduces TcpServerChannel and HttpServerChannel classes. Additionally, it adds the implementations for the various transports. This allows server channels to have unique functionality and not implement the methods they do not support (such as send and getRemoteAddress). Additionally, with the introduction of HttpServerChannel this commit extracts some of the storing and closing channel work to the abstract http server transport.	2018-06-20 16:34:56 -06:00
Nhat Nguyen	db1b97fd85	Remove QueryCachingPolicy#ALWAYS_CACHE (#31451 ) The QueryCachingPolicy#ALWAYS_CACHE was deprecated in Lucene-7.4 and will be removed in Lucene-8.0. This change replaces it with QueryCachingPolicy. This also makes INDEX_QUERY_CACHE_EVERYTHING_SETTING visible in testing only.	2018-06-20 10:34:08 -04:00
Tim Brooks	529e704b11	Unify http channels and exception handling (#31379 ) This is a general cleanup of channels and exception handling in http. This commit introduces a CloseableChannel that is a superclass of TcpChannel and HttpChannel. This allows us to unify the closing logic between tcp and http transports. Additionally, the normal http channels are extracted to the abstract server transport. Finally, this commit (mostly) unifies the exception handling between nio and netty4 http server transports.	2018-06-19 11:50:03 -06:00
Ryan Ernst	e67aa96c81	Core: Combine Action and GenericAction (#31405 ) Since #30966, Action no longer has anything but a call to the GenericAction super constructor. This commit renames GenericAction into Action, thus eliminating the Action class. Additionally, this commit removes the Request generic parameter of the class, since it was unused.	2018-06-18 23:53:04 +02:00
Simon Willnauer	3d5f113ada	Ensure we don't use a remote profile if cluster name matches (#31331 ) If we are running into a race condition between a node being configured to be a remote node for cross cluster search etc. and that node joining the cluster we might connect to that node with a remote profile. If that node now joins the cluster it connected to it as a CCS remote node we use the wrong profile and can't use bulk connections etc. anymore. This change uses the remote profile only if we connect to a node that has a different cluster name than the local cluster. This is not a perfect fix for this situation but is the safe option while potentially only loose a small optimization of using less connections per node which is small anyways since we only connect to a small set of nodes. Closes #29321	2018-06-17 13:32:53 +02:00
Tal Levy	3b70e943eb	add is-write-index flag to aliases (#30942 ) This commit adds the is-write-index flag for aliases. It allows requests to set the flag, and responses to display the flag. It does not validate and/or affect any indexing/getting/updating behavior of Elasticsearch -- this will be done in a follow-up PR.	2018-06-15 08:45:29 -07:00
Nhat Nguyen	8453ca638d	Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360 )	2018-06-15 10:58:21 -04:00
Nik Everett	856936c286	REST Client: NodeSelector for node attributes (#31296 ) Add a `NodeSelector` so that users can filter the nodes that receive requests based on node attributes. I believe we'll need this to backport #30523 and we want it anyway. I also added a bash script to help with rebuilding the sniffer parsing test documents.	2018-06-15 08:04:54 -04:00
Nhat Nguyen	e5b7137508	TEST: getCapturedRequestsAndClear should be atomic (#31312 ) We might lose messages between getCapturedRequestsAndClear calls. This commit makes sure that both getCapturedRequestsAndClear and getCapturedRequestsByTargetNodeAndClear are atomic.	2018-06-14 21:32:07 -04:00
Tim Brooks	fcf1e41e42	Extract common http logic to server (#31311 ) This is related to #28898. With the addition of the http nio transport, we now have two different modules that provide http transports. Currently most of the http logic lives at the module level. However, some of this logic can live in server. In particular, some of the setting of headers, cors, and pipelining. This commit begins this moving in that direction by introducing lower level abstraction (HttpChannel, HttpRequest, and HttpResonse) that is implemented by the modules. The higher level rest request and rest channel work can live entirely in server.	2018-06-14 15:10:02 -06:00
Tanguy Leroux	bbfe1eccc7	[Tests] Mutualize fixtures code in BaseHttpFixture (#31210 ) Many fixtures have similar code for writing the pid & ports files or for handling HTTP requests. This commit adds an AbstractHttpFixture class in the test framework that can be extended for specific testing purposes.	2018-06-14 14:09:56 +02:00
Tanguy Leroux	4d7447cb5e	Reenable Checkstyle's unused import rule (#31270 )	2018-06-14 09:52:46 +02:00
Nik Everett	77bb93557e	Test: Remove broken yml test feature (#31255 ) The `requires_replica` yaml test feature hasn't worked for years. This is what happens if you try to use it: ``` > Throwable #1: java.lang.NullPointerException > at __randomizedtesting.SeedInfo.seed([E6602FB306244B12:6E341069A8D826EA]:0) > at org.elasticsearch.test.rest.yaml.Features.areAllSupported(Features.java:58) > at org.elasticsearch.test.rest.yaml.section.SkipSection.skip(SkipSection.java:144) > at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:321) ``` None of our tests use it.	2018-06-13 09:33:06 -04:00
Ryan Ernst	a65b18f19d	Core: Remove plain execute method on TransportAction (#30998 ) TransportAction has many variants of execute. One of those variants executes by returning a future, which is then often blocked on by calling get(). This commit removes this variant of execute, instead using a helper method for tests that want to block, or having tests pass in a PlainActionFuture directly as a listener. Co-authored-by: Simon Willnauer <simonw@apache.org>	2018-06-13 09:58:13 +02:00

1 2 3 4 5 ...

1681 Commits