Commit Graph

179 Commits

Author SHA1 Message Date
Nik Everett c6c9075ca4
Switch rolling restart to new style Requests ()
In  we added `Request` object flavored requests to the low level
REST client and in  we deprecated the old `performRequest`s. This
changes all calls in the `qa/rolling-upgrade` project to use the new
versions.
2018-07-20 12:01:50 -04:00
Nhat Nguyen 51151027cd TEST: Add bwc recovery tests with synced-flush index
Although the master branch does not affect by , it's helpful to
have BWC tests that verify the peer recovery with a synced-flush index.
This commit adds the bwc tests from  to the master branch.

Relates 
Relates 
2018-06-22 20:14:06 -04:00
Tanguy Leroux bf58660482
Remove all unused imports and fix CRLF ()
The X-Pack opening and the recent other refactorings left a lot of 
unused imports in the codebase. This commit removes them all.
2018-06-11 15:12:12 +02:00
Nik Everett dfcc939ef8 QA: Better seed nodes for rolling restart
Use all running nodes as unicast seeds in the rolling restart tests to
avoid a race between pinging and the tests. Without this if the tests
are too fast then when a new node comes up and pings its single
configured seed node that node *might* not have a ping from the other
running node.
2018-06-07 13:30:37 -04:00
Nik Everett 56207ea43d QA: Set better node names on rolling restart tests
These should help with debugging failures.
2018-06-07 11:25:41 -04:00
Nik Everett 7c59e7690e
QA: Switch xpack rolling upgrades to three nodes ()
This is much more realistic and can find more issues. This causes the
"mixed cluster" tests to be run twice so I had to fix the tests to work
in that case. In most cases I did as little as possible to get them
working but in a few cases I went a little beyond that to make them
easier for me to debug while getting them to work. My test changes:

1. Remove the "basic indexing" tests and replace them with a copy of the
tests used in the OSS. We have no way of sharing code between these two
projects so for now I copy.
2. Skip the a few tests in the "one third" upgraded scenario:
  * creating a scroll to be reused when the cluster is fully upgraded
  * creating some ml data to be used when the cluster is fully ugpraded
3. Drop many "assert yellow and that the cluster has two nodes"
assertions. These assertions duplicate those made by the wait condition
and they fail now that we have three nodes.
4. Switch many "assert green and that the cluster has two nodes" to 3
nodes. These assertions are unique from the wait condition and, while
I imagine they aren't required in all cases, now is not the time to
find that out. Thus, I made them work.
5. Rework the index audit trail test so it is more obvious that it is
the same test expecting different numbers based on the shape of the
cluster. The conditions for which number are expected are fairly
complex because the index audit trail is shut down until the template
for it is upgraded and the template is upgraded when a master node is
elected that has the new version of the software.
6. Add some more information to debug the index audit trail test because
it helped me figure out what was going on.

I also dropped the `waitCondition` from the `rolling-upgrade-basic`
tests because it wasn't needed.

Closes 
2018-06-06 11:59:16 -04:00
Nik Everett a5d90e919f
QA: Add xpack tests to rolling upgrade ()
A rolling upgrade from oss Elasticsearch to the default distribution of
Elasticsearch is significantly different than a full cluster restart to
install a plugin and is again different from starting a new cluster with
xpack installed. So this adds some basic tests to make sure that the
rolling upgrade that enables xpack works at all.

This also removes some unused imports from the tests that I modified in
PR . I didn't mean to leave them.
2018-05-22 17:17:34 -04:00
Nik Everett be8c0f27be
QA: Switch rolling upgrade to 3 nodes ()
Switches the rolling upgrade tests from upgrading two nodes to upgrading
three nodes which is much more realistic and much better able to find
unexpected bugs. It upgrades the nodes one at a time and runs tests
between each upgrade. As such this now has four test runs:

1. Old
2. One third upgraded
3. Two thirds upgraded
4. Upgraded

It sets system properties so the tests can figure out which stage they
are in. It reuses the same yml tests for the "one third" and "two
thirds" cases because they are *almost* the same case.

This rewrites the yml-based indexing tests to be Java based because the
yml-based tests can't handle different expected values for the counts.
And the indexing tests need that when they are run twice.
2018-05-21 13:15:13 -04:00
Ryan Ernst 3a64e6d121
Test: Remove specifying zip distribution in qa tests ()
Applying the rest test gradle plugin already uses the zip distribution
by default, so specifying it explicitly is not necessary. These are
leftovers from before zip was the default for rest tests.
2018-02-23 13:45:38 -08:00
Igor Motov c90d0fdf6b Tests: don't wait for completion while trying to get completed task
Nodes are reusing task ids after restart. So in some rare circumstances
the same task id might be assigned to the reindexing task stored by the
old cluster and the new task that is trying to retrieve the task
results. As a result, the get task request can timeout waiting on
itself. Since we already waited for the task to finish before restarting
the cluster, waiting for the task here doesn't make any sense to start
with.

Fixes 
2018-02-20 14:14:47 -05:00
Michael Basnight e0bea70070
Generalize BWC logic ()
Generalizing BWC building so that there is less code to modify for a release. This ensures we do not
need to think about what major or minor version is in the gradle code. It follows the general rules of the
elastic release structure. For more information on the rules, see the VersionCollection's javadoc.

This also removes the additional bwc snapshots that will never be released, such as 6.0.2, which were
being built and tested against every time we ran bwc tests.

Additionally, it creates 4 new projects that correspond to the different types of snapshots that may exist
for a given version. Its possible to now run those individual tasks to work out bwc logic whereas
previously it was impossible and the entire suite of bwc tests had to be run to work out any logic
changes in the build tools' bwc project. Please note that if the project does not make sense for the 
version that is current, that an error will be thrown from that individual project if an attempt is made to 
run it.

This should allow for automating the version bumps as well, since it removes all the hardcoded version
logic from the configs.
2018-02-09 14:55:10 -06:00
Luca Cavanna d860971572
REST high-level client: add support for split and shrink index API ()
Relates to 
2018-02-01 16:37:01 +01:00
Igor Motov c75ac319a6
Add ability to associate an ID with tasks ()
Adds support for capturing the X-Opaque-Id header from a REST request and storing it's value in the tasks that this request started. It works for all user-initiated tasks (not only search).

Closes 

Usage:
```
$ curl -H "X-Opaque-Id: imotov" -H "foo:bar" "localhost:9200/_tasks?pretty&group_by=parents"
{
  "tasks" : {
    "7qrTVbiDQKiZfubUP7DPkg:6998" : {
      "node" : "7qrTVbiDQKiZfubUP7DPkg",
      "id" : 6998,
      "type" : "transport",
      "action" : "cluster:monitor/tasks/lists",
      "start_time_in_millis" : 1513029940042,
      "running_time_in_nanos" : 266794,
      "cancellable" : false,
      "headers" : {
        "X-Opaque-Id" : "imotov"
      },
      "children" : [
        {
          "node" : "V-PuCjPhRp2ryuEsNw6V1g",
          "id" : 6088,
          "type" : "netty",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513029940043,
          "running_time_in_nanos" : 67785,
          "cancellable" : false,
          "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998",
          "headers" : {
            "X-Opaque-Id" : "imotov"
          }
        },
        {
          "node" : "7qrTVbiDQKiZfubUP7DPkg",
          "id" : 6999,
          "type" : "direct",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513029940043,
          "running_time_in_nanos" : 98754,
          "cancellable" : false,
          "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998",
          "headers" : {
            "X-Opaque-Id" : "imotov"
          }
        }
      ]
    }
  }
}
```
2018-01-12 15:34:17 -05:00
Jason Tedor ccaba016ac
Fix task ordering in rolling upgrade tests
The configuration of the upgraded cluster task was missing a dependency
on the stopping of the second old node in the cluster. In some cases
(e.g., --parallel) Gradle would then try to run the configuration of a
node in the upgraded cluster before it had even configured the old nodes
in the cluster.

Relates 
2018-01-09 17:20:55 -05:00
Boaz Leskes 0b50b313d2 RecoveryIT.testRecoveryWithConcurrentIndexing should check for 110 docs in an upgraded cluster
Closes 
2017-12-04 18:06:02 +01:00
Boaz Leskes f58a3d0b96 testRelocationWithConcurrentIndexing: wait for green (on relevan index) and shard initialization to settle down before starting relocation 2017-12-04 13:18:42 +01:00
Boaz Leskes 1a976ea7a4 Cherry pick tests and seqNo recovery hardning from 2017-12-04 13:15:40 +01:00
David Turner 89ba8996c6 Consolidate version numbering semantics ()
Fixes to the build system, particularly around BWC testing, and to make future
version bumps less painful.
2017-11-23 20:21:53 +00:00
Martijn van Groningen 4f43fe70cb
test: Sort hits by _id instead of _doc and
cleanup tests by removing unneeded parameter and settings.
2017-11-10 12:11:51 +01:00
Martijn van Groningen b4048b4e7f
Use CoveringQuery to select percolate candidate matches and
extract all clauses from a conjunction query.

When clauses from a conjunction are extracted the number of clauses is
also stored in an internal doc values field (minimum_should_match field).
This field is used by the CoveringQuery and allows the percolator to
reduce the number of false positives when selecting candidate matches and
in certain cases be absolutely sure that a conjunction candidate match
will match and then skip MemoryIndex validation. This can greatly improve
performance.

Before this change only a single clause was extracted from a conjunction
query. The percolator tried to extract the clauses that was rarest in order
(based on term length) to attempt less candidate queries to be selected
in the first place. However this still method there is still a very high
chance that candidate query matches are false positives.

This change also removes the influencing query extraction added via 
as this is no longer needed because now all conjunction clauses are extracted.

https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extraction

Closes 
2017-11-10 07:44:42 +01:00
Simon Willnauer 8dda827ff4 Don't refresh on `_flush` `_force_merge` and `_upgrade` ()
Today all these API calls have a sideeffect of making documents visible
to search requests. While this is sometimes desired it's an unnecessary sideeffect
and now that we have an internal (engine-private) index reader () we artificially
add a refresh call for bwc. This change removes this sideeffect in 7.0.
2017-10-16 10:16:35 +02:00
Yannick Welsch a4436195f8 Set minimum_master_nodes on rolling-upgrade test ()
The rolling-upgrade test was only writing the "minimum_master_nodes" setting to the configuration file of the old nodes, but not the upgraded ones.

Also changes the value of "minimum_master_nodes" from "number_of_nodes" to "(number_of_nodes / 2) + 1".
2017-10-09 10:45:03 +02:00
Boaz Leskes 2a04118e88 Promote common rest test utility methods to ESRestTestCase
We have duplicates in some classes and I was about to create one more.
2017-10-05 10:08:10 +02:00
Boaz Leskes 4f8131026e RecoveryIT.testHistoryUUIDIsGenerated should reduce unassigned shards delay instead of ensure green.
The ensure green approach to avoid allocation delays caused problems with other indices created by other tests which didn't use ensure green in the various cluster stages. This aligns testHistoryUUIDIsGenerated to use the same approach used by the other test.
2017-09-30 16:48:23 +02:00
Boaz Leskes 5df77a8c91 enable debug logging for testHistoryUUIDIsGenerated (+1 squashed commit)
Squashed commits:
[1d4f268] enable debug logging for testHistoryUUIDIsGenerated
2017-09-26 14:49:47 +02:00
Jay Modi b8cd82e5c2 Increase time to wait for green in rolling upgrade tests ()
This commit increases the amount of time to wait for green to accound for unassigned shards that
have been delayed. The default delay is 60s, so we need to wait longer than that. Previously, the
wait would timeout at 30s due to the rest client and the default for the cluster health api.

Closes 
2017-09-25 12:39:33 -06:00
Boaz Leskes cd2a4372b4 RecoveryIT should wait for green when in mixed cluster to avoid unassigned shards
The test starts with two old nodes and creates indices (without waiting for green, which is fixed here too). Then it restarts one of the nodes and waits for it to join the cluster. This wait condition only uses wait for yellow as our generic infra doesn't how many nodes are there in total. Once the restarted node is part of the cluster (mixed mode) the second old node is restarted. If indices are not fully allocated when that happens, the shards will go into delayed unassigned mode. If the recovery of the replica never completed we may end up with corrupted / no secondary copy on the node. This will cause the shards to be delayed for 1m before being reassigned and the test will time out.
2017-09-24 22:38:20 +02:00
Boaz Leskes 2b6f75730e RecoveryIT up client time out to 40s to see response in a 30s time 2017-09-24 21:33:20 +02:00
Boaz Leskes 04385a9ce9 Restoring from snapshot should force generation of a new history uuid ()
Restoring a shard from snapshot throws the primary back in time violating assumptions and bringing the validity of global checkpoints in question. To avoid problems, we should make sure that a shard that was restored will never be the source of an ops based recovery to a shard that existed before the restore. To this end we have introduced the notion of `histroy_uuid` in  and required that both source and target will have the same history to allow ops based recoveries. This PR make sure that a shard gets a new uuid after restore.

As suggested by @ywelsch , I derived the creation of a `history_uuid` from the `RecoverySource` of the shard. Store recovery will only generate a uuid if it doesn't already exist (we can make this stricter when we don't need to deal with 5.x indices). Peer recovery follows the same logic (note that this is different than the approach in , I went this way as it means that shards always have a history uuid after being recovered on a 6.x node and will also mean that a rolling restart is enough for old indices to step over to the new seq no model). Local shards and snapshot force the generation of a new translog uuid.

Relates 
Closes 
2017-09-19 15:58:36 +02:00
Ryan Ernst 072281d5aa Update version to 7.0.0-alpha1 ()
This commit updates the version for master to 7.0.0-alpha1. It also adds
the 6.1 version constant, and fixes many tests, as well as marking some
as awaits fix.

Closes 
Closes 
2017-08-01 15:47:48 -04:00
Jason Tedor 8eb4a3f6fa Remove busted rolling upgrade script test
This commit removes a rolling upgrade test for scripting that is totally
busted yet is preventing builds from succeeding. We elect to remove this
test as opposed to skipping the test as:
 - it has beeen being skipped for months with no apparent loss
 - it appears to need significant work to get to an unbusted state
2017-07-30 12:01:03 +09:00
Ryan Ernst 072402463b Scripting: Remove search template actions ()
The dedicated search template put/get/delete actions are deprecated in
5.6. This commit removes them from 6.0.
2017-07-14 23:12:05 -07:00
Martijn van Groningen a85b22b298
test: put template api is deprecated, so take warnings into account
Relates to 
2017-07-13 11:39:53 +02:00
Ali Beyad cc1f40ca18 Fix cluster health wait conditions in rolling restart tests
In the rolling upgrade tests, there is a test to create an index with
replica shards and ensure that in the mixed cluster environment, the
cluster health is green before any other tests are executed.  However,
there were two problems with this.  First, if the replica shard was
residing on the restarted node, then delayed allocation will kick in and
cause the cluster health request to timeout after 1m.  The fix to this
was to drastically lower the delayed allocation setting.  Second, if the
primary exists on the higher version node, then the replica cannot be
assigned to the lower version node because recovery cannot happen from
lower lucene versions.  The fix here was to wait for the cluster health
to be yellow instead of green in the mixed cluster environment.  In the
fully upgraded cluster, the cluster health check waits for a green
cluster as before.

Closes 
2017-07-06 14:35:07 -04:00
Simon Willnauer 7c637a0bfe Ensure `index.mapping.single_type` can only be set on 5.x indices ()
In 6.x we prevent multiple types and default to `index.mapping.single_type: false`
This change removes the registered setting and ensures that it's preserved for
5.x indices.

Relates to 
2017-07-05 15:16:40 +02:00
Ryan Ernst 106e373412 Build: Add master flag for disabling bwc tests ()
This commit adds a gradle project, set inside the root build.gradle,
which controls all our bwc tests. This allows for seamless (ie no errant
CI failures) backporting of behavior.
2017-06-14 22:01:49 -07:00
Jay Modi ed76b9a518 Test: allow setting socket timeout for rest client ()
In , a setting was added to allow setting the retry timeout for the rest client under the
impression that this would allow requests to go longer than 30s. However, there is also a socket
timeout that needs to be set to greater than 30s, which this change adds a setting for.
2017-06-14 08:21:56 -06:00
Jay Modi 190242fb1b Test: add setting to change request timeout for rest client ()
This commit adds a setting to change the request timeout for the rest client. This is useful as the
default timeout is 30s, which is also the same default for calls like cluster health. If both are
the same then the response from the cluster health api will not be received as the client usually
times out first making test failures harder to debug.

Relates 
2017-06-13 12:19:17 -06:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id ()
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Simon Willnauer 4d423bf2ba Add a dummy_index to upgrade tests to ensure we recover fine with replicas ()
We default to 0 replicas in the rolling restart scenario already to ensure
we test against worst case. Yet, this adds a dummy index to ensure we also
recover and index with replicas just fine.
2017-05-29 17:36:44 +02:00
Nik Everett e072cc7770 Begin replacing static index tests with full restart tests ()
These tests spin up two nodes of an older version of Elasticsearch,
create some stuff, shut down the nodes, start the current version,
and verify that the created stuff works.

You can run `gradle qa:full-cluster-restart:check` to run these
tests against the head of the previous branch of Elasticsearch
(5.x for master, 5.4 for 5.x, etc) or you can run
`gradle qa:full-cluster-restart:bwcTest` to run this test against
all "index compatible" versions, one after the other. For master
this is every released version in the 5.x.y version *and* the tip
of the 5.x branch.

I'd love to add more to these tests in the future but these
currently just cover the functionality of the `create_bwc_index.py`
script and start to cover the assertions in the
`OldIndexBackwardsCompatibilityIT` test.
2017-05-26 14:07:48 -04:00
Ryan Ernst 0353bd1fb6 Test: Convert rolling upgrade test to have task per wire compat version ()
This commit changes the rolling upgrade test to create a set of rest
test tasks per wire compat version. The most recent wire compat version
is always tested with the `integTest` task, and all versions can be
tested with `bwcTest`.
2017-05-18 01:14:24 -07:00
Ryan Ernst ff34434bba Build: Extract all ES versions into gradle properties ()
This commit expands the logic for version extraction from Version.java
to include a list of all versions for backcompat purposes. The tests
using bwcVersion are converted to use this list, but those tests
(rolling upgrade and backwards-5.0) are still not randomized; that will
happen in another followup.
2017-05-17 12:58:37 -07:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml ()
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Nik Everett 4423e1b78f Test search templates during rolling upgrade test ()
In  we fix an issue with stored search templates that
this test would have discovered: stored search templates cause
the node to refuse to start. Technically a "restart" test would
have caught this as well and would have caught it more quickly.
But we already *have* an upgrade test and we don't have restart tests.
And testing this on upgrade is a good thing too.
2017-04-22 13:37:13 -04:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling ()
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes 
2017-04-18 15:07:08 -07:00
Ryan Ernst a8017ff020 Tests: Move cluster dependencies from runner to cluster ()
After splitting integ tests into cluster configuration and the test
runner task, we still have dependencies of the test runner added as deps
of the cluster. This commit adds dependencies directly to the cluster,
so that the runner can have other dependencies independent of what is
needed for the cluster.
2017-04-17 16:02:46 -07:00
Ryan Ernst cc1addeac2 Build: Find bwc version during build ()
We currently have the last minor version of the previous major hardcoded
in tests like rolling upgrade. This change programatically finds this
during gradle initialization by parsing versions from Version.java.
2017-03-29 12:11:38 -07:00
Ryan Ernst 175bda64a0 Build: Rework integ test setup and shutdown to ensure stop runs when desired ()
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).

This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`.  The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
2017-02-22 12:43:15 -08:00
Jay Modi b234644035 Enforce Content-Type requirement on the rest layer and remove deprecated methods ()
This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport
requests and their usages.

While doing this, it turns out that there are many places where *Entity classes are used from the apache http client
libraries and many of these usages did not specify the content type. The methods that do not specify a content type
explicitly have been added to forbidden apis to prevent more of these from entering our code base.

Relates 
2017-02-17 14:45:41 -05:00
Ali Beyad a6389a30f2 [TEST] bumps rolling upgrade test version to 5.4.0-snapshot 2017-02-15 10:59:24 -05:00
Ali Beyad 22e64bdf4f [TEST] remove stored scripts rolling upgrade test as it doesn't apply to 6.0 2017-02-03 12:30:42 -05:00
Ali Beyad 43aadef23a [TEST] upgrade backward compatibility version of rolling upgrade
tests to 5.3.0-SNAPSHOT
2017-02-02 09:56:38 -05:00
Ali Beyad 3c1637f413 Changes the rolling upgrade version test to test against
5.2.0 instead of 5.2.0-SNAPSHOT
2017-02-01 22:20:11 -05:00
Jack Conradson 3d2626c4c6 Change Namespace for Stored Script to Only Use Id ()
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.

The new behavior is the following:

When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.

Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].

When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.

Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.

When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
2017-01-31 13:27:02 -08:00
Tim Brooks 719e75bb3f Add repository-url module and move URLRepository ()
This is related to . URLRepository requires SocketPermission
connect. This commit introduces a new module called "repository-url"
where URLRepository will reside. With the new module, permissions can
be removed from core.
2017-01-25 17:09:25 -06:00
Lee Hinman fed2a1a822 Fix Translog.Delete serialization for sequence numbers ()
* Fix Translog.Delete serialization for sequence numbers

Translog.Delete used `.writeVLong` instead of `.writeLong` for the sequence
number and primary term (and their respective "read" variants). This could lead
to issues where a 5.x node sent a translog operation with a negative sequence
number (-2 for unassigned seq no) that tripped an assertion serializing a
negative number and causing ES to exit.

Adds a unit test for serialization and a mixed-cluster REST test, since that was
how this was originally caught.

* Use more realistic values for random seqNum and primary term

* Add comment with TODO for removal in 7.0

* Change comment into an assert
2017-01-11 10:08:04 -07:00
Igor Motov ca90d9ea82 Remove PROTO-based custom cluster state components
Switches custom cluster state components from PROTO-based de-serialization to named objects based de-serialization
2016-12-28 13:32:35 -05:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites ()
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Boaz Leskes b857b316b6 Add BWC layer to seq no infra and enable BWC tests ()
Sequence BWC logic consists of two elements:

1) Wire level BWC using stream versions.
2) A changed to the global checkpoint maintenance semantics.

For the sequence number infra to work with a mixed version clusters, we have to consider situation where the primary is on an old node and replicas are on new ones (i.e., the replicas will receive operations without seq#) and also the reverse (i.e., the primary sends operations to a replica but the replica can't process the seq# and respond with local checkpoint). An new primary with an old replica is a rare because we do not allow a replica to recover from a new primary. However, it can occur if the old primary failed and a new replica was promoted or during primary relocation where the source primary is treated as a replica until the master starts the target.

1) Old Primary & New Replica - this case is easy as is taken care of by the wire level BWC. All incoming requests will have their seq# set to `UNASSIGNED_SEQ_NO`, which doesn't confuse the local checkpoint logic (keeping it at `NO_OPS_PERFORMED`) 
2) New Primary & Old replica - this one is trickier as the global checkpoint service currently takes all in sync replicas into consideration for the global checkpoint calculation. In order to deal with old replicas, we change the semantics to say all *new node* in sync replicas. That means the replicas on old nodes don't count for the global checkpointing. In this state the seq# infra is not fully operational (you can't search on it, because copies may miss it) but it is maintained on shards that can support it. The old replicas will have to go through a file based recovery at some point and will get the seq# information at that point. There is still an edge case where a new primary fails and an old replica takes over. I'lll discuss this one with @ywelsch as I prefer to avoid it completely.

This PR also re-enables the BWC tests which were disabled. As such it had to fix any BWC issue that had crept in. Most notably an issue with the removal of the `timestamp` field in .

The commit also includes a fix for the default value of the seq number field in replicated write requests (it was 0 but should be -2), that surface some other minor bugs which are fixed as well.

Last - I added some debugging tools like more sane node names and forcing replication request to implement a `toString`
2016-12-19 13:08:24 +01:00
jaymode ced638bcda
test: use the correct number of bwc nodes for old cluster 2016-11-22 14:56:34 -05:00
Jason Tedor b2b7595fa7 Temporarily set BWC version to 6.0.0 for seq. no
There is not yet a BWC layer in sequence numbers. This commit sets the
BWC version to 6.0.0 for the BWC and rolling upgrade tests until this
BWC layer is built.
2016-11-16 09:09:38 -05:00
Simon Willnauer bdc942fa72 Enable 5.x to 6.x BWC tests
This commit enables real BWC testing against a 5.1 snapshot. All
REST tests plus rolling upgrade test now run against a mixed version
cross major version cluster.
2016-11-14 14:26:49 +01:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner ()
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Ali Beyad 8afc83047f Change the timeout of the rolling upgrades test from 40 mins to 5 mins
to still allow accounting for slow VMs
2016-09-19 15:45:41 -04:00
Ali Beyad 98230d035a Adds a preserveIndicesUponCompletion method to ESRestTestCase
that can be overridden by subclasses if the test must not
delete indices it created after exiting.
2016-09-16 19:21:26 -04:00
Ali Beyad 83adc87015 Removes stopNodeUponCompletion in favor of moving the stop
nodes task to the final part of the cluster test task execution
graph.
2016-09-16 16:32:43 -04:00
Ali Beyad 22e6bc8359 Disables unit tests for rolling upgrades, as there are only
rest integration tests.
2016-09-16 11:32:28 -04:00
Ali Beyad 56f97500c6 In the rolling upgrades tests, we do not want to stop nodes
automatically between tasks, as we want some of the nodes from
the previous task to continue running in the next task. This
commit enables a cluster configuration setting to not stop
nodes automatically after a task runs, but instead the creator
of the test task must stop the running nodes explicitly in a
cleanup phase.
2016-09-16 10:36:55 -04:00
Ali Beyad ba072ec18e When Elasticsearch nodes are started in gradle to form a
cluster, we wait for the cluster health to indicate the
necessary nodes have formed a cluster.  This check was an
exact value (equality) check.  However, if we are trying to
connect the nodes in the cluster to nodes from a previously
formed cluster (of the same name), then we will have more
nodes returned by the cluster health check than the current
task's configured number of nodes.  Hence, this check needs
to be a >= check.  This commit fixes it.
2016-09-16 09:41:26 -04:00
Ali Beyad ec7e383783 Better checks for the cluster being up in the rolling upgrades tests. 2016-09-15 15:08:11 -04:00
Ali Beyad 3f79874042 Prevent the rolling upgrades rest tests from cleaning up indices
after finishing if a the tests.rest.preserve_indices system property
is set
2016-09-14 23:34:19 -04:00
Ali Beyad 513ed58d17 Added a indices flush to the rolling upgrades rest tests to ensure
they are on disk for the upgraded nodes to pick up.
2016-09-13 12:36:20 -04:00
Ali Beyad 0c7dd1865c Reworking yaml tests for rolling upgrades 2016-09-12 18:46:06 -04:00
Ali Beyad 0b3eb11712 Changed rest-api-spec tests folder structure 2016-09-12 16:05:36 -04:00
Ryan Ernst bd2de367cc Use 5.0 alpha5 for bwc version, so we have transport.ports
Also fixed bug to ensure unicast host is writtent in yaml quotes
2016-09-06 16:08:27 -07:00
Ryan Ernst a844b085f1 Made node config always have a unicast transport uri closure 2016-09-06 15:51:14 -07:00
Ali Beyad 5d8aa6b4fe Adds tests for rolling upgrades to execute 2016-09-03 01:34:39 -04:00
Ryan Ernst ecaf6ef001 Add rolling upgrade test project 2016-08-31 16:45:03 -07:00