Commit Graph

220 Commits

Author SHA1 Message Date
Christoph Büscher 5406a9f30d Add rank-eval module to transport client and HL client dependencies 2017-12-13 18:05:43 +01:00
David Turner 89ba8996c6 Consolidate version numbering semantics (#27397)
Fixes to the build system, particularly around BWC testing, and to make future
version bumps less painful.
2017-11-23 20:21:53 +00:00
Jim Ferenczi 6319424e4a
Move composite aggregation to core (#27474)
This change removes the module named aggs-composite and adds the `composite` aggs
as a core aggregation. This allows other plugins to use this new aggregation
and simplifies the integration in the HL rest client.
2017-11-21 13:31:01 +01:00
Michael Basnight cb3e8f4763
Move the CLI into its own subproject (#27114)
Projects the depend on the CLI currently depend on core. This should not
always be the case. The EnvironmentAwareCommand will remain in :core,
but the rest of the CLI components have been moved into their own
subproject of :core, :core:cli.
2017-11-18 21:42:57 -06:00
David Turner 9766b858d0
Prepare for bump to 6.0.1 on the master branch (#27391)
An assortment of fixes, particularly to version number calculations, in preparation for the bump to 6.0.1.
2017-11-16 18:38:54 +00:00
Jim Ferenczi 623367d793
Add composite aggregator (#26800)
* This change adds a module called `aggs-composite` that defines a new aggregation named `composite`.
The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources.
The sources for each bucket can be defined as:
  * A `terms` source, values are extracted from a field or a script.
  * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval.
This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets.
A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources.
For instance the following aggregation:

````
"test_agg": {
  "terms": {
    "field": "field1"
  },
  "aggs": {
    "nested_test_agg":
      "terms": {
        "field": "field2"
      }
  }
}
````
... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve **all** the combinations of `field1`, `field2` in the matching documents:

````
"composite_agg": {
  "composite": {
    "sources": [
      {
	"field1": {
          "terms": {
              "field": "field1"
            }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
        }
      },
    }
  }
````

The response of the aggregation looks like this:

````
"aggregations": {
  "composite_agg": {
    "buckets": [
      {
        "key": {
          "field1": "alabama",
          "field2": "almanach"
        },
        "doc_count": 100
      },
      {
        "key": {
          "field1": "alabama",
          "field2": "calendar"
        },
        "doc_count": 1
      },
      {
        "key": {
          "field1": "arizona",
          "field2": "calendar"
        },
        "doc_count": 1
      }
    ]
  }
}
````

By default this aggregation returns 10 buckets sorted in ascending order of the composite key.
Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after.
For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`:

````
"composite_agg": {
  "composite": {
    "after": {"field1": "alabama", "field2": "calendar"},
    "size": 100,
    "sources": [
      {
	"field1": {
          "terms": {
            "field": "field1"
          }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
	}
      }
    }
  }
````

This aggregation is optimized for indices that set an index sorting that match the composite source definition.
For instance the aggregation above could run faster on indices that defines an index sorting like this:

````
"settings": {
  "index.sort.field": ["field1", "field2"]
}
````

In this case the `composite` aggregation can early terminate on each segment.
This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition.
This is mandatory because index sorting picks only one value per document to perform the sort.
2017-11-16 15:13:36 +01:00
Boaz Leskes ace446f335 Update shrink's bwc version to 6.1.0 and enabled bwc tests 2017-11-07 15:35:46 +01:00
Boaz Leskes 2fc6c64c82 Disable bwc tests in preparation of backporting #26931 2017-11-07 10:58:45 +01:00
Yannick Welsch 31f43f5c34 Reenable BWC tests
Relates to #26682
2017-09-22 16:54:00 +02:00
Yannick Welsch df5c450e89 Add v6.1 BWC layer for adding wait_for_active_shards to index open command
This commit disables BWC tests while adding a v6.1 BWC layer for the PR #26682
2017-09-22 16:30:07 +02:00
Jason Tedor 954fb1c80d Reenable BWC tests after global checkpoint sync
This commit reenables the BWC tests after the introduction of the
post-operation and background global checkpoint sync.

Relates #26591
2017-09-21 15:44:36 -04:00
Jason Tedor f35d1de502 Introduce global checkpoint background sync
It is the exciting return of the global checkpoint background
sync. Long, long ago, in snapshot version far, far away we had and only
had a global checkpoint background sync. This sync would fire
periodically and send the global checkpoint from the primary shard to
the replicas so that they could update their local knowledge of the
global checkpoint. Later in time, as we sped ahead towards finalizing
the initial version of sequence IDs, we realized that we need the global
checkpoint updates to be inline. This means that on a replication
operation, the primary shard would piggy back the global checkpoint with
the replication operation to the replicas. The replicas would update
their local knowledge of the global checkpoint and reply with their
local checkpoint. However, this could allow the global checkpoint on the
primary to advance again and the replicas would fall behind in their
local knowledge of the global checkpoint. If another replication
operation never fired, then the replicas would be permanently behind. To
account for this, we added one more sync that would fire when the
primary shard fell idle. However, this has problems:
 - the shard idle timer defaults to five minutes, a long time to wait
   for the replicas to learn of the new global checkpoint
 - if a replica missed the sync, there was no follow-up sync to catch
   them up
 - there is an inherent race condition where the primary shard could
   fall idle mid-operation (after having sent the replication request to
   the replicas); in this case, there would never be a background sync
   after the operation completes
 - tying the global checkpoint sync to the idle timer was never natural

To fix this, we add two additional changes for the global checkpoint to
be synced to the replicas. The first is that we add a post-operation
sync that only fires if there are no operations in flight and there is a
lagging replica. This gives us a chance to sync the global checkpoint to
the replicas immediately after an operation so that they are always kept
up to date. The second is that we add back a global checkpoint
background sync that fires on a timer. This timer fires every thirty
seconds, and is not configurable (for simplicity). This background sync
is smarter than what we had previously in the sense that it only sends a
sync if the global checkpoint on at least one replica is lagging that of
the primary. When the timer fires, we can compare the global checkpoint
on the primary to its knowledge of the global checkpoint on the replicas
and only send a sync if there is a shard behind.

Relates #26591
2017-09-21 15:34:13 -04:00
Jason Tedor 52e80a9292 Reenable BWC tests after disabling for backport
This commit reenables the BWC tests after they were disabled for
backporting the change to track global checkpoints of shard copies on
the primary.

Relates #26666
2017-09-18 06:28:50 -04:00
Jason Tedor c238b79cf4 Add global checkpoint tracking on the primary
This commit adds local tracking of the global checkpoints on all shard
copies when a global checkpoint tracker is operating in primary
mode. With this, we relay the global checkpoint on a shard copy back to
the primary shard during replication operations. This serves as another
step towards adding a background sync of the global checkpoint to the
shard copies.

Relates #26666
2017-09-18 06:04:44 -04:00
Boaz Leskes a99803f89a enable bwc testing 2017-09-15 00:20:19 +03:00
Boaz Leskes 1ca0b5e9e4 Introduce a History UUID as a requirement for ops based recovery (#26577)
The new ops based recovery, introduce as part of  #10708, is based on the assumption that all operations below the global checkpoint known to the replica do not need to be synced with the primary. This is based on the guarantee that all ops below it are available on primary and they are equal. Under normal operations this guarantee holds. Sadly, it can be violated when a primary is restored from an old snapshot. At the point the restore primary can miss operations below the replica's global checkpoint, or even worse may have total different operations at the same spot. This PR introduces the notion of a history uuid to be able to capture the difference with the restored primary (in a follow up PR).

The History UUID is generated by a primary when it is first created and is synced to the replicas which are recovered via a file based recovery. The PR adds a requirement to ops based recovery to make sure that the history uuid of the source and the target are equal. Under normal operations, all shard copies will stay with that history uuid for the rest of the index lifetime and thus this is a noop. However, it gives us a place to guarantee we fall back to file base syncing in special events like a restore from snapshot (to be done as a follow up) and when someone calls the truncate translog command which can go wrong when combined with primary recovery (this is done in this PR).

We considered in the past to use the translog uuid for this function (i.e., sync it across copies) and thus avoid adding an extra identifier. This idea was rejected as it removes the ability to verify that a specific translog really belongs to a specific lucene index. We also feel that having a history uuid will serve us well in the future.
2017-09-14 21:25:02 +03:00
Ryan Ernst 8ba4ff3be0 Build: Move javadoc linking to root build.gradle (#26529)
Javadoc linking between projects currently relies on
projectSubstitutions. However, that is an extension variable that is not
part of BuildPlugin. This commit moves the javadoc linking into the root
build.gradle, alongside where projectSubstitutions are defined.
2017-09-11 15:43:34 -07:00
Jason Tedor 53848877c8 Reenable BWC tests after backport
This commit reenables the BWC tests after they were disabled for
backporting a change that had broad BWC implications.
2017-08-31 22:13:50 -04:00
Tim Vernum eb87df9ff9 Allow abort of bulk items before processing (#26434)
Adds support for bulk items to be aborted before they are processed by the TransportShardBulkAction.
This can be used by an ActionFilter to reject a subset of the items in a bulk action without rejecting the whole action (or all the items for a shard).
2017-08-31 21:23:14 +10:00
Ryan Ernst 35a2ee38e1 Build: Add git hashes used as build metadata (#26397)
This commit adds files to the build output called build_metadata which
contain key/value pairs of metadata associated with the build. The first
use of this metadata are the git hashes associated with bwc checkouts.
These metadata files will be picked up by CI intake jobs and stored
along with last-good-commit, and then passed back in throug the
BUILD_METADATA env var on periodic jobs.
2017-08-28 14:10:06 -07:00
Michael Basnight cfd14cd2b8 Revert shading for the low level rest client (#26367)
At current, we do not feel there is enough of a reason to shade the low
level rest client. It caused problems with commons logging and IDE's
during the brief time it was used. We did not know exactly how many
users will need this, and decided that leaving shading out until we
gather more information is best. Users can still shade the jar
themselves. For information and feeback, see issue #26366.

Closes #26328

This reverts commit 3a20922046.
This reverts commit 2c271f0f22.
This reverts commit 9d10dbea39.
This reverts commit e816ef89a2.
2017-08-25 14:13:12 -05:00
Ryan Ernst 661648b3aa Build: Allow build to configure which license/notice to embed in jars (#26373)
We currently add the apache license/notice for elasticsearch to any
plugin that uses our ES plugin gradle plugin. However, each plugin
should be able to use their own license. This commit adds a licenseFile
and noticeFile property to the root of project's using BuildPlugin,
which is added to jar files for that project.
2017-08-24 22:46:30 -07:00
Nik Everett 99ac7beb8e Teach the build about betas and rcs (#26066)
The build was ignoring suffixes like "beta1" and "rc1" on the version numbers which was causing the backwards compatibility packaging tests to fail because they expected to be upgrading from 6.0.0 even though they were actually upgrading from 6.0.0-beta1. This adds the suffixes to the information that the build scrapes from Version.java. It then uses those suffixes when it resolves artifacts build from the bwc branch and for testing.

Closes #26017
2017-08-10 14:30:00 -04:00
Ryan Ernst 072281d5aa Update version to 7.0.0-alpha1 (#25876)
This commit updates the version for master to 7.0.0-alpha1. It also adds
the 6.1 version constant, and fixes many tests, as well as marking some
as awaits fix.

Closes #25893
Closes #25870
2017-08-01 15:47:48 -04:00
Michael Basnight e816ef89a2 Shade external dependencies in the rest client jar
This commit removes all external dependencies from the rest client jar
and shades them in an 'org.elasticsearch.client' package within the jar
using shadowJar gradle plugin. All projects that depended on the
existing jar have been converted to using the 'org.elasticsearch.client'
package prefixes to interact with the rest client.

Closes #25208
2017-07-24 12:55:43 -05:00
Yannick Welsch 49279d26da Reenable BWC tests
after merging #25822 & #25824
2017-07-21 16:36:50 +02:00
Yannick Welsch a2624dfcef Move primary term from ReplicationRequest to ConcreteShardRequest (#25822)
Removes the primary term from the replication request and pushes it into the transport envelope. This makes it possible to remove the term from the ReplicationOperation universe. The primary term that is to be used for a replication operation is now determined in the reroute phase when the node decides to execute a primary action (and validated once the primary action gets to execute). This makes it possible to validate that the primary action was sent to the correct primary shard instance that it was meant to be sent to (currently we only validate primary actions using the allocation id, which can be reused for failed and reallocated primaries).
2017-07-21 15:57:42 +02:00
Luca Cavanna ec66d655b5 Rename client artifacts (#25693)
It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts.

This commit renames:

- rest -> elasticsearch-rest-client
- sniffer -> elasticsearch-rest-client-sniffer
- rest-high-level -> elasticsearch-rest-high-level-client

A couple of small changes are also preparing the high level client for its first release.

Closes #20248
2017-07-13 09:44:25 +02:00
Boaz Leskes 215bffb08b Enable bwc testing
#25512 & #25511 have been merged
2017-07-08 11:57:22 +02:00
Boaz Leskes b40b002c1d Disable bwc tests in preparation of merging #25512 & #25511 2017-07-08 09:25:43 +02:00
Nik Everett 21b1db2965 Remove assemble from build task when assemble removed
Removes the `assemble` task from the `build` task when we have
removed `assemble` from the project. We removed `assemble` from
projects that aren't published so our releases will be faster. But
That broke CI because CI builds with `gradle precommit build` and,
it turns out, that `build` includes `check` and `assemble`. With
this change CI will only run `check` for projects without an
`assemble`.
2017-06-16 17:19:14 -04:00
Nik Everett 7b358190d6 Remove assemble task when not used for publishing (#25228)
Removes the `assemble` task from projects that are not published.
This should speed up `gradle assemble` by skipping projects that
don't need to be built. Which is useful because `gradle assemble`
is how we cut releases.
2017-06-16 11:46:34 -04:00
Ryan Ernst 106e373412 Build: Add master flag for disabling bwc tests (#25230)
This commit adds a gradle project, set inside the root build.gradle,
which controls all our bwc tests. This allows for seamless (ie no errant
CI failures) backporting of behavior.
2017-06-14 22:01:49 -07:00
Ryan Ernst 160a049930 Build: Move verifyVersions to new branchConsistency task (#25009)
This commit adds a new `branchConsistency` task which will run in CI
once a day, instead of on every commit. This allows `verifyVersions` to
not break immediately once a new version is released in maven.
2017-06-01 10:29:51 -07:00
Nik Everett 6a167a7ba8 Build: improve verifyVersions error message (#25006)
The error message was confusing because it doesn't
include unreleased versions like CURRENT.
2017-06-01 11:55:23 -04:00
Nik Everett bc4df47a76 Rework bwc snapshot projects to build up to two bwc versions (#24870)
Removes the `distribution:bwc` project in favor of
`distribution:bwc-release-snapshot` and
`distribution:bwc-stable-snapshot`.
`distribution:bwc-release-snapshot` builds a snapshot of the
latest release branch (5.4 now) if needed for backwards
compatibility. `distribution:bwc-stable-snapshot` builds a
snapshot of the latest stable branch (5.x now) if needed for
backwards compatibility.
2017-05-29 10:22:32 -04:00
Nik Everett 5da8ce8318 Remove the need for _UNRELEASED suffix in versions (#24798)
Removes the need for the `_UNRELEASED` suffix on versions by detecting if a version should be unreleased or not based on the versions around it. This should make it simpler to automate the task of adding a new version label.
2017-05-26 18:36:32 -04:00
Jason Tedor 95ebdbaddb Add BWC packaging distributions
Some packaging tests depend on snapshot versions of packaging
distributions yet the build does not use a repository that includes such
distributions. While we could add such a repository, a better strategy
is to follow our approach for other BWC tests where we depend on a
locally-compiled archive distribution. This commit adds a local
compilation of packaging artifacts and substitutes these anywhere that
we would otherwise depend on a snapshot of these artifacts.

Relates #24861
2017-05-24 11:55:32 -04:00
Luca Cavanna 747fa721e4 Build: add client jar for aggs-matrix-stats (#24827)
This will be useful for the high level client to add support for the matrix stats aggregation, as we will ship with this jar by default like we do for parent-join-client which is aligned with distributing core with the modules already included.

Relates to #24796
2017-05-23 13:33:54 +02:00
Nik Everett 82d2c7a142 Remove vagrant testing versions (#24754)
Now that we generate the versions list from Versions.java we can
drop the list of versions maintained for vagrant testing. One nice
thing that the vagrant testing did was to check if the list of
versions was out of date. This moves that test to the core
project.
2017-05-18 09:33:13 -04:00
Ryan Ernst 0353bd1fb6 Test: Convert rolling upgrade test to have task per wire compat version (#24758)
This commit changes the rolling upgrade test to create a set of rest
test tasks per wire compat version. The most recent wire compat version
is always tested with the `integTest` task, and all versions can be
tested with `bwcTest`.
2017-05-18 01:14:24 -07:00
Ryan Ernst ff34434bba Build: Extract all ES versions into gradle properties (#24748)
This commit expands the logic for version extraction from Version.java
to include a list of all versions for backcompat purposes. The tests
using bwcVersion are converted to use this list, but those tests
(rolling upgrade and backwards-5.0) are still not randomized; that will
happen in another followup.
2017-05-17 12:58:37 -07:00
Jim Ferenczi 279a18a527 Add parent-join module (#24638)
* Add parent-join module

This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.

Relates #20257
2017-05-12 15:58:06 +02:00
Ryan Ernst b3466818ac Remove outdated comment 2017-03-29 14:37:32 -07:00
Ryan Ernst cc1addeac2 Build: Find bwc version during build (#23801)
We currently have the last minor version of the previous major hardcoded
in tests like rolling upgrade. This change programatically finds this
during gradle initialization by parsing versions from Version.java.
2017-03-29 12:11:38 -07:00
Ryan Ernst 8c53555b28 Tests: Use local clone build of 5.x with bwc tests (#22946)
The current rest backcompat tests, which run against a mixed cluster of
5.x and 6.0 nodes, depend on snapshot builds of 5.x. However, this has
the potential for inconsistency that results in CI failures, and happens
quite often, whenever some backcompat logic is added to 5.x, but the bwc
test on master fails because the 5.x code has not yet been published as
a snapshot.

This change creates a git clone of the 5.x branch,
builds the zip distribution, and ties that into gradle substitutions for
the 5.x version.
2017-03-23 22:32:13 -07:00
Ryan Ernst 621643a5c3 Build: Only add ASL license to pom for elasticsearch project (#22664)
The extra plugins that may be attached to the elasticsearch build
contain their own license. In the past, the ASL license elasticsearch
uses was avoided by specially checking for the gradle project prefix of
`:x-plugins`. However, since refactoring to the elasticsearch-extra dir
structure, this mechanism was broken. This change fixes the pom license
adding to only be applied to projects that fall under the root project
(ie elasticsearch).
2017-01-17 13:38:32 -08:00
Tim Brooks 864e3128e7 Add task to clean idea build directory. Make cleanIdea task invoke it. 2016-12-22 11:24:13 -06:00
Ryan Ernst a30f649013 Build: Apply license section in poms only to elasticsearch artifacts (#21757) 2016-11-24 00:03:43 -08:00
Tanguy Leroux b9bee8bca3 Remove transport-netty3-client mention (#21650)
Now netty3 is gone, this mention must also be removed.
2016-11-18 11:54:22 +01:00