OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	d912a49c6f	[7.x] Support geotile_grid aggregation in composite agg sources (#45810 ) (#46399 ) * Support geotile_grid aggregation in composite agg sources (#45810) Adds support for `geotile_grid` as a source in composite aggs. Part of this change includes adding a new docFormat of `GEOTILE` that formats a hashed `long` value into a geotile formatting string `zoom/x/y`.	2019-09-05 13:22:57 -05:00
Nhat Nguyen	3393f9599e	Ignore translog retention policy if soft-deletes enabled (#45473 ) Since #45136, we use soft-deletes instead of translog in peer recovery. There's no need to retain extra translog to increase a chance of operation-based recoveries. This commit ignores the translog retention policy if soft-deletes is enabled so we can discard translog more quickly. Backport of #45473 Relates #45136	2019-08-22 16:40:06 -04:00
Tal Levy	9b14b7298b	[7.x] Add is_write_index column to cat.aliases (#45798 ) * Add is_write_index column to cat.aliases (#44772) Aliases have had the option to set `is_write_index` since 6.4, but the cat.aliases action was never updated. * correct version bounds to 7.4	2019-08-21 14:15:49 -07:00
Armin Braun	6aaee8aa0a	Repository Cleanup Endpoint (#43900 ) (#45780 ) * Repository Cleanup Endpoint (#43900) * Snapshot cleanup functionality via transport/REST endpoint. * Added all the infrastructure for this with the HLRC and node client * Made use of it in tests and resolved relevant TODO * Added new `Custom` CS element that tracks the cleanup logic. Kept it similar to the delete and in progress classes and gave it some (for now) redundant way of handling multiple cleanups but only allow one * Use the exact same mechanism used by deletes to have the combination of CS entry and increment in repository state ID provide some concurrency safety (the initial approach of just an entry in the CS was not enough, we must increment the repository state ID to be safe against concurrent modifications, otherwise we run the risk of "cleaning up" blobs that just got created without noticing) * Isolated the logic to the transport action class as much as I could. It's not ideal, but we don't need to keep any state and do the same for other repository operations (like getting the detailed snapshot shard status)	2019-08-21 17:59:49 +02:00
Luca Cavanna	c31cddf27e	Update the schema for the REST API specification (#42346 ) * Update the REST API specification This patch updates the REST API spefication in JSON files to better encode deprecated entities, to improve specification of URL paths, and to open up the schema for future extensions. Notably, it changes the `paths` from a list of strings to a list of objects, where each particular object encodes all the information for this particular path: the `parts` and the `methods`. Among the benefits of this approach is eg. encoding the difference between using the `PUT` and `POST` methods in the Index API, to either use a specific document ID, or let Elasticsearch generate one. Also `documentation` becomes an object that supports an `url` and also a `description` which is a new field. * Adapt YAML runner to new REST API specification format The logic for choosing the path to use when running tests has been simplified, as a consequence of the path parts being listed under each path in the spec. The special case for create and index has been removed. Also the parsing code has been hardened so that errors are thrown earlier when the structure of the spec differs from what expected, and their error messages should be more helpful.	2019-08-16 14:40:00 +02:00
Jason Tedor	9a142ff25c	Introduce formal node ML role (#45174 ) This commit builds on the ability for plugins to introduce new roles to add a formal node ML role.	2019-08-06 13:00:05 -04:00
Tanguy Leroux	772ce1f599	Add deprecation warning for Force Merge API (#44903 ) This commit adds a deprecation warning in 7.x for the Force Merge API when both only_expunge_deletes and max_num_segments are set in a request. Relates #44761	2019-08-06 16:04:24 +02:00
Nhat Nguyen	d128188c28	Return seq_no and primary_term in noop update (#44603 ) With this change, we will return primary_term and seq_no of the current document if an update is detected as a noop. We already return the version; hence we should also return seq_no and primary_term. Relates #42497	2019-07-25 19:16:56 -04:00
Yannick Welsch	0ce841915c	Add Clone Index API (#44267 ) Adds an API to clone an index. This is similar to the index split and shrink APIs, just with the difference that the number of primary shards is kept the same. In case where the filesystem provides hard-linking capabilities, this is a very cheap operation. Indexing cloning can be done by running `POST my_source_index/_clone/my_target_index` and it supports the same options as the split and shrink APIs. Closes #44128	2019-07-25 22:02:28 +02:00
Enrico Zimuel	a12be619f6	Fix URL documentation in API specs (#44487 )	2019-07-24 10:33:15 -07:00
James Rodewig	d46545f729	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:18:23 -04:00
Przemyslaw Gomulka	e23ecc5838	JSON logging refactoring and X-Opaque-ID support backport(#41354 ) (#44178 ) This is a refactor to current JSON logging to make it more open for extensions and support for custom ES log messages used inDeprecationLogger IndexingSlowLog , SearchSLowLog We want to include x-opaque-id in deprecation logs. The easiest way to have this as an additional JSON field instead of part of the message is to create a custom DeprecatedMessage (extends ESLogMEssage) These messages are regular log4j messages with a text, but also carry a map of fields which can then populate the log pattern. The logic for this lives in ESJsonLayout and ESMessageFieldConverter. Similar approach can be used to refactor IndexingSlowLog and SearchSlowLog JSON logs to contain fields previously only present as escaped JSON string in a message field. closes #41350 backport #41354	2019-07-12 16:53:27 +02:00
Tanguy Leroux	b977f019b8	Expose translog stats in ReadOnlyEngine (#43752 ) (#43823 ) Backport of #43752 for 7.x.	2019-07-02 13:39:00 +02:00
Zachary Tong	1e47ea5f18	Update rare_term version skips, fix SetBackedScalingCuckooFilter javadoc	2019-07-01 10:52:06 -04:00
Zachary Tong	ea1794832f	Add RareTerms aggregation (#35718 ) This adds a `rare_terms` aggregation. It is an aggregation designed to identify the long-tail of keywords, e.g. terms that are "rare" or have low doc counts. This aggregation is designed to be more memory efficient than the alternative, which is setting a terms aggregation to size: LONG_MAX (or worse, ordering a terms agg by count ascending, which has unbounded error). This aggregation works by maintaining a map of terms that have been seen. A counter associated with each value is incremented when we see the term again. If the counter surpasses a predefined threshold, the term is removed from the map and inserted into a cuckoo filter. If a future term is found in the cuckoo filter we assume it was previously removed from the map and is "common". The map keys are the "rare" terms after collection is done.	2019-07-01 10:30:02 -04:00
Alan Woodward	81dbcfb268	Wildcard intervals (#43691 ) This commit adds a wildcard intervals source, similar to the prefix. It also changes the term parameter in prefix to read prefix, to bring it in to line with the pattern parameter in wildcard. Closes #43198	2019-06-28 14:04:03 +01:00
Alan Woodward	76d0edd1a4	Add prefix intervals source (#43635 ) This commit adds a prefix intervals source, allowing you to search for intervals that contain terms starting with a given prefix. The source can make use of the index_prefixes mapping option. Relates to #43198	2019-06-26 16:22:12 +01:00
Yu	c88f2f23a5	Make Recovery API support `detailed` params (#29076 ) Properly forwards the `detailed` parameter to show the recovery stats details. Closes #28910	2019-06-21 09:05:33 +02:00
Martijn Laarman	8b1b9f8ab9	Introduce stability description to the REST API specification (#38413 ) (#43278 ) * introduce state to the REST API specification * change state over to stability * CCR is no GA updated to stable * SQL is now GA so marked as stable * Introduce `internal` as state for API's, marks stable in terms of lifetime but unstable in terms of guarantees on its output format since it exposes internal representations * make setting a wrong stability value, or not setting it at all an error that causes the YAML test suite to fail * update spec files to be explicit about their stability state * Document the fact that stability needs to be defined Otherwise the YAML test runner will fail (with a nice exception message) * address check style violations * update rest spec unit tests to include stability * found one more test spec file not declaring stability, made sure stability appears after documentation everywhere * cluster.state is stable, mark response in some way to denote its a key value format that can be changed during minors * mark data frame API's as beta * remove internal and private as states for an API * removed the wrong enum values in the Stability Enum in the previous commit (cherry picked from commit 61c34bbd92f8f7e5f22fa411c6b682b0ebd8a99d)	2019-06-17 16:57:13 +02:00
Ryan Ernst	5be0fb32f8	Move painless context api spec to test local (#43122 ) The painless context api is internal and currently meant only for use in generating docs. This commit moves the spec file for the api so that it is only used by the test for this api, and not externally by any clients building from the public rest spec.	2019-06-12 08:19:45 -07:00
Luca Cavanna	e538592652	Update max_concurrent_shard_request parameter docs (#42227 ) Some of the docs were outdated as they did not mention that the limit is not per node. Also, The default value changed. Relates to #31206	2019-06-12 11:25:03 +02:00
Martijn Laarman	2c9a6cbf69	Documents the new deprecations options on the rest-api-spec (#41444 ) (#43090 ) * Documents the new deprecations options on the rest-api-spec Relates #41439 #38613 #35262 * remove reference to path now that #41452 is merged, also fixed missing a comma rendering the example json invalid * removed one more instance of path * make sure json examples are self contained and not excerpts (cherry picked from commit 4430f99750a3bf98373d69d2be59d71475c7aaad)	2019-06-11 15:15:11 +02:00
Martijn Laarman	cb7ce865b7	remove path from rest-api-spec (#41452 ) (#43084 ) (cherry picked from commit f5fde1d0843d2f0f53d3b9a15b9cfc8b94471ab7)	2019-06-11 12:52:36 +02:00
Alan Woodward	8e23e4518a	Move construction of custom analyzers into AnalysisRegistry (#42940 ) Both TransportAnalyzeAction and CategorizationAnalyzer have logic to build custom analyzers for index-independent analysis. A lot of this code is duplicated, and it requires the AnalysisRegistry to expose a number of internal provider classes, as well as making some assumptions about when analysis components are constructed. This commit moves the build logic directly into AnalysisRegistry, reducing the registry's API surface considerably.	2019-06-10 14:33:25 +01:00
Henning Andersen	dea935ac31	Reindex max_docs parameter name (#42942 ) Previously, a reindex request had two different size specifications in the body: * Outer level, determining the maximum documents to process * Inside the source element, determining the scroll/batch size. The outer level size has now been renamed to max_docs to avoid confusion and clarify its semantics, with backwards compatibility and deprecation warnings for using size. Similarly, the size parameter has been renamed to max_docs for update/delete-by-query to keep the 3 interfaces consistent. Finally, all 3 endpoints now support max_docs in both body and URL. Relates #24344	2019-06-07 12:16:36 +02:00
Gordon Brown	6eb4600e93	Add custom metadata to snapshots (#41281 ) Adds a metadata field to snapshots which can be used to store arbitrary key-value information. This may be useful for attaching a description of why a snapshot was taken, tagging snapshots to make categorization easier, or identifying the source of automatically-created snapshots.	2019-06-05 17:30:31 -06:00
Przemyslaw Gomulka	0ce7e7a4d8	Enable tests failing due to java-joda warnings (#42693 ) Tests were failing in mixed cluster after more broad warnings were introduced in 6.x These tests were using `yyyy-MM-dd` pattern which is now warning about the change of `y` to `u`. However, using predefined pattern `strict_date` which uses the same format prevents the warning from being generate and allow smooth upgrade/work in mixed cluster. relates #42679	2019-05-31 09:42:32 +02:00
James Baiera	41208b7041	Muting prefilter shard tests as they are breaking in BWC testing See #42679	2019-05-29 14:12:38 -04:00
Tanguy Leroux	6bec876682	Improve Close Index Response (#39687 ) This changes the `CloseIndexResponse` so that it reports closing result for each index. Shard failures or exception are also reported per index, and the global acknowledgment flag is computed from the index results only. The response looks like: ``` { "acknowledged" : true, "shards_acknowledged" : true, "indices" : { "docs" : { "closed" : true } } } ``` The response reports shard failures like: ``` { "acknowledged" : false, "shards_acknowledged" : false, "indices" : { "docs-1" : { "closed" : true }, "docs-2" : { "closed" : false, "shards" : { "1" : { "failures" : [ { "shard" : 1, "index" : "docs-2", "status" : "BAD_REQUEST", "reason" : { "type" : "index_closed_exception", "reason" : "closed", "index_uuid" : "JFmQwr_aSPiZbkAH_KEF7A", "index" : "docs-2" } } ] } } }, "docs-3" : { "closed" : true } } } ``` Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>	2019-05-24 21:57:55 -04:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Zachary Tong	072a9bdf55	Fix FiltersAggregation NPE when `filters` is empty (#41459 ) If `keyedFilters` is null it assumes there are unkeyed filters...which will NPE if the unkeyed filters was actually empty. This refactors to simplify the filter assignment a bit, adds an empty check and tidies up some formatting.	2019-05-20 10:04:21 -04:00
Tomas Della Vedova	4e9bf3f18a	Remove deprecated _source_exclude and _source_include from get API spec (#42188 ) Support for these parameters was removed in #35097. The spec were left outdated.	2019-05-20 12:40:45 +02:00
Russ Cam	8f838198fa	Remove parent query string parameter (#41098 ) This commit removes the deprecated parent query string parameter. The routing parameter should be used instead.	2019-05-20 12:08:02 +02:00
jaymode	7c6d7997db	Fix skip version in indices open test	2019-05-01 15:18:26 -06:00
Jason Tedor	7f3ab4524f	Bump 7.x branch to version 7.2.0 This commit adds the 7.2.0 version constant to the 7.x branch, and bumps BWC logic accordingly.	2019-05-01 13:38:57 -04:00
Jim Ferenczi	6184efaff6	Handle unmapped fields in _field_caps API (#34071 ) (#41426 ) Today the `_field_caps` API returns the list of indices where a field is present only if this field has different types within the requested indices. However if the request is an index pattern (or an alias, or both...) there is no way to infer the indices if the response contains only fields that have the same type in all indices. This commit changes the response to always return the list of indices in the response. It also adds a way to retrieve unmapped field in a specific section per field called `unmapped`. This section is created for each field that is present in some indices but not all if the parameter `include_unmapped` is set to true in the request (defaults to false).	2019-04-25 18:13:48 +02:00
Zachary Tong	ec5dd0594f	Disallow null/empty or duplicate composite sources (#41359 ) Adds some validation to prevent duplicate source names from being used in the composite agg. Also refactored to use a ConstructingObjectParser and removed the private ctor and setter for sources, making it mandatory.	2019-04-24 13:23:31 -04:00
Martijn Laarman	85b9dc18a7	fix #35262 define deprecations of API's as a whole and urls (#39063 ) * fix #35262 define deprecations of API's as a whole and urls * document hot threads deprecated paths * deprecate scroll_id as part of the URL, documented only as part of the body which is a safer behaviour as well * use version numbers up to patch version * rest spec parser picks up deprecated paths as paths too (cherry picked from commit 7e06023e7603b7584bfd9ee4e8a1ccd82c208ce7)	2019-04-23 14:28:36 +02:00
Jim Ferenczi	8f73e1e883	Fix unmapped field handling in the composite aggregation (#41280 ) The `composite` aggregation maps unknown fields as numerics, this means that any `after` value that is set on a query with an unmapped field on some indices will fail if the provided value is not numeric. This commit changes the default value source to use keyword instead in order to be able to parse any type of after values.	2019-04-18 23:08:13 +02:00
Zachary Tong	f19b052e03	Better error messages when pipelines reference incompatible aggs (#40068 ) Pipelines require single-valued agg or a numeric to be returned. If they don't get that, they throw an exception. Unfortunately, this exception text is very confusing to users because it usually arises from pathing "through" multiple terms aggs. The final target is a numeric, but it's the intermediary aggs that cause the problem. This commit adds the current agg name to the exception message so the user knows which "level" is the issue.	2019-04-15 10:35:53 -04:00
Nik Everett	c379206c1e	Fix some documentation urls in rest-api-spec (#40618 ) (#41145 ) Fixes some documentation urls in the rest-api-spec. Some of these URLs pointed to 404s and a few others pointed to deprecated documentation when we have better documentation now. I'm not consistent about `master` vs `current` because we're not consistent in other places and I think we should solve all of those at once with something a little more automatic.	2019-04-12 10:11:14 -04:00
Nhat Nguyen	9c36ab4ab4	Adjust bwc version for flush parameter validation Relates to #40213	2019-04-11 18:02:48 -04:00
Jason Tedor	24446ceae0	Add packaging to cluster stats response (#41048 ) This commit adds a packaging_types field to the cluster stats response that outlines the build flavors and types present in a cluster.	2019-04-10 13:47:19 -04:00
Mark Vieira	1287c7d91f	[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978 ) (#40993 ) * Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions. (cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6) * Fix forking JVM runner * Don't bump shadow plugin version	2019-04-09 11:52:50 -07:00
Adrien Grand	7c27e5f243	Revert "Mute failing test" This reverts commit `1af2b2bfe6`.	2019-04-04 15:55:09 +02:00
Nhat Nguyen	2756a3936b	Reject illegal flush parameters (#40213 ) This change rejects an illegal combination of flush parameters where force is true, but wait_if_ongoing is false. This combination is trappy and should be forbidden. Closes #36342	2019-04-04 09:02:31 -04:00
Alpar Torok	1af2b2bfe6	Mute failing test Tracked in #40838	2019-04-04 14:54:21 +03:00
Adrien Grand	670e76669c	Fix alias resolution runtime complexity. (#40263 ) (#40788 ) A user reported that the same query that takes ~900ms when querying an index pattern only takes ~50ms when only querying indices that have matches. The query is a date range query and we confirmed that the `can_match` phase works as expected. I was able to reproduce this issue locally with a single node: with 900 1-shard indices, a query to an index pattern that matches all indices runs in ~90ms while a query to the only index that has matches runs in 0-1ms. This ended up not being related to the `can_match` phase but to the cost of resolving aliases when querying an index pattern that matches lots of indices. In that case, we first resolve the index pattern to a list of concrete indices and then for each concrete index, we check whether it was matched through an alias, meaning we might have to apply alias filters. Unfortunately this second per-index operation runs in linear time with the number of matched concrete indices, which means that alias resolution runs in O(num_indices^2) overall. So queries get exponentially slower as an index pattern matches more indices. I reorganized alias resolution into a one-step operation that runs in linear time with the number of matches indices, and then a per-index operation that runs in linear time with the number of aliases of this index. This makes alias resolution run is O(num_indices * num_aliases_per_index) overall instead. When testing the scenario described above, the `took` went down from ~90ms to ~10ms. It is still more than the 0-1ms latency that one gets when only querying the single index that has data, but still much better than what we had before. Closes #40248	2019-04-04 11:40:42 +02:00
David Turner	5a2ba34174	Get node ID from nodes info in REST tests (#40052 ) (#40532 ) We discussed recently that the cluster state API should be considered "internal" and therefore our usual cast-iron stability guarantees do not hold for this API. However, there are a good number of REST tests that try to identify the master node. Today they call `GET /_cluster/state` API and extract the master node ID from the response. In fact many of these tests just want an arbitary node ID (or perhaps a data node ID) so an alternative is to call `GET _nodes` or `GET _nodes/data:true` and obtain a node ID from the keys of the `nodes` map in the response. This change adds the ability for YAML-based REST tests to extract an arbitrary key from a map so that they can obtain a node ID from the nodes info API instead of using the master node ID from the cluster state API. Relates #40047.	2019-03-27 23:08:10 +00:00
Andy Bristol	23395a9b9f	search as you type fieldmapper (#35600 ) Adds the search_as_you_type field type that acts like a text field optimized for as-you-type search completion. It creates a couple subfields that analyze the indexed terms as shingles, against which full terms are queried, and a prefix subfield that analyze terms as the largest shingle size used and edge-ngrams, against which partial terms are queried Adds a match_bool_prefix query type that creates a boolean clause of a term query for each term except the last, for which a boolean clause with a prefix query is created. The match_bool_prefix query is the recommended way of querying a search as you type field, which will boil down to term queries for each shingle of the input text on the appropriate shingle field, and the final (possibly partial) term as a term query on the prefix field. This field type also supports phrase and phrase prefix queries however	2019-03-27 13:29:13 -07:00
Nhat Nguyen	b9f96a8e1f	Expose external refreshes through the stats API (#38643 ) Right now, the stats API only provides refresh metrics regarding internal refreshes. This isn't very useful and somewhat misleading for cluster administrators since the internal refreshes are not indicative of documents being available for search. In this PR I added a new metric for collecting external refreshes as they occur and exposing them through the stats API. Now, calling an endpoint for stats will yield external refresh metrics as well. Relates #36712	2019-03-24 22:21:00 -04:00
Mayya Sharipova	49a7c6e0e8	Expose proximity boosting (#39385 ) (#40251 ) Expose DistanceFeatureQuery for geo, date and date_nanos types Closes #33382	2019-03-20 09:24:41 -04:00
Simon Willnauer	235f57989f	Return cached segments stats if `include_unloaded_segments` is true (#39698 ) Today we don't return segments stats for closed indices which makes it hard to tell how much memory such an index would require. With this change we return the statistics if requested by setting `include_unloaded_segments` to true on the rest request. Relates to #39512	2019-03-20 12:08:41 +01:00
Jason Tedor	86d1d03c37	Remove cluster state size (#40109 ) This commit removes the cluster state size field from the cluster state response, and drops the backwards compatibility layer added in 6.7.0 to continue to support this field. As calculation of this field was expensive and had dubious value, we have elected to remove this field.	2019-03-15 17:16:25 -04:00
Jack Conradson	b57af6c401	Add a Painless Context REST API (#39382 ) This PR adds an internal REST API for querying context information about Painless whitelists. Commands include the following: GET /_scripts/painless/_context -- retrieves a list of contexts GET /_scripts/painless/_context?context=%name% retrieves all available information about the API for this specific context	2019-03-14 12:42:12 -07:00
Jason Tedor	24973cf464	Adjust BWC version on cluster state size response This work has been backported all the way now, so this commit adjusts the BWC version. Relates #40016	2019-03-14 09:42:12 -04:00
Jason Tedor	9181668edf	Stop returning cluster state size by default (#40016 ) Computing the compressed size of the cluster state on every invocation of cluster:monitor/state action is expensive, and the value of this field is dubious anyway. Therefore we want to remove computing this field. As a first step, we stop computing and return this field by default. To avoid breaking users, we will give them a system property to use to tide them over until the next major release when we will actually remove this field. This comes with a deprecation warning too, and the backport to the appropriate minor will also include a note in the migration guide. There will be a follow-up to remove this field in the next major version.	2019-03-14 08:57:55 -04:00
Ioannis Kakavas	87ec511684	Mute locale dependent mapping tests (#39996 )	2019-03-13 17:07:20 +02:00
Tal Levy	6c52da54c8	fix index refresh in test within 20_mix_typeless_typeful (#39198 ) (#39804 ) the test "Implicitly create a typeless ... typed template" fails occasionally because the index operation hasn't propogated to update the index mapping in time for the following assertion about a dynamically mapped field "bar". error failed with: ``` field [test-1.mappings.my_type.properties.bar] doesn't have a true value Expected: not null but: was null ``` refreshing the index should resolve this timing issue.	2019-03-08 12:15:32 -08:00
Martijn Laarman	af4e740500	Document scroll param on reindex.json (#38615 ) The Reindex API also exposes `scroll` as a querystring parameter.	2019-03-07 18:14:29 +01:00
Jim Ferenczi	160dc29f0e	Handle total hits equal to track_total_hits (#37907 ) This change ensures that a total hits equal to the value set for track_total_hits is not considered as a lower bound.	2019-03-05 16:28:48 +01:00
Simon Willnauer	d112c89041	Allow inclusion of unloaded segments in stats (#39512 ) Today we have no chance to fetch actual segment stats for segments that are currently unloaded. This is relevant in the case of frozen indices. This allows to monitor how much memory a frozen index would use if it was unfrozen.	2019-03-05 14:02:20 +01:00
Tanguy Leroux	e005eeb0b3	Backport support for replicating closed indices to 7.x (#39506 )(#39499 ) Backport support for replicating closed indices (#39499) Before this change, closed indexes were simply not replicated. It was therefore possible to close an index and then decommission a data node without knowing that this data node contained shards of the closed index, potentially leading to data loss. Shards of closed indices were not completely taken into account when balancing the shards within the cluster, or automatically replicated through shard copies, and they were not easily movable from node A to node B using APIs like Cluster Reroute without being fully reopened and closed again. This commit changes the logic executed when closing an index, so that its shards are not just removed and forgotten but are instead reinitialized and reallocated on data nodes using an engine implementation which does not allow searching or indexing, which has a low memory overhead (compared with searchable/indexable opened shards) and which allows shards to be recovered from peer or promoted as primaries when needed. This new closing logic is built on top of the new Close Index API introduced in 6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before closing them, and closing an index on a 8.0 cluster will reinitialize the index shards and therefore impact the cluster health. Some APIs have been adapted to make them work with closed indices: - Cluster Health API - Cluster Reroute API - Cluster Allocation Explain API - Recovery API - Cat Indices - Cat Shards - Cat Health - Cat Recovery This commit contains all the following changes (most recent first): * c6c42a1 Adapt NoOpEngineTests after #39006 * 3f9993d Wait for shards to be active after closing indices (#38854) * 5e7a428 Adapt the Cluster Health API to closed indices (#39364) * 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767) * 71f5c34 Recover closed indices after a full cluster restart (#39249) * 4db7fd9 Adapt the Recovery API for closed indices (#38421) * 4fd1bb2 Adapt more tests suites to closed indices (#39186) * 0519016 Add replica to primary promotion test for closed indices (#39110) * b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631) * c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955) * 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex() * e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329) * cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327) * b9becdd Adapt testPendingTasks() for replicated closed indices (#38326) * 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024) * e53a9be Fix compilation error in IndexShardIT after merge with master * cae4155 Relax NoOpEngine constraints (#37413) * 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes * c63fd69 [RCI] Add NoOpEngine for closed indices (#33903) Relates to #33888	2019-03-01 14:48:26 +01:00
Tal Levy	0b676f07f6	mute failing test in 20_mix_typeless_typeful awaits fix in #39198	2019-02-20 16:12:07 -08:00
Adrien Grand	c28b6fb9b6	Reenable test in `indices.put_mapping/20_mix_typeless_typeful.yml`. (#39056 ) (#39057 ) This test had been disabled because of test failures, but it only affected the 6.x branch. The fix for 6.x is at #39054. On master/7.x/7.0 we can reenable the test as-is.	2019-02-20 11:33:50 +01:00
Lee Hinman	41ac6f9c55	Revert "Mute failing test 20_mix_typless_typefull (#38781 )" (#38912 ) (#39141 ) Backport of #38912 This reverts commit b91e0589fe1efdaa5061a75a3674a5cc8706b703. This should be fixed by #38873 Resolves #38711	2019-02-19 16:14:31 -07:00
Alan Woodward	ab4d5f404f	Add overlapping, before, after filters to intervals query (#38999 ) Lucene recently added `overlapping`, `before` and `after` filters to the intervals package. This commit exposes them in elasticsearch.	2019-02-18 15:06:24 +00:00
Alpar Torok	12eac6ad4b	Mute failing test (#38781 ) Tracking #38711	2019-02-12 15:57:57 +02:00
Tanguy Leroux	510829f9f7	TransportVerifyShardBeforeCloseAction should force a flush (#38401 ) This commit changes the `TransportVerifyShardBeforeCloseAction` so that it always forces the flush of the shard. It seems that #37961 is not sufficient to ensure that the translog and the Lucene commit share the exact same max seq no and global checkpoint information in case of one or more noop operations have been made. The `BulkWithUpdatesIT.testThatMissingIndexDoesNotAbortFullBulkRequest` and `FrozenIndexTests.testFreezeEmptyIndexWithTranslogOps` test this trivial situation and they both fail 1 on 10 executions. Relates to #33888	2019-02-06 13:22:54 +01:00
Boaz Leskes	033ba725af	Remove support for internal versioning for concurrency control (#38254 ) Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page: > When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach. Relates to #1078	2019-02-05 20:53:35 +01:00
Julie Tibshirani	3ce7d2c9b6	Make sure to reject mappings with type _doc when include_type_name is false. (#38270 ) `CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when deserializing index creation requests, accidentally accepts mappings that are nested twice under the type key (as described in the bug report #38266). This in turn causes us to be too lenient in parsing typeless mappings. In particular, we accept the following index creation request, even though it should not contain the type key `_doc`: ``` PUT index?include_type_name=false { "mappings": { "_doc": { "properties": { ... } } } } ``` There is a similar issue for both 'put templates' and 'put mappings' requests as well. This PR makes the minimal changes to detect and reject these typed mappings in requests. It does not address #38266 generally, or attempt a larger refactor around types in these server-side requests, as I think this should be done at a later time.	2019-02-05 10:52:32 -08:00
Tal Levy	ae47c025e2	add basic REST test for geohash_grid (#37996 )	2019-02-05 09:44:47 -08:00
Boaz Leskes	12657fda44	`if_seq_no` and `if_primary_term` parameters aren't wired correctly in REST Client's CRUD API (#38411 )	2019-02-05 18:05:56 +01:00
Yogesh Gaikwad	fe36861ada	Add support for API keys to access Elasticsearch (#38291 ) X-Pack security supports built-in authentication service `token-service` that allows access tokens to be used to access Elasticsearch without using Basic authentication. The tokens are generated by `token-service` based on OAuth2 spec. The access token is a short-lived token (defaults to 20m) and refresh token with a lifetime of 24 hours, making them unsuitable for long-lived or recurring tasks where the system might go offline thereby failing refresh of tokens. This commit introduces a built-in authentication service `api-key-service` that adds support for long-lived tokens aka API keys to access Elasticsearch. The `api-key-service` is consulted after `token-service` in the authentication chain. By default, if TLS is enabled then `api-key-service` is also enabled. The service can be disabled using the configuration setting. The API keys:- - by default do not have an expiration but expiration can be configured where the API keys need to be expired after a certain amount of time. - when generated will keep authentication information of the user that generated them. - can be defined with a role describing the privileges for accessing Elasticsearch and will be limited by the role of the user that generated them - can be invalidated via invalidation API - information can be retrieved via a get API - that have been expired or invalidated will be retained for 1 week before being deleted. The expired API keys remover task handles this. Following are the API key management APIs:- 1. Create API Key - `PUT/POST /_security/api_key` 2. Get API key(s) - `GET /_security/api_key` 3. Invalidate API Key(s) `DELETE /_security/api_key` The API keys can be used to access Elasticsearch using `Authorization` header, where the auth scheme is `ApiKey` and the credentials, is the base64 encoding of API key Id and API key separated by a colon. Example:- ``` curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health ``` Closes #34383	2019-02-05 14:21:57 +11:00
Mayya Sharipova	641704464d	Deprecate types in rollover index API (#38039 ) Relates to #35190	2019-02-04 16:07:45 -05:00
Alexander Reelsen	87f3579125	Add nanosecond field mapper (#37755 ) This adds a dedicated field mapper that supports nanosecond resolution - at the price of a reduced date range. When using the date field mapper, the time is stored as milliseconds since the epoch in a long in lucene. This field mapper stores the time in nanoseconds since the epoch - which means its range is much smaller, ranging roughly from 1970 to 2262. Note that aggregations will still be in milliseconds. However docvalue fields will have full nanosecond resolution Relates #27330	2019-02-04 11:31:16 +01:00
Boaz Leskes	f6e06a2b19	Adapt minimum versions for seq# powered operations in Watch related requests and UpdateRequest (#38231 ) After backporting #37977, #37857 and #37872	2019-02-01 20:37:16 -05:00
Julie Tibshirani	c2e9d13ebd	Default include_type_name to false in the yml test harness. (#38058 ) This PR removes the temporary change we made to the yml test harness in #37285 to automatically set `include_type_name` to `true` in index creation requests if it's not already specified. This is possible now that the vast majority of index creation requests were updated to be typeless in #37611. A few additional tests also needed updating here. Additionally, this PR updates the test harness to set `include_type_name` to `false` in index creation requests when communicating with 6.x nodes. This mirrors the logic added in #37611 to allow for typeless document write requests in test set-up code. With this update in place, we can remove many references to `include_type_name: false` from the yml tests.	2019-02-01 11:44:13 -08:00
Nhat Nguyen	f5f3cb8f4c	AwaitsFix PUT mapping with _doc on an index that has types (#38204 ) Tracked at #38202	2019-02-01 12:00:43 -05:00
Adrien Grand	2229e7231e	Enable bw tests for #37871 and #38032 . (#38167 ) Mixed-version clusters tests had been disabled initially since they wouldn't work until the functionality would be backported.	2019-02-01 13:55:51 +01:00
Jim Ferenczi	b7308aa03c	Don't load global ordinals with the `map` execution_hint (#37833 ) The terms aggregator loads the global ordinals to retrieve the cardinality of the field to aggregate on. This information is then used to select the strategy to use for the aggregation (breadth_first or depth_first). However this should be avoided if the execution_hint is explicitly set to map since this mode doesn't really need the global ordinals. Since we still need the cardinality of the field this change picks the maximum cardinality in the segments as an estimation of the total cardinality to select the strategy to use (breadth_first or depth_first). This estimation is only used if the execution hint is set to map, otherwise the global ordinals are still used to retrieve the accurate cardinality. Closes #37705	2019-02-01 09:35:46 +01:00
Yuri Astrakhan	f3cde06a1d	geotile_grid implementation (#37842 ) Implements `geotile_grid` aggregation This patch refactors previous implementation https://github.com/elastic/elasticsearch/pull/30240 This code uses the same base classes as `geohash_grid` agg, but uses a different hashing algorithm to allow zoom consistency. Each grid bucket is aligned to Web Mercator tiles.	2019-01-31 19:11:30 -05:00
Luca Cavanna	622fb7883b	Introduce ability to minimize round-trips in CCS (#37828 ) With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters. This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour. Relates to #32125	2019-01-31 15:12:14 +01:00
Adrien Grand	a536fa7755	Treat put-mapping calls with `_doc` as a top-level key as typed calls. (#38032 ) Currently the put-mapping API assumes that because the type name is `_doc` then it is dealing with a typeless put-mapping call. Yet we still allow running the put-mapping API in a typed fashion with `_doc` as a type name. The current logic triggers surprising errors when doing a typed put-mapping call with `_doc` as a type name on an index that has a type already. This is a bit of a corner-case, but is more important on 6.x due to the fact that using the index API with `_doc` as a type name triggers typed calls to the put-mapping API with `_doc` as a type name.	2019-01-31 13:57:42 +01:00
Adrien Grand	3c439d3b92	Fix test bug when testing the merging of mappings and templates. (#38021 ) This test performs a typed index call when it actually means to run a typeless index call.	2019-01-31 13:52:09 +01:00
David Turner	81c443c9de	Deprecate minimum_master_nodes (#37868 ) Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in tests, but for 7.x nodes this setting is not required as it has no effect. This commit removes this setting so that nodes are started with more realistic configurations, and deprecates it.	2019-01-30 20:09:15 +00:00
Colin Goodheart-Smithe	21e392e95e	Removes typed calls from YAML REST tests (#37611 ) This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour. The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.	2019-01-30 16:32:58 +00:00
Adrien Grand	c8af0f4bfa	Use mappings to format doc-value fields by default. (#30831 ) Doc-value fields now return a value that is based on the mappings rather than the script implementation by default. This deprecates the special `use_field_mapping` docvalue format which was added in #29639 only to ease the transition to 7.x and it is not necessary anymore in 7.0.	2019-01-30 10:31:51 +01:00
Adrien Grand	b63b50b945	Give precedence to index creation when mixing typed templates with typeless index creation and vice-versa. (#37871 ) Currently if you mix typed templates and typeless index creation or typeless templates and typed index creation then you will end up with an error because Elasticsearch tries to create an index that has multiple types: `_doc` and the explicit type name that you used. This commit proposes to give precedence to the index creation call so that the type from the template will be ignored if the index creation call is typeless while the template is typed, and the type from the index creation call will be used if there is a typeless template. This is consistent with the fact that index creation already "wins" if a field is defined differently in the index creation call and in a template: the definition from the index creation call is used in such cases. Closes #37773	2019-01-30 10:28:24 +01:00
Albert Zaharovits	d05a4b9d14	Get Aliases with wildcard exclusion expression (#34230 ) This commit adds the code in the HTTP layer that will parse exclusion wildcard expressions. The existing code issues 404s for wildcards as well as explicit indices. But, in general, in an expression with exclude wildcards (-...*) following other include wildcards, there is no way to tell if the include wildcard produced no results or they were subsequently excluded. Therefore, the proposed change is breaking the behavior of 404s for wildcards. Specifically, no 404s will be returned for wildcards, even if they are not followed by exclude wildcards or the exclude wildcards could not possibly exclude what has previously been included. Only explicitly requested aliases will be called out as missing.	2019-01-29 18:56:20 +02:00
Boaz Leskes	65a9b61a91	Add Seq# based optimistic concurrency control to UpdateRequest (#37872 ) The update request has a lesser known support for a one off update of a known document version. This PR adds an a seq# based alternative to power these operations. Relates #36148 Relates #10708	2019-01-29 09:18:05 -05:00
Jim Ferenczi	cb451edb01	Allow nested fields in the composite aggregation (#37178 ) This changes adds the support to handle `nested` fields in the `composite` aggregation. A `nested` aggregation can be used as parent of a `composite` aggregation in order to target `nested` fields in the `sources`. Closes #28611	2019-01-25 14:00:39 +01:00
Boaz Leskes	9d87ca567a	300_sequence_numbers should not rely on 7.0 total hits structure	2019-01-24 23:41:10 +01:00
Boaz Leskes	af2f4c8f73	enable bwc tests and bump versions after backporting https://github.com/elastic/elasticsearch/pull/37639	2019-01-24 20:55:55 +01:00
Alexander Reelsen	daa2ec8a60	Switch mapping/aggregations over to java time (#36363 ) This commit moves the aggregation and mapping code from joda time to java time. This includes field mappers, root object mappers, aggregations with date histograms, query builders and a lot of changes within tests. The cut-over to java time is a requirement so that we can support nanoseconds properly in a future field mapper. Relates #27330	2019-01-23 10:40:05 +01:00
Boaz Leskes	52ba407931	Expose sequence number and primary terms in search responses (#37639 ) Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it. This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.	2019-01-23 09:01:58 +01:00
Tomas Della Vedova	257f3eff22	Add note about how the body is referenced (#33935 )	2019-01-22 09:07:48 +01:00
Julie Tibshirani	8da7a27f3b	Deprecate types in the put mapping API. (#37280 ) From #29453 and #37285, the `include_type_name` parameter was already present and defaulted to false. This PR makes the following updates: - Add deprecation warnings to `RestPutMappingAction`, plus tests in `RestPutMappingActionTests`. - Add a typeless 'put mappings' method to the Java HLRC, and deprecate the old typed version. To do this cleanly, I opted to create a new `PutMappingRequest` object that differs from the existing server one.	2019-01-18 12:28:31 -08:00
Christoph Büscher	2f0e0b2426	Allow indices.get_mapping response parsing without types (#37492 ) This change adds deprecation warning to the indices.get_mapping API in case the "inlcude_type_name" parameter is set to "true" and changes the parsing code in GetMappingsResponse to parse the type-less response instead of the one containing types. As a consequence the HLRC client doesn't need to force "include_type_name=true" any more and the GetMappingsResponseTests can be adapted to the new format as well. Also removing some "include_type_name" parameters in yaml test and docs where not necessary.	2019-01-18 09:33:36 +01:00
Julie Tibshirani	1a1dbf705f	Make sure to use the resolved type in DocumentMapperService#extractMappings. (#37451 ) * Pull out a shared method MapperService#resolveDocumentType. * Make sure to resolve the type when extracting the mappings. Addresses #36811.	2019-01-15 07:32:47 -08:00

1 2 3 4 5 ...

1878 Commits