OpenSearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	91e938ead8	Add Trace Logging of REST Requests (#51684 ) (#52015 ) Being able to trace log all REST requests to a node would make debugging a number of issues a lot easier.	2020-02-07 09:03:20 +01:00
Jim Ferenczi	0f333c89b9	Always rewrite search shard request outside of the search thread pool (#51708 ) (#51979 ) This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change. This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.	2020-02-06 10:53:11 +01:00
Jim Ferenczi	fb710cc62b	Remove the query builder serialization from QueryShardException message (#51885 ) QueryBuilders that throw exceptions on shards when building the Lucene query returns the full serialization of the query builder in the exception message. For large queries that fails to execute due to the max boolean clause, this means that we keep a reference of these big messages for every shard that participate in the request. In order to limit the memory needed to hold these query shard exceptions in the coordinating node, this change removes the query builder serialization from the shard exception. The query is known by the user so there should be no need to repeat it on every shard exception. We could also omit the entire stack trace for known bad request exception but it would deserve a separate issue/pr. Closes #51843 Closes #48910	2020-02-06 08:26:15 +01:00
Nik Everett	80e29a47d8	Fix a sneaky bug in rare_terms (#51868 ) (#51959 ) When the `rare_terms` aggregation contained another aggregation it'd break them. Most of the time. This happened because the process that it uses to remove buckets that turn out not to be rare was incorrectly merging results from multiple leaves. This'd cause array index out of bounds issues. We didn't catch it in the test because the issue doesn't happen on the very first bucket. And the tests generated data in such a way that the first bucket always contained the rare terms. Randomizing the order of the generated data fixed the test so it caught the issue. Closes #51020	2020-02-05 16:32:55 -05:00
Adrien Grand	ad9d2f1922	Move analysis/mappings stats to cluster-stats. (#51875 ) Closes #51138	2020-02-05 11:02:25 +01:00
Yannick Welsch	b4480bb8a4	Mute LoggingOutputStreamTests (#51917 ) Relates #51838	2020-02-05 10:46:45 +01:00
Julie Tibshirani	38ce428831	Create a class to hold field capabilities for one index. (#51844 ) Currently, the same class `FieldCapabilities` is used both to represent the capabilities for one index, and also the merged capabilities across indices. To help clarify the logic, this PR proposes to create a separate class `IndexFieldCapabilities` for the capabilities in one index. The refactor will also help when adding `source_path` information in #49264, since the merged source path field will have a different structure from the field for a single index. Individual changes: * Add a new class IndexFieldCapabilities. * Remove extra constructor from FieldCapabilities. * Combine the add and merge methods in FieldCapabilities.Builder.	2020-02-04 11:24:57 -08:00
Maria Ralli	8d3e73b3a0	Add host address to BindTransportException message (#51269 ) When bind fails, show the host address in addition to the port. This helps debugging cases with wrong "network.host" values. Closes #48001	2020-02-04 17:13:19 +00:00
feifeiiiiiiiiii	337153b29f	Throw better exception on wrong `dynamic_templates` syntax (#51783 ) Currently, a mappings update request, where dynamic_mappings is an object instead of an array, results in a http response with a 500 code. This PR checks for this condition and throws a MapperParsingException like we do for other malformed mapping cases. Closes #51486	2020-02-04 17:01:55 +01:00
Henning Andersen	41552359a2	Increase master disruption test assert timeouts (#51810 ) After #51803, the timeouts waiting for assertions around master change were too short.	2020-02-03 15:51:33 +01:00
Henning Andersen	1800b2730f	Fix completeWith exception handling (#51734 ) ActionListener.completeWith would catch exceptions from listener.onResponse and deliver them to lister.onFailure, essentially double notifying the listener. Instead we now assert that listeners do not throw when using ActionListener.completeWith. Relates #50886	2020-02-03 14:22:55 +01:00
Przemyslaw Gomulka	a6d24d6a46	Fix ingest timezone logic backport(#51215 ) (#51802 ) when a timezone is not provided Ingest logic should consider a time to be in a timezone provided as a parameter. When a timezone is provided Ingest should recalculate a time to the timezone provided as a parameter closes #51108 backport(#51215)	2020-02-03 14:17:43 +01:00
Henning Andersen	918dfaff1f	Increase disruption test publish timeout to 5s (#51803 ) With the new mechanism for storing cluster state in lucene, we store index metadata in multiple data paths too. This causes cluster state publish to timeout too frequently with a 1s timeout, so increasing it to 5s. Also increasing follower check timeout to 5s since it also sometimes has fsync in its timeout path and leader check for symmetry. Closes #51329	2020-02-03 13:57:57 +01:00
Ryan Ernst	21224caeaf	Remove comparison to true for booleans (#51723 ) While we use `== false` as a more visible form of boolean negation (instead of `!`), the true case is implied and the true value does not need to explicitly checked. This commit converts cases that have slipped into the code checking for `== true`.	2020-01-31 16:35:43 -08:00
Ryan Ernst	61622c4f0c	Fix LoggingOutputStream to work on windows (#51779 ) LoggingOutputStream reads a stream and breaks on newlines. This commit fixes the behavior to account for windows newlines also containing `\r`. closes #51532	2020-01-31 16:30:10 -08:00
Noor	70bb7c862d	Fixed typo in comment (#51745 ) Comment said the supporter highlighter type was fvj, there's no such highlighter. It is supposed to say fvh for fast vector highlighting.	2020-01-31 10:30:21 -08:00
Mayya Sharipova	42b885f050	Upgrade to lucene-8.5.0-snapshot-3333ce7da6d (#51749 ) Backport for #51327	2020-01-31 11:20:15 -05:00
David Turner	39a3a950de	Simplify rebalancer's weight function (#51632 ) This commit inlines the `weightShardAdded` and `weightShardRemoved` methods from the `BalancedShardsAllocator#WeightFunction` that respectively add and subtract 1 (±ε) from the result of `weight`. It then follows up with a number of simplifications that this inlining enables. As a side-effect it also somewhat reduces the number of calls to canRebalance and canAllocate during rebalancing when there are multiple shards of the same index on a node that is heavier than average.	2020-01-31 14:40:23 +00:00
Christoph Büscher	86f3b47299	Make `date_range` query rounding consistent with `date` (#50237 ) (#51741 ) Currently the rounding used in range queries can behave differently for `date` and `date_range` as explained in #50009. The behaviour on `date` fields is the one we document in https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#range-query-date-math-rounding. This change adapts the rounding behaviour for RangeType.DATE so it uses the same logic as the `date` for the `date_range` type. Backport of #50237	2020-01-31 15:35:05 +01:00
Dominic Page	d7e1215e42	Backport of #50737 to 7.x (#51662 ) * Refactor GeoShape tests to GeoShape and GeoPoint (#50737) Backport to 7.x	2020-01-31 11:55:10 +01:00
Adrien Grand	915a931e93	Bucket aggregation circuit breaker optimization. (#46751 ) (#51730 ) Co-authored-by: Howard <danielhuang@tencent.com>	2020-01-31 11:30:51 +01:00
Henning Andersen	282ae8fd8c	Increase log level for failing AbstractDisruptionIT tests (#51462 ) Increase log level for two failing tests to include trace logging for PersistedClusterStateService. Relates #51329	2020-01-31 08:30:09 +01:00
Ioannis Kakavas	46ffc57abe	Fix test compilation error	2020-01-31 09:23:02 +02:00
David Turner	72ae0ca73f	Log exceptions in TcpTransport at DEBUG level (#51612 ) When running Elasticsearch on a flaky network, we may see nodes leaving the cluster with reason `disconnected`. It may be useful to the cluster administrator to see the full exception that caused the disconnection, but this is only available with `TRACE` level logging which commingles the details of the problem with other messages that are not useful to end users. This commit promotes logging of exceptions in `TcpTransport` from `TRACE` to `DEBUG` to separate them from the truly `TRACE`-level messages.	2020-01-31 01:36:58 +00:00
Gordon Brown	10c8179351	Use exclusions list instead of fake system indices (#51586 ) This commit switches the strategy for managing dot-prefixed indices that should be hidden indices from using "fake" system indices to an explicit exclusions list that must be updated when those indices are converted to hidden indices.	2020-01-30 16:31:27 -07:00
Armin Braun	9c7a63214c	Fix InternalEngineTests.testSeqNoAndCheckpoints (#51630 ) (#51671 ) * Fix InternalEngineTests.testSeqNoAndCheckpoints If we force flush while possibly triggering a merge the local checkpoint may change from the expectation from the loop that just increments on every operation. Closes #51604	2020-01-30 15:42:48 +01:00
Nhat Nguyen	f0fad5b622	Deprecate translog retention settings (#51588 ) (#51638 ) This change deprecates the translog retention settings as they are effectively ignored since 7.4. Relates #50775 Relates #45473	2020-01-30 09:03:10 -05:00
Armin Braun	1064009e9d	Allow Parallel Snapshot Restore And Delete (#51608 ) (#51666 ) There is no reason not to allow deletes in parallel to restores if they're dealing with different snapshots. A delete will not remove any files related to the snapshot that is being restored if it is different from the deleted snapshot because those files will still be referenced by the restoring snapshot. Loading RepositoryData concurrently to modifying it is concurrency safe nowadays as well since the repo generation is tracked in the cluster state. Closes #41463	2020-01-30 14:27:05 +01:00
Henning Andersen	2e8a2c4baf	Fix ActionListener.map exception handling (#50886 ) (#51642 ) ActionListener.map would call listener.onFailure for exceptions from listener.onResponse, but this means we could double trigger some listeners which is generally unexpected. Instead, we should assume that a listener's onResponse (and onFailure) implementation is responsible for its own exception handling.	2020-01-30 12:54:55 +01:00
Ryan Ernst	cf5a2269a5	Fix stderr to also be captured by log4j (#51569 ) In #50259 we redirected stdout and stderr to log4j, to capture jdk and external library messages. However, a typo in the method name used to redirect the stream in java means stdout is currently being duplicated twice, and stderr not captured. This commit corrects that mistake. Unfortunately this is at a level that cannot really be tested, thus we are still missing tests for this behavior.	2020-01-29 16:37:56 -08:00
Julie Tibshirani	40f4f2d267	Avoid processing search profile results twice. (#51575 ) Just a small clean-up, not motivated by performance.	2020-01-29 14:37:39 -08:00
Nhat Nguyen	316fba0c67	Ensure warm up engine in testTranslogReplayWithFailure We need to warm up the engine (i.e., perform an external refresh) before accessing the external refresh. Note that we refresh externally before allowing reading from a shard. Relates #48605 Closes #51548	2020-01-28 21:49:21 -05:00
Jason Tedor	b080237837	Ignore virtual ethernet devices that disappear (#51581 ) When checking if a device is up, today we can run into virtual ethernet devices that disappear while we are in the middle of checking. This leads to "no such device". This commit addresses such devices by treating them as not being up, if they are virtual ethernet devices that disappeared while we were checking.	2020-01-28 18:44:36 -05:00
Jim Ferenczi	77f4aafaa2	Expose the logic to cancel task when the rest channel is closed (#51423 ) This commit moves the logic that cancels search requests when the rest channel is closed to a generic client that can be used by other APIs. This will be useful for any rest action that wants to cancel the execution of a task if the underlying rest channel is closed by the client before completion. Relates #49931 Relates #50990 Relates #50990	2020-01-28 22:55:42 +01:00
Armin Braun	aae93a7578	Allow Repository Plugins to Filter Metadata on Create (#51472 ) (#51542 ) * Allow Repository Plugins to Filter Metadata on Create Add a hook that allows repository plugins to filter the repository metadata before it gets written to the cluster state.	2020-01-28 18:33:26 +01:00
David Roberts	b8adb59e4a	[TEST] Mute ReloadSecureSettingsIT.testReloadAllNodesWithPasswordWithoutTLSFails Due to https://github.com/elastic/elasticsearch/issues/51546	2020-01-28 17:14:36 +00:00
Gordon Brown	89c2834b24	Deprecate creation of dot-prefixed index names except for hidden and system indices (#49959 ) This commit deprecates the creation of dot-prefixed index names (e.g. .watches) unless they are either 1) a hidden index, or 2) registered by a plugin that extends SystemIndexPlugin. This is the first step towards more thorough protections for system indices. This commit also modifies several plugins which use dot-prefixed indices to register indices they own as system indices, and adds a plugin to register .tasks as a system index.	2020-01-28 10:01:16 -07:00
William Brafford	9efa5be60e	Password-protected Keystore Feature Branch PR (#51123 ) (#51510 ) * Reload secure settings with password (#43197) If a password is not set, we assume an empty string to be compatible with previous behavior. Only allow the reload to be broadcast to other nodes if TLS is enabled for the transport layer. * Add passphrase support to elasticsearch-keystore (#38498) This change adds support for keystore passphrases to all subcommands of the elasticsearch-keystore cli tool and adds a subcommand for changing the passphrase of an existing keystore. The work to read the passphrase in Elasticsearch when loading, which will be addressed in a different PR. Subcommands of elasticsearch-keystore can handle (open and create) passphrase protected keystores When reading a keystore, a user is only prompted for a passphrase only if the keystore is passphrase protected. When creating a keystore, a user is allowed (default behavior) to create one with an empty passphrase Passphrase can be set to be empty when changing/setting it for an existing keystore Relates to: #32691 Supersedes: #37472 * Restore behavior for force parameter (#44847) Turns out that the behavior of `-f` for the add and add-file sub commands where it would also forcibly create the keystore if it didn't exist, was by design - although undocumented. This change restores that behavior auto-creating a keystore that is not password protected if the force flag is used. The force OptionSpec is moved to the BaseKeyStoreCommand as we will presumably want to maintain the same behavior in any other command that takes a force option. * Handle pwd protected keystores in all CLI tools (#45289) This change ensures that `elasticsearch-setup-passwords` and `elasticsearch-saml-metadata` can handle a password protected elasticsearch.keystore. For setup passwords the user would be prompted to add the elasticsearch keystore password upon running the tool. There is no option to pass the password as a parameter as we assume the user is present in order to enter the desired passwords for the built-in users. For saml-metadata, we prompt for the keystore password at all times even though we'd only need to read something from the keystore when there is a signing or encryption configuration. * Modify docs for setup passwords and saml metadata cli (#45797) Adds a sentence in the documentation of `elasticsearch-setup-passwords` and `elasticsearch-saml-metadata` to describe that users would be prompted for the keystore's password when running these CLI tools, when the keystore is password protected. Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * Elasticsearch keystore passphrase for startup scripts (#44775) This commit allows a user to provide a keystore password on Elasticsearch startup, but only prompts when the keystore exists and is encrypted. The entrypoint in Java code is standard input. When the Bootstrap class is checking for secure keystore settings, it checks whether or not the keystore is encrypted. If so, we read one line from standard input and use this as the password. For simplicity's sake, we allow a maximum passphrase length of 128 characters. (This is an arbitrary limit and could be increased or eliminated. It is also enforced in the keystore tools, so that a user can't create a password that's too long to enter at startup.) In order to provide a password on standard input, we have to account for four different ways of starting Elasticsearch: the bash startup script, the Windows batch startup script, systemd startup, and docker startup. We use wrapper scripts to reduce systemd and docker to the bash case: in both cases, a wrapper script can read a passphrase from the filesystem and pass it to the bash script. In order to simplify testing the need for a passphrase, I have added a has-passwd command to the keystore tool. This command can run silently, and exit with status 0 when the keystore has a password. It exits with status 1 if the keystore doesn't exist or exists and is unencrypted. A good deal of the code-change in this commit has to do with refactoring packaging tests to cleanly use the same tests for both the "archive" and the "package" cases. This required not only moving tests around, but also adding some convenience methods for an abstraction layer over distribution-specific commands. * Adjust docs for password protected keystore (#45054) This commit adds relevant parts in the elasticsearch-keystore sub-commands reference docs and in the reload secure settings API doc. * Fix failing Keystore Passphrase test for feature branch (#50154) One problem with the passphrase-from-file tests, as written, is that they would leave a SystemD environment variable set when they failed, and this setting would cause elasticsearch startup to fail for other tests as well. By using a try-finally, I hope that these tests will fail more gracefully. It appears that our Fedora and Ubuntu environments may be configured to store journald information under /var rather than under /run, so that it will persist between boots. Our destructive tests that read from the journal need to account for this in order to avoid trying to limit the output we check in tests. * Run keystore management tests on docker distros (#50610) * Add Docker handling to PackagingTestCase Keystore tests need to be able to run in the Docker case. We can do this by using a DockerShell instead of a plain Shell when Docker is running. * Improve ES startup check for docker Previously we were checking truncated output for the packaged JDK as an indication that Elasticsearch had started. With new preliminary password checks, we might get a false positive from ES keystore commands, so we have to check specifically that the Elasticsearch class from the Bootstrap package is what's running. * Test password-protected keystore with Docker (#50803) This commit adds two tests for the case where we mount a password-protected keystore into a Docker container and provide a password via a Docker environment variable. We also fix a logging bug where we were logging the identifier for an array of strings rather than the contents of that array. * Add documentation for keystore startup prompting (#50821) When a keystore is password-protected, Elasticsearch will prompt at startup. This commit adds documentation for this prompt for the archive, systemd, and Docker cases. Co-authored-by: Lisa Cawley <lcawley@elastic.co> * Warn when unable to upgrade keystore on debian (#51011) For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This commit brings the same logic to the code for Debian packages. See the posttrans file for gets executed for RPMs. * Restore handling of string input Adds tests that were mistakenly removed. One of these tests proved we were not handling the the stdin (-x) option correctly when no input was added. This commit restores the original approach of reading stdin one char at a time until there is no more (-1, \r, \n) instead of using readline() that might return null * Apply spotless reformatting * Use '--since' flag to get recent journal messages When we get Elasticsearch logs from journald, we want to fetch only log messages from the last run. There are two reasons for this. First, if there are many logs, we might get a string that's too large for our utility methods. Second, when we're looking for a specific message or error, we almost certainly want to look only at messages from the last execution. Previously, we've been trying to do this by clearing out the physical files under the journald process. But there seems to be some contention over these directories: if journald writes a log file in between when our deletion command deletes the file and when it deletes the log directory, the deletion will fail. It seems to me that we might be able to use journald's "--since" flag to retrieve only log messages from the last run, and that this might be less likely to fail due to race conditions in file deletion. Unfortunately, it looks as if the "--since" flag has a granularity of one-second. I've added a two-second sleep to make sure that there's a sufficient gap between the test that will read from journald and the test before it. * Use new journald wrapper pattern * Update version added in secure settings request Co-authored-by: Lisa Cawley <lcawley@elastic.co> Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>	2020-01-28 05:32:32 -05:00
Tal Levy	fc2d875c9f	Fix geogrid with bounds test edge cases (#51118 ) This commit modifies the bounding box for geogrid unit tests to only consider bounding boxes that have significant longitudinal width and whose coordinates are normalized to quantized space Closes #51103.	2020-01-27 12:16:18 -08:00
Nik Everett	4ff314a9d5	Begin moving date_histogram to offset rounding (take two) (#51271 ) (#51495 ) We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset. This is a redo of #50873 with more integration tests. This reverts commit d114c9db3e1d1a766f9f48f846eed0466125ce83.	2020-01-27 13:40:54 -05:00
Nik Everett	faf6bb27be	Support time_zone on composite's date_histogram (#51172 ) (#51491 ) We've been parsing the `time_zone` parameter on `date_hitogram` for a while but it hasn't done anything. This wires it up. Closes #45199 Inspired by #45200	2020-01-27 12:44:48 -05:00
Armin Braun	2eeea21d84	Use Consistent ClusterState throughout Snapshot API Calls (#51464 ) (#51471 ) We shouldn't be using potentially changing versions of the cluster state when answering a snapshot status API call by calling `SnapshotService#currentSnapshots` multiple times (each time using `ClusterService#state` under the hood) but instead pass down the state from the transport action. Having these API behave more in a more deterministic way will make it easier to use them once parallel repository operations are introduced.	2020-01-27 13:28:17 +01:00
Ryan Ernst	fbfc5a327c	Fix checkstyle and test for logging output stream Commit `a156629b` didn't quite fix the botched backport. This commit does.	2020-01-25 14:13:47 -08:00
Ryan Ernst	a156629b4a	Fix backport compilation Java 8 doesn't have a PrintStream ctor which takes a Charset object, only a charset name. This fixes that.	2020-01-25 11:13:33 -08:00
Ryan Ernst	a564cac7ba	Capture stdout and stderr to log4j log (#50259 ) This commit overrides the stdout and stderr print streams to be redirected to the main elasticsearch.log file. While the Elasticsearch project ensures stdout and stderr are not written to, the jdk or 3rd party libs may do this, which can be unexepected for users used to looking the elasticsearch log. closes #50156	2020-01-25 11:00:01 -08:00
Armin Braun	ef94a5863a	Fix RareClusterStateIT Cancelling Publication too Early (#51429 ) (#51434 ) Wait for the cluster to have settled down and have the same accepted version on all nodes before executing and cancelling request so that a slow CS accept on one node doesn't make it fall behind and then get sent the full CS because of the diff-version mismatch, breaking the mechanics of this test. Closes #51308	2020-01-24 20:33:45 +01:00
Armin Braun	af1ff52e70	Fix TransportMasterNodeAction not Retrying NodeClosedException (#51325 ) (#51437 ) Added node closed exception to the retryable remote exceptions as it's possible to run into this exception instead of a connect exception when the master node is just shutting down but still responding to requests.	2020-01-24 20:33:13 +01:00
Armin Braun	f0d8c785e3	Fix Inconsistent Shard Failure Count in Failed Snapshots (#51416 ) (#51426 ) * Fix Inconsistent Shard Failure Count in Failed Snapshots This fix was necessary to allow for the below test enhancement: We were not adding shard failure entries to a failed snapshot for those snapshot entries that were never attempted because the snapshot failed during the init stage and wasn't partial. This caused the never attempted snapshots to be counted towards the successful shard count which seems wrong and broke repository consistency tests. Also, this change adjusts snapshot resiliency tests to run another snapshot at the end of each test run to guarantee a correct `index.latest` blob exists after each run. Closes #47550	2020-01-24 18:20:47 +01:00
Ryan Ernst	144e037941	Ignore order of indices in hidden indices test (#51383 ) The order indices are returned in in the metadata is not guaranteed. This commit accounts for any possible ordering in assertions about hidden indices. closes #51340	2020-01-23 15:49:32 -08:00
Nhat Nguyen	1ca5dd13de	Flush instead of synced-flush inactive shards (#51365 ) If all nodes are on 7.6, we prefer to perform a normal flush instead of synced flush when a shard becomes inactive. Backport of #49126	2020-01-23 17:20:57 -05:00
David Turner	0152c40724	Log when probe succeeds but full connection fails (#51304 ) (#51357 ) It is permitted for nodes to accept transport connections at addresses other than their publish address, which allows a good deal of flexibility when configuring discovery. However, it is not unusual for users to misconfigure nodes to pick a publish address which is inaccessible to other nodes. We see this happen a lot if the nodes are on different networks separated by a proxy, or if the nodes are running in Docker with the wrong kind of network config. In this case we offer no useful feedback to the user unless they enable TRACE-level logs. It's particularly tricky to diagnose because if we test connectivity between the nodes (using their discovery addresses) then all will appear well. This commit adds a WARN-level log if this kind of misconfiguration is detected: the probe connection has succeeded (to indicate that we are really talking to a healthy Elasticsearch node) but the followup connection attempt fails. It also tidies up some loose ends in `HandshakingTransportAddressConnector`, removing some TODOs that need not be completed, and registering its accidentally-unregistered timeout settings.	2020-01-23 17:27:59 +00:00
Nhat Nguyen	acf84b68cb	Do not wrap soft-deletes reader for segment stats (#51331 ) IndexWriter might not filter out fully deleted segments if retention leases exist or the number of the retaining operations is non-zero. SoftDeletesDirectoryReaderWrapper, however, always filters out fully deleted segments. This change uses the original directory reader when calculating segment stats instead. Relates #51192 Closes #51303	2020-01-23 08:43:06 -05:00
Armin Braun	4e8ab43a3e	Simplify Snapshot Initialization (#51256 ) (#51344 ) We were loading `RepositoryData` twice during snapshot initialization, redundantly checking if a snapshot existed already. The first snapshot existence check is somewhat redundant because a snapshot could be created between loading `RepositoryData` and updating the cluster state with the `INIT` state snapshot entry. Also, it is much safer to do the subsequent checks for index existence in the repo and and the presence of old version snapshots once the `INIT` state entry prevents further snapshots from being created concurrently. While the current state of things will never lead to corruption on a concurrent snapshot creation, it could result in a situation (though unlikely) where all the snapshot's work is done on the data nodes, only to find out that the repository generation was off during snapshot finalization, failing there and leaving a bunch of dead data in the repository that won't be used in a subsequent snapshot (because the shard generation was never referenced due to the failed snapshot finalization). Note: This is a step on the way to parallel repository operations by making snapshot related CS and repo related CS more tightly correlated.	2020-01-23 14:29:35 +01:00
Henning Andersen	8c8d0dbacc	Revert "Workaround for JDK 14 EA FileChannel.map issue (#50523 )" (#51323 ) This reverts commit c7fd24ca1569a809b499caf34077599e463bb8d6. Now that JDK-8236582 is fixed in JDK 14 EA, we can revert the workaround. Relates #50523 and #50512	2020-01-23 11:27:02 +01:00
Nhat Nguyen	157b352b47	Exclude nested documents in LuceneChangesSnapshot (#51279 ) LuceneChangesSnapshot can be slow if nested documents are heavily used. Also, it estimates the number of operations to be recovered in peer recoveries inaccurately. With this change, we prefer excluding the nested non-root documents in a Lucene query instead.	2020-01-22 17:33:28 -05:00
Nirmal Chidambaram	f193e527a1	Short circuited to MatchNone for non-participating slice (#51207 ) In case of numSlices = numShards, use a MatchNone query instead of boolean with a MatchNone - MUST clause Backport for #51207	2020-01-22 17:04:53 -05:00
Igor Motov	08e9c673e5	Fix leftover mentions of method parameter in Percentile Aggs (#51272 ) The method parameter is not used in the percentile aggs, instead the method is determined by the presence of `hdr` or `tdigest` objects. Relates to #8324	2020-01-22 10:03:35 -05:00
Andrei Dan	123266714b	ILM wait for active shards on rolled index in a separate step (#50718 ) (#51296 ) After we rollover the index we wait for the configured number of shards for the rolled index to become active (based on the index.write.wait_for_active_shards setting which might be present in a template, or otherwise in the default case, for the primaries to become active). This wait might be long due to disk watermarks being tripped, replicas not being able to spring to life due to cluster nodes reconfiguration and others and, the RolloverStep might not complete successfully due to this inherent transient situation, albeit the rolled index having been created. (cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-22 11:01:52 +00:00
Armin Braun	7b4c2bfdc4	Fix Overly Optimistic Request Deduplication (#51270 ) (#51291 ) On master failover we have to resent all the shard failed messages, but the transport requests remain the same in the eyes of `equals`. If the master failover is registered and the requests to the new master are sent before all the callbacks have executed and the request to the old master removed from the deduplicator then the requuests to the new master will incorrectly fail and the snapshot get stuck. Closes #51253	2020-01-22 11:38:40 +01:00
Stuart Tettemer	41c15b438d	Scripting: Add char position of script errors (#51069 ) (#51266 ) Add the character position of a scripting error to error responses. The contents of the `position` field are experimental and subject to change. Currently, `offset` refers to the character location where the error was encountered, `start` and `end` define a range of characters that contain the error. eg. ``` { "error": { "root_cause": [ { "type": "script_exception", "reason": "runtime error", "script_stack": [ "y = x;", " ^---- HERE" ], "script": "def x = new ArrayList(); Map y = x;", "lang": "painless", "position": { "offset": 33, "start": 29, "end": 35 } } ``` Refs: #50993	2020-01-21 13:45:59 -07:00
Nik Everett	ca15a3f5a8	Add "did you mean" to unknown queries (#51177 ) (#51254 ) This replaces the message we return for unknown queries with the standard one that we use for unknown fields from `ObjectParser`. This is nice because it includes "did you mean". One day we might convert parsing queries to using object parser, but that looks complex. This change is much smaller and seems useful.	2020-01-21 12:45:52 -05:00
Paul Sanwald	d186e470ff	version bump for 7.5.3	2020-01-21 10:59:55 -05:00
Nik Everett	788836ea3f	Revert "Begin moving date_histogram to offset rounding (backport of #50873 ) (#50978 )" (#51239 ) This reverts commit `9a3d4db840`. It was subtly broken in ways we didn't have tests for.	2020-01-21 08:50:02 -05:00
Przemyslaw Gomulka	0513d8dca3	Add SPI jvm option to SystemJvmOptions (#50916 ) Adding back accidentally removed jvm option that is required to enforce start of the week = Monday in IsoCalendarDataProvider. Adding a `feature` to yml test in order to skip running it in JDK8 commit that removed it `398c802` commit that backports SystemJvmOptions `c4fbda3` relates 7.x backport of code that enforces CalendarDataProvider use #48349	2020-01-21 09:02:21 +01:00
Nhat Nguyen	43ed244a04	Account soft-deletes in FrozenEngine (#51192 ) (#51229 ) Currently, we do not exclude soft-deleted documents when opening index reader in the FrozenEngine. Backport of #51192	2020-01-20 17:07:29 -05:00
zacharymorn	dc02458dd6	Exclude unmapped fields from query max_clause limit (#49523 ) Take into account of number of unmapped fields when calculating against limit. Closes #49002	2020-01-20 13:44:33 +01:00
Armin Braun	694b8ab95d	Fix CorruptedBlobStoreRepository Test (#51128 ) (#51186 ) The tests, when creating broken serialized blobs could randomly create a sequence of bytes that is partially readable by the deserializer and then not throw `IOException` but instead `ElasticsearchParseException`. We should just handle these unexpected exceptions downstream properly and pass them wrapped as `RepositoryException` to the listener to fix the test and keep the API consistent.	2020-01-18 14:12:55 +01:00
Jay Modi	107989df3e	Introduce hidden indices (#51164 ) This change introduces a new feature for indices so that they can be hidden from wildcard expansion. The feature is referred to as hidden indices. An index can be marked hidden through the use of an index setting, `index.hidden`, at creation time. One primary use case for this feature is to have a construct that fits indices that are created by the stack that contain data used for display to the user and/or intended for querying by the user. The desire to keep them hidden is to avoid confusing users when searching all of the data they have indexed and getting results returned from indices created by the system. Hidden indices have the following properties: * API calls for all indices (empty indices array, _all, or ) will not return hidden indices by default. Wildcard expansion will not return hidden indices by default unless the wildcard pattern begins with a `.`. This behavior is similar to shell expansion of wildcards. * REST API calls can enable the expansion of wildcards to hidden indices with the `expand_wildcards` parameter. To expand wildcards to hidden indices, use the value `hidden` in conjunction with `open` and/or `closed`. * Creation of a hidden index will ignore global index templates. A global index template is one with a match-all pattern. * Index templates can make an index hidden, with the exception of a global index template. * Accessing a hidden index directly requires no additional parameters. Backport of #50452	2020-01-17 10:09:01 -07:00
Nik Everett	5299664ae3	"did you mean" for ObjectParser with top named (#51018 ) (#51165 ) When you declare an ObjectParser with top level named objects like we do with `significant_terms` we didn't support "did you mean". This fixes that. Relates #50938	2020-01-17 12:00:03 -05:00
Armin Braun	e51b209dd3	Fix Infinite Retry Loop in loading RepositoryData (#50987 ) (#51093 ) * Fix Infinite Retry Loop in loading RepositoryData We were running into an infinite loop when trying to load corrupted (or otherwise un-loadable) repository data for a repo that uses best effort consistency (e.g. that was just freshly mounted as done in the test) because we kepy resetting to `-1` on `IOException`, listing and finding the broken generation `N` and then interpreted the subsequent reset to `-1` as a concurrent change to the repository.	2020-01-16 21:08:35 +01:00
Nik Everett	f6c89b4599	Move test of custom sig heuristic to plugin (#50891 ) (#51067 ) This moves the testing of custom significance heuristic plugins from an `ESIntegTestCase` to an example plugin. This is much more "real" and can be used as an example for anyone that needs to actually build such a plugin. The old test had testing concerns and the example all jumbled together.	2020-01-16 14:49:12 -05:00
Zachary Tong	e2ca93bad3	Mute GeoGridAggregatorTestCase#testBounds() Tracking issue: https://github.com/elastic/elasticsearch/issues/51103	2020-01-16 10:28:10 -05:00
Marios Trivyzas	fda25ed04a	Fix caching for PreConfiguredTokenFilter (#50912 ) (#51091 ) The PreConfiguredTokenFilter#singletonWithVersion uses the version internally for the token filter factories but it registers only one instance in the cache and not one instance per version. This can lead to exceptions like the one described in #50734 since the singleton is created and cached using the version created of the first index that is processed. Remove the singletonWithVersion() methods and use the elasticsearchVersion() methods instead. Fixes: #50734 (cherry picked from commit 24e1858)	2020-01-16 13:58:02 +01:00
Martijn van Groningen	02dfd71efa	Backport: Add pipeline name to ingest metadata (#51050 ) Backport: #50467 This commit adds the name of the current pipeline to ingest metadata. This pipeline name is accessible under the following key: '_ingest.pipeline'. Example usage in pipeline: PUT /_ingest/pipeline/2 { "processors": [ { "set": { "field": "pipeline_name", "value": "{{_ingest.pipeline}}" } } ] } Closes #42106	2020-01-16 10:50:47 +01:00
Zachary Tong	9eab87062c	Bump version to 7.7.0	2020-01-15 10:12:14 -05:00
Yannick Welsch	dc47b380c8	Block too many concurrent mapping updates (#51038 ) Ensures that there are not too many concurrent dynamic mapping updates going out from the data nodes to the master. Closes #50670	2020-01-15 16:02:11 +01:00
Alan Woodward	0257de8c26	Emit warnings when index templates have multiple mappings (#50982 ) Index templates created in the 5x line can still be present in the cluster state through multiple upgrades, and may have more than one mapping defined. 8x will stop supporting templates with multiple mappings, and we should emit deprecation warnings in 7x clusters to give users a chance to update their templates before upgrading.	2020-01-15 11:59:27 +00:00
Nik Everett	fc5fde7950	Add "did you mean" to ObjectParser (#50938 ) (#50985 ) Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation. s	2020-01-14 17:53:41 -05:00
Yannick Welsch	4b0581f182	Remove custom metadata tool (#50813 ) Adds a command-line tool to remove broken custom metadata from the cluster state. Relates to #48701	2020-01-14 23:08:33 +01:00
Nik Everett	a8aca6b2a0	Switch AggregationSpec to ContextParser (#50871 ) (#50980 ) We seem to have settled on the `ContextParser` interface for parsing stuff, mostly because `ObjectParser` implements it. We don't really need the old `Aggregator.Parser` interface any more because it duplicates `ContextParser` but with the arguments reversed. This adds support to `AggregationSpec` to declare aggregation parsers using `ContextParser`. This should integrate cleanly with `ObjectParser`. It doesn't drop support for `Aggregator.Parser` or change the plugin intrface at all so it should be safe to backport to 7.x. And we can remove `Aggregator.Parser` in a follow up which is only targeted to 8.0.	2020-01-14 16:50:52 -05:00
Nik Everett	9a3d4db840	Begin moving date_histogram to offset rounding (backport of #50873 ) (#50978 ) We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.	2020-01-14 16:50:27 -05:00
Tal Levy	9ee2e11181	[7.x] Adds support for geo-bounds filtering in geogrid aggregations (#50996 ) * Adds support for geo-bounds filtering in geogrid aggregations (#50002) It is fairly common to filter the geo point candidates in geohash_grid and geotile_grid aggregations according to some viewable bounding box. This change introduces the option of specifying this filter directly in the tiling aggregation. This is even more relevant to `geo_shape` where the bounds will restrict the shape to be within the bounds this optional `bounds` parameter is parsed in an equivalent fashion to the bounds specified in the geo_bounding_box query.	2020-01-14 11:18:46 -08:00
Armin Braun	16c07472e5	Track Snapshot Version in RepositoryData (#50930 ) (#50989 ) * Track Snapshot Version in RepositoryData (#50930) Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient. Follow up to #50853	2020-01-14 18:15:07 +01:00
Tim Brooks	6e7478b846	Allow proxy mode server name to be configured (#50951 ) Currently, proxy mode allows a remote cluster connection to be setup by expecting all open connections to be routed through an intermediate proxy. The proxy must use some logic to ensure that the connections end up on the correct remote cluster. One mechanism provided is that the default distribution TLS implementations will forward the host component of the configured address to the remote connection using the SNI extension. This is limiting as it requires that the proxy be configured in a way that always uses a valid hostname as the proxy address. Instead, this commit adds an additional setting to allow the server_name to be configured independently. This allows the proxy address to be specified as a IP literal, but the server_name specified as an arbitrary string which still must be a valid hostname. It also decouples the server_name from the requirement of being a DNS resolvable domain.	2020-01-14 10:57:44 -06:00
Tim Brooks	d8510be3d9	Revert "Send cluster name and discovery node in handshake (#48916 )" (#50944 ) This reverts commit `0645ee88e2`.	2020-01-14 09:53:13 -06:00
Yannick Welsch	f1c5031766	Fix queuing in AsyncLucenePersistedState (#50958 ) The logic in AsyncLucenePersistedState was flawed, unexpectedly queuing up two update tasks in parallel.	2020-01-14 15:04:28 +01:00
Yannick Welsch	91d7b446a0	Warn on slow metadata performance (#50956 ) Has the new cluster state storage layer emit warnings in case metadata performance is very slow. Relates #48701	2020-01-14 15:04:28 +01:00
Alan Woodward	8c16725a0d	Check for deprecations when analyzers are built (#50908 ) Generally speaking, deprecated analysis components in elasticsearch will issue deprecation warnings when they are first used. However, this means that no warnings are emitted when indexes are created with deprecated components, and users have to actually index a document to see warnings. This makes it much harder to see these warnings and act on them at appropriate times. This is worse in the case where components throw exceptions on upgrade. In this case, users will not be aware of a problem until a document is indexed, instead of at index creation time. This commit adds a new check that pushes an empty string through all user-defined analyzers and normalizers when an IndexAnalyzers object is built for each index; deprecation warnings and exceptions are now emitted when indexes are created or opened. Fixes #42349	2020-01-14 13:52:02 +00:00
Yannick Welsch	22ba759e1f	Move metadata storage to Lucene (#50928 ) * Move metadata storage to Lucene (#50907) Today we split the on-disk cluster metadata across many files: one file for the metadata of each index, plus one file for the global metadata and another for the manifest. Most metadata updates only touch a few of these files, but some must write them all. If a node holds a large number of indices then it's possible its disks are not fast enough to process a complete metadata update before timing out. In severe cases affecting master-eligible nodes this can prevent an election from succeeding. This commit uses Lucene as a metadata storage for the cluster state, and is a squashed version of the following PRs that were targeting a feature branch: * Introduce Lucene-based metadata persistence (#48733) This commit introduces `LucenePersistedState` which master-eligible nodes can use to persist the cluster metadata in a Lucene index rather than in many separate files. Relates #48701 * Remove per-index metadata without assigned shards (#49234) Today on master-eligible nodes we maintain per-index metadata files for every index. However, we also keep this metadata in the `LucenePersistedState`, and only use the per-index metadata files for importing dangling indices. However there is no point in importing a dangling index without any shard data, so we do not need to maintain these extra files any more. This commit removes per-index metadata files from nodes which do not hold any shards of those indices. Relates #48701 * Use Lucene exclusively for metadata storage (#50144) This moves metadata persistence to Lucene for all node types. It also reenables BWC and adds an interoperability layer for upgrades from prior versions. This commit disables a number of tests related to dangling indices and command-line tools. Those will be addressed in follow-ups. Relates #48701 * Add command-line tool support for Lucene-based metadata storage (#50179) Adds command-line tool support (unsafe-bootstrap, detach-cluster, repurpose, & shard commands) for the Lucene-based metadata storage. Relates #48701 * Use single directory for metadata (#50639) Earlier PRs for #48701 introduced a separate directory for the cluster state. This is not needed though, and introduces an additional unnecessary cognitive burden to the users. Co-Authored-By: David Turner <david.turner@elastic.co> * Add async dangling indices support (#50642) Adds support for writing out dangling indices in an asynchronous way. Also provides an option to avoid writing out dangling indices at all. Relates #48701 * Fold node metadata into new node storage (#50741) Moves node metadata to uses the new storage mechanism (see #48701) as the authoritative source. * Write CS asynchronously on data-only nodes (#50782) Writes cluster states out asynchronously on data-only nodes. The main reason for writing out the cluster state at all is so that the data-only nodes can snap into a cluster, that they can do a bit of bootstrap validation and so that the shard recovery tools work. Cluster states that are written asynchronously have their voting configuration adapted to a non existing configuration so that these nodes cannot mistakenly become master even if their node role is changed back and forth. Relates #48701 * Remove persistent cluster settings tool (#50694) Adds the elasticsearch-node remove-settings tool to remove persistent settings from the on disk cluster state in case where it contains incompatible settings that prevent the cluster from forming. Relates #48701 * Make cluster state writer resilient to disk issues (#50805) Adds handling to make the cluster state writer resilient to disk issues. Relates to #48701 * Omit writing global metadata if no change (#50901) Uses the same optimization for the new cluster state storage layer as the old one, writing global metadata only when changed. Avoids writing out the global metadata if none of the persistent fields changed. Speeds up server:integTest by ~10%. Relates #48701 * DanglingIndicesIT should ensure node removed first (#50896) These tests occasionally failed because the deletion was submitted before the restarting node was removed from the cluster, causing the deletion not to be fully acked. This commit fixes this by checking the restarting node has been removed from the cluster. Co-authored-by: David Turner <david.turner@elastic.co> * fix tests Co-authored-by: David Turner <david.turner@elastic.co>	2020-01-14 09:35:43 +01:00
Tim Brooks	50cb770315	Use default profile for remote connections (#50947 ) Currently, the connection manager is configured with a default profile for both the sniff and proxy connection stratgies. This profile correctly reflects the expected number of connection (6 for sniff, 18 for proxy). This commit removes the proxy strategy usages of the per connection attempt profile configuration. Additionally, it refactors other unnecessary code around the connection manager. The connection manager now can always be built inside the remote connection.	2020-01-13 21:46:23 -06:00
Tim Brooks	27c2eb744e	Fix open/close race in ConnectionManagerTests (#50621 ) Currently we reuse the same test connection for all connection attempts in the testConcurrentConnectsAndDisconnects test. This means that if the connection fails due to a pre-existing connection, the connection will be closed impacting the state of all connection attempts. This commit fixes the test, by returning a unique connection for each attempt. Fixes #49903.	2020-01-13 18:43:18 -07:00
Nhat Nguyen	fb32a55dd5	Deprecate synced flush (#50835 ) A normal flush has the same effect as a synced flush on Elasticsearch 7.6 or later. It's deprecated in 7.6 and will be removed in 8.0. Relates #50776	2020-01-13 19:54:38 -05:00
Nhat Nguyen	05f97d5e1b	Revert "Deprecate synced flush (#50835 )" This reverts commit `1a32d7142a`.	2020-01-13 11:41:03 -05:00
Nhat Nguyen	1a32d7142a	Deprecate synced flush (#50835 ) A normal flush has the same effect as a synced flush on Elasticsearch 7.6 or later. It's deprecated in 7.6 and will be removed in 8.0. Relates #50776	2020-01-13 10:58:29 -05:00
Armin Braun	609b015e3c	Prevent Old Version Clusters From Corrupting Snapshot Repositories (#50853 ) (#50913 ) Follow up to #50692 that starts writing a `min_version` field to the `RepositoryData` so that pre-7.6 ES versions can not read it (and potentially corrupt it if they attempt to modify the repo contents) after the repository moved to the new metadata format.	2020-01-13 15:02:53 +01:00
Christoph Büscher	c31a21c3d8	Fix time zone issue in Rounding serialization (#50845 ) When deserializing time zones in the Rounding classes we used to include a tiny normalization step via `DateUtils.of(in.readString())` that was lost in #50609. Its at least necessary for some tests, e.g. the cause of #50827 is that when sending the default time zone ZoneOffset.UTC on a stream pre 7.0 we convert it to a "UTC" string id via `DateUtils.zoneIdToDateTimeZone`. This gets then read back as a UTC ZoneRegion, which should behave the same but fails the equality tests in our serialization tests. Reverting to the previous behaviour with an additional normalization step on 7.x. Co-authored-by: Nik Everett <nik9000@gmail.com> Closes #50827	2020-01-13 10:10:15 +01:00
David Turner	456de59698	Fix non-corruption in testCurrentHeaderVersion (#50883 ) Today we make multiple attempts to corrupt the translog header in `TranslogHeaderTests#testCurrentHeaderVersion`, but if we are extraordinarily unlucky then this sequence of corruptions may restore the file to its original state. This change adjusts the test to only corrupt the file once, which is certain not to leave the file in its original state.	2020-01-12 12:38:37 +00:00
Henning Andersen	2e5e5fd483	Fix testSkipRefreshIfShardIsRefreshingAlready (#50856 ) The test checked queue size and active count, however, ThreadPoolExecutor pulls out the request from the queue before marking the worker active, risking that we think all tasks are done when they are not. Now check on completed-tasks metric instead, which is guaranteed to be monotonic. Relates #50769	2020-01-11 11:21:05 -05:00
Nhat Nguyen	f4aabdcd89	Do not force refresh when write indexing buffer (#50769 ) Today we periodically check the indexing buffer memory every 5 seconds or after we have used 1/30 of the configured memory. If the total used memory is over the threshold, then we refresh the "largest" shards. If refreshing takes longer these intervals (i.e., 5s or 1/30 buffer), then we continue to enqueue refreshes to these shards. This leads to two issues: - The refresh thread pool can be exhausted and other shards can't refresh - Execute too many refreshes for the "largest" shards With this change, we only refresh the largest shards if they are not refreshing. Here we rely on the periodic check to trigger another refresh if needed. We can harden this by making the ongoing refresh triggers the memory check when it's completed. I opted out this option in this PR for simplicity. See: https://discuss.elastic.co/t/write-queue-continue-to-rise/213652/	2020-01-11 11:21:05 -05:00
Nik Everett	e6d0f7df01	Fix format problem in composite of unmapped (#50869 ) (#50875 ) When a composite aggregation is reduced using the results from an index that has one of the fields unmapped we were throwing away the formatter. This is mildly annoying, except in the case of IP addresses which were coming out as non-utf-8-characters. And tripping assertions. This carefully preserves the formatter from the working bucket. Closes #50600	2020-01-10 16:18:11 -05:00
Jim Ferenczi	60308cf0b3	Fix upgrade of custom similarity (#50851 ) This change fixes the upgrade of index metadata that contain a custom similarity with options that are not compatible with BM25. The upgrade doesn't need a real similarity service so we fake one that resolves all custom similarity to BM25 but this logic fails because the BM25 provider checks that all options are compatible. This commit removes the verification step as it is not needed during the upgrade (the verification is done when the index is restored/opened). Closes #50763	2020-01-10 18:43:13 +01:00
Armin Braun	7e68989dae	Fix Snapshot Shard Status Request Deduplication (#50788 ) (#50840 ) * Fix Snapshot Shard Status Request Deduplication The request deduplication didn't actually work for these requests since they had no `equals` and `hashCode` so the deduplicator wouldn't actually recognize equal requests.	2020-01-10 11:49:52 +01:00
Christoph Büscher	75cb4e0b69	Muting InternalAggregationsTests.testSerialization	2020-01-10 09:24:09 +01:00
Nik Everett	d021071ab9	Move scripted metric to ObjectParser (#50708 ) (#50811 ) This replaces the hand rolled parsing code for scripted metric with `ObjectParser` which is simpler to work with because it is declarative.	2020-01-09 16:09:21 -05:00
Nik Everett	ae40e22452	Drop "funny" functions building parsers (#50715 ) (#50814 ) Replaces the "funny" `Function<String, ConstructingObjectParser<T, Void>>` with a much simpler `ConstructingObjectParser<T, String>`. This makes pretty much all of our object parsers static.	2020-01-09 15:53:03 -05:00
Armin Braun	f70e8f6ab5	Fix Snapshot Repository Corruption in Downgrade Scenarios (#50692 ) (#50797 ) * Fix Snapshot Repository Corruption in Downgrade Scenarios (#50692) This PR introduces test infrastructure for downgrading a cluster while interacting with a given repository. It fixes the fact that repository metadata in the new format could be written while there's still older snapshots in the repository that require the old-format metadata to be restorable.	2020-01-09 21:21:13 +01:00
Jake Landis	de6f132887	[7.x] Foreach processor - fork recursive call (#50514 ) (#50773 ) A very large number of recursive calls can cause a stack overflow exception. This commit forks the recursive calls for non-async processors. Once forked, each thread will handle at most 10 recursive calls to help keep the stack size and thread count down to a reasonable size.	2020-01-09 13:21:18 -06:00
Nik Everett	1d8e51f89d	Support offset in composite aggs (#50609 ) (#50808 ) Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes #48757	2020-01-09 14:11:24 -05:00
Julie Tibshirani	a299aba2f8	Ensure that field collapsing works with field aliases. (#50766 ) Previously, the following situation would throw an error: * A search contains a `collapse` on a particular field. * The search spans multiple indices, and in one index the field is mapped as a concrete field, but in another it is a field alias. The error occurs when we attempt to merge `CollapseTopFieldDocs` across shards. When merging, we validate that the name of the collapse field is the same across shards. But the name has already been resolved to the concrete field name, so it will be different on shards where the field was mapped as an alias vs. shards where it was a concrete field. This PR updates the collapse field name in `CollapseTopFieldDocs` to the original requested field, so that it will always be consistent across shards. Note that in #32648, we already made a fix around collapsing on field aliases. However, we didn't test this specific scenario where the field was mapped as an alias in only one of the indices being searched.	2020-01-08 14:51:15 -08:00
Christoph Büscher	b1b4282273	Make Multiplexer inherit filter chains analysis mode (#50662 ) Currently, if an updateable synonym filter is included in a multiplexer filter, it is not reloaded via the _reload_search_analyzers because the multiplexer itself doesn't pass on the analysis mode of the filters it contains, so its not recognized as "updateable" in itself. Instead we can check and merge the AnalysisMode settings of all filters in the multiplexer and use the resulting mode (e.g. search-time only) for the multiplexer itself, thus making any synonym filters contained in it reloadable. This, of course, will also make the analyzers using the multiplexer be usable at search-time only. Closes #50554	2020-01-08 22:12:01 +01:00
Przemyslaw Gomulka	e95b0c447f	Allow parsing timezone without fully provided time backport(#50178 ) (#50740 ) strict_date_optional_time changes to have optional minute part. It already allowed optional second and fraction of second part. This allows parsing 2018-01-01T00+01 , 2018-01-01T00:00+01 , 2018-01-01T00:00:00+01 , 2018-01-01T00:00:00.000+01 It won't allow parsing a timezone without an hour part as this is not allowed by iso8601 spec closes #49351	2020-01-08 20:04:57 +01:00
Henning Andersen	125feecabc	Guess root cause support unwrap (#50525 ) (#50742 ) ElasticsearchException.guessRootCauses would return wrapper exception if inner exception was not an ElasticsearchException. Fixed to never return wrapper exceptions. At least following APIs change root_cause.0.type as a result: _update with bad script _index with bad pipeline Relates #50417	2020-01-08 19:09:14 +01:00
Adrien Grand	4f2299c714	Upgrade to Lucene 8.4.0. (#50518 ) (#50750 )	2020-01-08 18:53:59 +01:00
Adrien Grand	31158ab3d5	Add per-field metadata. (#50333 ) This PR adds per-field metadata that can be set in the mappings and is later returned by the field capabilities API. This metadata is completely opaque to Elasticsearch but may be used by tools that index data in Elasticsearch to communicate metadata about fields with tools that then search this data. A typical example that has been requested in the past is the ability to attach a unit to a numeric field. In order to not bloat the cluster state, Elasticsearch requires that this metadata be small: - keys can't be longer than 20 chars, - values can only be numbers or strings of no more than 50 chars - no inner arrays or objects, - the metadata can't have more than 5 keys in total. Given that metadata is opaque to Elasticsearch, field capabilities don't try to do anything smart when merging metadata about multiple indices, the union of all field metadatas is returned. Here is how the meta might look like in mappings: ```json { "properties": { "latency": { "type": "long", "meta": { "unit": "ms" } } } } ``` And then in the field capabilities response: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms" ] } } } } ``` When there are no conflicts, values are arrays of size 1, but when there are conflicts, Elasticsearch includes all unique values in this array, without giving ways to know which index has which metadata value: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms", "ns" ] } } } } ``` Closes #33267	2020-01-08 16:21:18 +01:00
Yannick Welsch	f203c2b39d	Import replicated closed dangling indices (#50649 ) Dangling replicated closed indices are not imported properly (they miss their routing table when imported).	2020-01-08 13:39:20 +01:00
Rory Hunter	b1ff74f652	New setting to prevent automatically importing dangling indices (#49174 ) Introduce a new static setting, `gateway.auto_import_dangling_indices`, which prevents dangling indices from being automatically imported. Part of #48366.	2020-01-08 13:39:20 +01:00
Tim Vernum	293661d62c	Security should not reload files that haven't changed (#50724 ) In security we currently monitor a set of files for changes: - config/role_mapping.yml (or alternative configured path) - config/roles.yml - config/users - config/users_roles This commit prevents unnecessary reloading when the file change actually doesn't change the internal structure. Backport of: #50207 Co-authored-by: Anton Shuvaev <anton.shuvaev91@gmail.com>	2020-01-08 15:13:47 +11:00
Nik Everett	deb0991667	Teach ObjectParser a happy pattern (#50691 ) (#50710 ) We very commonly have object with ctors like: ``` public Foo(String name) ``` And then declare a bunch of setters on the object. Every aggregation works like this, for example. This change teaches `ObjectParser` how to build these aggregations all on its own, without any help. This'll make it much cleaner to parse aggs, and, probably, a bunch of other things. It'll let us remove lots of wrapping. I've used this new power for the `avg` aggregation just to prove that it works outside of a unit test.	2020-01-07 11:57:41 -05:00
Nhat Nguyen	c3d207f437	Disable auto refresh in testSegmentsStats (#50689 ) If an auto-refresh happens, then version_map_memory is reset to 0. By default, the auto-refresh occurs for every second in the first 30 seconds until search becomes idle. Closes #50362	2020-01-07 10:44:30 -05:00
Hendrik Muhs	98ca9500e8	implement a workaround for remote cluster validation (#50460 ) In 7.x an internal API used for validating remote cluster does not throw, see #50420 for the details. This change implements a workaround for remote cluster validation, only for 7.x branches. fixes #50420	2020-01-07 13:51:51 +01:00
Yannick Welsch	a2ef0e8830	Check allocation id when failing shard on recovery (#50656 ) A failure of a recovering shard can race with a new allocation of the shard, and cause the new allocation to be failed as well. This can result in a shard being marked as initializing in the cluster state, but not exist on the node anymore. Closes #50508	2020-01-07 09:41:28 +01:00
Jay Modi	e5191e77e3	Remove unused IndicesOptions#fromByte method (#50683 ) This change removes a no longer used method, `fromByte`, in IndicesOptions. This method was necessary for backwards compatibility with versions prior to 6.4.0 and was used when talking to those versions. However, the minimum wire compatibility version has changed and we no longer use this code. Backport of #50665	2020-01-06 14:57:10 -07:00
Nik Everett	76bb661023	Replace AggParseContext with a String (backport of #50625 ) (#50679 ) We used to have a ton off stuff in the `AggParseContext` but now we parse aggs entirely with named xcontent. So we don't need the context any more.	2020-01-06 14:32:03 -05:00
Nhat Nguyen	926c0aa74c	Fix testRelocationEstablishedPeerRecoveryRetentionLeases (#50673 ) The redNodes are calculated incorrectly. Closes #50660	2020-01-06 13:32:04 -05:00
Nik Everett	f576aefd0f	Replace bespoke parser for significance heuristics (#50623 ) (#50659 ) This replaces the hand written xcontent parsers for significance heristics with `ObjectParser` and parsing named xcontent. As a happy accident, this was the last user of `ParseFieldRegistry` so this PR entirely removes that class. Closes #25519	2020-01-06 12:57:43 -05:00
Tim Brooks	fa57813c6d	Remove races in ProxyConnectionStrategyTests (#50620 ) Currently, we use delayed address resolution in the proxy strategy tests to allow tests to connect to different addresses. Unfortunately, this has the potential to introduce races as the address is resolved each connection attempt. The number of connection attempts can vary based on when connections are opening and closing. This commit modifies the test be allowing them to specifically control which address is used. Related to #50618	2020-01-06 10:20:53 -07:00
Martijn van Groningen	7be43e9f6d	Fix ingest stats test bug. (#50653 ) This test code fixes a serialization test bug: https://gradle-enterprise.elastic.co/s/7x2ct6yywkw3o Rarely stats for the same processor are generated and the production code then sums up these stats. However the test code wasn't summing up in that case, which caused inconsistencies between the actual and expected results. Closes #50507	2020-01-06 15:37:47 +01:00
Nhat Nguyen	b71490b06b	Deprecate indices without soft-deletes (#50502 ) (#50634 ) Soft-deletes will be enabled for all indices in 8.0. Hence, we should deprecate new indices without soft-deletes in 7.x. Backport of #50502	2020-01-06 08:44:30 -05:00
Henning Andersen	ec0ec61881	Deleted docs disregarded for if_seq_no check (#50526 ) Previously, as long as a deleted version value was kept as a tombstone, another index or delete operation against the same id would leak that the doc had existed (through seq_no info) or would allow the operation if the client forged the seq_no. Fixed to disregard info on deleted docs when doing seq_no based optimistic concurrency check.	2020-01-06 13:54:36 +01:00
Nikita Glashenko	5533e1172c	Add tests for remaining IntervalsSourceProvider implementations (#50326 ) This PR adds unit tests for wire and xContent serialization of remaining IntervalsSourceProvider implementations. Closes #50150	2020-01-06 13:18:53 +01:00
David Turner	66c690922c	Collect shard sizes for closed indices (#50645 ) Today the `InternalClusterInfoService` collects information on the sizes of shards of open indices, but does not consider closed indices. This means that shards of closed indices are treated as having zero size when they are being allocated. This commit fixes this, obtaining the sizes of all shards. Relates #33888	2020-01-06 11:44:19 +00:00
Henning Andersen	312bf44601	Workaround for JDK 14 EA FileChannel.map issue (#50523 ) FileChannel.map provokes static initialization of ExtendedMapMode in JDK14 EA, which needs elevated privileges. Relates #50512	2020-01-06 12:18:49 +01:00
Nik Everett	2362c430cd	Clean up wire test case a bit (#50627 ) (#50632 ) * Adds JavaDoc to `AbstractWireTestCase` and `AbstractWireSerializingTestCase` so it is more obvious you should prefer the latter if you have a choice * Moves the `instanceReader` method out of `AbstractWireTestCase` becaue it is no longer used. * Marks a bunch of methods final so it is more obvious which classes are for what. * Cleans up the side effects of the above.	2020-01-05 16:20:38 -05:00
Nik Everett	4d58656065	Declare remaining parsers `final` (#50571 ) (#50615 ) We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which are final. This is probably the right way to declare them because in practice we never mutate them after they are built. And we certainly don't change the static reference. Anyway, this adds `final` to these parsers. I found the non-final parsers with this: ``` diff \ <(find . -type f -name '.java' -exec grep -iHe 'static.PARSER\s=' {} \+ \| sort) \ <(find . -type f -name '.java' -exec grep -iHe 'static.final.PARSER\s*=' {} \+ \| sort) \ 2>&1 \| grep '^<' ```	2020-01-03 11:48:11 -05:00
Andrei Dan	856607b5a6	Guard against null geoBoundingBox (#50506 ) (#50608 ) A geo box with a top value of Double.NEGATIVE_INFINITY will yield an empty xContent which translates to a null `geoBoundingBox`. This commit marks the field as `Nullable` and guards against null when retrieving the `topLeft` and `bottomRight` fields. Fixes https://github.com/elastic/elasticsearch/issues/50505 (cherry picked from commit 051718f9b1e1ca957229b01e80d7b79d7e727e14) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-03 18:04:26 +02:00
Nik Everett	1abecad21b	Mark some constants in decay functions final (#50569 ) (#50575 ) This marks a couple of constants in the `DecayFunctionBuilder` as final. They are written in CONSTANT_CASE and used as constants but not final which is a little confusing and might lead to sneaky bugs.	2020-01-03 10:58:15 -05:00
Henning Andersen	218bd19034	Improve FutureUtils.get exception handling (#50339 ) (#50417 ) FutureUtils.get() would unwrap ElasticsearchWrapperExceptions. This is trappy, since nearly all usages of FutureUtils.get() expected only to not have to deal with checked exceptions. In particular, StepListener builds upon ListenableFuture which uses FutureUtils.get to be informed about the exception passed to onFailure. This had the bad consequence of masking away any exception that was an ElasticsearchWrapperException like RemoteTransportException. Specifically for recovery, this made CircuitBreakerExceptions happening on the target node look like they originated from the source node. The only usage that expected that behaviour was AdapterActionFuture. The unwrap behaviour has been moved to that class.	2020-01-03 15:28:47 +01:00
kkewwei	5655d6a1c1	Log index name when updating index settings (#49969 ) Today we log changes to index settings like this: updating [index.setting.blah] from [A] to [B] The identity of the index whose settings were updated is conspicuously absent from this message. This commit addresses this by adding the index name to these messages. Fixes #49818.	2020-01-03 11:26:29 +00:00
Alan Woodward	8b362c657b	Add fuzzy intervals source (#49762 ) This intervals source will return terms that are similar to an input term, up to an edit distance defined by fuzziness, similar to FuzzyQuery. Closes #49595	2020-01-03 09:59:19 +00:00
Henning Andersen	e19585b47f	Enhance TransportReplicationAction assertions (#49081 ) Include failure into assertion error when replication action discovers that it has been double triggered.	2020-01-02 19:23:10 +01:00
Oleg	7539fbb30f	Deprecate the 'local' parameter of /_cat/nodes (#50499 ) The cat nodes API performs a `ClusterStateAction` then a `NodesInfoAction`. Today it accepts the `?local` parameter and passes this to the `ClusterStateAction` but this parameter has no effect on the `NodesInfoAction`. This is surprising, because `GET _cat/nodes?local` looks like it might be a completely local call but in fact it still depends on every node in the cluster. This commit deprecates the `?local` parameter on this API so that it can be removed in 8.0. Relates #50088	2020-01-02 14:53:56 +00:00
Nhat Nguyen	e7c15a5c6e	Ensure relocating shards establish peer recovery retention leases (#50486 ) We forgot to establish peer recovery retention leases for relocating primaries without soft-deletes. Relates #50351	2019-12-26 13:51:35 -05:00
Nhat Nguyen	7713221733	Fix testCancelRecoveryDuringPhase1 (#50449 ) testCancelRecoveryDuringPhase1 uses a mock of IndexShard, which can't create retention leases. We need to stub method createRetentionLease. Relates #50351 Closes #50424	2019-12-26 09:48:58 -05:00
Yannick Welsch	f57569bf5c	Mute RecoverySourceHandlerTests.testCancelRecoveryDuringPhase1 Relates #50424	2019-12-24 12:13:31 -05:00
Martijn van Groningen	10ed1ae1d2	Add remote info to the HLRC (#50483 ) The additional change to the original PR (#49657), is that `org.elasticsearch.client.cluster.RemoteConnectionInfo` now parses the initial_connect_timeout field as a string instead of a TimeValue instance. The reason that this is needed is because that the initial_connect_timeout field in the remote connection api is serialized for human consumption, but not for parsing purposes. Therefore the HLRC can't parse it correctly (which caused test failures in CI, but not in the PR CI :( ). The way this field is serialized needs to be changed in the remote connection api, but that is a breaking change. We should wait making this change until rest api versioning is introduced. Co-Authored-By: j-bean <anton.shuvaev91@gmail.com> Co-authored-by: j-bean <anton.shuvaev91@gmail.com>	2019-12-24 15:11:58 +01:00
Nhat Nguyen	33204c2055	Use peer recovery retention leases for indices without soft-deletes (#50351 ) Today, the replica allocator uses peer recovery retention leases to select the best-matched copies when allocating replicas of indices with soft-deletes. We can employ this mechanism for indices without soft-deletes because the retaining sequence number of a PRRL is the persisted global checkpoint (plus one) of that copy. If the primary and replica have the same retaining sequence number, then we should be able to perform a noop recovery. The reason is that we must be retaining translog up to the local checkpoint of the safe commit, which is at most the global checkpoint of either copy). The only limitation is that we might not cancel ongoing file-based recoveries with PRRLs for noop recoveries. We can't make the translog retention policy comply with PRRLs. We also have this problem with soft-deletes if a PRRL is about to expire. Relates #45136 Relates #46959	2019-12-23 22:04:07 -05:00
Tal Levy	bed121efaf	[7.x-backport] Centralize BoundingBox logic to a dedicated class (#50469 ) Both geo_bounding_box query and geo_bounds aggregation have a very similar definition of a "bounding box". A lot of this logic (serialization, xcontent-parsing, etc) can be centralized instead of having separated efforts to do the same things	2019-12-23 11:21:39 -08:00
Aleksandr Maus	d5cec7faa1	Improve SearchHit "equals" implementation for null fields cases (#50327 ) (#50448 ) * Improve SearchHit "equals" implementation for null fields cases	2019-12-23 09:59:07 -05:00
Igor Motov	339d10c16f	Geo: Switch generated GeoJson type names to camel case (#50400 ) Switches generated GeoJson type names to camel case to conform to the standard. Closes #49568	2019-12-20 15:37:22 -05:00
Andrei Dan	a3cdbda7c6	Make the TransportRolloverAction execute in one cluster state update (#50388 ) (#50442 ) This commit makes the TransportRolloverAction more resilient, by having it execute only one cluster state update that creates the new (rollover index), rolls over the alias from the source to the target index and set the RolloverInfo on the source index. Before these 3 steps were represented as 3 chained cluster state updates, which would've seen the user manually intervene if, say, the alias rollover cluster state update (second in the chain) failed but the creation of the rollover index (first in the chain) update succeeded * Rename innerExecute to applyAliasActions (cherry picked from commit 1ba4339a0c73ef3354b8c8b44b628fc55f1dbc78) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-12-20 18:01:03 +00:00

1 2 3 4 5 ...

4247 Commits