OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Turner	00b9098250	Ignore timeouts with single-node discovery (#52159 ) Today we use `cluster.join.timeout` to prevent nodes from waiting indefinitely if joining a faulty master that is too slow to respond, and `cluster.publish.timeout` to allow a faulty master to detect that it is unable to publish its cluster state updates in a timely fashion. If these timeouts occur then the node restarts the discovery process in an attempt to find a healthier master. In the special case of `discovery.type: single-node` there is no point in looking for another healthier master since the single node in the cluster is all we've got. This commit suppresses these timeouts and instead lets the node wait for joins and publications to succeed no matter how long this might take.	2020-02-11 14:15:01 +00:00
Armin Braun	91e938ead8	Add Trace Logging of REST Requests (#51684 ) (#52015 ) Being able to trace log all REST requests to a node would make debugging a number of issues a lot easier.	2020-02-07 09:03:20 +01:00
István Zoltán Szabó	9600ab4f57	[DOCS] Adds recommendation on dedicated master-eligible nodes (#51674 ) Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2020-01-31 12:56:58 +01:00
István Zoltán Szabó	424b4ed4ea	[DOCS] Expands the documentation of Node Query Cache (#51105 ) Co-authored-by: debadair <debadair@elastic.co>	2020-01-20 11:13:29 +01:00
debadair	83d961391b	[DOCS] Move snapshot-restore out of modules. (#49618 ) (#50829 ) * [DOCS] Move snapshot-restore docs out of modules. * [DOCS] Incorporates comments from @jrodewig. * [DOCS] Fix snippet tests	2020-01-09 16:55:46 -08:00
Lisa Cawley	2106a7b02a	[7.x][DOCS] Updates ML links (#50387 ) (#50409 )	2019-12-20 10:01:19 -08:00
Stuart Tettemer	2e76865290	[DOCS] Deterministic scripted queries are cached (#50408 ) (#50411 ) Backport Refs: #49321	2019-12-19 16:30:34 -07:00
Patryk Krawaczyński	df558aa0ca	[DOCS] Document `index.queries.cache.enabled` as a static setting (#49886 )	2019-12-10 14:24:03 -05:00
James Rodewig	f1fd41cb53	[DOCS] Document CCR compatibility requirements (#49776 ) * Creates a prerequisites section in the cross-cluster replication (CCR) overview. * Adds concise definitions for local and remote cluster in a CCR context. * Documents that the ES version of the local cluster must be the same or a newer compatible version as the remote cluster.	2019-12-02 15:53:00 -05:00
David Turner	86a40f6d8b	Drop snapshot instructions for autobootstrap fix (#49755 ) The "Restore any snapshots as required" step is a trap: it's somewhere between tricky and impossible to restore multiple clusters into a single one. Also add a note about configuring discovery during a rolling upgrade to proscribe any rare cases where you might accidentally autobootstrap during the upgrade.	2019-12-02 14:33:42 +00:00
István Zoltán Szabó	35cc0e0948	[DOCS] Removes the default size definition of thread pool types (#49442 ) Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-11-22 11:20:11 +01:00
James Rodewig	0fa3b887b7	[DOCS] Document several missing thread pools (#48543 ) Adds documentation for the following thread pools: - fetch_shard_started - fetch_shard_store - flush - force_merge - management Closes #48524 Co-Authored-By: Jay Modi <jaymode@users.noreply.github.com>	2019-11-21 13:12:56 -05:00
James Rodewig	f264808a6a	[DOCS] Replace cross-cluster search PNG images with SVGs (#49395 )	2019-11-21 09:06:33 -05:00
weizijun	3eb577f6c8	Document all shard allocation filtering attributes (#46992 ) This commit adds coverage to the docs for some missing built-in shard allocation attributes.	2019-11-21 08:30:30 -05:00
SylvainJuge	e8f49cdee0	[DOCS] minor fix to documentation: http.host can't default to itself (#48135 ) fix minor typos on http.host and transport.host default values. 7.x backport of https://github.com/elastic/elasticsearch/pull/48135	2019-11-14 18:16:38 +01:00
glerb	baabc21a04	[DOCS] Correct typo in Discovery docs (#48494 )	2019-11-05 08:48:43 -05:00
Jason Tedor	13043219ac	Fix specification for cluster.remote.connect (#48690 ) The docs specify that cluster.remote.connect disables cross-cluster search. This is correct, but not fully accurate as it disables any functionality that relies on remote cluster connections: cross-cluster search, remote data feeds, and cross-cluster replication. This commit updates the docs to reflect this.	2019-10-30 11:26:15 -04:00
Ian Danforth	4a076f5e92	[Doc] Fix typo in indices module docs (#48598 )	2019-10-28 21:40:09 +01:00
James Rodewig	f4fa61b2f2	[DOCS] Add 'Selecting gateway and seed nodes' section to CCS docs (#48297 )	2019-10-21 11:14:23 -05:00
François-Clément Brossard	f501a4b2b5	Clarify low watermark documentation (#48112 ) Today the docs say that the low watermark has no effect on any shards that have never been allocated, but this is confusing. Here "shard" means "replication group" not "shard copy" but this conflicts with the "never been allocated" qualifier since one allocates shard copies and not replication groups. This commit removes the misleading words. A newly-created replication group remains newly-created until one of its copies is assigned, which might be quite some time later, but it seems better to leave this implicit.	2019-10-16 12:27:49 +01:00
David Turner	ecb20ebc6c	More bootstrap docs tweaks (#47809 ) Clarifies not to set `cluster.initial_master_nodes` on nodes that are joining an existing cluster. Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-10-10 09:55:30 +01:00
David Turner	11093197f1	Fix deprecation docs formatting (#47725 ) Relates #47443	2019-10-08 15:41:34 +02:00
David Turner	bb5f750ab4	Deprecate include_relocations setting (#47443 ) Setting `cluster.routing.allocation.disk.include_relocations` to `false` is a bad idea since it will lead to the kinds of overshoot that were otherwise fixed in #46079. This commit deprecates this setting so it can be removed in the next major release.	2019-10-08 08:19:04 +01:00
Lisa Cawley	39ef795085	[DOCS] Cleans up links to security content (#47610 ) (#47703 )	2019-10-07 15:23:19 -07:00
James Rodewig	079bf887c0	[DOCS] Reorder index APIs alphabetically (#46981 ) (#47402 )	2019-10-01 17:07:28 -04:00
David Turner	272b0ecbdd	Remove docs for proxy mode (#46677 ) We added docs for proxy mode in #40281 but on reflection we should not be documenting this setting since it does not play well with all proxies and we can't recommend its use. This commit removes those docs and expands its Javadoc instead.	2019-09-13 22:20:11 +01:00
James Rodewig	60db630abd	[DOCS] Add missing mention of current version to snapshot docs (#46516 ) (#46658 )	2019-09-12 08:47:22 -04:00
David Turner	5c85b0998b	Clarify that discovery ignores master-ineligibles (#44835 ) The changes in #32006 mean that the discovery process can no longer use master-ineligible nodes as a stepping-stone between master-eligible nodes. This was normally an indication of a strange and possibly-fragile configuration and was not recommended, but this commit adds a note to the breaking changes docs to note that this kind of configuration is more obviously broken in recent versions.	2019-09-12 11:07:34 +01:00
James Rodewig	e253ee6ba6	[DOCS] Change // CONSOLE comments to [source,console] (#46440 ) (#46494 )	2019-09-09 12:35:50 -04:00
James Rodewig	f04573f8e8	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 ) (#46459 )	2019-09-06 16:09:09 -04:00
James Rodewig	bb7bff5e30	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 ) (#46418 )	2019-09-06 09:22:08 -04:00
Jim Ferenczi	f2a6c88f83	Add a system property to ignore awareness attributes (#46375 ) This is a follow up of #19191 for 7.x. This change adds a system property called "es.routing.search_ignore_awareness_attributes" that when set to true will effectively ignore allocation awareness attributes when routing search and get requests. This is now the default in 8.x so this commit adds a way to opt-in to this new behavior in a minor version of 7.x. Relates #45735	2019-09-06 09:29:27 +02:00
Armin Braun	6aaee8aa0a	Repository Cleanup Endpoint (#43900 ) (#45780 ) * Repository Cleanup Endpoint (#43900) * Snapshot cleanup functionality via transport/REST endpoint. * Added all the infrastructure for this with the HLRC and node client * Made use of it in tests and resolved relevant TODO * Added new `Custom` CS element that tracks the cleanup logic. Kept it similar to the delete and in progress classes and gave it some (for now) redundant way of handling multiple cleanups but only allow one * Use the exact same mechanism used by deletes to have the combination of CS entry and increment in repository state ID provide some concurrency safety (the initial approach of just an entry in the CS was not enough, we must increment the repository state ID to be safe against concurrent modifications, otherwise we run the risk of "cleaning up" blobs that just got created without noticing) * Isolated the logic to the transport action class as much as I could. It's not ideal, but we don't need to keep any state and do the same for other repository operations (like getting the detailed snapshot shard status)	2019-08-21 17:59:49 +02:00
James Rodewig	a635eca5f8	Retitle and relocate cross-cluster search docs (#45608 )	2019-08-15 16:28:04 -04:00
James Rodewig	d64c31e43d	[DOCS] Rewrite cross-cluster seach docs (#45583 )	2019-08-15 13:23:40 -04:00
James Rodewig	c75fd40f2c	[DOCS] Add diagrams to cross-cluster search documentation (#45569 )	2019-08-15 11:00:25 -04:00
Chris Dean	deab736aad	[DOCS] - Updating chunk_size values to fix size value notation. Chunksize41591 (#45552 ) (#45579 ) * changes to chunk_size #41591 * update to chunk size to include ` ` * Update docs/plugins/repository-azure.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/modules/snapshots.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/plugins/repository-azure.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/plugins/repository-s3.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * edits to fix passive voice	2019-08-14 15:59:36 -05:00
Chris Dean	caa2a7738f	Revert "[DOCS] - Updating chunk_size values to fix size value notation. Chunksize41591 (#45552 )" This reverts commit `8fdbcd7395`.	2019-08-14 15:14:10 -05:00
Chris Dean	8fdbcd7395	[DOCS] - Updating chunk_size values to fix size value notation. Chunksize41591 (#45552 ) * changes to chunk_size #41591 * update to chunk size to include ` ` * Update docs/plugins/repository-azure.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/modules/snapshots.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/plugins/repository-azure.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/plugins/repository-s3.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * edits to fix passive voice	2019-08-14 14:15:22 -05:00
Chris Dean	82d48cfcc9	[DOCS] Added cross-link to snapshot lifecycle management. Closes #44588 . (#45408 ) (#45468 ) merging #44588 changes into 7.x	2019-08-12 15:13:11 -05:00
David Turner	ddcc38cf1c	More read-only-allow-delete docs (#45320 ) Adds to the `index.blocks.read_only_allow_delete` docs the information that this block may be added or removed automatically, and rewords the breaking-changes docs to mention the blocks explicitly and to recommend using a different block. Relates #42559	2019-08-08 09:58:23 +01:00
Bukhtawar	cd304c4def	Auto-release flood-stage write block (#42559 ) If a node exceeds the flood-stage disk watermark then we add a block to all of its indices to prevent further writes as a last-ditch attempt to prevent the node completely exhausting its disk space. However today this block remains in place until manually removed, and this block is a source of confusion for users who current have ample disk space and did not even realise they nearly ran out at some point in the past. This commit changes our behaviour to automatically remove this block when a node drops below the high watermark again. The expectation is that the high watermark is some distance below the flood-stage watermark and therefore the disk space problem is truly resolved. Fixes #39334	2019-08-07 11:03:53 +01:00
Yannick Welsch	7aeb2fe73c	Add per-socket keepalive options (#44055 ) Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings. By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like to explore whether we can enable them by default, in particular to force keepalive configurations that are better tuned for running ES.	2019-08-06 10:45:44 +02:00
David Turner	532ade7816	More logging for slow cluster state application (#45007 ) Today the lag detector may remove nodes from the cluster if they fail to apply a cluster state within a reasonable timeframe, but it is rather unclear from the default logging that this has occurred and there is very little extra information beyond the fact that the removed node was lagging. Moreover the only forewarning that the lag detector might be invoked is a message indicating that cluster state publication took unreasonably long, which does not contain enough information to investigate the problem further. This commit adds a good deal more detail to make the issues of slow nodes more prominent: - after 10 seconds (by default) we log an INFO message indicating that a publication is still waiting for responses from some nodes, including the identities of the problematic nodes. - when the publication times out after 30 seconds (by default) we log a WARN message identifying the nodes that are still pending. - the lag detector logs a more detailed warning when a fatally-lagging node is detected. - if applying a cluster state takes too long then the cluster applier service logs a breakdown of all the tasks it ran as part of that process.	2019-08-01 13:20:46 +01:00
Daniel Mitterdorfer	5dd0e74e79	Clarify which circuit breaker settings are static (#44992 ) Most of the circuit breaker settings are dynamically configurable. However, `indices.breaker.total.use_real_memory` is not. With this commit we add a clarifying note that this specific setting is static. Closes #44974	2019-07-31 13:15:33 +02:00
James Rodewig	d46545f729	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:18:23 -04:00
James Rodewig	76c7e3a05f	[DOCS] Replace `_meta` with `metadata` for snapshot APIs. (#44596 ) elastic/elasticsearch#41281 added custom metadata parameter to snapshots. During review, the parameter name was changed from '_meta' to 'metadata,' but the documentation wasn't updated. This corrects the documentation to use the 'metadata' name.	2019-07-19 08:40:57 -04:00
Lee Hinman	fb0461ac76	[7.x] Add Snapshot Lifecycle Management (#44382 ) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It does record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "/30 * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}	2019-07-16 07:37:13 -06:00
Albert Zaharovits	018d946bba	[DOC] Backup & Restore Security Configuration (#42970 ) This commit documents the backup and restore of a cluster's security configuration. It is not possible to only backup (or only restore) security configuration, independent to the rest of the cluster's conf, so this describes how a full configuration backup&restore will include security as well. Moreover, it explains how part of the security conf data resides on the special .security index and how to backup that using regular data snapshot API. Co-Authored-By: Lisa Cawley <lcawley@elastic.co> Co-Authored-By: Tim Vernum <tim@adjective.org>	2019-07-10 14:53:56 +03:00
Akshesh Doshi	01b982fd31	Draw attention to transport layer in remote cluster docs (#43883 ) Closes #43858	2019-07-05 13:44:36 +02:00

1 2 3 4 5 ...

709 Commits