60 Commits

Author SHA1 Message Date
Wylie Conlon
a86c8dce70 Clarify field data cache behavior in docs (#64375)
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking
2020-11-20 13:59:52 -08:00
James Rodewig
7e2903d888
[DOCS] Document dynamic index mgmt and buffer settings (#61753) (#61996) 2020-09-04 10:40:55 -04:00
James Rodewig
129b233156
[DOCS] Document dynamic cluster settings (#61760) (#61817)
Co-authored-by: Adam Locke <adam.locke@elastic.co>
2020-09-01 16:04:23 -04:00
James Rodewig
580ef8eb0c
[DOCS] Document static field cache settings (#61424) (#61606) 2020-08-26 17:29:15 -04:00
James Rodewig
dc9d613280
[DOCS] Document dynamic circuit breaker settings (#61334) (#61335) 2020-08-19 11:13:46 -04:00
James Rodewig
26d51089da
[DOCS] Replace twitter dataset in docs (#60604) (#60609) 2020-08-03 13:31:19 -04:00
James Rodewig
aba785cb6e
[DOCS] Update my-index examples (#60132) (#60248)
Changes the following example index names to `my-index-000001` for consistency:

* `my-index`
* `my_index`
* `myindex`
2020-07-27 15:58:26 -04:00
James Rodewig
988e8c8fc6
[DOCS] Swap [float] for [discrete] (#60134)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 12:42:33 -04:00
Nhat Nguyen
ef5c397c0f
Sending operations concurrently in peer recovery (#58018)
Today, we send operations in phase2 of peer recoveries batch by batch
sequentially. Normally that's okay as we should have a fairly small of
operations in phase 2 due to the file-based threshold. However, if
phase1 takes a lot of time and we are actively indexing, then phase2 can
have a lot of operations to replay.

With this change, we will send multiple batches concurrently (defaults
to 1) to reduce the recovery time.

Backport of #58018
2020-07-07 22:03:31 -04:00
Adam Locke
20d04081ec
[7.x] [DOCS] Add supported ESS settings to ES docs (#57953) (#58981)
* Adding ESS icons to supported ES settings.

* Adding new file for supported ESS settings.

* Adding supported ESS settings for HTTP and disk-based shard allocation.

* Adding more supported settings for ESS.

* Adding descriptions for each Cloud section, plus additional settings.

* Adding new warehouse file for Cloud, plus additional settings.

* Adding node settings for Cloud.

* Adding audit settings for Cloud.

* Resolving merge conflict.

* Adding SAML settings (part 1).

* Adding SAML realm encryption and signing settings.

* Adding SAML SSL settings.

* Adding Kerberos realm settings.

* Adding OpenID Connect Realm settings.

* Adding OpenID Connect SSL settings.

* Resolving leftover Git merge markers.

* Removing Cloud settings page and link to it.

* Add link to mapping source

* Update docs/reference/docs/reindex.asciidoc

* Incorporate edit of HTTP settings

* Remove "cloud" from tag and ID

* Remove "cloud" from tag and update description

* Remove "cloud" from tag and ID

* Change "whitelists" to "specifies"

* Remove "cloud" from end tag

* Removing cloud from IDs and tags.

* Changing link reference to fix build issue.

* Adding index management page for missing settings.

* Removing warehouse file for Cloud and moving settings elsewhere.

* Clarifying true/false usage of http.detailed_errors.enabled.

* Changing underscore to dash in link to fix ci build.
2020-07-02 19:40:45 -04:00
Yannick Welsch
15c85b29fd
Account for recovery throttling when restoring snapshot (#58658) (#58811)
Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account
(i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository
setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a
per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to
configure throttling in a single place.

The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to
`40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change
will be observed by clusters where the recovery and restore settings were not adapted.

Relates https://github.com/elastic/elasticsearch/issues/57023

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-07-01 12:19:29 +02:00
Stuart Tettemer
20abba8433
Scripting: Deprecate general cache settings (#55753) (#58283)
Backport: ef543b0
2020-06-18 11:54:23 -06:00
Stuart Tettemer
01795d1925
Revert "Scripting: Deprecate general cache settings (#55753)" (#58201)
This reverts commit 88e8b34fc2d672060a82979cb782b8cf491a3985.
2020-06-16 14:58:18 -06:00
Stuart Tettemer
88e8b34fc2
Scripting: Deprecate general cache settings (#55753)
Backport: ef543b0
2020-06-16 13:06:59 -06:00
James Rodewig
43ef469570
[DOCS] Relocate indices module content (#54903) (#57413)
Moves `indices` content from the [Modules][0] section to the [Configuring
Elasticsearch][1] section.

Also removes the [Indices][2] landing page and adds a related redirect.

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules.html
[1]: https://www.elastic.co/guide/en/elasticsearch/reference/master/settings.html
[2]: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-indices.html
2020-06-01 09:44:32 -04:00
James Rodewig
a5154cc190 [DOCS] Correct setting type for indices.query.bool.max_clause_count (#56640)
#56449 incorrectly labelled this as a dynamic setting.

This corrects that error.
2020-05-12 16:26:18 -04:00
James Rodewig
ba67ab3b64
[DOCS] Add reference docs for search.max_buckets setting (#56449) (#56511)
Adds reference-style setting documentation for the `search.max_buckets`
setting.

This setting was previously only documented on the [bucket
aggregations][0] page.

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket.html
2020-05-11 09:45:09 -04:00
David Turner
4e083cd97d indices.recovery.max_bytes_per_sec may be per-node (#54633)
The `indices.recovery.max_bytes_per_sec` recovery bandwidth limit can differ
between nodes if it is not set dynamically, but today this is not obvious. This
commit adds a paragraph to its documentation clarifying how to set different
bandwidth limits on each node.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-04-02 18:15:41 +01:00
István Zoltán Szabó
424b4ed4ea [DOCS] Expands the documentation of Node Query Cache (#51105)
Co-authored-by: debadair <debadair@elastic.co>
2020-01-20 11:13:29 +01:00
Stuart Tettemer
2e76865290
[DOCS] Deterministic scripted queries are cached (#50408) (#50411)
**Backport**

Refs: #49321
2019-12-19 16:30:34 -07:00
Patryk Krawaczyński
df558aa0ca [DOCS] Document index.queries.cache.enabled as a static setting (#49886) 2019-12-10 14:24:03 -05:00
James Rodewig
079bf887c0
[DOCS] Reorder index APIs alphabetically (#46981) (#47402) 2019-10-01 17:07:28 -04:00
James Rodewig
e253ee6ba6
[DOCS] Change // CONSOLE comments to [source,console] (#46440) (#46494) 2019-09-09 12:35:50 -04:00
Daniel Mitterdorfer
5dd0e74e79 Clarify which circuit breaker settings are static (#44992)
Most of the circuit breaker settings are dynamically configurable.
However, `indices.breaker.total.use_real_memory` is not. With this
commit we add a clarifying note that this specific setting is static.

Closes #44974
2019-07-31 13:15:33 +02:00
Sam Mingo
12962ee0a7 Update search-settings.asciidoc (#43016)
Grammar and spelling fixes
2019-06-10 10:14:03 +01:00
James Rodewig
58f2e91684 [DOCS] Rewrite 'rewrite' parameter docs (#42018) 2019-05-13 08:43:12 -04:00
James Rodewig
6a7459ff11 [DOCS] Clarify Recovery Settings for Shard Relocation (#40329)
* Clarify that peer recovery settings apply to shard relocation

* Fix awkward wording of 1st sentence

* [DOCS] Remove snapshot recovery reference.
Call out link to [[cat-recovery]].
Separate expert settings.
2019-04-26 10:24:14 -04:00
Christoph Büscher
25aac4f77f
Remove include_type_name in asciidoc where possible (#37568)
The "include_type_name" parameter was temporarily introduced in #37285 to facilitate
moving the default parameter setting to "false" in many places in the documentation
code snippets. Most of the places can simply be reverted without causing errors.
In this change I looked for asciidoc files that contained the
"include_type_name=true" addition when creating new indices but didn't look
likey they made use of the "_doc" type for mappings. This is mostly the case
e.g. in the analysis docs where index creating often only contains settings. I
manually corrected the use of types in some places where the docs still used an
explicit type name and not the dummy "_doc" type.
2019-01-18 09:34:11 +01:00
Julie Tibshirani
36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Nhat Nguyen
15aa3764a4
Reduce recovery time with compress or secure transport (#36981)
Today file-chunks are sent sequentially one by one in peer-recovery. This is a
correct choice since the implementation is straightforward and recovery is
network bound in most of the time. However, if the connection is encrypted, we
might not be able to saturate the network pipe because encrypting/decrypting
are cpu bound rather than network-bound.

With this commit, a source node can send multiple (default to 2) file-chunks
without waiting for the acknowledgments from the target.

Below are the benchmark results for PMC and NYC_taxis.

- PMC (20.2 GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| -------- | -------- | -------- | -------- |
| Plain     | 184s     | 137s     | 106s     | 105s     | 106s     |
| TLS       | 346s     | 294s     | 176s     | 153s     | 117s     |
| Compress  | 1556s    | 1407s    | 1193s    | 1183s    | 1211s    |

- NYC_Taxis (38.6GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| ---------| ---------| ---------| -------- |
| Plain     | 321s     | 249s     | 191s     |  *       | *        |
| TLS       | 618s     | 539s     | 323s     | 290s     | 213s     |
| Compress  | 2622s    | 2421s    | 2018s    | 2029s    | n/a      |

Relates #33844
2019-01-14 15:14:46 -05:00
David Turner
d9e2ebca67
Add more detail to recovery bandwidth limit docs (#37156) 2019-01-09 08:18:25 +00:00
Yu
d01b30acba lower fielddata circuit breaker's default limit (#27162)
* Lower fielddata circuit breaker default limit

Lower fielddata circuit breaker default limit from 60% to 40% as we have
moved to doc_values for most of the cases.

* merge master in

* update tests

* update docs
2018-12-11 11:30:58 +01:00
Alexandru Rusanescu
f3e150b0ea [Docs] Update query_cache.asciidoc (#33340)
Add note about non-visibility of cache content.
2018-11-01 10:22:36 +01:00
Christoph Büscher
c0c6a28e86
[Docs] Add indices.query.bool.max_clause_count setting (#34779)
This change adds a section about the global search setting
`indices.query.bool.max_clause_count` that limits the number of boolean clauses
allowed in a Lucene BooleanQuery.

Closes #19858
2018-10-25 17:59:59 +02:00
Daniel Mitterdorfer
f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00
Daniel Mitterdorfer
3d53daeb2f
Account for XContent overhead in in-flight breaker
So far the in-flight request circuit breaker has only accounted for the
on-the-wire representation of a request. However, we convert the raw
request into XContent internally which increases the overhead.
Therefore, we increase the value of the corresponding setting
`network.breaker.inflight_requests.overhead` from one to two. While this
value is still rather conservative (we assume that the representation as
structured objects has no overhead compared to the byte[]), it is closer
to reality than the current value.

Relates #31613
2018-07-03 09:17:16 +02:00
Colin Goodheart-Smithe
360b09f148
[DOCS] Fixes accounting setting names (#30863)
The documentation for the account circuit breaker listed the settings for it's limit and overhead to be `network.breaker.accounting.limit` and `network.breaker.accounting.overhead` when in `HieratchyCircuitBreakerService` it seems the settings are actually `indices.breaker.accounting.limit` and `indices.breaker.accounting.overhead`.
2018-06-04 09:20:54 +01:00
Lee Jones
37f67d9e21 [Docs] Fix typo in circuit breaker docs (#29659)
The previous description had a part that didn't fit and was probably
from a copy/paste of the in flight requests description above.
2018-05-22 16:43:45 +02:00
Lee Hinman
623d3700f0
Add accounting circuit breaker and track segment memory usage (#27116)
* Add accounting circuit breaker and track segment memory usage

This commit adds a new circuit breaker "accounting" that is used for tracking
the memory usage of non-request-tied memory users. It also adds tracking for the
amount of Lucene segment memory used by a shard as a user of the new circuit
breaker.

The Lucene segment memory is updated when the shard refreshes, and removed when
the shard relocates away from a node or is deleted. It should also be noted that
all tracking for segment memory uses `addWithoutBreaking` so as not to fail the
shard if a limit is reached.

The `accounting` breaker has a default limit of 100% and will contribute to the
parent breaker limit.

Resolves #27044
2017-12-01 07:59:45 -07:00
Alexander Reelsen
80d0a32f8e ScriptService: Replace max compilation per minute setting with max compilation rate (#26399)
The current script service has a script compilation limit for a one
minute window. This is set to a small default value of 15. Instead of
increasing that default value, this commit introduces a new setting 
that allows to configure a rate per time unit, so that the script service can deal with bursts better.

The new setting is named `script.max_compilations_rate`,
requires a nonnegative number and a positive time value.

The default is `75/5m`, which is equivalent to the existing 15 per minute.
2017-09-01 10:15:27 +02:00
Robin Clarke
1900d9c447 Docs: Fix typo for request cache (#25444) 2017-06-28 14:31:03 +02:00
Stefan Gorgiovski
798c19dd7f Deprecate request_cache for clear-cache (#23638)
It is called `request` now.
2017-03-22 08:28:04 -04:00
alamzeeshan
a1cc683cff Updated document as per code change. (#22878)
Updated document as per this change : https://github.com/elastic/elasticsearch/pull/15235
2017-01-31 13:36:09 +01:00
Nik Everett
3ed3e5e660 Convert more docs to CONSOLE
* plugins/discovery-azure-class.asciidoc
* reference/cluster.asciidoc
* reference/modules/cluster/misc.asciidoc
* reference/modules/indices/request_cache.asciidoc

After this is merged there will be no unconvereted snippets outside
of `reference`.

Related to #18160
2016-09-21 09:36:21 -04:00
Colin Goodheart-Smithe
2904562b01 [DOCS] Fix shard request cache docs
Docs have been changed to reflect the fact that shard request cache is now enabled by default

Closes #19695
2016-08-11 14:25:34 +01:00
Lee Hinman
2be52eff09 Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized
scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>)
should be preferred. This enforces a limit to the number of scripts that
can be compiled within a minute. A new dynamic setting is added -
`script.max_compilations_per_minute`, which defaults to 15.

If more dynamic scripts are sent, a user will get the following
exception:

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "i",
        "node" : "a5V1eXcZRYiIk8lecjZ4Jw",
        "reason" : {
          "type" : "general_script_exception",
          "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
          "caused_by" : {
            "type" : "circuit_breaking_exception",
            "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
            "bytes_wanted" : 0,
            "bytes_limit" : 0
          }
        }
      }
    ],
    "caused_by" : {
      "type" : "general_script_exception",
      "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    }
  },
  "status" : 500
}
```

This also fixes a bug in `ScriptService` where requests being executed
concurrently on a single node could cause a script to be compiled
multiple times (many in the case of a powerful node with many shards)
due to no synchronization between checking the cache and compiling the
script. There is now synchronization so that a script being compiled
will only be compiled once regardless of the number of concurrent
searches on a node.

Relates to #19396
2016-08-09 10:26:27 -06:00
Lee Hinman
1623cff6c0 Merge remote-tracking branch 'dakrone/bucket-circuit-breaker' 2016-07-25 13:37:26 -06:00
Lee Hinman
124a9fabe3 Circuit break on aggregation bucket numbers with request breaker
This adds new circuit breaking with the "request" breaker, which adds
circuit breaks based on the number of buckets created during
aggregations. It consists of incrementing during AggregatorBase creation

This also bumps the REQUEST breaker to 60% of the JVM heap now.

The output when circuit breaking an aggregation looks like:

```json
{
  "shard" : 0,
  "index" : "i",
  "node" : "a5AvjUn_TKeTNYl0FyBW2g",
  "reason" : {
    "type" : "exception",
    "reason" : "java.util.concurrent.ExecutionException: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: CircuitBreakingException[[request] Data too large, data for [<agg [otherthings]>] would be larger than limit of [104857600/100mb]];",
    "caused_by" : {
      "type" : "execution_exception",
      "reason" : "QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: CircuitBreakingException[[request] Data too large, data for [<agg [myagg]>] would be larger than limit of [104857600/100mb]];",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[request] Data too large, data for [<agg [otherthings]>] would be larger than limit of [104857600/100mb]",
        "bytes_wanted" : 104860781,
        "bytes_limit" : 104857600
      }
    }
  }
}
```

Relates to #14046
2016-07-25 11:33:37 -06:00
Colin Goodheart-Smithe
b717ad8eb6 Enable option to use request cache for size > 0
Previously if the size of the search request was greater than zero we would not cache the request in the request cache.

This change retains the default behaviour of not caching requests with size > 0 but also allows the `request_cache=true` query parameter
to enable the cache for requests with size > 0
2016-07-18 13:33:59 +01:00
Adrien Grand
db9af54ec0 Remove _timestamp and _ttl on 5.x indices. #18980
This removes the ability to use `_timestamp` and `_ttl` on indices created on
or after 5.0.

Closes #18280
2016-06-22 08:35:54 +02:00