Commit Graph

43 Commits

Author SHA1 Message Date
David Turner 4e083cd97d indices.recovery.max_bytes_per_sec may be per-node (#54633)
The `indices.recovery.max_bytes_per_sec` recovery bandwidth limit can differ
between nodes if it is not set dynamically, but today this is not obvious. This
commit adds a paragraph to its documentation clarifying how to set different
bandwidth limits on each node.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-04-02 18:15:41 +01:00
István Zoltán Szabó 424b4ed4ea [DOCS] Expands the documentation of Node Query Cache (#51105)
Co-authored-by: debadair <debadair@elastic.co>
2020-01-20 11:13:29 +01:00
Stuart Tettemer 2e76865290
[DOCS] Deterministic scripted queries are cached (#50408) (#50411)
**Backport**

Refs: #49321
2019-12-19 16:30:34 -07:00
Patryk Krawaczyński df558aa0ca [DOCS] Document `index.queries.cache.enabled` as a static setting (#49886) 2019-12-10 14:24:03 -05:00
James Rodewig 079bf887c0
[DOCS] Reorder index APIs alphabetically (#46981) (#47402) 2019-10-01 17:07:28 -04:00
James Rodewig e253ee6ba6
[DOCS] Change // CONSOLE comments to [source,console] (#46440) (#46494) 2019-09-09 12:35:50 -04:00
Daniel Mitterdorfer 5dd0e74e79 Clarify which circuit breaker settings are static (#44992)
Most of the circuit breaker settings are dynamically configurable.
However, `indices.breaker.total.use_real_memory` is not. With this
commit we add a clarifying note that this specific setting is static.

Closes #44974
2019-07-31 13:15:33 +02:00
Sam Mingo 12962ee0a7 Update search-settings.asciidoc (#43016)
Grammar and spelling fixes
2019-06-10 10:14:03 +01:00
James Rodewig 58f2e91684 [DOCS] Rewrite 'rewrite' parameter docs (#42018) 2019-05-13 08:43:12 -04:00
James Rodewig 6a7459ff11 [DOCS] Clarify Recovery Settings for Shard Relocation (#40329)
* Clarify that peer recovery settings apply to shard relocation

* Fix awkward wording of 1st sentence

* [DOCS] Remove snapshot recovery reference.
Call out link to [[cat-recovery]].
Separate expert settings.
2019-04-26 10:24:14 -04:00
Christoph Büscher 25aac4f77f
Remove `include_type_name` in asciidoc where possible (#37568)
The "include_type_name" parameter was temporarily introduced in #37285 to facilitate
moving the default parameter setting to "false" in many places in the documentation
code snippets. Most of the places can simply be reverted without causing errors.
In this change I looked for asciidoc files that contained the
"include_type_name=true" addition when creating new indices but didn't look
likey they made use of the "_doc" type for mappings. This is mostly the case
e.g. in the analysis docs where index creating often only contains settings. I
manually corrected the use of types in some places where the docs still used an
explicit type name and not the dummy "_doc" type.
2019-01-18 09:34:11 +01:00
Julie Tibshirani 36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Nhat Nguyen 15aa3764a4
Reduce recovery time with compress or secure transport (#36981)
Today file-chunks are sent sequentially one by one in peer-recovery. This is a
correct choice since the implementation is straightforward and recovery is
network bound in most of the time. However, if the connection is encrypted, we
might not be able to saturate the network pipe because encrypting/decrypting
are cpu bound rather than network-bound.

With this commit, a source node can send multiple (default to 2) file-chunks
without waiting for the acknowledgments from the target.

Below are the benchmark results for PMC and NYC_taxis.

- PMC (20.2 GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| -------- | -------- | -------- | -------- |
| Plain     | 184s     | 137s     | 106s     | 105s     | 106s     |
| TLS       | 346s     | 294s     | 176s     | 153s     | 117s     |
| Compress  | 1556s    | 1407s    | 1193s    | 1183s    | 1211s    |

- NYC_Taxis (38.6GB)

| Transport | Baseline | chunks=1 | chunks=2 | chunks=3 | chunks=4 |
| ----------| ---------| ---------| ---------| ---------| -------- |
| Plain     | 321s     | 249s     | 191s     |  *       | *        |
| TLS       | 618s     | 539s     | 323s     | 290s     | 213s     |
| Compress  | 2622s    | 2421s    | 2018s    | 2029s    | n/a      |

Relates #33844
2019-01-14 15:14:46 -05:00
David Turner d9e2ebca67
Add more detail to recovery bandwidth limit docs (#37156) 2019-01-09 08:18:25 +00:00
Yu d01b30acba lower fielddata circuit breaker's default limit (#27162)
* Lower fielddata circuit breaker default limit

Lower fielddata circuit breaker default limit from 60% to 40% as we have
moved to doc_values for most of the cases.

* merge master in

* update tests

* update docs
2018-12-11 11:30:58 +01:00
Alexandru Rusanescu f3e150b0ea [Docs] Update query_cache.asciidoc (#33340)
Add note about non-visibility of cache content.
2018-11-01 10:22:36 +01:00
Christoph Büscher c0c6a28e86
[Docs] Add `indices.query.bool.max_clause_count` setting (#34779)
This change adds a section about the global search setting
`indices.query.bool.max_clause_count` that limits the number of boolean clauses
allowed in a Lucene BooleanQuery.

Closes #19858
2018-10-25 17:59:59 +02:00
Daniel Mitterdorfer f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00
Daniel Mitterdorfer 3d53daeb2f
Account for XContent overhead in in-flight breaker
So far the in-flight request circuit breaker has only accounted for the
on-the-wire representation of a request. However, we convert the raw
request into XContent internally which increases the overhead.
Therefore, we increase the value of the corresponding setting
`network.breaker.inflight_requests.overhead` from one to two. While this
value is still rather conservative (we assume that the representation as
structured objects has no overhead compared to the byte[]), it is closer
to reality than the current value.

Relates #31613
2018-07-03 09:17:16 +02:00
Colin Goodheart-Smithe 360b09f148
[DOCS] Fixes accounting setting names (#30863)
The documentation for the account circuit breaker listed the settings for it's limit and overhead to be `network.breaker.accounting.limit` and `network.breaker.accounting.overhead` when in `HieratchyCircuitBreakerService` it seems the settings are actually `indices.breaker.accounting.limit` and `indices.breaker.accounting.overhead`.
2018-06-04 09:20:54 +01:00
Lee Jones 37f67d9e21 [Docs] Fix typo in circuit breaker docs (#29659)
The previous description had a part that didn't fit and was probably
from a copy/paste of the in flight requests description above.
2018-05-22 16:43:45 +02:00
Lee Hinman 623d3700f0
Add accounting circuit breaker and track segment memory usage (#27116)
* Add accounting circuit breaker and track segment memory usage

This commit adds a new circuit breaker "accounting" that is used for tracking
the memory usage of non-request-tied memory users. It also adds tracking for the
amount of Lucene segment memory used by a shard as a user of the new circuit
breaker.

The Lucene segment memory is updated when the shard refreshes, and removed when
the shard relocates away from a node or is deleted. It should also be noted that
all tracking for segment memory uses `addWithoutBreaking` so as not to fail the
shard if a limit is reached.

The `accounting` breaker has a default limit of 100% and will contribute to the
parent breaker limit.

Resolves #27044
2017-12-01 07:59:45 -07:00
Alexander Reelsen 80d0a32f8e ScriptService: Replace max compilation per minute setting with max compilation rate (#26399)
The current script service has a script compilation limit for a one
minute window. This is set to a small default value of 15. Instead of
increasing that default value, this commit introduces a new setting 
that allows to configure a rate per time unit, so that the script service can deal with bursts better.

The new setting is named `script.max_compilations_rate`,
requires a nonnegative number and a positive time value.

The default is `75/5m`, which is equivalent to the existing 15 per minute.
2017-09-01 10:15:27 +02:00
Robin Clarke 1900d9c447 Docs: Fix typo for request cache (#25444) 2017-06-28 14:31:03 +02:00
Stefan Gorgiovski 798c19dd7f Deprecate request_cache for clear-cache (#23638)
It is called `request` now.
2017-03-22 08:28:04 -04:00
alamzeeshan a1cc683cff Updated document as per code change. (#22878)
Updated document as per this change : https://github.com/elastic/elasticsearch/pull/15235
2017-01-31 13:36:09 +01:00
Nik Everett 3ed3e5e660 Convert more docs to CONSOLE
* plugins/discovery-azure-class.asciidoc
* reference/cluster.asciidoc
* reference/modules/cluster/misc.asciidoc
* reference/modules/indices/request_cache.asciidoc

After this is merged there will be no unconvereted snippets outside
of `reference`.

Related to #18160
2016-09-21 09:36:21 -04:00
Colin Goodheart-Smithe 2904562b01 [DOCS] Fix shard request cache docs
Docs have been changed to reflect the fact that shard request cache is now enabled by default

Closes #19695
2016-08-11 14:25:34 +01:00
Lee Hinman 2be52eff09 Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized
scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>)
should be preferred. This enforces a limit to the number of scripts that
can be compiled within a minute. A new dynamic setting is added -
`script.max_compilations_per_minute`, which defaults to 15.

If more dynamic scripts are sent, a user will get the following
exception:

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "i",
        "node" : "a5V1eXcZRYiIk8lecjZ4Jw",
        "reason" : {
          "type" : "general_script_exception",
          "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
          "caused_by" : {
            "type" : "circuit_breaking_exception",
            "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
            "bytes_wanted" : 0,
            "bytes_limit" : 0
          }
        }
      }
    ],
    "caused_by" : {
      "type" : "general_script_exception",
      "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    }
  },
  "status" : 500
}
```

This also fixes a bug in `ScriptService` where requests being executed
concurrently on a single node could cause a script to be compiled
multiple times (many in the case of a powerful node with many shards)
due to no synchronization between checking the cache and compiling the
script. There is now synchronization so that a script being compiled
will only be compiled once regardless of the number of concurrent
searches on a node.

Relates to #19396
2016-08-09 10:26:27 -06:00
Lee Hinman 1623cff6c0 Merge remote-tracking branch 'dakrone/bucket-circuit-breaker' 2016-07-25 13:37:26 -06:00
Lee Hinman 124a9fabe3 Circuit break on aggregation bucket numbers with request breaker
This adds new circuit breaking with the "request" breaker, which adds
circuit breaks based on the number of buckets created during
aggregations. It consists of incrementing during AggregatorBase creation

This also bumps the REQUEST breaker to 60% of the JVM heap now.

The output when circuit breaking an aggregation looks like:

```json
{
  "shard" : 0,
  "index" : "i",
  "node" : "a5AvjUn_TKeTNYl0FyBW2g",
  "reason" : {
    "type" : "exception",
    "reason" : "java.util.concurrent.ExecutionException: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: CircuitBreakingException[[request] Data too large, data for [<agg [otherthings]>] would be larger than limit of [104857600/100mb]];",
    "caused_by" : {
      "type" : "execution_exception",
      "reason" : "QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: CircuitBreakingException[[request] Data too large, data for [<agg [myagg]>] would be larger than limit of [104857600/100mb]];",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[request] Data too large, data for [<agg [otherthings]>] would be larger than limit of [104857600/100mb]",
        "bytes_wanted" : 104860781,
        "bytes_limit" : 104857600
      }
    }
  }
}
```

Relates to #14046
2016-07-25 11:33:37 -06:00
Colin Goodheart-Smithe b717ad8eb6 Enable option to use request cache for size > 0
Previously if the size of the search request was greater than zero we would not cache the request in the request cache.

This change retains the default behaviour of not caching requests with size > 0 but also allows the `request_cache=true` query parameter
to enable the cache for requests with size > 0
2016-07-18 13:33:59 +01:00
Adrien Grand db9af54ec0 Remove `_timestamp` and `_ttl` on 5.x indices. #18980
This removes the ability to use `_timestamp` and `_ttl` on indices created on
or after 5.0.

Closes #18280
2016-06-22 08:35:54 +02:00
Daniel Mitterdorfer 52b2016447 Limit request size on transport level
With this commit we limit the size of all in-flight requests on
transport level. The size is guarded by a circuit breaker and is
based on the content size of each request.

By default we use 100% of available heap meaning that the parent
circuit breaker will limit the maximum available size. This value
can be changed by adjusting the setting

network.breaker.inflight_requests.limit

Relates #16011
2016-04-13 09:54:59 +02:00
Adrien Grand 0eb1a816c8 Allow the query cache to be disabled. #16268
This replaces the internal `index.queries.cache.type` setting with
a new `index.queries.cache.enabled` setting, which is documented.

Closes #15802
2016-04-11 18:06:16 +02:00
Michael McCandless 3744fb9dc0 merge master 2016-01-06 04:03:42 -05:00
Simon Willnauer f5e4cd4616 Remove recovery threadpools and throttle outgoing recoveries on the master
Today we throttle recoveries only for incoming recoveries. Nodes that have a lot
of primaries can get overloaded due to too many recoveries. To still keep that at bay
we limit the number of threads that are sending files to the target to overcome this problem.

The right solution here is to also throttle the outgoing recoveries that are today unbounded on
the master and don't start the recovery until we have enough resources on both source and target nodes.

The concurrency aspects of the recovery source also added a lot of complexity and additional threadpools
that are hard to configure. This commit removes the concurrent streamns notion completely and sends files
in the thread that drives the recovery simplifying the recovery code considerably.
Outgoing recoveries are not throttled on the master via a allocation decider.
2015-12-22 14:59:43 +01:00
Michael McCandless 319dc8c8ed remove dead code; get one test working again; fix docs; remove nocommits 2015-12-16 16:19:07 -05:00
xuzha f46e66e7d0 Remove the experimental indices.fielddata.cache.expire
closes #10781
2015-09-01 00:40:04 -07:00
Martijn van Groningen a14913f7b6 Left over from the `query_cache` to `request_cache` rename. 2015-07-27 13:28:15 +02:00
Clinton Gormley 2b512f1f29 Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting 2015-07-14 18:14:09 +02:00
Adrien Grand 38f5cc236a Rename caches.
In order to be more consistent with what they do, the query cache has been
renamed to request cache and the filter cache has been renamed to query
cache.

A known issue is that package/logger names do no longer match settings names,
please speak up if you think this is an issue.

Here are the settings for which I kept backward compatibility. Note that they
are a bit different from what was discussed on #11569 but putting `cache` before
the name of what is cached has the benefit of making these settings consistent
with the fielddata cache whose size is configured by
`indices.fielddata.cache.size`:
 * index.cache.query.enable -> index.requests.cache.enable
 * indices.cache.query.size -> indices.requests.cache.size
 * indices.cache.filter.size -> indices.queries.cache.size

Close #11569
2015-06-29 10:15:27 +02:00
Clinton Gormley f123a53d72 Docs: Refactored modules and index modules sections 2015-06-22 23:49:45 +02:00