Commit Graph

62 Commits

Author SHA1 Message Date
Tim Brooks 85fdf959ad
Add configured indexing memory limit to node stats (#60414)
This commit adds the configured memory limit to the node stats API.
2020-07-29 12:28:21 -06:00
David Turner 9450ea08b4 Log and track open/close of transport connections (#60297)
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.

This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
2020-07-28 17:08:04 +01:00
Tim Brooks ba01540d7e
Implement human readable indexing pressure stats (#60058)
The indexing pressure stats do not currently have human readable
variants. This commit add human readable variants and updates the
documentation.
2020-07-22 12:07:59 -06:00
Tim Brooks ceb54ed655
Add indexing pressure documentation (#59456)
This commit adds documentation about the new indexing pressure memory
limit setting and exposure of this metrics in node stats.
2020-07-22 10:09:18 -06:00
David Turner b75207a09f Remove sporadic min/max usage estimates from stats (#59755)
Today `GET _nodes/stats/fs` includes `{least,most}_usage_estimate`
fields for some nodes. These fields have rather strange semantics. They
are only reported on the elected master and on nodes that have been the
elected master since they were last restarted; when a node stops being
the elected master these stats remain in place but we stop updating them
so they may become arbitrarily stale.

This means that these statistics are pretty meaningless and impossible
to use correctly. Even if they were kept up to date they're never
reported for data-only nodes anyway, despite the fact that data nodes
are the ones where we care most about disk usage. The information needed
to compute the path with the least/most available space is already
provided in the rest the stats output, so we can treat the inclusion of
these stats as a bug and fix it by simply removing them in this commit.
Since these stats were always optional and mostly omitted (for opaque
reasons) this is not considered a breaking change.
2020-07-20 15:22:04 +01:00
David Turner 3a234d2669
Account for remaining recovery in disk allocator (#58800)
Today the disk-based shard allocator accounts for incoming shards by
subtracting the estimated size of the incoming shard from the free space on the
node. This is an overly conservative estimate if the incoming shard has almost
finished its recovery since in that case it is already consuming most of the
disk space it needs.

This change adds to the shard stats a measure of how much larger each store is
expected to grow, computed from the ongoing recovery, and uses this to account
for the disk usage of incoming shards more accurately.

Backport of #58029 to 7.x

* Picky picky

* Missing type
2020-07-01 10:12:44 +01:00
Lisa Cawley db5bf92acf
[7.x][DOCS] Replace docdir attribute with es-repo-dir (#57489) (#57494) 2020-06-01 16:42:53 -07:00
James Rodewig e9c3bfc8e5 [DOCS] Collapse nested objects in node stats API response (#54755)
Replaces dot notation with collapsed nested object formatting
per the [Elastic API reference template][0].

[0]:https://github.com/elastic/docs/blob/master/shared/api-ref-ex.asciidoc
2020-04-06 15:19:54 -04:00
James Rodewig 2fdf6b2f96
[DOCS] Document missing data types for node stats API's response parameters (#53475)
Documents missing data types for several response parameters returned
by the node stats API.

Also adds several missing human-readable parameters returned by the API.
2020-03-25 08:42:58 -04:00
James Rodewig 068181b0b6 [DOCS] Add missing `indices` parms returned by `_nodes/stats` (#52055)
Adds several human-readable `indices` parameters returned by the
`_nodes/stats` API.
2020-02-21 08:15:59 -05:00
James Rodewig 1545c2ab26 [DOCS] Document node stats response meta (#51263)
Documents several metadata-related parameters returned by the
`GET _nodes/stats` API.
2020-02-03 08:33:57 -05:00
James Rodewig b6bf64b969 [DOCS] Collapse node stats response sections (#51063)
elastic/docs#1687 added support for the `[%collapsible]` Asciidoc
attribute, which creates collapsible sections in the HTML output.

This PR makes two related changes to the nodes stats API documentation:

* Makes the response parameter sections collapsible. This allows users
  to more easily navigate the page without long walls of text.

* Reorders the response parameter sections to match the default order
  returned by the API.

Relates to #47524.
2020-01-16 13:19:29 -05:00
James Rodewig a290762df1 [DOCS] Document `breakers`, `script`, and `discovery` node stats (#50509)
Documents the `breakers`, `script`, and `discovery` parameters returned
by the `_nodes/stats` API.
2020-01-14 16:51:50 -05:00
James Rodewig 1cec87c0e6 [DOCS] Document `transport` and `http` node stats (#50473)
Documents the `transport` and `http` parameters returned by the
`_nodes/stats` API.
2019-12-26 07:43:14 -05:00
James Rodewig 73ca2e5826 [DOCS] Document `thread_pool` node stats (#50330) 2019-12-18 17:02:00 -05:00
James Rodewig 249334f38d [DOCS] Document JVM node stats (#49500)
* [DOCS] Document JVM node stats

Documents the `jvm` parameters returned by the `_nodes/stats` API.

Co-Authored-By: James Baiera <james.baiera@gmail.com>
2019-12-12 15:42:18 -05:00
James Rodewig 42e92616f6 [DOCS] Document indices response parameters for node stats API (#47525) 2019-11-12 08:35:35 -05:00
James Rodewig e45b0cd7e3 [DOCS] Sort cluster API docs alphabetically (#48198) 2019-10-22 12:28:39 -05:00
James Rodewig b4bf47cce4 [DOCS] Document ingest pipeline and processor node stats (#47884) 2019-10-17 09:19:45 -04:00
James Rodewig b59ecde041
[DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353) (#46502) 2019-09-09 13:38:14 -04:00
James Rodewig 327da31db4 [DOCS] Reformat index stats API docs (#46322) 2019-09-05 15:45:02 -04:00
James Rodewig ceb8b9bbee Change `{var}` convention to `<var>` (#45904) 2019-08-23 10:57:48 -04:00
István Zoltán Szabó 4ee7ac25ae [DOCS] Reformats cluster node stats API (#45441)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-08-13 12:48:07 +02:00
James Rodewig a63f60b776 [DOCS] Remove heading offsets for REST APIs (#44568)
Several files in the REST APIs nav section are included using
:leveloffset: tags. This increments headings (h2 -> h3, h3 -> h4, etc.)
in those files and removes the :leveloffset: tags.

Other supporting changes:
* Alphabetizes top-level REST API nav items.
* Change 'indices APIs' heading to 'index APIs.'
* Changes 'Snapshot lifecycle management' heading to sentence case.
2019-07-19 14:36:06 -04:00
Jason Tedor cd5f1b53e8
Remove reference to fs.data.spins in docs
We long ago removed fs.data.spins from the nodes stats. This commit
removes reference to this in the docs.
2019-05-10 11:49:01 -04:00
Josh Soref edb48321ba [DOCS] Various spelling corrections (#37046) 2019-01-07 14:44:12 +01:00
Jason Tedor 0045111ce2 Deprecate the suggest metrics (#29627)
The suggest stats were folded into the search stats as part of the
indices stats API in 5.0.0. However, the suggest metric remained as a
synonym for the search metric for BWC reasons. This commit deprecates
usage of the suggest metric on the indices stats API.

Similarly, due to the changes to fold the suggest stats into the search
stats, requesting the suggest index metric on the indices metric on the
nodes stats API has produced an empty object as the response since
5.0.0. This commit deprecates this index metric on the indices metric on
the nodes stats API.
2018-04-20 09:47:38 -04:00
Jason Tedor 105dcb544c
Enable selecting adaptive selection stats
The node stats API enables filtlering the top-level stats for only
desired top-level stats. Yet, this was never enabled for adaptive
replica selection stats. This commit enables this. We also add setting
these stats on the request builder, and fix an inconsistent name in a
setter.

Relates #28721
2018-02-19 16:56:36 -05:00
David Roberts a292740b9e Add cgroup memory usage/limit to OS stats on Linux (#26166)
This change adds cgroup memory usage/limit to the OS stats section of
the node stats on Linux.  This information is useful because in Docker
containers the standard node stats report the host memory limit, not
taking account of extra restrictions that may have been applied to the
container.

The original idea was to store these values as Long, truncating any values
outside the range of long.  However, this meant that in the relatively common
case of no limit being applied, users would not see the same value in the OS
stats as they see by querying Linux directly.  So instead the values are stored
as String.  This change places a burden on consumers of the strings to
convert the strings to numbers and decide what to do about extremely large
values, but there will be very few consumers and they would need to have a
policy for dealing with "no limit" in any case.
2017-10-03 12:08:36 +01:00
Peter Dyson 1f9e0fd0dd [Docs] improved description for fs.total.available_in_bytes (#26657) 2017-09-18 16:56:19 +10:00
Clinton Gormley 1b0c93b07c Documented the level parameter to nodes stats
Closes #24999
2017-06-01 12:11:21 +02:00
Jason Tedor 41ffb008ad Fix doc bug for cgroup cpuacct usage metric
This commit fixes a silly doc bug where the field that represents the
total CPU time consumed by all tasks in the same cgroup was mistakenly
reported as "usage" instead of "usage_nanos".

Relates #21029
2016-12-15 23:22:54 -05:00
Clinton Gormley cfabc95f59 Fixed bad asciidoc ID in node stats 2016-11-15 17:39:15 +00:00
Jason Tedor f5ac0e5076 Remove lenient stats parsing
Today when parsing a stats request, Elasticsearch silently ignores
incorrect metrics. This commit removes lenient parsing of stats requests
for the nodes stats and indices stats APIs.

Relates #21417
2016-11-15 12:17:26 -05:00
Jason Tedor aec09a76d6 Clarify requesting all stats in node stats docs
This commit clarifies how to explicitly obtain all stats from the node
stats API.
2016-11-08 13:47:15 -05:00
Jason Tedor 900ee0536e Strengthen handling of unavailable cgroup stats
On some systems, cgroups will be available but not configured. And in
some cases, cgroups will be configured, but not for the subsystems that
we are expecting (e.g., cpu and cpuacct). This commit strengthens the
handling of cgroup stats on such systems.

Relates #21094
2016-10-24 16:36:51 -04:00
Jason Tedor 3d642ab0eb Add basic cgroup CPU metrics
This commit adds basic cgroup CPU metrics to the node stats API.

Relates #21029
2016-10-24 08:26:56 -04:00
Jason Tedor ecce53f0df Add I/O statistics on Linux
This commit adds a variety of real disk metrics for the block devices
that back Elasticsearch data paths. A collection of statistics are read
from /proc/diskstats and are used to report the raw metrics for
operations and read/write bytes.

Relates #15915
2016-05-17 16:16:39 -04:00
Martijn van Groningen 2fa33d5c47 Added ingest statistics to node stats API
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node

Also these stats are returned on a per pipeline basis.
2016-03-10 13:21:43 +01:00
Jason Tedor df598e8129 Modify load average formats
This commit modifies the load_average in the node stats API response
to be an object containing the one-minute, five-minute and
fifteen-minute load averages as fields (if those values are
available). Additionally, this commit modifies the cat nodes API
response to format the one-minute, five-minute and fifteen-minute load
averages as null if any of the respective values are not available.
2016-01-18 11:41:34 -05:00
Jason Tedor 1de2081ed3 Reintroduce five-minute and fifteen-minute load averages on Linux
This commit reintroduces the five-minute and fifteen-minute load stats
on Linux, and changes the format of the load_average field back to an
array.
2016-01-11 23:42:47 -05:00
Jason Tedor 6872d545ac Add system CPU percent to OS stats
This commit adds the system CPU percent reflecting the recent CPU usage
for the whole system.
2015-11-17 13:48:46 -05:00
xuzha 97ecd7bf5a Expose pending cluster state queue size in node stats
Add 3 stats about the queue: total queue size, number of committed cluster
states, and number of pending cluster states.
2015-10-28 10:59:15 -07:00
Tanguy Leroux db7aecab4d update list of available os stats
os cpu information is no longer exposed through the nodes stats api
2015-08-31 17:03:45 +02:00
Tanguy Leroux 8e052f0da2 Make platform specific assumptions in OS & Process probes tests 2015-08-17 14:47:23 +02:00
Tanguy Leroux 03c327ff12 Expose ClassloadingMXBean in Node Stats
Closes #12738
2015-08-12 14:29:13 +02:00
Tanguy Leroux 19e348a82c Update OS stats 2015-07-08 17:48:10 +02:00
Tanguy Leroux 1c5d8efd47 Process Stats: remove sigar specific stats from APIs and add JMX implementation 2015-07-08 15:12:45 +02:00
Tanguy Leroux 26fd4ba95b Docs: fix wrong title level 2015-07-08 09:29:21 +02:00
Tanguy Leroux fbcf4dbbf7 FS Stats: remove sigar specific stats from APIs:
- fs.*.disk_reads
- fs.*.disk_writes
- fs.*.disk_io_op
- fs.*.disk_read_size_in_bytes
- fs.*.disk_write_size_in_bytes
- fs.*.disk_io_size_in_bytes
- fs.*.disk_queue
- fs.*.disk_service_time
2015-07-07 22:16:39 +02:00