mirror of https://github.com/apache/lucene.git
SOLR-11779: Ref Guide: minor typos; capitalize section titles; remove monospace from section titles
This commit is contained in:
parent
f1ce5bb22a
commit
7773bf6764
|
@ -16,11 +16,10 @@
|
||||||
// specific language governing permissions and limitations
|
// specific language governing permissions and limitations
|
||||||
// under the License.
|
// under the License.
|
||||||
|
|
||||||
== Design
|
|
||||||
=== Round-robin databases
|
|
||||||
Solr collects long-term history of certain key metrics both in SolrCloud and in standalone mode.
|
Solr collects long-term history of certain key metrics both in SolrCloud and in standalone mode.
|
||||||
|
|
||||||
This information can be used for very simple monitoring and troubleshooting, but also some
|
This information can be used for very simple monitoring and troubleshooting, but also some
|
||||||
Solr Cloud components (e.g., autoscaling) can use this data for making informed decisions based on
|
SolrCloud components (e.g., autoscaling) can use this data for making informed decisions based on
|
||||||
long-term trends of selected metrics.
|
long-term trends of selected metrics.
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
|
@ -30,7 +29,13 @@ is absent then metrics history will still be collected and kept in memory but it
|
||||||
on node restart.
|
on node restart.
|
||||||
====
|
====
|
||||||
|
|
||||||
This data is maintained as multi-resolution time series, with a fixed total number of data points
|
== Design
|
||||||
|
|
||||||
|
Before discussing how to configure metrics storage, a bit of explanation about how it works may be helpful.
|
||||||
|
|
||||||
|
=== Round-Robin Databases
|
||||||
|
|
||||||
|
The metrics history data is maintained as multi-resolution time series, with a fixed total number of data points
|
||||||
per metric history (a fixed size window). Multi-resolution refers to the fact that data from the most detailed
|
per metric history (a fixed size window). Multi-resolution refers to the fact that data from the most detailed
|
||||||
time series is periodically resampled to create coarser-grained time series, which in turn
|
time series is periodically resampled to create coarser-grained time series, which in turn
|
||||||
are periodically resampled again to build even coarser-grained series.
|
are periodically resampled again to build even coarser-grained series.
|
||||||
|
@ -47,28 +52,28 @@ time series are built:
|
||||||
This means that the total number of samples in all data series is constant, and consequently
|
This means that the total number of samples in all data series is constant, and consequently
|
||||||
the size of this data structure is also constant (because the size of the moving window is fixed, and
|
the size of this data structure is also constant (because the size of the moving window is fixed, and
|
||||||
older samples are replaced by newer ones). This arrangement is referred to as a
|
older samples are replaced by newer ones). This arrangement is referred to as a
|
||||||
round-robin database, and Solr uses implementation of this concept provided by RRD4j library.
|
round-robin database, and Solr uses implementation of this concept provided by the https://github.com/rrd4j/rrd4j[RRD4j] library.
|
||||||
|
|
||||||
=== Storage
|
=== Storage
|
||||||
Databases created with RRD4j are compact - for the time series specified above the total size
|
Databases created with RRD4j are compact - for the time series specified above the total size
|
||||||
of data is ca. 11kB for each of the primary time series, including its resampled data. Each database may contain
|
of data is around 11kB for each of the primary time series, including its resampled data. Each database may contain
|
||||||
several primary time series ("datasources" in RRD4j parlance) and their re-sampled versions (called
|
several primary time series ("datasources" in RRD4j parlance) and their re-sampled versions (called
|
||||||
"archives").
|
"archives").
|
||||||
|
|
||||||
This data is updated in memory and then periodically stored in the `.system`
|
This data is updated in memory and then periodically stored in the `.system`
|
||||||
collection in the form of Solr documents with a binary `data_bin` field, each document
|
collection in the form of Solr documents with a binary `data_bin` field, each document
|
||||||
containing data of one full database. This method of storage is much more compact and generates less
|
containing data of one full database. This method of storage is much more compact and generates less
|
||||||
update operations than storing each data point in a separate Solr document. Metrics history API allows retrieving
|
update operations than storing each data point in a separate Solr document. The Metrics History API allows retrieving
|
||||||
detailed data from each database, including retrieval of all individual datapoints.
|
detailed data from each database, including retrieval of all individual datapoints.
|
||||||
|
|
||||||
Databases are identified primarily by their corresponding metric registry name, so for databases that
|
Databases are identified primarily by their corresponding metric registry name, so for databases that
|
||||||
keep track of aggregated metrics this will be e.g., `solr.jvm`, `solr.node`, `solr.collection.gettingstarted`.
|
keep track of aggregated metrics this will be e.g., `solr.jvm`, `solr.node`, `solr.collection.gettingstarted`.
|
||||||
For databases with non-aggregated metrics the name consists of the registry name, optionally with a node name
|
For databases with non-aggregated metrics the name consists of the registry name, optionally with a node name
|
||||||
to identify databases with the same name coming from different nodes. For example, per-node databases are
|
to identify databases with the same name coming from different nodes. For example, per-node databases are
|
||||||
name like this: `solr.jvm.localhost:8983_solr`, `solr.node.localhost:7574_solr`, but per-replica names are
|
named like this: `solr.jvm.localhost:8983_solr`, `solr.node.localhost:7574_solr`, but per-replica names are
|
||||||
already unique across the cluster so they are named like this: `solr.core.gettingstarted.shard1.replica_n1`.
|
already unique across the cluster so they are named like this: `solr.core.gettingstarted.shard1.replica_n1`.
|
||||||
|
|
||||||
=== Collected metrics
|
=== Collected Metrics
|
||||||
Currently the following selected metrics are tracked:
|
Currently the following selected metrics are tracked:
|
||||||
|
|
||||||
* Non-aggregated `solr.core` and aggregated `solr.collection` metrics:
|
* Non-aggregated `solr.core` and aggregated `solr.collection` metrics:
|
||||||
|
@ -114,10 +119,10 @@ the call from originating node to the current Overseer leader.
|
||||||
|
|
||||||
The handler assumes that a simple aggregation (sum of partial metric values from each resource) is
|
The handler assumes that a simple aggregation (sum of partial metric values from each resource) is
|
||||||
sufficient. This happens to make sense for the default built-in sets of metrics. Future extensions will
|
sufficient. This happens to make sense for the default built-in sets of metrics. Future extensions will
|
||||||
provide other aggregation strategies (average, max, min, ...).
|
provide other aggregation strategies (such as, average, max, min, etc.).
|
||||||
|
|
||||||
== Metrics History Configuration
|
== Metrics History Configuration
|
||||||
There are two mechanisms for configuring this subsystem:
|
There are two ways to configure this subsystem:
|
||||||
|
|
||||||
* `/clusterprops.json` - this is the primary mechanism. It uses the cluster properties JSON
|
* `/clusterprops.json` - this is the primary mechanism. It uses the cluster properties JSON
|
||||||
file in ZooKeeper. Configuration is stored in the `/metrics/history` element in a JSON map.
|
file in ZooKeeper. Configuration is stored in the `/metrics/history` element in a JSON map.
|
||||||
|
@ -128,43 +133,42 @@ with the existing metrics configuration section in this file. Configuration is s
|
||||||
|
|
||||||
Currently the following configuration options are supported:
|
Currently the following configuration options are supported:
|
||||||
|
|
||||||
`enable`:: boolean, default is true. If this if false then metrics history is not collected
|
`enable`:: boolean, default is `true`. If this is `false` then metrics history is not collected
|
||||||
but can still be retrieved from existing databases. When this is true then metrics are
|
but can still be retrieved from existing databases. When this is `true` then metrics are
|
||||||
periodically collected, aggregated and saved.
|
periodically collected, aggregated and saved.
|
||||||
|
|
||||||
`enableReplicas`:: boolean, default is false. When this is true non-aggregated history will be
|
`enableReplicas`:: boolean, default is `false`. When this is `true` non-aggregated history will be
|
||||||
collected for each replica in each collection. When this is false then only aggregated history
|
collected for each replica in each collection. When this is `false` then only aggregated history
|
||||||
is collected for each collection.
|
is collected for each collection.
|
||||||
|
|
||||||
`enableNodes`:: boolean, default is false. When this is true then non-aggregated history will be
|
`enableNodes`:: boolean, default is `false`. When this is `true` then non-aggregated history will be
|
||||||
collected separately for each node (for node and JVM metrics), with database names consisting of
|
collected separately for each node (for node and JVM metrics), with database names consisting of
|
||||||
base registry name with appended node name, e.g., `solr.jvm.localhost:8983_solr`. When this is false
|
base registry name with appended node name, e.g., `solr.jvm.localhost:8983_solr`. When this is `false`
|
||||||
then only aggregated history will be collected in a single `solr.jvm` and `solr.node` cluster-wide
|
then only aggregated history will be collected in a single `solr.jvm` and `solr.node` cluster-wide
|
||||||
databases.
|
databases.
|
||||||
|
|
||||||
`collectPeriod`:: integer, in seconds, default is 60. Metrics values will be collected and respective
|
`collectPeriod`:: integer, in seconds, default is `60`. Metrics values will be collected and respective
|
||||||
databases updated every `collectPeriod` seconds.
|
databases updated every `collectPeriod` seconds.
|
||||||
|
+
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
====
|
====
|
||||||
Value of `collectPeriod` must be at least 1, and if it's changed then all previously existing databases
|
Value of `collectPeriod` must be at least 1, and if it's changed then all previously existing databases
|
||||||
with their historic data must be manually removed (new databases will be created automatically).
|
with their historic data must be manually removed (new databases will be created automatically).
|
||||||
====
|
====
|
||||||
|
|
||||||
`syncPeriod`:: integer, in seconds, default is 60. Data from modified databases will be saved to Solr
|
`syncPeriod`:: integer, in seconds, default is `60`. Data from modified databases will be saved to Solr
|
||||||
every `syncPeriod` seconds. When accessing the databases via REST API in `index` mode the visibility of
|
every `syncPeriod` seconds. When accessing the databases via REST API in `index` mode the visibility of
|
||||||
most recent data depends on this period, because requests accessing the data from other nodes see only
|
most recent data depends on this period, because requests accessing the data from other nodes see only
|
||||||
the version of the data that is stored in the `.system` collection.
|
the version of the data that is stored in the `.system` collection.
|
||||||
|
|
||||||
=== Example configuration
|
=== Example Configuration
|
||||||
Example `/clusterprops.json` file with metrics history configuration that turns on the collection of
|
Example `/clusterprops.json` file with metrics history configuration that turns on the collection of
|
||||||
per-node metrics history for node and JVM metrics. Note: typically this file will also contain other
|
per-node metrics history for node and JVM metrics. Typically this file will also contain other
|
||||||
properties unrelated to metrics history API.
|
properties unrelated to Metrics History API.
|
||||||
|
|
||||||
[source,json]
|
[source,json]
|
||||||
----
|
----
|
||||||
{
|
{
|
||||||
...
|
|
||||||
"metrics" : {
|
"metrics" : {
|
||||||
"history" : {
|
"history" : {
|
||||||
"enable" : true,
|
"enable" : true,
|
||||||
|
@ -172,7 +176,6 @@ properties unrelated to metrics history API.
|
||||||
"syncPeriod" : 300
|
"syncPeriod" : 300
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
...
|
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
|
@ -186,11 +189,14 @@ required parameter `action`.
|
||||||
All responses contain a section named `state`, which reports the current internal state of the API:
|
All responses contain a section named `state`, which reports the current internal state of the API:
|
||||||
|
|
||||||
`enableReplicas`:: boolean, corresponds to the `enableReplicas` configuration setting.
|
`enableReplicas`:: boolean, corresponds to the `enableReplicas` configuration setting.
|
||||||
|
|
||||||
`enableNodes`:: boolean, corresponds to the `enableNodes` configuration setting.
|
`enableNodes`:: boolean, corresponds to the `enableNodes` configuration setting.
|
||||||
|
|
||||||
`mode`:: one of the following values:
|
`mode`:: one of the following values:
|
||||||
|
|
||||||
* `inactive` - when metrics collection is disabled (but access to existing metrics history is still available).
|
* `inactive` - when metrics collection is disabled (but access to existing metrics history is still available).
|
||||||
* `memory` - when metrics history is kept only in memory because `.system` collection doesn't exist. In this mode
|
* `memory` - when metrics history is kept only in memory because `.system` collection doesn't exist. In this mode
|
||||||
clients can access metrics history available on the node that received the reuqest and on the Overseer leader.
|
clients can access metrics history available on the node that received the request and on the Overseer leader.
|
||||||
* `index` - when metrics history is periodically stored in the `.system` collection. Data available in memory on
|
* `index` - when metrics history is periodically stored in the `.system` collection. Data available in memory on
|
||||||
the node that accepted the request is retrieved from memory, any other data is retrieved from the
|
the node that accepted the request is retrieved from memory, any other data is retrieved from the
|
||||||
`.system` collection (so it's at least `syncPeriod` old).
|
`.system` collection (so it's at least `syncPeriod` old).
|
||||||
|
@ -198,14 +204,15 @@ the node that accepted the request is retrieved from memory, any other data is r
|
||||||
Also, the response header section (`responseHeader`) contains `zkConnected` boolean property that indicates
|
Also, the response header section (`responseHeader`) contains `zkConnected` boolean property that indicates
|
||||||
whether the current node is a part of SolrCloud cluster.
|
whether the current node is a part of SolrCloud cluster.
|
||||||
|
|
||||||
=== List databases (`action=list`)
|
=== List Databases
|
||||||
This call produces a list of available databases. It supports the following parameters:
|
The query parameter `action=list` produces a list of available databases. It supports the following parameters:
|
||||||
|
|
||||||
`rows`:: optional integer, default is 500. Maximum number of results to return.
|
`rows`:: optional integer, default is `500`. Maximum number of results to return.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
In this SolrCloud example the API is in `memory` mode, and the request was made to a node that is
|
In this SolrCloud example the API is in `memory` mode, and the request was made to a node that is
|
||||||
not Overseer leader. The API transparently forwarded the request to Overseer leader.
|
not Overseer leader. The API transparently forwarded the request to Overseer leader.
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
curl http://localhost:7574/solr/admin/metrics/history?action=list&rows=10
|
curl http://localhost:7574/solr/admin/metrics/history?action=list&rows=10
|
||||||
|
@ -252,12 +259,12 @@ received the request (because the data is retrieved from the `.system` collectio
|
||||||
Each section also contains a `lastModified` element, which contains the last modification time when the
|
Each section also contains a `lastModified` element, which contains the last modification time when the
|
||||||
database was update. All timestamps returned from this API correspond to Unix epoch time in seconds.
|
database was update. All timestamps returned from this API correspond to Unix epoch time in seconds.
|
||||||
|
|
||||||
=== Database status (`action=status`)
|
=== Database Status
|
||||||
This call provides detailed status of the selected database.
|
The query parameter `action=status` provides detailed status of the selected database.
|
||||||
|
|
||||||
The following parameters are supported:
|
The following parameters are supported:
|
||||||
|
|
||||||
`name`:: string, required: database name
|
`name`:: string, required: database name.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
[source,bash]
|
[source,bash]
|
||||||
|
@ -295,7 +302,7 @@ curl http://localhost:7574/solr/admin/metrics/history?action=status&name=solr.co
|
||||||
"datasource": "DS:numReplicas:GAUGE:120:U:U",
|
"datasource": "DS:numReplicas:GAUGE:120:U:U",
|
||||||
"lastValue": 4
|
"lastValue": 4
|
||||||
},
|
},
|
||||||
...
|
"..."
|
||||||
],
|
],
|
||||||
"archives": [
|
"archives": [
|
||||||
{
|
{
|
||||||
|
@ -316,7 +323,7 @@ curl http://localhost:7574/solr/admin/metrics/history?action=status&name=solr.co
|
||||||
"endTime": 1528318200,
|
"endTime": 1528318200,
|
||||||
"rows": 288
|
"rows": 288
|
||||||
},
|
},
|
||||||
...
|
"..."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
"node": "127.0.0.1:7574_solr"
|
"node": "127.0.0.1:7574_solr"
|
||||||
|
@ -330,12 +337,12 @@ curl http://localhost:7574/solr/admin/metrics/history?action=status&name=solr.co
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
=== Get database data (`action=get`)
|
=== Get Database Data
|
||||||
This call retrieves all data collected in the specified database.
|
The query parameter `action=get` retrieves all data collected in the specified database.
|
||||||
|
|
||||||
The following parameters are supported:
|
The following parameters are supported:
|
||||||
|
|
||||||
`name`:: string, required: database name
|
`name`:: string, required: database name.
|
||||||
`format`:: string, optional, default is `list`. Format of the data. Currently the
|
`format`:: string, optional, default is `list`. Format of the data. Currently the
|
||||||
following formats are supported:
|
following formats are supported:
|
||||||
|
|
||||||
|
@ -369,27 +376,26 @@ curl http://localhost:8983/solr/admin/metrics/history?action=get&name=solr.colle
|
||||||
"timestamps": [
|
"timestamps": [
|
||||||
1528304160,
|
1528304160,
|
||||||
1528304220,
|
1528304220,
|
||||||
...
|
"..."
|
||||||
],
|
],
|
||||||
"values": {
|
"values": {
|
||||||
"numShards": [
|
"numShards": [
|
||||||
"NaN",
|
"NaN",
|
||||||
2.0,
|
2.0,
|
||||||
...
|
"..."
|
||||||
],
|
],
|
||||||
"numReplicas": [
|
"numReplicas": [
|
||||||
"NaN",
|
"NaN",
|
||||||
4.0,
|
4.0,
|
||||||
...
|
"..."
|
||||||
],
|
],
|
||||||
...
|
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"RRA:AVERAGE:0.5:10:288": {
|
"RRA:AVERAGE:0.5:10:288": {
|
||||||
"timestamps": [
|
"timestamps": [
|
||||||
1528145400,
|
1528145400,
|
||||||
1528146000,
|
1528146000,
|
||||||
...
|
],
|
||||||
"lastModified": 1528318606,
|
"lastModified": 1528318606,
|
||||||
"node": "127.0.0.1:8983_solr"
|
"node": "127.0.0.1:8983_solr"
|
||||||
}
|
}
|
||||||
|
@ -398,8 +404,7 @@ curl http://localhost:8983/solr/admin/metrics/history?action=get&name=solr.colle
|
||||||
"enableReplicas": false,
|
"enableReplicas": false,
|
||||||
"enableNodes": false,
|
"enableNodes": false,
|
||||||
"mode": "index"
|
"mode": "index"
|
||||||
}
|
}}}}
|
||||||
}
|
|
||||||
----
|
----
|
||||||
|
|
||||||
This is the output when using the `string` format:
|
This is the output when using the `string` format:
|
||||||
|
@ -424,11 +429,12 @@ curl http://localhost:8983/solr/admin/metrics/history?action=get&name=solr.colle
|
||||||
"numShards": "NaN\n2.0\n2.0\n2.0\n2.0\n2.0\n2.0\n...",
|
"numShards": "NaN\n2.0\n2.0\n2.0\n2.0\n2.0\n2.0\n...",
|
||||||
"numReplicas": "NaN\n4.0\n4.0\n4.0\n4.0\n4.0\n4.0\n...",
|
"numReplicas": "NaN\n4.0\n4.0\n4.0\n4.0\n4.0\n4.0\n...",
|
||||||
"QUERY./select.requests": "NaN\n123\n456\n789\n...",
|
"QUERY./select.requests": "NaN\n123\n456\n789\n...",
|
||||||
...
|
"..."
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"RRA:AVERAGE:0.5:10:288": {
|
"RRA:AVERAGE:0.5:10:288": {
|
||||||
...
|
"..."
|
||||||
|
}}}}}
|
||||||
----
|
----
|
||||||
|
|
||||||
This is the output when using the `graph` format:
|
This is the output when using the `graph` format:
|
||||||
|
@ -452,25 +458,23 @@ curl http://localhost:8983/solr/admin/metrics/history?action=get&name=solr.colle
|
||||||
"numShards": "iVBORw0KGgoAAAANSUhEUgAAAkQAAA...",
|
"numShards": "iVBORw0KGgoAAAANSUhEUgAAAkQAAA...",
|
||||||
"numReplicas": "iVBORw0KGgoAAAANSUhEUgAAAkQA...",
|
"numReplicas": "iVBORw0KGgoAAAANSUhEUgAAAkQA...",
|
||||||
"QUERY./select.requests": "iVBORw0KGgoAAAANS...",
|
"QUERY./select.requests": "iVBORw0KGgoAAAANS...",
|
||||||
...
|
"..."
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"RRA:AVERAGE:0.5:10:288": {
|
"RRA:AVERAGE:0.5:10:288": {
|
||||||
"values": {
|
"values": {
|
||||||
"numShards": "iVBORw0KGgoAAAANSUhEUgAAAkQAAA...",
|
"numShards": "iVBORw0KGgoAAAANSUhEUgAAAkQAAA...",
|
||||||
...
|
"..."
|
||||||
},
|
}
|
||||||
...
|
}}}}}
|
||||||
----
|
----
|
||||||
|
|
||||||
.Example 60 sec resolution history graph for `QUERY./select.requests` metric
|
.Example 60 sec resolution history graph for `QUERY./select.requests` metric
|
||||||
image::images/metrics-history/query-graph-60s.png[image]
|
image::images/metrics-history/query-graph-60s.png[image]
|
||||||
|
|
||||||
|
|
||||||
.Example 10 min resolution history graph for `QUERY./select.requests` metric
|
.Example 10 min resolution history graph for `QUERY./select.requests` metric
|
||||||
image::images/metrics-history/query-graph-10min.png[image]
|
image::images/metrics-history/query-graph-10min.png[image]
|
||||||
|
|
||||||
|
|
||||||
.Example 60 sec resolution history graph for `UPDATE./update.requests` metric
|
.Example 60 sec resolution history graph for `UPDATE./update.requests` metric
|
||||||
image::images/metrics-history/update-graph-60s.png[image]
|
image::images/metrics-history/update-graph-60s.png[image]
|
||||||
|
|
||||||
|
@ -478,4 +482,4 @@ image::images/metrics-history/update-graph-60s.png[image]
|
||||||
image::images/metrics-history/memHeap-60s.png[image]
|
image::images/metrics-history/memHeap-60s.png[image]
|
||||||
|
|
||||||
.Example 60 sec resolution history graph for `os.systemLoadAverage` metric
|
.Example 60 sec resolution history graph for `os.systemLoadAverage` metric
|
||||||
image::images/metrics-history/loadAvg-60s.png[image]
|
image::images/metrics-history/loadAvg-60s.png[image]
|
||||||
|
|
Loading…
Reference in New Issue