[DOCS] Add ML Getting Started job analysis pages (elastic/x-pack-elasticsearch#1185)

* [DOCS] ML getting started file extraction * [DOCS] ML Getting Started exploring job results Original commit: elastic/x-pack-elasticsearch@7b46e7beb3
2025-02-25 14:26:27 +00:00 · 2017-04-24 16:26:55 -07:00 · 2017-04-24 16:26:55 -07:00 · ee612a3dd8
commit ee612a3dd8
parent 918f4fb962
5 changed files with 150 additions and 67 deletions
--- a/docs/en/ml/getting-started.asciidoc
+++ b/docs/en/ml/getting-started.asciidoc
@ -16,7 +16,6 @@ tutorial shows you how to:
 * Create a {ml} job
 * Use the results to identify possible anomalies in the data
 {nbsp}
 At the end of this tutorial, you should have a good idea of what {ml} is and
 will hopefully be inspired to use it to detect anomalies in your own data.
@ -155,12 +154,13 @@ available publicly on https://github.com/elastic/examples
 //Download this data set by clicking here:
 //See  https://download.elastic.co/demos/kibana/gettingstarted/shakespeare.json[shakespeare.json].
 ////
 Use the following commands to extract the files:
 [source,shell]
-gzip -d transactions.ndjson.gz
+----------------------------------
-////
+tar xvf server_metrics.tar.gz
 ----------------------------------
 Each document in the server-metrics data set has the following schema:
 [source,js]
@ -191,7 +191,12 @@ and specify a field's characteristics, such as the field's searchability or
 whether or not it's _tokenized_, or broken up into separate words.
 The sample data includes an `upload_server-metrics.sh` script, which you can use
-to create the mappings and load the data set. The script runs a command similar
+to create the mappings and load the data set. Before you run it, however, you
 must edit the USERNAME and PASSWORD variables with your actual user ID and
 password. If you want to test adding data to an existing data feed, you must
 also comment out the final two commands related to `server-metrics_4.json`.
 The script runs a command similar
 to the following example, which sets up a mapping for the data set:
 [source,shell]
@ -247,8 +252,7 @@ http://localhost:9200/server-metrics -d '{
 ----------------------------------
 NOTE: If you run this command, you must replace `elasticpassword` with your
-actual password. Likewise, if you use the `upload_server-metrics.sh` script,
+actual password.
 you must edit the USERNAME and PASSWORD variables before you run it.
 ////
 This mapping specifies the following qualities for the data set:
@ -262,7 +266,7 @@ This mapping specifies the following qualities for the data set:
 You can then use the Elasticsearch `bulk` API to load the data set. The
 `upload_server-metrics.sh` script runs commands similar to the following
-example, which loads the four JSON files:
+example, which loads three of the JSON files:
 [source,shell]
 ----------------------------------
@ -276,10 +280,10 @@ http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_2.json
 curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json"
 http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_3.json"
 curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json"
 http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_4.json"
 ----------------------------------
 //curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json"
 //http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_4.json"
 These commands might take some time to run, depending on the computing resources
 available.
@ -291,13 +295,13 @@ You can verify that the data was loaded successfully with the following command:
 curl 'http://localhost:9200/_cat/indices?v' -u elastic:elasticpassword
 ----------------------------------
-You should see output similar to the following:
+For three sample JSON files, you should see output similar to the following:
 [source,shell]
 ----------------------------------
 health status index ... pri rep docs.count  docs.deleted  store.size ...
-green  open   server-metrics ... 1 0 907200  0 134.9mb ...
+green  open   server-metrics ... 1 0 680400  0 101.7mb  ...
 ----------------------------------
 Next, you must define an index pattern for this data set:
@ -423,12 +427,8 @@ at which alerting is required.
 //non-aggregating functions.
 --
-. Click **Use full transaction_counts data**.
+. Click **Use full transaction_counts data**. A graph is generated,
-+
+which represents the total number of requests over time.
 --
 A graph is generated, which represents the total number of requests over time.
 //TBD: What happens if you click the play button instead?
 --
 . Provide a name for the job, for example `total-requests`. The job name must
 be unique in your cluster. You can also optionally provide a description of the
@ -442,10 +442,14 @@ the {ml} that occurs as the data is processed.
 //To explore the results, click **View Results**.
 //TBD: image::images/ml-gs-job1-results.jpg["The total-requests job is created"]
-[[ml-gs-job1-managa]]
+TIP: The `create_single_metic.sh` script creates a similar job and data feed by
 using the {ml} APIs. For API reference information, see <<ml-apis>>.
 [[ml-gs-job1-manage]]
 === Managing Jobs
 After you create a job, you can see its status in the **Job Management** tab:
 image::images/ml-gs-job1-manage.jpg["Status information for the total-requests job"]
 The following information is provided for each job:
@ -458,14 +462,11 @@ The optional description of the job.
 Processed records::
 The number of records that have been processed by the job.
-+
+
 --
 NOTE: Depending on how you send data to the job, the number of processed
 records is not always equal to the number of input records. For more information,
 see the `processed_record_count` description in <<ml-datacounts,Data Counts Objects>>.
 --
 Memory status::
 The status of the mathematical models. When you create jobs by using the APIs or
 by using the advanced options in Kibana, you can specify a `model_memory_limit`.
@ -510,71 +511,137 @@ the job.
 You can also click one of the **Actions** buttons to start the data feed, edit
 the job or data feed, and clone or delete the job, for example.
-* TBD: Demonstrate how to re-open the data feed and add additional data
+[float]
 [[ml-gs-job1-datafeed]]
 ==== Managing Data Feeds
 A data feed can be started and stopped multiple times throughout its lifecycle.
 If you want to retrieve more data from Elasticsearch and the data feed is
 stopped, you must restart it.
 For example, if you only loaded three of the sample JSON files, you can now load
 the fourth using the Elasticsearch `bulk` API as follows:
 [source,shell]
 ----------------------------------
 curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json"
 http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_4.json"
 ----------------------------------
 You can optionally verify that the data was loaded successfully with the
 following command:
 [source,shell]
 ----------------------------------
 curl 'http://localhost:9200/_cat/indices?v' -u elastic:elasticpassword
 ----------------------------------
 For the four sample JSON files, you should see output similar to the following:
 [source,shell]
 ----------------------------------
 health status index ... pri rep docs.count  docs.deleted  store.size ...
 green  open   server-metrics ... 1 0 907200  0  136.2mb  ...
 ----------------------------------
 To use this new data in your job:
 . In the **Machine Learning** / **Job Management** tab, click the following
 button to start the data feed: image::images/ml-start-feed.jpg["Start data feed"].
 . Choose a start time and end time. For example,
 click **Continue from 2017-04-22** and **No end time**, then click **Start**.
 image::images/ml-gs-job1-datafeed.jpg["Restarting a data feed"]
 * TBD: Why do I not see increases in the job count stats after this occurs?
 How can I determine that it has been successfully processed?
 [[ml-gs-jobresults]]
 === Exploring Job Results
-After you create a job, you can use the **Anomaly Explorer** or the
+The {xpack} {ml} features analyze the input stream of data, model its behavior,
 and perform analysis based on the detectors you defined in your job. When an
 event occurs outside of the model, that event is identified as an anomaly.
 Result records for each anomaly are stored in `.ml-notifications` and
 `.ml-anomalies*` indices in Elasticsearch. By default, the name of the
 index where {ml} results are stored is `shared`, which corresponds to
 the `.ml-anomalies-shared` index.
 //For example, these results include the probability of detecting that anomaly.
 You can use the **Anomaly Explorer** or the
 **Single Metric Viewer** in Kibana to view the analysis results.
 Anomaly Explorer::
-TBD
+  This view contains heatmap charts, where the color for each section of the
  timeline is determined by the maximum anomaly score in that period.
 //TBD: Do the time periods in the heat map correspond to buckets?
 Single Metric Viewer::
-TBD
+  This view contains a time series chart that represents the analysis.
  As in the **Anomaly Explorer**, anomalous data points are shown in
  different colors depending on their probability.
 [float]
 [[ml-gs-job1-analyze]]
 ==== Exploring Single Metric Job Results
-TBD.
+By default when you view the results for a single metric job,
-
+the **Single Metric Viewer** opens:
 * Walk through exploration of job results.
 ** Based on this job configuration we analyze the input stream of data.
 We model the behavior of the data, perform analysis based upon the defined detectors
 and for the time interval. When we see an event occurring outside of our model,
 we identify this as an anomaly. For each anomaly detected, we store the
 result records of our analysis, which includes the probability of
 detecting that anomaly.
 ** With high volumes of real-life data, many anomalies may be found.
 These vary in probability from very likely to highly unlikely i.e. from not
 particularly anomalous to highly anomalous. There can be none, one or two or
 tens, sometimes hundreds of anomalies found within each bucket.
 There can be many thousands found per job.
 In order to provide a sensible view of the results, we calculate an anomaly score
 for each time interval. An interval with a high anomaly score is significant
 and requires investigation.
 ** The anomaly score is a sophisticated aggregation of the anomaly records.
 The calculation is optimized for high throughput, gracefully ages historical data,
 and reduces the signal to noise levels.
 It adjusts for variations in event rate, takes into account the frequency
 and the level of anomalous activity and is adjusted relative to past anomalous behavior.
 In addition, it is boosted if anomalous activity occurs for related entities,
 for example if disk IO and CPU are both behaving unusually for a given host.
 ** Once an anomalous time interval has been identified, it can be expanded to
 view the detailed anomaly records which are the significant causal factors.
 * Provide brief overview of statistical models and/or link to more info.
 * Possibly discuss effect of altering bucket span.
 * Provide general overview of management of jobs (when/why to start or
  stop them).
 Integrate the following images:
 . Single Metric Viewer: All
 image::images/ml-gs-job1-analysis.jpg["Single Metric Viewer for total-requests job"]
-. Single Metric Viewer: Anomalies
+The blue line in the chart represents the actual data values. The shaded blue area
 represents the expected behavior that was calculated by the model.
 //TBD: What is meant by "95% prediction bounds"?
 If you slide the time selector from the beginning of the data to the end of the
 data, you can see how the model improves as it processes more data. At the
 beginning, the expected range of values is pretty broad and the model is not
 capturing the periodicity in the data. But it quickly learns and begins to
 reflect the daily variation.
 Any data points outside the range that was predicted by the model are marked
 as anomalies. When you have high volumes of real-life data, many anomalies
 might be found. These vary in probability from very likely to highly unlikely,
 that is to say, from not particularly anomalous to highly anomalous. There
 can be none, one or two or tens, sometimes hundreds of anomalies found within
 each bucket. There can be many thousands found per job. In order to provide
 a sensible view of the results, an _anomaly score_ is calculated for each bucket
 time interval. The anomaly score is a value from 0 to 100, which indicates
 the significance of the observed anomaly compared to previously seen anomalies.
 The highly anomalous values are shown in red and the low scored values are
 indicated in blue. An interval with a high anomaly score is significant and
 requires investigation.
 Slide the time selector to a section of the time series that contains a red data
 point. If you hover over the point, you can see more information about that
 data point. You can also see details in the **Anomalies** section of the viewer.
 For example:
 image::images/ml-gs-job1-anomalies.jpg["Single Metric Viewer Anomalies for total-requests job"]
-. Anomaly Explorer: All
+For each anomaly you can see key details such as the time, the actual and
 expected ("typical") values, and their probability.
 You can see the same information in a different format by using the **Anomaly Explorer**:
 image::images/ml-gs-job1-explorer.jpg["Anomaly Explorer for total-requests job"]
-. Anomaly Explorer: Selected a red area from the heatmap
+Click one of the red areas in the heatmap to see details about that anomaly. For
 example:
 image::images/ml-gs-job1-explorer-anomaly.jpg["Anomaly Explorer details for total-requests job"]
 After you have identified anomalies, often the next step is to try to determine
 the context of those situations. For example, are there other factors that are
 contributing to the problem? Are the anomalies confined to particular
 applications or servers? You can begin to troubleshoot these situations by
 layering additional jobs or creating multi-metric jobs.
 ////
 [float]
 [[ml-gs-job2-create]]
@ -614,6 +681,22 @@ TBD.
 * Walk through exploration of job results.
 * Describe how influencer detection accelerates root cause identification.
 ////
 ////
 * Provide brief overview of statistical models and/or link to more info.
 * Possibly discuss effect of altering bucket span.
 The anomaly score is a sophisticated aggregation of the anomaly records in the
 bucket. The calculation is optimized for high throughput, gracefully ages
 historical data, and reduces the signal to noise levels. It adjusts for
 variations in event rate, takes into account the frequency and the level of
 anomalous activity and is adjusted relative to past anomalous behavior.
 In addition, [the anomaly score] is boosted if anomalous activity occurs for related entities,
 for example if disk IO and CPU are both behaving unusually for a given host.
 ** Once an anomalous time interval has been identified, it can be expanded to
 view the detailed anomaly records which are the significant causal factors.
 ////
 ////
 [[ml-gs-alerts]]
 === Creating Alerts for Job Results
--- a/docs/en/ml/images/ml-gs-job1-datafeed.jpg
+++ b/docs/en/ml/images/ml-gs-job1-datafeed.jpg
--- a/docs/en/ml/images/ml-start-feed.jpg
+++ b/docs/en/ml/images/ml-start-feed.jpg
--- a/docs/en/rest-api/ml/start-datafeed.asciidoc
+++ b/docs/en/rest-api/ml/start-datafeed.asciidoc
@ -3,7 +3,7 @@
 ==== Start Data Feeds
 A data feed must be started in order to retrieve data from {es}.
-A data feed can be opened and closed multiple times throughout its lifecycle.
+A data feed can be started and stopped multiple times throughout its lifecycle.
 ===== Request
--- a/docs/en/rest-api/ml/stop-datafeed.asciidoc
+++ b/docs/en/rest-api/ml/stop-datafeed.asciidoc
@ -3,7 +3,7 @@
 ==== Stop Data Feeds
 A data feed that is stopped ceases to retrieve data from {es}.
-A data feed can be opened and closed multiple times throughout its lifecycle.
+A data feed can be started and stopped multiple times throughout its lifecycle.
 ===== Request