diff --git a/docs/en/ml/getting-started.asciidoc b/docs/en/ml/getting-started.asciidoc index 164834021b7..dc99c3ff79b 100644 --- a/docs/en/ml/getting-started.asciidoc +++ b/docs/en/ml/getting-started.asciidoc @@ -1,7 +1,6 @@ [[ml-getting-started]] == Getting Started -TBD. //// {xpack} {ml} features automatically detect: * Anomalies in single or multiple time series @@ -10,29 +9,39 @@ TBD. This tutorial is focuses on an anomaly detection scenario in single time series. //// +Ready to get some hands-on experience with the {xpack} {ml} features? This +tutorial shows you how to: + +* Load a sample data set into Elasticsearch +* Create a {ml} job +* Use the results to identify possible anomalies in the data + +{nbsp} + +At the end of this tutorial, you should have a good idea of what {ml} is and +will hopefully be inspired to use it to detect anomalies in your own data. + +You might also be interested in these video tutorials: + +* Getting started with machine learning (single metric) +* Getting started with machine learning (multiple metric) -In this tutorial, you will explore the {xpack} {ml} features by using sample -data. You will create two simple jobs and use the results to identify possible -anomalies in the data. You can also optionally create an alert. At the end of -this tutorial, you should have a good idea of what {ml} is and will hopefully -be inspired to use it to detect anomalies in your own data. [float] [[ml-gs-sysoverview]] === System Overview -TBD. - To follow the steps in this tutorial, you will need the following components of the Elastic Stack: * Elasticsearch {version}, which stores the data and the analysis results * {xpack} {version}, which provides the {ml} features -* Kibana {version}, which provides a helpful user interface for creating -and viewing jobs +* Kibana {version}, which provides a helpful user interface for creating and +viewing jobs + + See the https://www.elastic.co/support/matrix[Elastic Support Matrix] for -information about supported operating systems and product compatibility. +information about supported operating systems. See {stack-ref}/installing-elastic-stack.html[Installing the Elastic Stack] for information about installing each of the components. @@ -47,15 +56,26 @@ optionally dedicate nodes to specific purposes. If you want to control which nodes are _machine learning nodes_ or limit which nodes run resource-intensive activity related to jobs, see <>. -NOTE: This tutorial uses Kibana to create jobs and view results, but you can -alternatively use APIs to accomplish most tasks. -For API reference information, see <>. +[float] +[[ml-gs-users]] +==== Users, Roles, and Privileges + +The {xpack} {ml} features implement cluster privileges and built-in roles to +make it easier to control which users have authority to view and manage the jobs, +data feeds, and results. + +By default, you can perform all of the steps in this tutorial by using the +built-in `elastic` user. If you are performing these steps in a production +environment, take extra care because that user has the `superuser` role and you +could inadvertently make significant changes to the system. You can +alternatively assign the `machine_learning_admin` and `kibana_user` roles to a +user ID of your choice. + +For more information, see <> and <>. [[ml-gs-data]] === Identifying Data for Analysis -TBD. - For the purposes of this tutorial, we provide sample data that you can play with. When you consider your own data, however, it's important to take a moment and consider where the {xpack} {ml} features will be most impactful. @@ -69,11 +89,10 @@ very efficient and occur in real-time. The second consideration, especially when you are first learning to use {ml}, is the importance of the data and how familiar you are with it. Ideally, it is information that contains key performance indicators (KPIs) for the health or -success of your business or system. It is information for which you want alarms -to ring when anomalous behavior occurs. You might even have Kibana dashboards -that you're already using to watch this data. The better you know the data, -the quicker you will be able to create jobs that generate useful insights from -{ml}. +success of your business or system. It is information that you need to act on +when anomalous behavior occurs. You might even have Kibana dashboards that +you're already using to watch this data. The better you know the data, +the quicker you will be able to create {ml} jobs that generate useful insights. //TBD: Talk about layering additional jobs? //// @@ -102,84 +121,413 @@ future more-advanced scenario.) The final consideration is where the data is located. If the data that you want to analyze is stored in Elasticsearch, you can define a _data feed_ that -provides data to the job in real time. By having both the input data and the -analytical results in Elasticsearch, you get performance benefits? (TBD) -The alternative to data feeds is to upload batches of data to the job by -using the <>. +provides data to the job in real time. When you have both the input data and the +analytical results in Elasticsearch, this data gravity provides performance +benefits. + +IMPORTANT: If you want to create {ml} jobs in Kibana, you must use data feeds. +That is to say, you must store your input data in Elasticsearch. When you create +a job, you select an existing index pattern and Kibana configures the data feed +for you under the covers. + +If your data is not stored in Elasticsearch, you can create jobs by using +the <> and upload batches of data to the job by +using the <>. That scenario is not covered in +this tutorial, however. + //TBD: The data must be provided in JSON format? [float] [[ml-gs-sampledata]] -==== Obtaining a sample dataset +==== Obtaining a Sample Data Set -TBD. +The sample data for this tutorial contains information about the requests that +are received by various applications and services in a system. A system +administrator might use this type of information to track the the total +number of requests across all of the infrastructure. If the number of requests +increases or decreases unexpectedly, for example, this might be an indication +that there is a problem or that resources need to be redistributed. By using +the {xpack} {ml} features to model the behavior of this data, it is easier to +identify anomalies and take appropriate action. -* Provide instructions for downloading the sample data from https://github.com/elastic/examples -* Provide overview/context of the sample data +* TBD: Provide instructions for downloading the sample data after it's made +available publicly on https://github.com/elastic/examples +//Download this data set by clicking here: +//See https://download.elastic.co/demos/kibana/gettingstarted/shakespeare.json[shakespeare.json]. + +//// +Use the following commands to extract the files: + +[source,shell] +gzip -d transactions.ndjson.gz +//// +Each document in the server-metrics data set has the following schema: + +[source,json] +---------------------------------- + +{ + "index": + { + "_index":"server-metrics", + "_type":"metric", + "_id":"AVuQL1eekrHQ5a9V5qre" + } +} +{ + "deny":1783, + "service":"app_0", + "@timestamp":"2017-03-26T06:47:28.684926", + "accept":24465, + "host":"server_1", + "total":26248, + "response":1.8242486553275024 +} +---------------------------------- + +Before you load the data set, you need to set up {ref}/mapping.html[_mappings_] +for the fields. Mappings divide the documents in the index into logical groups +and specify a field's characteristics, such as the field's searchability or +whether or not it's _tokenized_, or broken up into separate words. + +The sample data includes an `upload_server-metrics.sh` script, which you can use +to create the mappings and load the data set. The script runs a command similar +to the following example, which sets up a mapping for the data set: + +[source,shell] +---------------------------------- + +curl -u elastic:elasticpassword -X PUT -H 'Content-Type: application/json' +http://localhost:9200/server-metrics -d '{ + "settings": { + "number_of_shards": 1, + "number_of_replicas": 0 + }, + "mappings": { + "metric": { + "properties": { + "@timestamp": { + "type": "date" + }, + "accept": { + "type": "long" + }, + "deny": { + "type": "long" + }, + "host": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "response": { + "type": "float" + }, + "service": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "total": { + "type": "long" + } + } + } + } + } +}' +---------------------------------- + +NOTE: If you run this command, you must replace `elasticpassword` with your +actual password. Likewise, if you use the `upload_server-metrics.sh` script, +you must edit the USERNAME and PASSWORD variables before you run it. + +//// +This mapping specifies the following qualities for the data set: + +* The _@timestamp_ field is a date. +//that uses the ISO format `epoch_second`, +//which is the number of seconds since the epoch. +* The _accept_, _deny_, and _total_ fields are long numbers. +* The _host +//// + +You can then use the Elasticsearch `bulk` API to load the data set. The +`upload_server-metrics.sh` script runs commands similar to the following +example, which loads the four JSON files: + +[source,shell] +---------------------------------- + +curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json" +http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_1.json" + +curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json" +http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_2.json" + +curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json" +http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_3.json" + +curl -u elastic:elasticpassword -X POST -H "Content-Type: application/json" +http://localhost:9200/server-metrics/_bulk --data-binary "@server-metrics_4.json" +---------------------------------- + +These commands might take some time to run, depending on the computing resources +available. + +You can verify that the data was loaded successfully with the following command: + +[source,shell] +---------------------------------- + +curl 'http://localhost:9200/_cat/indices?v' -u elastic:elasticpassword +---------------------------------- + +You should see output similar to the following: + +[source,shell] +---------------------------------- + +health status index ... pri rep docs.count docs.deleted store.size ... +green open server-metrics ... 1 0 907200 0 134.9mb ... +---------------------------------- + +Next, you must define an index pattern for this data set: + +. Open Kibana in your web browser and log in. If you are running Kibana +locally, go to `http://localhost:5601/`. + +. Click the **Management** tab, then **Index Patterns**. + +. Click the plus sign (+) to define a new index pattern. + +. For this tutorial, any pattern that matches the name of the index you've +loaded will work. For example, enter `server-metrics*` as the index pattern. + +. Verify that the **Index contains time-based events** is checked. + +. Select the `@timestamp` field from the **Time-field name** list. + +. Click **Create**. + +This data set can now be analyzed in {ml} jobs in Kibana. +//Content based on https://www.elastic.co/guide/en/kibana/current/tutorial-load-dataset.html [[ml-gs-jobs]] -=== Working with Jobs - -TBD. +=== Creating Jobs Machine learning jobs contain the configuration information and metadata necessary to perform an analytical task. They also contain the results of the -analytical task. Each job ID must be unique in your cluster. +analytical task. + +NOTE: This tutorial uses Kibana to create jobs and view results, but you can +alternatively use APIs to accomplish most tasks. +For API reference information, see <>. To work with jobs in Kibana: . Open Kibana in your web browser and log in. If you are running Kibana -locally, go to `http://localhost:5601/`. To use the {ml} features, -you must log in as a user who has the `kibana_user` -and `monitor_ml` roles (TBD). +locally, go to `http://localhost:5601/`. . Click **Machine Learning** in the side navigation: -image::images/ml.jpg["Job Management"] +image::images/ml-kibana.jpg["Job Management"] -You can choose to create single-metric, multi-metric, or advanced jobs in Kibana. +You can choose to create single metric, multi-metric, or advanced jobs in +Kibana. In this tutorial, the goal is to detect anomalies in the total requests +received by your applications and services. The sample data contains a single +key performance indicator to track this, which is the total requests over time. +It is therefore logical to start by creating a single metric job for this KPI. [float] [[ml-gs-job1-create]] ==== Creating a Single Metric Job -TBD. +A single metric job contains a single _detector_. A detector defines the type of +analysis that will occur (for example, `max`, `average`, or `rare` analytical +functions) and the fields that will be analyzed. -* Walk through creation of a simple single metric job. -* Provide overview of: -** aggregations -** detectors (define which fields to analyze) -*** The detectors define what type of analysis needs to be done -(e.g. max, average, rare) and upon which fields (e.g. IP address, Host name, Num bytes). -** bucket spans (define time intervals to analyze across) -*** Take into account the granularity at which you want to analyze, -the frequency of the input data, and the frequency at which alerting is required. -*** When we analyze data, we use the concept of a bucket to divide up a continuous -stream of data into batches for processing. For example, if you were monitoring the -average response time of a system and received a data point every 10 minutes, -using a bucket span of 1 hour means that at the end of each hour we would calculate -the average (mean) value of the data for the last hour and compute the -anomalousness of that average value compared to previous hours. -*** The bucket span has two purposes: it dictates over what time span to look for -anomalous features in data, and also determines how quickly anomalies can be detected. -Choosing a shorter bucket span allows anomalies to be detected more quickly but -at the risk of being too sensitive to natural variations or noise in the input data. -Choosing too long a bucket span however can mean that interesting anomalies are averaged away. -** analysis functions -*** Some of the analytical functions look for single anomalous data points, e.g. max, -which identifies the maximum value seen within a bucket. -Others perform some aggregation over the length of the bucket, e.g. mean, -which calculates the mean of all the data points seen within the bucket, -or count, which calculates the total number of data points within the bucket. -There is the possibility that the aggregation might smooth out some anomalies -based on when the bucket starts in time. -**** To avoid this, you can use overlapping buckets (how/where?). -We analyze the data points in two buckets simultaneously, one starting half a bucket -span later than the other. Overlapping buckets are only beneficial for -aggregating functions, and should not be used for non-aggregating functions. +To create a single metric job in Kibana: + +. Click **Machine Learning** in the side navigation, +then click **Create new job**. + +. Click **Create single metric job**. +image::images/ml-create-jobs.jpg["Create a new job"] + +. Click the `server-metrics` index. + ++ +-- +image::images/ml-gs-index.jpg["Select an index"] +-- + +. Configure the job by providing the following information: +image::images/ml-gs-single-job.jpg["Create a new job from the server-metrics index"] + +.. For the **Aggregation**, select `Sum`. This value specifies the analysis +function that is used. ++ +-- +Some of the analytical functions look for single anomalous data points. For +example, `max` identifies the maximum value that is seen within a bucket. +Others perform some aggregation over the length of the bucket. For example, +`mean` calculates the mean of all the data points seen within the bucket. +Similarly, `count` calculates the total number of data points within the bucket. +In this tutorial, you are using the `sum` function, which calculates the sum of +the specified field's values within the bucket. +-- + +.. For the **Field**, select `total`. This value specifies the field that +the detector uses in the function. ++ +-- +NOTE: Some functions such as `count` and `rare` do not require fields. +-- + +.. For the **Bucket span**, enter `600s`. This value specifies the size of the +interval that the analysis is aggregated into. ++ +-- +The {xpack} {ml} features use the concept of a bucket to divide up a continuous +stream of data into batches for processing. For example, if you are monitoring +the total number of requests in the system, +//and receive a data point every 10 minutes +using a bucket span of 1 hour would mean that at the end of each hour, it +calculates the sum of the requests for the last hour and computes the +anomalousness of that value compared to previous hours. + +The bucket span has two purposes: it dictates over what time span to look for +anomalous features in data, and also determines how quickly anomalies can be +detected. Choosing a shorter bucket span allows anomalies to be detected more +quickly. However, there is a risk of being too sensitive to natural variations +or noise in the input data. Choosing too long a bucket span can mean that +interesting anomalies are averaged away. There is also the possibility that the +aggregation might smooth out some anomalies based on when the bucket starts +in time. + +The bucket span has a significant impact on the analysis. When you're trying to +determine what value to use, take into account the granularity at which you +want to perform the analysis, the frequency of the input data, and the frequency +at which alerting is required. +//TBD: Talk about overlapping buckets? "To avoid this, you can use overlapping +//buckets (how/where?). We analyze the data points in two buckets simultaneously, +//one starting half a bucket span later than the other. Overlapping buckets are +//only beneficial for aggregating functions, and should not be used for +//non-aggregating functions. +-- + +. Click **Use full transaction_counts data**. ++ +-- +A graph is generated, which represents the total number of requests over time. +//TBD: What happens if you click the play button instead? +-- + +. Provide a name for the job, for example `total-requests`. The job name must +be unique in your cluster. You can also optionally provide a description of the +job. + +. Click **Create Job**. +image::images/ml-gs-job1.jpg["A graph of the total number of requests over time"] + +As the job is created, the graph is updated to give a visual representation of +the {ml} that occurs as the data is processed. +//To explore the results, click **View Results**. +//TBD: image::images/ml-gs-job1-results.jpg["The total-requests job is created"] + +[[ml-gs-job1-managa]] +=== Managing Jobs + +After you create a job, you can see its status in the **Job Management** tab: +image::images/ml-gs-job1-manage.jpg["Status information for the total-requests job"] + +The following information is provided for each job: + +Job ID:: +The unique identifier for the job. + +Description:: +The optional description of the job. + +Processed records:: +The number of records that have been processed by the job. ++ +-- +NOTE: Depending on how you send data to the job, the number of processed +records is not always equal to the number of input records. For more information, +see the `processed_record_count` description in <>. + +-- + +Memory status:: +The status of the mathematical models. When you create jobs by using the APIs or +by using the advanced options in Kibana, you can specify a `model_memory_limit`. +That value is the maximum amount of memory, in MiB, that the mathematical models +can use. Once that limit is approached, data pruning becomes more aggressive. +Upon exceeding that limit, new entities are not modeled. +The default value is `4096`. The memory status field reflects whether you have +reached or exceeded the model memory limit. It can have one of the following +values: + +`ok`::: The models stayed below the configured value. +`soft_limit`::: The models used more than 60% of the configured memory limit +and older unused models will be pruned to free up space. +`hard_limit`::: The models used more space than the configured memory limit. +As a result, not all incoming data was processed. + +Job state:: +The status of the job, which can be one of the following values: + +`open`::: The job is available to receive and process data. +`closed`::: The job finished successfully with its model state persisted. +The job must be opened before it can accept further data. +`closing`::: The job close action is in progress and has not yet completed. +A closing job cannot accept further data. +`failed`::: The job did not finish successfully due to an error. +This situation can occur due to invalid input data. +If the job had irrevocably failed, it must be force closed and then deleted. +If the data feed can be corrected, the job can be closed and then re-opened. + +Datafeed state:: +The status of the data feed, which can be one of the following values: + +started::: The data feed is actively receiving data. +stopped::: The data feed is stopped and will not receive data until it is re-started. +//TBD: How to restart data feeds in Kibana? + +Latest timestamp:: +The timestamp of the last processed record. +//TBD: Is that right? + +If you click the arrow beside the name of job, you can show or hide additional +information, such as the settings, configuration information, or messages for +the job. + +You can also click one of the **Actions** buttons to start the data feed, edit +the job or data feed, and clone or delete the job, for example. + +* TBD: Demonstrate how to re-open the data feed and add additional data + + +[[ml-gs-jobresults]] +=== Exploring Job Results + +After you create a job, you can use the **Anomaly Explorer** or the +**Single Metric Viewer** in Kibana to view the analysis results. + +Anomaly Explorer:: +TBD + +Single Metric Viewer:: +TBD [float] [[ml-gs-job1-analyze]] -===== Viewing Single Metric Job Results +==== Exploring Single Metric Job Results TBD. @@ -213,6 +561,21 @@ view the detailed anomaly records which are the significant causal factors. * Provide general overview of management of jobs (when/why to start or stop them). +Integrate the following images: + +. Single Metric Viewer: All +image::images/ml-gs-job1-analysis.jpg["Single Metric Viewer for total-requests job"] + +. Single Metric Viewer: Anomalies +image::images/ml-gs-job1-anomalies.jpg["Single Metric Viewer Anomalies for total-requests job"] + +. Anomaly Explorer: All +image::images/ml-gs-job1-explorer.jpg["Anomaly Explorer for total-requests job"] + +. Anomaly Explorer: Selected a red area from the heatmap +image::images/ml-gs-job1-explorer-anomaly.jpg["Anomaly Explorer details for total-requests job"] + +//// [float] [[ml-gs-job2-create]] ==== Creating a Multi-Metric Job @@ -259,11 +622,3 @@ TBD. * Walk through creation of simple alert for anomalous data? //// -To start exploring anomalies in your data: - -. Open Kibana in your web browser and log in. If you are running Kibana -locally, go to `http://localhost:5601/`. - -. Click **ML** in the side navigation ... -//// -//image::images/graph-open.jpg["Accessing Graph"] diff --git a/docs/en/ml/images/ml-create-jobs.jpg b/docs/en/ml/images/ml-create-jobs.jpg new file mode 100644 index 00000000000..0c37ea2c9ff Binary files /dev/null and b/docs/en/ml/images/ml-create-jobs.jpg differ diff --git a/docs/en/ml/images/ml-edit-job.jpg b/docs/en/ml/images/ml-edit-job.jpg new file mode 100644 index 00000000000..e6a3e6b1106 Binary files /dev/null and b/docs/en/ml/images/ml-edit-job.jpg differ diff --git a/docs/en/ml/images/ml-gs-aggregations.jpg b/docs/en/ml/images/ml-gs-aggregations.jpg new file mode 100644 index 00000000000..446dce79727 Binary files /dev/null and b/docs/en/ml/images/ml-gs-aggregations.jpg differ diff --git a/docs/en/ml/images/ml-gs-index.jpg b/docs/en/ml/images/ml-gs-index.jpg new file mode 100644 index 00000000000..8932cd8a47c Binary files /dev/null and b/docs/en/ml/images/ml-gs-index.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1-analysis.jpg b/docs/en/ml/images/ml-gs-job1-analysis.jpg new file mode 100644 index 00000000000..fcf73d0ef52 Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1-analysis.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1-anomalies.jpg b/docs/en/ml/images/ml-gs-job1-anomalies.jpg new file mode 100644 index 00000000000..0e7e2422424 Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1-anomalies.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1-explorer-anomaly.jpg b/docs/en/ml/images/ml-gs-job1-explorer-anomaly.jpg new file mode 100644 index 00000000000..4a886697025 Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1-explorer-anomaly.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1-explorer.jpg b/docs/en/ml/images/ml-gs-job1-explorer.jpg new file mode 100644 index 00000000000..92059ccfc52 Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1-explorer.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1-manage.jpg b/docs/en/ml/images/ml-gs-job1-manage.jpg new file mode 100644 index 00000000000..f7e0d2d0e3a Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1-manage.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1-results.jpg b/docs/en/ml/images/ml-gs-job1-results.jpg new file mode 100644 index 00000000000..0b04fec0e2d Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1-results.jpg differ diff --git a/docs/en/ml/images/ml-gs-job1.jpg b/docs/en/ml/images/ml-gs-job1.jpg new file mode 100644 index 00000000000..f168c34221f Binary files /dev/null and b/docs/en/ml/images/ml-gs-job1.jpg differ diff --git a/docs/en/ml/images/ml-gs-single-job.jpg b/docs/en/ml/images/ml-gs-single-job.jpg new file mode 100644 index 00000000000..e56f23d0cde Binary files /dev/null and b/docs/en/ml/images/ml-gs-single-job.jpg differ diff --git a/docs/en/ml/images/ml-kibana.jpg b/docs/en/ml/images/ml-kibana.jpg new file mode 100644 index 00000000000..206d2fdef6c Binary files /dev/null and b/docs/en/ml/images/ml-kibana.jpg differ diff --git a/docs/en/ml/index.asciidoc b/docs/en/ml/index.asciidoc index 3a2abbf8ea4..96259f79add 100644 --- a/docs/en/ml/index.asciidoc +++ b/docs/en/ml/index.asciidoc @@ -16,7 +16,7 @@ easily answer these types of questions. include::introduction.asciidoc[] include::getting-started.asciidoc[] -include::ml-scenarios.asciidoc[] +// include::ml-scenarios.asciidoc[] include::api-quickref.asciidoc[] //include::troubleshooting.asciidoc[] Referenced from x-pack/docs/public/xpack-troubleshooting.asciidoc