100 lines
3.8 KiB
Plaintext
100 lines
3.8 KiB
Plaintext
[[ml-gs-wizards]]
|
|
=== Creating Jobs in {kib}
|
|
++++
|
|
<titleabbrev>Creating Jobs</titleabbrev>
|
|
++++
|
|
|
|
Machine learning jobs contain the configuration information and metadata
|
|
necessary to perform an analytical task. They also contain the results of the
|
|
analytical task.
|
|
|
|
[NOTE]
|
|
--
|
|
This tutorial uses {kib} to create jobs and view results, but you can
|
|
alternatively use APIs to accomplish most tasks.
|
|
For API reference information, see {ref}/ml-apis.html[Machine Learning APIs].
|
|
|
|
The {xpackml} features in {kib} use pop-ups. You must configure your
|
|
web browser so that it does not block pop-up windows or create an
|
|
exception for your {kib} URL.
|
|
--
|
|
|
|
{kib} provides wizards that help you create typical {ml} jobs. For example, you
|
|
can use wizards to create single metric, multi-metric, population, and advanced
|
|
jobs.
|
|
|
|
To see the job creation wizards:
|
|
|
|
. Open {kib} in your web browser and log in. If you are running {kib} locally,
|
|
go to `http://localhost:5601/`.
|
|
|
|
. Click **Machine Learning** in the side navigation.
|
|
|
|
. Click **Create new job**.
|
|
|
|
. Click the `server-metrics*` index pattern.
|
|
|
|
You can then choose from a list of job wizards. For example:
|
|
|
|
[role="screenshot"]
|
|
image::images/ml-create-job.jpg["Job creation wizards in {kib}"]
|
|
|
|
If you are not certain which wizard to use, there is also a **Data Visualizer**
|
|
that can help you explore the fields in your data.
|
|
|
|
To learn more about the sample data:
|
|
|
|
. Click **Data Visualizer**. +
|
|
+
|
|
--
|
|
[role="screenshot"]
|
|
image::images/ml-data-visualizer.jpg["Data Visualizer in {kib}"]
|
|
--
|
|
|
|
. Select a time period that you're interested in exploring by using the time
|
|
picker in the {kib} toolbar. Alternatively, click
|
|
**Use full server-metrics* data** to view data over the full time range. In this
|
|
sample data, the documents relate to March and April 2017.
|
|
|
|
. Optional: Change the number of documents per shard that are used in the
|
|
visualizations. There is a relatively small number of documents in the sample
|
|
data, so you can choose a value of `all`. For larger data sets, keep in mind
|
|
that using a large sample size increases query run times and increases the load
|
|
on the cluster.
|
|
|
|
[role="screenshot"]
|
|
image::images/ml-data-metrics.jpg["Data Visualizer output for metrics in {kib}"]
|
|
|
|
The fields in the indices are listed in two sections. The first section contains
|
|
the numeric ("metric") fields. The second section contains non-metric fields
|
|
(such as `keyword`, `text`, `date`, `boolean`, `ip`, and `geo_point` data types).
|
|
|
|
For metric fields, the **Data Visualizer** indicates how many documents contain
|
|
the field in the selected time period. It also provides information about the
|
|
minimum, median, and maximum values, the number of distinct values, and their
|
|
distribution. You can use the distribution chart to get a better idea of how
|
|
the values in the data are clustered. Alternatively, you can view the top values
|
|
for metric fields. For example:
|
|
|
|
[role="screenshot"]
|
|
image::images/ml-data-topmetrics.jpg["Data Visualizer output for top values in {kib}"]
|
|
|
|
For date fields, the **Data Visualizer** provides the earliest and latest field
|
|
values and the number and percentage of documents that contain the field
|
|
during the selected time period. For example:
|
|
|
|
[role="screenshot"]
|
|
image::images/ml-data-dates.jpg["Data Visualizer output for date fields in {kib}"]
|
|
|
|
For keyword fields, the **Data Visualizer** provides the number of distinct
|
|
values, a list of the top values, and the number and percentage of documents
|
|
that contain the field during the selected time period. For example:
|
|
|
|
[role="screenshot"]
|
|
image::images/ml-data-keywords.jpg["Data Visualizer output for date fields in {kib}"]
|
|
|
|
In this tutorial, you will create single and multi-metric jobs that use the
|
|
`total`, `response`, `service`, and `host` fields. Though there is an option to
|
|
create an advanced job directly from the **Data Visualizer**, we will use the
|
|
single and multi-metric job creation wizards instead.
|