OpenSearch/docs/en/ml/getting-started-wizards.asc...

100 lines
3.8 KiB
Plaintext
Raw Normal View History

[[ml-gs-wizards]]
=== Creating Jobs in {kib}
++++
<titleabbrev>Creating Jobs</titleabbrev>
++++
Machine learning jobs contain the configuration information and metadata
necessary to perform an analytical task. They also contain the results of the
analytical task.
[NOTE]
--
This tutorial uses {kib} to create jobs and view results, but you can
alternatively use APIs to accomplish most tasks.
For API reference information, see {ref}/ml-apis.html[Machine Learning APIs].
The {xpackml} features in {kib} use pop-ups. You must configure your
web browser so that it does not block pop-up windows or create an
exception for your {kib} URL.
--
{kib} provides wizards that help you create typical {ml} jobs. For example, you
can use wizards to create single metric, multi-metric, population, and advanced
jobs.
To see the job creation wizards:
. Open {kib} in your web browser and log in. If you are running {kib} locally,
go to `http://localhost:5601/`.
. Click **Machine Learning** in the side navigation.
. Click **Create new job**.
. Click the `server-metrics*` index pattern.
You can then choose from a list of job wizards. For example:
[role="screenshot"]
image::images/ml-create-job.jpg["Job creation wizards in {kib}"]
If you are not certain which wizard to use, there is also a **Data Visualizer**
that can help you explore the fields in your data.
To learn more about the sample data:
. Click **Data Visualizer**. +
+
--
[role="screenshot"]
image::images/ml-data-visualizer.jpg["Data Visualizer in {kib}"]
--
. Select a time period that you're interested in exploring by using the time
picker in the {kib} toolbar. Alternatively, click
**Use full server-metrics* data** to view data over the full time range. In this
sample data, the documents relate to March and April 2017.
. Optional: Change the number of documents per shard that are used in the
visualizations. There is a relatively small number of documents in the sample
data, so you can choose a value of `all`. For larger data sets, keep in mind
that using a large sample size increases query run times and increases the load
on the cluster.
[role="screenshot"]
image::images/ml-data-metrics.jpg["Data Visualizer output for metrics in {kib}"]
The fields in the indices are listed in two sections. The first section contains
the numeric ("metric") fields. The second section contains non-metric fields
(such as `keyword`, `text`, `date`, `boolean`, `ip`, and `geo_point` data types).
For metric fields, the **Data Visualizer** indicates how many documents contain
the field in the selected time period. It also provides information about the
minimum, median, and maximum values, the number of distinct values, and their
distribution. You can use the distribution chart to get a better idea of how
the values in the data are clustered. Alternatively, you can view the top values
for metric fields. For example:
[role="screenshot"]
image::images/ml-data-topmetrics.jpg["Data Visualizer output for top values in {kib}"]
For date fields, the **Data Visualizer** provides the earliest and latest field
values and the number and percentage of documents that contain the field
during the selected time period. For example:
[role="screenshot"]
image::images/ml-data-dates.jpg["Data Visualizer output for date fields in {kib}"]
For keyword fields, the **Data Visualizer** provides the number of distinct
values, a list of the top values, and the number and percentage of documents
that contain the field during the selected time period. For example:
[role="screenshot"]
image::images/ml-data-keywords.jpg["Data Visualizer output for date fields in {kib}"]
In this tutorial, you will create single and multi-metric jobs that use the
`total`, `response`, `service`, and `host` fields. Though there is an option to
create an advanced job directly from the **Data Visualizer**, we will use the
single and multi-metric job creation wizards instead.