mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-01 16:39:11 +00:00
[DOCS] Add forecasting overview (elastic/x-pack-elasticsearch#3263)
* [DOCS] Restructure ML overview * [DOCS] Added forecasting limitations * [DOCS] Merged changes to ML overview * [DOCS] Added forecasting screenshot * [DOCS] Removed incorrect results info from forecast API * [DOCS] Addressed feedback about forecasts * [DOCS] Clarified default forecast duration Original commit: elastic/x-pack-elasticsearch@1403f2cd2e
This commit is contained in:
parent
01e3db3740
commit
b35f1909cc
68
docs/en/ml/forecasting.asciidoc
Normal file
68
docs/en/ml/forecasting.asciidoc
Normal file
@ -0,0 +1,68 @@
|
|||||||
|
[float]
|
||||||
|
[[ml-forecasting]]
|
||||||
|
=== Forecasting the Future
|
||||||
|
|
||||||
|
After the {xpackml} features create baselines of normal behavior for your data,
|
||||||
|
you can use that information to extrapolate future behavior.
|
||||||
|
|
||||||
|
You can use a forecast to estimate a time series value at a specific future date.
|
||||||
|
For example, you might want to determine how many users you can expect to visit
|
||||||
|
your website next Sunday at 0900.
|
||||||
|
|
||||||
|
You can also use it to estimate the probability of a time series value occurring
|
||||||
|
at a future date. For example, you might want to determine how likely it is that
|
||||||
|
your disk utilization will reach 100% before the end of next week.
|
||||||
|
|
||||||
|
Each forecast has a unique ID, which you can use to distinguish between forecasts
|
||||||
|
that you created at different times. You can create a forecast by using the
|
||||||
|
{ref}/ml-forecast.html[Forecast Jobs API] or by using {kib}. For example:
|
||||||
|
|
||||||
|
|
||||||
|
[role="screenshot"]
|
||||||
|
image::images/ml-gs-job-forecast.jpg["Example screenshot from the Machine Learning Single Metric Viewer in Kibana"]
|
||||||
|
|
||||||
|
//For a more detailed walk-through of {xpackml} features, see <<ml-getting-started>>.
|
||||||
|
|
||||||
|
The yellow line in the chart represents the predicted data values. The
|
||||||
|
shaded yellow area represents the bounds for the predicted values, which also
|
||||||
|
gives an indication of the confidence of the predictions.
|
||||||
|
|
||||||
|
When you create a forecast, you specify its _duration_, which indicates how far
|
||||||
|
the forecast extends beyond the last record that was processed. By default, the
|
||||||
|
duration is 1 day. Typically the farther into the future that you forecast, the
|
||||||
|
lower the confidence levels become (that is to say, the bounds increase).
|
||||||
|
Eventually if the confidence levels are too low, the forecast stops.
|
||||||
|
|
||||||
|
You can also optionally specify when the forecast expires. By default, it
|
||||||
|
expires in 14 days and is deleted automatically thereafter. You can specify a
|
||||||
|
different expiration period by using the `expires_in` parameter in the {xpack-ref}/ml-forecast.html[Forecast Jobs API].
|
||||||
|
|
||||||
|
//Add examples of forecast_request_stats and forecast documents?
|
||||||
|
|
||||||
|
There are some limitations that affect your ability to create a forecast:
|
||||||
|
|
||||||
|
* You can generate only three forecasts concurrently. There is no limit to the
|
||||||
|
number of forecasts that you retain. Existing forecasts are not overwritten when
|
||||||
|
you create new forecasts. Rather, they are automatically deleted when they expire.
|
||||||
|
* If you use an `over_field_name` property in your job (that is to say, it's a
|
||||||
|
_population job_), you cannot create a forecast.
|
||||||
|
* If you use any of the following analytical functions in your job, you
|
||||||
|
cannot create a forecast:
|
||||||
|
** `lat_long`
|
||||||
|
** `rare` and `freq_rare`
|
||||||
|
** `time_of_day` and `time_of_week`
|
||||||
|
+
|
||||||
|
--
|
||||||
|
For more information about any of these functions, see <<ml-functions>>.
|
||||||
|
--
|
||||||
|
* Forecasts run concurrently with real-time {ml} analysis. That is to say, {ml}
|
||||||
|
analysis does not stop while forecasts are generated. Forecasts can have an
|
||||||
|
impact on {ml} jobs, however, especially in terms of memory usage. For this
|
||||||
|
reason, forecasts run only if the model memory status is acceptable and the
|
||||||
|
snapshot models for the forecast do not require more than 20 MB. If these memory
|
||||||
|
limits are reached, consider splitting the job into multiple smaller jobs and
|
||||||
|
creating forecasts for these.
|
||||||
|
* The job must be open when you create a forecast. Otherwise, an error occurs.
|
||||||
|
* If there is insufficient data to generate any meaningful predictions, an
|
||||||
|
error occurs. In general, forecasts that are created early in the learning phase
|
||||||
|
of the data analysis are less accurate.
|
@ -6,6 +6,8 @@ input data.
|
|||||||
|
|
||||||
The {xpackml} features include the following geographic function: `lat_long`.
|
The {xpackml} features include the following geographic function: `lat_long`.
|
||||||
|
|
||||||
|
NOTE: You cannot create forecasts for jobs that contain geographic functions.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[ml-lat-long]]
|
[[ml-lat-long]]
|
||||||
==== Lat_long
|
==== Lat_long
|
||||||
|
@ -12,6 +12,8 @@ number of times (frequency) rare values occur.
|
|||||||
====
|
====
|
||||||
* The `rare` and `freq_rare` functions should not be used in conjunction with
|
* The `rare` and `freq_rare` functions should not be used in conjunction with
|
||||||
`exclude_frequent`.
|
`exclude_frequent`.
|
||||||
|
* You cannot create forecasts for jobs that contain `rare` or `freq_rare`
|
||||||
|
functions.
|
||||||
* Shorter bucket spans (less than 1 hour, for example) are recommended when
|
* Shorter bucket spans (less than 1 hour, for example) are recommended when
|
||||||
looking for rare events. The functions model whether something happens in a
|
looking for rare events. The functions model whether something happens in a
|
||||||
bucket at least once. With longer bucket spans, it is more likely that
|
bucket at least once. With longer bucket spans, it is more likely that
|
||||||
|
@ -13,6 +13,7 @@ The {xpackml} features include the following time functions:
|
|||||||
|
|
||||||
[NOTE]
|
[NOTE]
|
||||||
====
|
====
|
||||||
|
* NOTE: You cannot create forecasts for jobs that contain time functions.
|
||||||
* The `time_of_day` function is not aware of the difference between days, for instance
|
* The `time_of_day` function is not aware of the difference between days, for instance
|
||||||
work days and weekends. When modeling different days, use the `time_of_week` function.
|
work days and weekends. When modeling different days, use the `time_of_week` function.
|
||||||
In general, the `time_of_week` function is more suited to modeling the behavior of people
|
In general, the `time_of_week` function is more suited to modeling the behavior of people
|
||||||
|
BIN
docs/en/ml/images/ml-gs-job-forecast.jpg
Normal file
BIN
docs/en/ml/images/ml-gs-job-forecast.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 262 KiB |
@ -2,6 +2,7 @@
|
|||||||
== Overview
|
== Overview
|
||||||
|
|
||||||
include::analyzing.asciidoc[]
|
include::analyzing.asciidoc[]
|
||||||
|
include::forecasting.asciidoc[]
|
||||||
|
|
||||||
[[ml-concepts]]
|
[[ml-concepts]]
|
||||||
=== Basic Machine Learning Terms
|
=== Basic Machine Learning Terms
|
||||||
|
@ -15,17 +15,7 @@ a time series.
|
|||||||
|
|
||||||
==== Description
|
==== Description
|
||||||
|
|
||||||
You can use the API to estimate a time series value at a specific future date.
|
See {xpack-ref}/ml-forecasting.html[Forecasting the Future].
|
||||||
For example, you might want to determine how many users you can expect to visit
|
|
||||||
your website next Sunday at 0900.
|
|
||||||
|
|
||||||
You can also use it to estimate the probability of a time series value occurring
|
|
||||||
at a future date. For example, you might want to determine how likely it is that
|
|
||||||
your disk utilization will reach 100% before the end of next week.
|
|
||||||
|
|
||||||
Each time you call the API, it generates a new forecast and returns a unique ID.
|
|
||||||
Existing forecasts for the same job are not overwritten. You can use the forecast
|
|
||||||
ID to distinguish between forecasts that you generated at different times.
|
|
||||||
|
|
||||||
[NOTE]
|
[NOTE]
|
||||||
===============================
|
===============================
|
||||||
@ -45,9 +35,9 @@ forecast. For more information about this property, see <<ml-job-resource>>.
|
|||||||
|
|
||||||
`duration`::
|
`duration`::
|
||||||
(time units) A period of time that indicates how far into the future to
|
(time units) A period of time that indicates how far into the future to
|
||||||
forecast. For example, `30d` corresponds to 30 days. The forecast starts at the
|
forecast. For example, `30d` corresponds to 30 days. The default value is 1
|
||||||
last record that was processed. For more information about time units, see
|
day. The forecast starts at the last record that was processed. For more
|
||||||
<<time-units>>.
|
information about time units, see <<time-units>>.
|
||||||
|
|
||||||
`expires_in`::
|
`expires_in`::
|
||||||
(time units) The period of time that forecast results are retained.
|
(time units) The period of time that forecast results are retained.
|
||||||
@ -84,6 +74,6 @@ When the forecast is created, you receive the following results:
|
|||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
You can subsequently see the forecast in the *Single Metric Viewer* in {kib}
|
You can subsequently see the forecast in the *Single Metric Viewer* in {kib}.
|
||||||
and in the results that you retrieve by using {ml} APIs such as the
|
//and in the results that you retrieve by using {ml} APIs such as the
|
||||||
<<ml-get-bucket,get bucket API>> and <<ml-get-record,get records API>>.
|
//<<ml-get-bucket,get bucket API>> and <<ml-get-record,get records API>>.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user