[7.x][DOCS] Augments ML shared definitions (#50487)
This commit is contained in:
parent
f57569bf5c
commit
d479e0563a
|
@ -1,3 +1,10 @@
|
|||
tag::aggregations[]
|
||||
If set, the {dfeed} performs aggregation searches. Support for aggregations is
|
||||
limited and should only be used with low cardinality data. For more information,
|
||||
see
|
||||
{ml-docs}/ml-configuring-aggregation.html[Aggregating data for faster performance].
|
||||
end::aggregations[]
|
||||
|
||||
tag::allow-lazy-open[]
|
||||
Advanced configuration option. Specifies whether this job can open when there is
|
||||
insufficient {ml} node capacity for it to be immediately assigned to a node. The
|
||||
|
@ -21,6 +28,21 @@ subject to the cluster-wide `xpack.ml.max_lazy_ml_nodes` setting - see
|
|||
`starting` state until sufficient {ml} node capacity is available.
|
||||
end::allow-lazy-start[]
|
||||
|
||||
tag::allow-no-datafeeds[]
|
||||
Specifies what to do when the request:
|
||||
+
|
||||
--
|
||||
* Contains wildcard expressions and there are no {dfeeds} that match.
|
||||
* Contains the `_all` string or no identifiers and there are no matches.
|
||||
* Contains wildcard expressions and there are only partial matches.
|
||||
|
||||
The default value is `true`, which returns an empty `datafeeds` array when
|
||||
there are no matches and the subset of results when there are partial matches.
|
||||
If this parameter is `false`, the request returns a `404` status code when there
|
||||
are no matches or only partial matches.
|
||||
--
|
||||
end::allow-no-datafeeds[]
|
||||
|
||||
tag::allow-no-jobs[]
|
||||
Specifies what to do when the request:
|
||||
+
|
||||
|
@ -57,71 +79,16 @@ example: `outlier_detection`. See <<ml-dfa-analysis-objects>>.
|
|||
end::analysis[]
|
||||
|
||||
tag::analysis-config[]
|
||||
The analysis configuration, which specifies how to analyze the data.
|
||||
After you create a job, you cannot change the analysis configuration; all
|
||||
the properties are informational. An analysis configuration object has the
|
||||
following properties:
|
||||
|
||||
`bucket_span`:::
|
||||
(<<time-units,time units>>)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=bucket-span]
|
||||
|
||||
`categorization_field_name`:::
|
||||
(string)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=categorization-field-name]
|
||||
|
||||
`categorization_filters`:::
|
||||
(array of strings)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=categorization-filters]
|
||||
|
||||
`categorization_analyzer`:::
|
||||
(object or string)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=categorization-analyzer]
|
||||
|
||||
`detectors`:::
|
||||
(array) An array of detector configuration objects. Detector configuration
|
||||
objects specify which data fields a job analyzes. They also specify which
|
||||
analytical functions are used. You can specify multiple detectors for a job.
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=detector]
|
||||
+
|
||||
--
|
||||
NOTE: If the `detectors` array does not contain at least one detector,
|
||||
no analysis can occur and an error is returned.
|
||||
|
||||
--
|
||||
|
||||
`influencers`:::
|
||||
(array of strings)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=influencers]
|
||||
|
||||
`latency`:::
|
||||
(time units)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=latency]
|
||||
|
||||
`multivariate_by_fields`:::
|
||||
(boolean)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=multivariate-by-fields]
|
||||
|
||||
`summary_count_field_name`:::
|
||||
(string)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=summary-count-field-name]
|
||||
|
||||
The analysis configuration, which specifies how to analyze the data. After you
|
||||
create a job, you cannot change the analysis configuration; all the properties
|
||||
are informational.
|
||||
end::analysis-config[]
|
||||
|
||||
tag::analysis-limits[]
|
||||
Limits can be applied for the resources required to hold the mathematical models
|
||||
in memory. These limits are approximate and can be set per job. They do not
|
||||
control the memory used by other processes, for example the {es} Java
|
||||
processes. If necessary, you can increase the limits after the job is created.
|
||||
The `analysis_limits` object has the following properties:
|
||||
|
||||
`categorization_examples_limit`:::
|
||||
(long)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=categorization-examples-limit]
|
||||
|
||||
`model_memory_limit`:::
|
||||
(long or string)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=model-memory-limit]
|
||||
control the memory used by other processes, for example the {es} Java processes.
|
||||
If necessary, you can increase the limits after the job is created.
|
||||
end::analysis-limits[]
|
||||
|
||||
tag::analyzed-fields[]
|
||||
|
@ -142,7 +109,6 @@ see the <<explain-dfanalytics>> which helps understand field selection.
|
|||
automatically.
|
||||
end::analyzed-fields[]
|
||||
|
||||
|
||||
tag::background-persist-interval[]
|
||||
Advanced configuration option. The time between each periodic persistence of the
|
||||
model. The default value is a randomized value between 3 to 4 hours, which
|
||||
|
@ -162,6 +128,11 @@ The size of the interval that the analysis is aggregated into, typically between
|
|||
see <<time-units>>.
|
||||
end::bucket-span[]
|
||||
|
||||
tag::bucket-span-results[]
|
||||
The length of the bucket in seconds. This value matches the `bucket_span`
|
||||
that is specified in the job.
|
||||
end::bucket-span-results[]
|
||||
|
||||
tag::by-field-name[]
|
||||
The field used to split the data. In particular, this property is used for
|
||||
analyzing the splits with respect to their own history. It is used for finding
|
||||
|
@ -184,15 +155,15 @@ object. If it is a string it must refer to a
|
|||
is an object it has the following properties:
|
||||
--
|
||||
|
||||
`char_filter`::::
|
||||
`analysis_config`.`categorization_analyzer`.`char_filter`::::
|
||||
(array of strings or objects)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=char-filter]
|
||||
|
||||
`tokenizer`::::
|
||||
`analysis_config`.`categorization_analyzer`.`tokenizer`::::
|
||||
(string or object)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=tokenizer]
|
||||
|
||||
`filter`::::
|
||||
`analysis_config`.`categorization_analyzer`.`filter`::::
|
||||
(array of strings or objects)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=filter]
|
||||
end::categorization-analyzer[]
|
||||
|
@ -246,6 +217,22 @@ add them here as
|
|||
<<analysis-pattern-replace-charfilter,pattern replace character filters>>.
|
||||
end::char-filter[]
|
||||
|
||||
tag::chunking-config[]
|
||||
{dfeeds-cap} might be required to search over long time periods, for several months
|
||||
or years. This search is split into time chunks in order to ensure the load
|
||||
on {es} is managed. Chunking configuration controls how the size of these time
|
||||
chunks are calculated and is an advanced configuration option.
|
||||
A chunking configuration object has the following properties:
|
||||
|
||||
`chunking_config`.`mode`:::
|
||||
(string)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=mode]
|
||||
|
||||
`chunking_config`.`time_span`:::
|
||||
(<<time-units,time units>>)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=time-span]
|
||||
end::chunking-config[]
|
||||
|
||||
tag::compute-feature-influence[]
|
||||
If `true`, the feature influence calculation is enabled. Defaults to `true`.
|
||||
end::compute-feature-influence[]
|
||||
|
@ -255,11 +242,10 @@ An array of custom rule objects, which enable you to customize the way detectors
|
|||
operate. For example, a rule may dictate to the detector conditions under which
|
||||
results should be skipped. For more examples, see
|
||||
{ml-docs}/ml-configuring-detector-custom-rules.html[Customizing detectors with custom rules].
|
||||
A custom rule has the following properties:
|
||||
+
|
||||
--
|
||||
`actions`::
|
||||
(array) The set of actions to be triggered when the rule applies. If
|
||||
end::custom-rules[]
|
||||
|
||||
tag::custom-rules-actions[]
|
||||
The set of actions to be triggered when the rule applies. If
|
||||
more than one action is specified the effects of all actions are combined. The
|
||||
available actions include:
|
||||
|
||||
|
@ -271,49 +257,47 @@ model. Unless you also specify `skip_result`, the results will be created as
|
|||
usual. This action is suitable when certain values are expected to be
|
||||
consistently anomalous and they affect the model in a way that negatively
|
||||
impacts the rest of the results.
|
||||
end::custom-rules-actions[]
|
||||
|
||||
`scope`::
|
||||
(object) An optional scope of series where the rule applies. A rule must either
|
||||
tag::custom-rules-scope[]
|
||||
An optional scope of series where the rule applies. A rule must either
|
||||
have a non-empty scope or at least one condition. By default, the scope includes
|
||||
all series. Scoping is allowed for any of the fields that are also specified in
|
||||
`by_field_name`, `over_field_name`, or `partition_field_name`. To add a scope
|
||||
for a field, add the field name as a key in the scope object and set its value
|
||||
to an object with the following properties:
|
||||
end::custom-rules-scope[]
|
||||
|
||||
`filter_id`:::
|
||||
(string) The id of the filter to be used.
|
||||
tag::custom-rules-scope-filter-id[]
|
||||
The id of the filter to be used.
|
||||
end::custom-rules-scope-filter-id[]
|
||||
|
||||
`filter_type`:::
|
||||
(string) Either `include` (the rule applies for values in the filter) or
|
||||
`exclude` (the rule applies for values not in the filter). Defaults to
|
||||
`include`.
|
||||
tag::custom-rules-scope-filter-type[]
|
||||
Either `include` (the rule applies for values in the filter) or `exclude` (the
|
||||
rule applies for values not in the filter). Defaults to `include`.
|
||||
end::custom-rules-scope-filter-type[]
|
||||
|
||||
`conditions`::
|
||||
(array) An optional array of numeric conditions when the rule applies. A rule
|
||||
must either have a non-empty scope or at least one condition. Multiple
|
||||
conditions are combined together with a logical `AND`. A condition has the
|
||||
following properties:
|
||||
tag::custom-rules-conditions[]
|
||||
An optional array of numeric conditions when the rule applies. A rule must
|
||||
either have a non-empty scope or at least one condition. Multiple conditions are
|
||||
combined together with a logical `AND`. A condition has the following properties:
|
||||
end::custom-rules-conditions[]
|
||||
|
||||
`applies_to`:::
|
||||
(string) Specifies the result property to which the condition applies. The
|
||||
available options are `actual`, `typical`, `diff_from_typical`, `time`.
|
||||
tag::custom-rules-conditions-applies-to[]
|
||||
Specifies the result property to which the condition applies. The available
|
||||
options are `actual`, `typical`, `diff_from_typical`, `time`. If your detector
|
||||
uses `lat_long`, `metric`, `rare`, or `freq_rare` functions, you can only
|
||||
specify conditions that apply to `time`.
|
||||
end::custom-rules-conditions-applies-to[]
|
||||
|
||||
`operator`:::
|
||||
(string) Specifies the condition operator. The available options are `gt`
|
||||
(greater than), `gte` (greater than or equals), `lt` (less than) and `lte` (less
|
||||
than or equals).
|
||||
tag::custom-rules-conditions-operator[]
|
||||
Specifies the condition operator. The available options are `gt` (greater than),
|
||||
`gte` (greater than or equals), `lt` (less than) and `lte` (less than or equals).
|
||||
end::custom-rules-conditions-operator[]
|
||||
|
||||
`value`:::
|
||||
(double) The value that is compared against the `applies_to` field using the
|
||||
`operator`.
|
||||
--
|
||||
+
|
||||
--
|
||||
NOTE: If your detector uses `lat_long`, `metric`, `rare`, or `freq_rare`
|
||||
functions, you can only specify `conditions` that apply to `time`.
|
||||
|
||||
--
|
||||
end::custom-rules[]
|
||||
tag::custom-rules-conditions-value[]
|
||||
The value that is compared against the `applies_to` field using the `operator`.
|
||||
end::custom-rules-conditions-value[]
|
||||
|
||||
tag::custom-settings[]
|
||||
Advanced configuration option. Contains custom meta data about the job. For
|
||||
|
@ -330,16 +314,14 @@ a {dfeed}, these properties are automatically set.
|
|||
When data is received via the <<ml-post-data,post data>> API, it is not stored
|
||||
in {es}. Only the results for {anomaly-detect} are retained.
|
||||
|
||||
A data description object has the following properties:
|
||||
|
||||
`format`:::
|
||||
`data_description`.`format`:::
|
||||
(string) Only `JSON` format is supported at this time.
|
||||
|
||||
`time_field`:::
|
||||
`data_description`.`time_field`:::
|
||||
(string) The name of the field that contains the timestamp.
|
||||
The default value is `time`.
|
||||
|
||||
`time_format`:::
|
||||
`data_description`.`time_format`:::
|
||||
(string)
|
||||
include::{docdir}/ml/ml-shared.asciidoc[tag=time-format]
|
||||
--
|
||||
|
@ -444,8 +426,8 @@ expression.
|
|||
end::datafeed-id-wildcard[]
|
||||
|
||||
tag::decompress-definition[]
|
||||
Specifies whether the included model definition should be returned as a JSON map (`true`) or
|
||||
in a custom compressed format (`false`). Defaults to `true`.
|
||||
Specifies whether the included model definition should be returned as a JSON map
|
||||
(`true`) or in a custom compressed format (`false`). Defaults to `true`.
|
||||
end::decompress-definition[]
|
||||
|
||||
tag::delayed-data-check-config[]
|
||||
|
@ -462,13 +444,11 @@ moment in time. See
|
|||
|
||||
This check runs only on real-time {dfeeds}.
|
||||
|
||||
The configuration object has the following properties:
|
||||
|
||||
`enabled`::
|
||||
`delayed_data_check_config`.`enabled`::
|
||||
(boolean) Specifies whether the {dfeed} periodically checks for delayed data.
|
||||
Defaults to `true`.
|
||||
|
||||
`check_window`::
|
||||
`delayed_data_check_config`.`check_window`::
|
||||
(<<time-units,time units>>) The window of time that is searched for late data.
|
||||
This window of time ends with the latest finalized bucket. It defaults to
|
||||
`null`, which causes an appropriate `check_window` to be calculated when the
|
||||
|
@ -485,6 +465,10 @@ that document will not be used for training, but a prediction with the trained
|
|||
model will be generated for it. It is also known as continuous target variable.
|
||||
end::dependent-variable[]
|
||||
|
||||
tag::desc-results[]
|
||||
If true, the results are sorted in descending order.
|
||||
end::desc-results[]
|
||||
|
||||
tag::description-dfa[]
|
||||
A description of the job.
|
||||
end::description-dfa[]
|
||||
|
@ -502,26 +486,6 @@ optionally `results_field` (`ml` by default).
|
|||
results of the analysis. Default to `ml`.
|
||||
end::dest[]
|
||||
|
||||
tag::detector-description[]
|
||||
A description of the detector. For example, `Low event rate`.
|
||||
end::detector-description[]
|
||||
|
||||
tag::detector-field-name[]
|
||||
The field that the detector uses in the function. If you use an event rate
|
||||
function such as `count` or `rare`, do not specify this field.
|
||||
+
|
||||
--
|
||||
NOTE: The `field_name` cannot contain double quotes or backslashes.
|
||||
|
||||
--
|
||||
end::detector-field-name[]
|
||||
|
||||
tag::detector-index[]
|
||||
A unique identifier for the detector. This identifier is based on the order of
|
||||
the detectors in the `analysis_config`, starting at zero. You can use this
|
||||
identifier when you want to update a specific detector.
|
||||
end::detector-index[]
|
||||
|
||||
tag::detector[]
|
||||
A detector has the following properties:
|
||||
|
||||
|
@ -567,6 +531,26 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=use-null]
|
|||
|
||||
end::detector[]
|
||||
|
||||
tag::detector-description[]
|
||||
A description of the detector. For example, `Low event rate`.
|
||||
end::detector-description[]
|
||||
|
||||
tag::detector-field-name[]
|
||||
The field that the detector uses in the function. If you use an event rate
|
||||
function such as `count` or `rare`, do not specify this field.
|
||||
+
|
||||
--
|
||||
NOTE: The `field_name` cannot contain double quotes or backslashes.
|
||||
|
||||
--
|
||||
end::detector-field-name[]
|
||||
|
||||
tag::detector-index[]
|
||||
A unique identifier for the detector. This identifier is based on the order of
|
||||
the detectors in the `analysis_config`, starting at zero. You can use this
|
||||
identifier when you want to update a specific detector.
|
||||
end::detector-index[]
|
||||
|
||||
tag::eta[]
|
||||
The shrinkage applied to the weights. Smaller values result
|
||||
in larger forests which have better generalization error. However, the smaller
|
||||
|
@ -583,6 +567,11 @@ working with both over and by fields, then you can set `exclude_frequent` to
|
|||
`all` for both fields, or to `by` or `over` for those specific fields.
|
||||
end::exclude-frequent[]
|
||||
|
||||
tag::exclude-interim-results[]
|
||||
If `true`, the output excludes interim results. By default, interim results are
|
||||
included.
|
||||
end::exclude-interim-results[]
|
||||
|
||||
tag::feature-bag-fraction[]
|
||||
Defines the fraction of features that will be used when
|
||||
selecting a random bag for each candidate split.
|
||||
|
@ -624,6 +613,13 @@ optional. If it is not specified, no token filters are applied prior to
|
|||
categorization.
|
||||
end::filter[]
|
||||
|
||||
tag::frequency[]
|
||||
The interval at which scheduled queries are made while the {dfeed} runs in real
|
||||
time. The default value is either the bucket span for short bucket spans, or,
|
||||
for longer bucket spans, a sensible fraction of the bucket span. For example:
|
||||
`150s`.
|
||||
end::frequency[]
|
||||
|
||||
tag::from[]
|
||||
Skips the specified number of {dfanalytics-jobs}. The default value is `0`.
|
||||
end::from[]
|
||||
|
@ -671,24 +667,26 @@ is available as part of the input data. When you use multiple detectors, the use
|
|||
of influencers is recommended as it aggregates results for each influencer entity.
|
||||
end::influencers[]
|
||||
|
||||
tag::is-interim[]
|
||||
If `true`, this is an interim result. In other words, the results are calculated
|
||||
based on partial input data.
|
||||
end::is-interim[]
|
||||
|
||||
tag::job-id-anomaly-detection[]
|
||||
Identifier for the {anomaly-job}.
|
||||
end::job-id-anomaly-detection[]
|
||||
|
||||
tag::job-id-data-frame-analytics[]
|
||||
Identifier for the {dfanalytics-job}.
|
||||
end::job-id-data-frame-analytics[]
|
||||
|
||||
tag::job-id-anomaly-detection-default[]
|
||||
Identifier for the {anomaly-job}. It can be a job identifier, a group name, or a
|
||||
wildcard expression. If you do not specify one of these options, the API returns
|
||||
information for all {anomaly-jobs}.
|
||||
end::job-id-anomaly-detection-default[]
|
||||
|
||||
tag::job-id-data-frame-analytics-default[]
|
||||
Identifier for the {dfanalytics-job}. If you do not specify this option, the API
|
||||
returns information for the first hundred {dfanalytics-jobs}.
|
||||
end::job-id-data-frame-analytics-default[]
|
||||
tag::job-id-anomaly-detection-define[]
|
||||
Identifier for the {anomaly-job}. This identifier can contain lowercase
|
||||
alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start
|
||||
and end with alphanumeric characters.
|
||||
end::job-id-anomaly-detection-define[]
|
||||
|
||||
tag::job-id-anomaly-detection-list[]
|
||||
An identifier for the {anomaly-jobs}. It can be a job
|
||||
|
@ -705,11 +703,14 @@ Identifier for the {anomaly-job}. It can be a job identifier, a group name, a
|
|||
comma-separated list of jobs or groups, or a wildcard expression.
|
||||
end::job-id-anomaly-detection-wildcard-list[]
|
||||
|
||||
tag::job-id-anomaly-detection-define[]
|
||||
Identifier for the {anomaly-job}. This identifier can contain lowercase
|
||||
alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start
|
||||
and end with alphanumeric characters.
|
||||
end::job-id-anomaly-detection-define[]
|
||||
tag::job-id-data-frame-analytics[]
|
||||
Identifier for the {dfanalytics-job}.
|
||||
end::job-id-data-frame-analytics[]
|
||||
|
||||
tag::job-id-data-frame-analytics-default[]
|
||||
Identifier for the {dfanalytics-job}. If you do not specify this option, the API
|
||||
returns information for the first hundred {dfanalytics-jobs}.
|
||||
end::job-id-data-frame-analytics-default[]
|
||||
|
||||
tag::job-id-data-frame-analytics-define[]
|
||||
Identifier for the {dfanalytics-job}. This identifier can contain lowercase
|
||||
|
@ -717,6 +718,10 @@ alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start
|
|||
and end with alphanumeric characters.
|
||||
end::job-id-data-frame-analytics-define[]
|
||||
|
||||
tag::job-id-datafeed[]
|
||||
The unique identifier for the job to which the {dfeed} sends data.
|
||||
end::job-id-datafeed[]
|
||||
|
||||
tag::jobs-stats-anomaly-detection[]
|
||||
An array of {anomaly-job} statistics objects.
|
||||
For more information, see <<ml-jobstats>>.
|
||||
|
@ -745,6 +750,15 @@ the <<ml-post-data,post data>> API.
|
|||
--
|
||||
end::latency[]
|
||||
|
||||
tag::max-empty-searches[]
|
||||
If a real-time {dfeed} has never seen any data (including during any initial
|
||||
training period) then it will automatically stop itself and close its associated
|
||||
job after this many real-time searches that return no documents. In other words,
|
||||
it will stop after `frequency` times `max_empty_searches` of real-time operation.
|
||||
If not set then a {dfeed} with no end time that sees no data will remain started
|
||||
until it is explicitly stopped. By default this setting is not set.
|
||||
end::max-empty-searches[]
|
||||
|
||||
tag::maximum-number-trees[]
|
||||
Defines the maximum number of trees the forest is allowed
|
||||
to contain. The maximum value is 2000.
|
||||
|
@ -837,26 +851,24 @@ be seen in the model plot.
|
|||
|
||||
Model plot config can be configured when the job is created or updated later. It
|
||||
must be disabled if performance issues are experienced.
|
||||
|
||||
The `model_plot_config` object has the following properties:
|
||||
|
||||
`enabled`:::
|
||||
(boolean) If true, enables calculation and storage of the model bounds for
|
||||
each entity that is being analyzed. By default, this is not enabled.
|
||||
|
||||
`terms`:::
|
||||
experimental[] (string) Limits data collection to this comma separated list of
|
||||
partition or by field values. If terms are not specified or it is an empty
|
||||
string, no filtering is applied. For example, "CPU,NetworkIn,DiskWrites".
|
||||
Wildcards are not supported. Only the specified `terms` can be viewed when
|
||||
using the Single Metric Viewer.
|
||||
--
|
||||
end::model-plot-config[]
|
||||
|
||||
tag::model-plot-config-enabled[]
|
||||
If true, enables calculation and storage of the model bounds for each entity
|
||||
that is being analyzed. By default, this is not enabled.
|
||||
end::model-plot-config-enabled[]
|
||||
|
||||
tag::model-plot-config-terms[]
|
||||
Limits data collection to this comma separated list of partition or by field
|
||||
values. If terms are not specified or it is an empty string, no filtering is
|
||||
applied. For example, "CPU,NetworkIn,DiskWrites". Wildcards are not supported.
|
||||
Only the specified `terms` can be viewed when using the Single Metric Viewer.
|
||||
end::model-plot-config-terms[]
|
||||
|
||||
tag::model-snapshot-id[]
|
||||
A numerical character string that uniquely identifies the model snapshot. For
|
||||
example, `1491007364`. For more information about model snapshots, see
|
||||
<<ml-snapshot-resource>>.
|
||||
example, `1575402236000 `.
|
||||
end::model-snapshot-id[]
|
||||
|
||||
tag::model-snapshot-retention-days[]
|
||||
|
@ -925,6 +937,21 @@ Defines the name of the prediction field in the results.
|
|||
Defaults to `<dependent_variable>_prediction`.
|
||||
end::prediction-field-name[]
|
||||
|
||||
tag::query[]
|
||||
The {es} query domain-specific language (DSL). This value corresponds to the
|
||||
query object in an {es} search POST body. All the options that are supported by
|
||||
{es} can be used, as this object is passed verbatim to {es}. By default, this
|
||||
property has the following value: `{"match_all": {"boost": 1}}`.
|
||||
end::query[]
|
||||
|
||||
tag::query-delay[]
|
||||
The number of seconds behind real time that data is queried. For example, if
|
||||
data from 10:04 a.m. might not be searchable in {es} until 10:06 a.m., set this
|
||||
property to 120 seconds. The default value is randomly selected between `60s`
|
||||
and `120s`. This randomness improves the query performance when there are
|
||||
multiple jobs running on the same node.
|
||||
end::query-delay[]
|
||||
|
||||
tag::randomize-seed[]
|
||||
Defines the seed to the random generator that is used to pick
|
||||
which documents will be used for training. By default it is randomly generated.
|
||||
|
@ -951,11 +978,33 @@ are deleted from {es}. The default value is null, which means results are
|
|||
retained.
|
||||
end::results-retention-days[]
|
||||
|
||||
tag::retain[]
|
||||
If `true`, this snapshot will not be deleted during automatic cleanup of
|
||||
snapshots older than `model_snapshot_retention_days`. However, this snapshot
|
||||
will be deleted when the job is deleted. The default value is `false`.
|
||||
end::retain[]
|
||||
|
||||
tag::script-fields[]
|
||||
Specifies scripts that evaluate custom expressions and returns script fields to
|
||||
the {dfeed}. The detector configuration objects in a job can contain functions
|
||||
that use these script fields. For more information, see
|
||||
{ml-docs}/ml-configuring-transform.html[Transforming data with script fields]
|
||||
and <<request-body-search-script-fields,Script fields>>.
|
||||
end::script-fields[]
|
||||
|
||||
tag::scroll-size[]
|
||||
The `size` parameter that is used in {es} searches. The default value is `1000`.
|
||||
end::scroll-size[]
|
||||
|
||||
tag::size[]
|
||||
Specifies the maximum number of {dfanalytics-jobs} to obtain. The default value
|
||||
is `100`.
|
||||
end::size[]
|
||||
|
||||
tag::snapshot-id[]
|
||||
Identifier for the model snapshot.
|
||||
end::snapshot-id[]
|
||||
|
||||
tag::source-put-dfa[]
|
||||
The configuration of how to source the analysis data. It requires an
|
||||
`index`. Optionally, `query` and `_source` may be specified.
|
||||
|
@ -1006,16 +1055,6 @@ function.
|
|||
--
|
||||
end::summary-count-field-name[]
|
||||
|
||||
tag::timeout-start[]
|
||||
Controls the amount of time to wait until the {dfanalytics-job} starts. Defaults
|
||||
to 20 seconds.
|
||||
end::timeout-start[]
|
||||
|
||||
tag::timeout-stop[]
|
||||
Controls the amount of time to wait until the {dfanalytics-job} stops. Defaults
|
||||
to 20 seconds.
|
||||
end::timeout-stop[]
|
||||
|
||||
tag::time-format[]
|
||||
The time format, which can be `epoch`, `epoch_ms`, or a custom pattern. The
|
||||
default value is `epoch`, which refers to UNIX or Epoch time (the number of
|
||||
|
@ -1033,6 +1072,25 @@ timestamp, job creation fails.
|
|||
--
|
||||
end::time-format[]
|
||||
|
||||
tag::time-span[]
|
||||
The time span that each search will be querying. This setting is only applicable
|
||||
when the mode is set to `manual`. For example: `3h`.
|
||||
end::time-span[]
|
||||
|
||||
tag::timeout-start[]
|
||||
Controls the amount of time to wait until the {dfanalytics-job} starts. Defaults
|
||||
to 20 seconds.
|
||||
end::timeout-start[]
|
||||
|
||||
tag::timeout-stop[]
|
||||
Controls the amount of time to wait until the {dfanalytics-job} stops. Defaults
|
||||
to 20 seconds.
|
||||
end::timeout-stop[]
|
||||
|
||||
tag::timestamp-results[]
|
||||
The start time of the bucket for which these results were calculated.
|
||||
end::timestamp-results[]
|
||||
|
||||
tag::tokenizer[]
|
||||
The name or definition of the <<analysis-tokenizers,tokenizer>> to use after
|
||||
character filters are applied. This property is compulsory if
|
||||
|
|
Loading…
Reference in New Issue