2019-12-19 06:19:44 -05:00
[role="xpack"]
[testenv="basic"]
[[inference-processor]]
2020-08-12 11:49:54 -04:00
=== {infer-cap} processor
++++
<titleabbrev>{infer-cap}</titleabbrev>
++++
2019-12-19 06:19:44 -05:00
2020-09-30 10:20:38 -04:00
experimental::[]
2020-07-06 06:35:30 -04:00
Uses a pre-trained {dfanalytics} model to infer against the data that is being
2019-12-19 06:19:44 -05:00
ingested in the pipeline.
[[inference-options]]
.{infer-cap} Options
[options="header"]
|======
| Name | Required | Default | Description
2020-01-21 10:21:17 -05:00
| `model_id` | yes | - | (String) The ID of the model to load and infer against.
2019-12-19 06:19:44 -05:00
| `target_field` | no | `ml.inference.<processor_tag>` | (String) Field added to incoming documents to contain results objects.
2020-07-06 06:35:30 -04:00
| `field_map` | no | If defined the model's default field map | (Object) Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.
| `inference_config` | no | The default settings defined in the model | (Object) Contains the inference type and its options. There are two types: <<inference-processor-regression-opt,`regression`>> and <<inference-processor-classification-opt,`classification`>>.
2019-12-19 06:19:44 -05:00
include::common-options.asciidoc[]
|======
[source,js]
--------------------------------------------------
{
2020-01-21 10:21:17 -05:00
"inference": {
"model_id": "flight_delay_regression-1571767128603",
"target_field": "FlightDelayMin_prediction_infer",
2020-07-06 06:35:30 -04:00
"field_map": {
"your_field": "my_field"
},
2020-02-05 05:49:36 -05:00
"inference_config": { "regression": {} }
2020-01-21 10:21:17 -05:00
}
2019-12-19 06:19:44 -05:00
}
--------------------------------------------------
// NOTCONSOLE
[discrete]
[[inference-processor-regression-opt]]
==== {regression-cap} configuration options
2020-04-02 12:25:10 -04:00
Regression configuration for inference.
2019-12-19 06:19:44 -05:00
`results_field`::
(Optional, string)
2020-06-29 05:28:17 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field-processor]
2019-12-19 06:19:44 -05:00
2020-04-02 12:25:10 -04:00
`num_top_feature_importance_values`::
2020-02-21 18:42:31 -05:00
(Optional, integer)
2020-06-01 19:42:53 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-regression-num-top-feature-importance-values]
2019-12-19 06:19:44 -05:00
2020-03-26 04:22:12 -04:00
2019-12-19 06:19:44 -05:00
[discrete]
[[inference-processor-classification-opt]]
==== {classification-cap} configuration options
2020-04-02 12:25:10 -04:00
Classification configuration for inference.
2019-12-19 06:19:44 -05:00
`num_top_classes`::
2020-04-02 12:25:10 -04:00
(Optional, integer)
2020-06-01 19:42:53 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-classes]
2019-12-19 06:19:44 -05:00
2020-04-02 12:25:10 -04:00
`num_top_feature_importance_values`::
2020-02-21 18:42:31 -05:00
(Optional, integer)
2020-06-01 19:42:53 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-feature-importance-values]
2020-03-26 04:22:12 -04:00
2020-04-02 12:25:10 -04:00
`results_field`::
(Optional, string)
2020-06-29 05:28:17 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field-processor]
2020-04-02 12:25:10 -04:00
`top_classes_results_field`::
(Optional, string)
2020-06-01 19:42:53 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-top-classes-results-field]
2019-12-19 06:19:44 -05:00
2020-04-15 09:45:22 -04:00
`prediction_field_type`::
(Optional, string)
2020-06-01 19:42:53 -04:00
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-prediction-field-type]
2020-04-15 09:45:22 -04:00
2019-12-19 06:19:44 -05:00
[discrete]
[[inference-processor-config-example]]
==== `inference_config` examples
[source,js]
--------------------------------------------------
{
"inference_config": {
2020-01-21 10:21:17 -05:00
"regression": {
"results_field": "my_regression"
2019-12-19 06:19:44 -05:00
}
2020-01-21 10:21:17 -05:00
}
2019-12-19 06:19:44 -05:00
}
--------------------------------------------------
// NOTCONSOLE
2020-07-06 06:35:30 -04:00
This configuration specifies a `regression` inference and the results are
written to the `my_regression` field contained in the `target_field` results
2019-12-19 06:19:44 -05:00
object.
[source,js]
--------------------------------------------------
{
"inference_config": {
2020-01-21 10:21:17 -05:00
"classification": {
"num_top_classes": 2,
"results_field": "prediction",
"top_classes_results_field": "probabilities"
2019-12-19 06:19:44 -05:00
}
2020-01-21 10:21:17 -05:00
}
2019-12-19 06:19:44 -05:00
}
--------------------------------------------------
// NOTCONSOLE
2020-07-06 06:35:30 -04:00
This configuration specifies a `classification` inference. The number of
categories for which the predicted probabilities are reported is 2
(`num_top_classes`). The result is written to the `prediction` field and the top
classes to the `probabilities` field. Both fields are contained in the
2019-12-19 06:19:44 -05:00
`target_field` results object.
2020-03-26 04:22:12 -04:00
[discrete]
[[inference-processor-feature-importance]]
==== {feat-imp-cap} object mapping
2020-07-06 06:35:30 -04:00
Update your index mapping of the {feat-imp} result field as you can see below to
get the full benefit of aggregating and searching for
2020-04-28 03:02:14 -04:00
{ml-docs}/ml-feature-importance.html[{feat-imp}].
2020-03-26 04:22:12 -04:00
[source,js]
--------------------------------------------------
"ml.inference.feature_importance": {
"type": "nested",
"dynamic": true,
"properties": {
"feature_name": {
"type": "keyword"
},
"importance": {
"type": "double"
}
}
}
--------------------------------------------------
// NOTCONSOLE
The mapping field name for {feat-imp} is compounded as follows:
`<ml.inference.target_field>`.`<inference.tag>`.`feature_importance`
2020-07-06 06:35:30 -04:00
If `inference.tag` is not provided in the processor definition, it is not part
2020-03-26 04:22:12 -04:00
of the field path. The `<ml.inference.target_field>` defaults to `ml.inference`.
For example, you provide a tag `foo` in the definition as you can see below:
[source,js]
--------------------------------------------------
{
"tag": "foo",
...
}
--------------------------------------------------
// NOTCONSOLE
2020-07-06 06:35:30 -04:00
The {feat-imp} value is written to the `ml.inference.foo.feature_importance`
2020-03-26 04:22:12 -04:00
field.
You can also specify a target field as follows:
[source,js]
--------------------------------------------------
{
"tag": "foo",
"target_field": "my_field"
}
--------------------------------------------------
// NOTCONSOLE
2020-07-06 06:35:30 -04:00
In this case, {feat-imp} is exposed in the
2020-04-02 12:25:10 -04:00
`my_field.foo.feature_importance` field.