mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-08 05:58:44 +00:00
* [ML] adding prediction_field_type to inference config (#55128) Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). Analytics provides the default `prediction_field_type` when the model is created from the process.
180 lines
5.3 KiB
Plaintext
180 lines
5.3 KiB
Plaintext
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[inference-processor]]
|
|
=== {infer-cap} Processor
|
|
|
|
Uses a pre-trained {dfanalytics} model to infer against the data that is being
|
|
ingested in the pipeline.
|
|
|
|
|
|
[[inference-options]]
|
|
.{infer-cap} Options
|
|
[options="header"]
|
|
|======
|
|
| Name | Required | Default | Description
|
|
| `model_id` | yes | - | (String) The ID of the model to load and infer against.
|
|
| `target_field` | no | `ml.inference.<processor_tag>` | (String) Field added to incoming documents to contain results objects.
|
|
| `field_map` | yes | - | (Object) Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.
|
|
| `inference_config` | yes | - | (Object) Contains the inference type and its options. There are two types: <<inference-processor-regression-opt,`regression`>> and <<inference-processor-classification-opt,`classification`>>.
|
|
include::common-options.asciidoc[]
|
|
|======
|
|
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"inference": {
|
|
"model_id": "flight_delay_regression-1571767128603",
|
|
"target_field": "FlightDelayMin_prediction_infer",
|
|
"field_map": {},
|
|
"inference_config": { "regression": {} }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
|
|
[discrete]
|
|
[[inference-processor-regression-opt]]
|
|
==== {regression-cap} configuration options
|
|
|
|
Regression configuration for inference.
|
|
|
|
`results_field`::
|
|
(Optional, string)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]
|
|
|
|
`num_top_feature_importance_values`::
|
|
(Optional, integer)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-regression-num-top-feature-importance-values]
|
|
|
|
|
|
[discrete]
|
|
[[inference-processor-classification-opt]]
|
|
==== {classification-cap} configuration options
|
|
|
|
Classification configuration for inference.
|
|
|
|
`num_top_classes`::
|
|
(Optional, integer)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-classes]
|
|
|
|
`num_top_feature_importance_values`::
|
|
(Optional, integer)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-feature-importance-values]
|
|
|
|
`results_field`::
|
|
(Optional, string)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]
|
|
|
|
`top_classes_results_field`::
|
|
(Optional, string)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-top-classes-results-field]
|
|
|
|
`prediction_field_type`::
|
|
(Optional, string)
|
|
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-prediction-field-type]
|
|
|
|
[discrete]
|
|
[[inference-processor-config-example]]
|
|
==== `inference_config` examples
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"inference_config": {
|
|
"regression": {
|
|
"results_field": "my_regression"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
This configuration specifies a `regression` inference and the results are
|
|
written to the `my_regression` field contained in the `target_field` results
|
|
object.
|
|
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"inference_config": {
|
|
"classification": {
|
|
"num_top_classes": 2,
|
|
"results_field": "prediction",
|
|
"top_classes_results_field": "probabilities"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
This configuration specifies a `classification` inference. The number of
|
|
categories for which the predicted probabilities are reported is 2
|
|
(`num_top_classes`). The result is written to the `prediction` field and the top
|
|
classes to the `probabilities` field. Both fields are contained in the
|
|
`target_field` results object.
|
|
|
|
|
|
[discrete]
|
|
[[inference-processor-feature-importance]]
|
|
==== {feat-imp-cap} object mapping
|
|
|
|
Update your index mapping of the {feat-imp} result field as you can see below to
|
|
get the full benefit of aggregating and searching for
|
|
{ml-docs}/dfa-classification.html#dfa-classification-feature-importance[{feat-imp}].
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"ml.inference.feature_importance": {
|
|
"type": "nested",
|
|
"dynamic": true,
|
|
"properties": {
|
|
"feature_name": {
|
|
"type": "keyword"
|
|
},
|
|
"importance": {
|
|
"type": "double"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
The mapping field name for {feat-imp} is compounded as follows:
|
|
|
|
`<ml.inference.target_field>`.`<inference.tag>`.`feature_importance`
|
|
|
|
If `inference.tag` is not provided in the processor definition, it is not part
|
|
of the field path. The `<ml.inference.target_field>` defaults to `ml.inference`.
|
|
|
|
For example, you provide a tag `foo` in the definition as you can see below:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"tag": "foo",
|
|
...
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
|
|
The {feat-imp} value is written to the `ml.inference.foo.feature_importance`
|
|
field.
|
|
|
|
You can also specify a target field as follows:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"tag": "foo",
|
|
"target_field": "my_field"
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
In this case, {feat-imp} is exposed in the
|
|
`my_field.foo.feature_importance` field.
|