[DOCS] Adds classification type evaluation docs to the DFA evaluation API (#47657)

This commit is contained in:
István Zoltán Szabó 2019-11-06 07:37:14 -05:00
parent f1396b6322
commit 70765dfb05
2 changed files with 119 additions and 2 deletions

View File

@ -17,12 +17,14 @@ experimental[]
`POST _ml/data_frame/_evaluate`
[[ml-evaluate-dfanalytics-prereq]]
==== {api-prereq-title}
* You must have `monitor_ml` privilege to use this API. For more
information, see <<security-privileges>> and <<built-in-roles>>.
[[ml-evaluate-dfanalytics-desc]]
==== {api-description-title}
@ -52,6 +54,7 @@ Available evaluation types:
* `binary_soft_classification`
* `regression`
* `classification`
--
@ -246,4 +249,88 @@ only. It means that a testing error will be calculated.
<2> The field that contains the ground truth value for the actual student
performance. This is required in order to evaluate results.
<3> The field that contains the predicted value for student performance
calculated by the {reganalysis}.
calculated by the {reganalysis}.
===== {classification-cap}
[source,console]
--------------------------------------------------
POST _ml/data_frame/_evaluate
{
"index": "animal_classification",
"evaluation": {
"classification": { <1>
"actual_field": "animal_class", <2>
"predicted_field": "ml.animal_class_prediction.keyword", <3>
"metrics": {
"multiclass_confusion_matrix" : {} <4>
}
}
}
}
--------------------------------------------------
// TEST[skip:TBD]
<1> The evaluation type.
<2> The field that contains the ground truth value for the actual animal
classification. This is required in order to evaluate results.
<3> The field that contains the predicted value for animal classification by
the {classanalysis}. Since the field storing predicted class is dynamically
mapped as text and keyword, you need to add the `.keyword` suffix to the name.
<4> Specifies the metric for the evaluation.
The API returns the following result:
[source,console-result]
--------------------------------------------------
{
"classification" : {
"multiclass_confusion_matrix" : {
"confusion_matrix" : [
{
"actual_class" : "cat", <1>
"actual_class_doc_count" : 12, <2>
"predicted_classes" : [ <3>
{
"predicted_class" : "cat",
"count" : 12 <4>
},
{
"predicted_class" : "dog",
"count" : 0 <5>
}
],
"other_predicted_class_doc_count" : 0 <6>
},
{
"actual_class" : "dog",
"actual_class_doc_count" : 11,
"predicted_classes" : [
{
"predicted_class" : "dog",
"count" : 11
},
{
"predicted_class" : "cat",
"count" : 4
}
],
"other_predicted_class_doc_count" : 4
}
],
"other_actual_class_count" : 0
}
}
}
--------------------------------------------------
<1> The name of the actual class that the analysis tried to predict.
<2> The number of documents in the index that belong to the `actual_class`.
<3> This object contains the list of the predicted classes and the number of
predictions associated with the class.
<4> The number of cats in the dataset that are correctly identified as cats.
<5> The number of cats in the dataset that are incorrectly classified as dogs.
<6> The number of documents that are classified as a class that is not listed as
a `predicted_class`.

View File

@ -18,6 +18,7 @@ Evaluation configuration objects relate to the <<evaluate-dfanalytics>>.
Available evaluation types:
* `binary_soft_classification`
* `regression`
* `classification`
--
`query`::
@ -95,4 +96,33 @@ which outputs a prediction of values.
`metrics`::
(object) Specifies the metrics that are used for the evaluation. Available
metrics are `r_squared` and `mean_squared_error`.
metrics are `r_squared` and `mean_squared_error`.
[[classification-evaluation-resources]]
==== {classification-cap} evaluation objects
{classification-cap} evaluation evaluates the results of a {classanalysis} which
outputs a prediction that identifies to which of the classes each document
belongs.
[discrete]
[[classification-evaluation-resources-properties]]
===== {api-definitions-title}
`actual_field`::
(string) The field of the `index` which contains the ground truth. The data
type of this field must be keyword.
`metrics`::
(object) Specifies the metrics that are used for the evaluation. Available
metric is `multiclass_confusion_matrix`.
`predicted_field`::
(string) The field in the `index` that contains the predicted value, in other
words the results of the {classanalysis}. The data type of this field is
string. You need to add `.keyword` to the predicted field name (the name you
put in the {classanalysis} object as `prediction_field_name` or the default
value of the same field if you didn't specified explicitly). For example,
`predicted_field` : `ml.animal_class_prediction.keyword`.