[DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs (#46966)

* [DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs.

* [DOCS] Removes extra lines from examples.

* Update docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* Update docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* [DOCS] Explains examples.
This commit is contained in:
István Zoltán Szabó 2019-10-02 10:26:20 +02:00
parent 7739938930
commit 033aa9cf9b
2 changed files with 105 additions and 2 deletions

View File

@ -172,3 +172,79 @@ only.
<3> The ground truth value for the actual house price. This is required in order
to evaluate results.
<4> The predicted value for house price calculated by the {reganalysis}.
The following example calculates the training error:
[source,console]
--------------------------------------------------
POST _ml/data_frame/_evaluate
{
"index": "student_performance_mathematics_reg",
"query": {
"term": {
"ml.is_training": {
"value": true <1>
}
}
},
"evaluation": {
"regression": {
"actual_field": "G3", <2>
"predicted_field": "ml.G3_prediction", <3>
"metrics": {
"r_squared": {},
"mean_squared_error": {}
}
}
}
}
--------------------------------------------------
// TEST[skip:TBD]
<1> In this example, a test/train split (`training_percent`) was defined for the
{reganalysis}. This query limits evaluation to be performed on the train split
only. It means that a training error will be calculated.
<2> The field that contains the ground truth value for the actual student
performance. This is required in order to evaluate results.
<3> The field that contains the predicted value for student performance
calculated by the {reganalysis}.
The next example calculates the testing error. The only difference compared with
the previous example is that `ml.is_training` is set to `false` this time, so
the query excludes the train split from the evaluation.
[source,console]
--------------------------------------------------
POST _ml/data_frame/_evaluate
{
"index": "student_performance_mathematics_reg",
"query": {
"term": {
"ml.is_training": {
"value": false <1>
}
}
},
"evaluation": {
"regression": {
"actual_field": "G3", <2>
"predicted_field": "ml.G3_prediction", <3>
"metrics": {
"r_squared": {},
"mean_squared_error": {}
}
}
}
}
--------------------------------------------------
// TEST[skip:TBD]
<1> In this example, a test/train split (`training_percent`) was defined for the
{reganalysis}. This query limits evaluation to be performed on the test split
only. It means that a testing error will be calculated.
<2> The field that contains the ground truth value for the actual student
performance. This is required in order to evaluate results.
<3> The field that contains the predicted value for student performance
calculated by the {reganalysis}.

View File

@ -179,7 +179,7 @@ The API returns the following result:
[[ml-put-dfanalytics-example-r]]
===== {regression-cap} example
===== {regression-cap} examples
The following example creates the `house_price_regression_analysis`
{dfanalytics-job}, the analysis type is `regression`:
@ -235,4 +235,31 @@ The API returns the following result:
}
----
// TESTRESPONSE[s/1567168659127/$body.$_path/]
// TESTRESPONSE[s/"version": "8.0.0"/"version": $body.version/]
// TESTRESPONSE[s/"version": "8.0.0"/"version": $body.version/]
The following example creates a job and specifies a training percent:
[source,console]
--------------------------------------------------
PUT _ml/data_frame/analytics/student_performance_mathematics_0.3
{
"source": {
"index": "student_performance_mathematics"
},
"dest": {
"index":"student_performance_mathematics_reg"
},
"analysis":
{
"regression": {
"dependent_variable": "G3",
"training_percent": 70 <1>
}
}
}
--------------------------------------------------
// TEST[skip:TBD]
<1> The `training_percent` defines the percentage of the data set that will be used
for training the model.