mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-17 02:14:54 +00:00
[DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs (#46966)
* [DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs. * [DOCS] Removes extra lines from examples. * Update docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * Update docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc Co-Authored-By: Lisa Cawley <lcawley@elastic.co> * [DOCS] Explains examples.
This commit is contained in:
parent
7739938930
commit
033aa9cf9b
@ -172,3 +172,79 @@ only.
|
||||
<3> The ground truth value for the actual house price. This is required in order
|
||||
to evaluate results.
|
||||
<4> The predicted value for house price calculated by the {reganalysis}.
|
||||
|
||||
|
||||
The following example calculates the training error:
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST _ml/data_frame/_evaluate
|
||||
{
|
||||
"index": "student_performance_mathematics_reg",
|
||||
"query": {
|
||||
"term": {
|
||||
"ml.is_training": {
|
||||
"value": true <1>
|
||||
}
|
||||
}
|
||||
},
|
||||
"evaluation": {
|
||||
"regression": {
|
||||
"actual_field": "G3", <2>
|
||||
"predicted_field": "ml.G3_prediction", <3>
|
||||
"metrics": {
|
||||
"r_squared": {},
|
||||
"mean_squared_error": {}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TEST[skip:TBD]
|
||||
|
||||
<1> In this example, a test/train split (`training_percent`) was defined for the
|
||||
{reganalysis}. This query limits evaluation to be performed on the train split
|
||||
only. It means that a training error will be calculated.
|
||||
<2> The field that contains the ground truth value for the actual student
|
||||
performance. This is required in order to evaluate results.
|
||||
<3> The field that contains the predicted value for student performance
|
||||
calculated by the {reganalysis}.
|
||||
|
||||
|
||||
The next example calculates the testing error. The only difference compared with
|
||||
the previous example is that `ml.is_training` is set to `false` this time, so
|
||||
the query excludes the train split from the evaluation.
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
POST _ml/data_frame/_evaluate
|
||||
{
|
||||
"index": "student_performance_mathematics_reg",
|
||||
"query": {
|
||||
"term": {
|
||||
"ml.is_training": {
|
||||
"value": false <1>
|
||||
}
|
||||
}
|
||||
},
|
||||
"evaluation": {
|
||||
"regression": {
|
||||
"actual_field": "G3", <2>
|
||||
"predicted_field": "ml.G3_prediction", <3>
|
||||
"metrics": {
|
||||
"r_squared": {},
|
||||
"mean_squared_error": {}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TEST[skip:TBD]
|
||||
|
||||
<1> In this example, a test/train split (`training_percent`) was defined for the
|
||||
{reganalysis}. This query limits evaluation to be performed on the test split
|
||||
only. It means that a testing error will be calculated.
|
||||
<2> The field that contains the ground truth value for the actual student
|
||||
performance. This is required in order to evaluate results.
|
||||
<3> The field that contains the predicted value for student performance
|
||||
calculated by the {reganalysis}.
|
@ -179,7 +179,7 @@ The API returns the following result:
|
||||
|
||||
|
||||
[[ml-put-dfanalytics-example-r]]
|
||||
===== {regression-cap} example
|
||||
===== {regression-cap} examples
|
||||
|
||||
The following example creates the `house_price_regression_analysis`
|
||||
{dfanalytics-job}, the analysis type is `regression`:
|
||||
@ -235,4 +235,31 @@ The API returns the following result:
|
||||
}
|
||||
----
|
||||
// TESTRESPONSE[s/1567168659127/$body.$_path/]
|
||||
// TESTRESPONSE[s/"version": "8.0.0"/"version": $body.version/]
|
||||
// TESTRESPONSE[s/"version": "8.0.0"/"version": $body.version/]
|
||||
|
||||
|
||||
The following example creates a job and specifies a training percent:
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
PUT _ml/data_frame/analytics/student_performance_mathematics_0.3
|
||||
{
|
||||
"source": {
|
||||
"index": "student_performance_mathematics"
|
||||
},
|
||||
"dest": {
|
||||
"index":"student_performance_mathematics_reg"
|
||||
},
|
||||
"analysis":
|
||||
{
|
||||
"regression": {
|
||||
"dependent_variable": "G3",
|
||||
"training_percent": 70 <1>
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TEST[skip:TBD]
|
||||
|
||||
<1> The `training_percent` defines the percentage of the data set that will be used
|
||||
for training the model.
|
Loading…
x
Reference in New Issue
Block a user