OpenSearch/x-pack
Dimitris Athanasiou 873ad3f942
[7.x][ML] Add option to regression to randomize training set (#45969) (#46017)
Adds a parameter `training_percent` to regression. The default
value is `100`. When the parameter is set to a value less than `100`,
from the rows that can be used for training (ie. those that have a
value for the dependent variable) we randomly choose whether to actually
use for training. This enables splitting the data into a training set and
the rest, usually called testing, validation or holdout set, which allows
for validating the model on data that have not been used for training.

Technically, the analytics process considers as training the data that
have a value for the dependent variable. Thus, when we decide a training
row is not going to be used for training, we simply clear the row's
dependent variable.
2019-08-27 17:53:11 +03:00
..
dev-tools Build: Merge xpack checkstyle config into core (#33399) 2018-09-05 09:17:02 -04:00
docs Add `manage_own_api_key` cluster privilege (#45897) (#46023) 2019-08-28 00:44:23 +10:00
license-tools [Backport] Remove dependency substitutions 7.x (#42866) 2019-06-04 13:50:23 -07:00
plugin [7.x][ML] Add option to regression to randomize training set (#45969) (#46017) 2019-08-27 17:53:11 +03:00
qa Partly revert globalInfo.ready check (#45960) 2019-08-27 13:01:56 +03:00
snapshot-tool CLI tools: write errors to stderr instead of stdout (#45586) 2019-08-21 14:46:07 -04:00
test [Backport] Remove dependency substitutions 7.x (#42866) 2019-06-04 13:50:23 -07:00
transport-client [Backport] Remove dependency substitutions 7.x (#42866) 2019-06-04 13:50:23 -07:00
NOTICE.txt
README.md
build.gradle [Backport] Remove dependency substitutions 7.x (#42866) 2019-06-04 13:50:23 -07:00

README.md

Elastic License Functionality

This directory tree contains files subject to the Elastic License. The files subject to the Elastic License are grouped in this directory to clearly separate them from files licensed under the Apache License 2.0.