OpenSearch/x-pack
Dimitris Athanasiou f2d4c94a9c
[7.x][ML] Deduplicate multi-fields for data frame analytics (#48799) (#48806)
In the case multi-fields exist in the source index, we pick
all variants of them in our extracted fields detection for
data frame analytics. This means we may have multiple instances
of the same feature. The worse consequence of this is when the
dependent variable (for regression or classification) is also
duplicated which means we train a model on the dependent variable
itself.

Now that #48770 is merged, this commit is adding logic to
only select one variant of multi-fields.

Closes #48756

Backport of #48799
2019-11-01 16:53:05 +02:00
..
dev-tools
docs Fix indentation of "except" in role mapping doc 2019-11-01 10:46:15 -04:00
license-tools
plugin [7.x][ML] Deduplicate multi-fields for data frame analytics (#48799) (#48806) 2019-11-01 16:53:05 +02:00
qa Copy http headers to ThreadContext strictly (#45945) (#48675) 2019-10-31 23:05:12 +02:00
snapshot-tool GCS snapshot cleanup tool backport to 7.x (#48750) 2019-10-31 18:21:36 +03:00
test Document SAML APIs (#45105) (#47909) 2019-10-11 16:34:11 +03:00
transport-client
NOTICE.txt
README.md
build.gradle

README.md

Elastic License Functionality

This directory tree contains files subject to the Elastic License. The files subject to the Elastic License are grouped in this directory to clearly separate them from files licensed under the Apache License 2.0.