Object fields cannot be used as features. At the moment _explain
API includes them and even worse it allows it does not error when
an object field is excluded. This creates the expectation to the
user that all children fields will also be excluded while it's not
the case.
This commit omits object fields from the _explain API and also
adds an error if an object field is included or excluded.
Backport of #51115
The version replacement for the code snippet should replace 7.6 with the current version,
but doesn't match because of a missing whitespace.
Closes#51052
Adds a new parameter to regression and classification that enables computation
of importance for the top most important features. The computation of the importance
is based on SHAP (SHapley Additive exPlanations) method.
Backport of #50914
Adds a `force` parameter to the delete data frame analytics
request. When `force` is `true`, the action force-stops the
jobs and then proceeds to the deletion. This can be used in
order to delete a non-stopped job with a single request.
Closes#48124
Backport of #50553
The docs/reference/redirects.asciidoc file stores a list of relocated or
deleted pages for the Elasticsearch Reference documentation.
This prunes several older redirects that are no longer needed and
don't require work to fix broken links in other repositories.
This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.
Backport of #49990
This adds a `_source` setting under the `source` setting of a data
frame analytics config. The new `_source` is reusing the structure
of a `FetchSourceContext` like `analyzed_fields` does. Specifying
includes and excludes for source allows selecting which fields
will get reindexed and will be available in the destination index.
Closes#49531
Backport of #49690
The categorization job wizard in the ML UI will use this
information when showing the effect of the chosen categorization
analyzer on a sample of input.
This commit replaces the _estimate_memory_usage API with
a new API, the _explain API.
The API consolidates information that is useful before
creating a data frame analytics job.
It includes:
- memory estimation
- field selection explanation
Memory estimation is moved here from what was previously
calculated in the _estimate_memory_usage API.
Field selection is a new feature that explains to the user
whether each available field was selected to be included or
not in the analysis. In the case it was not included, it also
explains the reason why.
Backport of #49455
* [ML] Add new geo_results.(actual_point|typical_point) fields for `lat_long` results (#47050)
[ML] Add new geo_results.(actual_point|typical_point) fields for `lat_long` results (#47050)
Related PR: https://github.com/elastic/ml-cpp/pull/809
* adjusting bwc version