823 Commits

Author SHA1 Message Date
Julie Tibshirani
63062ec7bd Mute ClassificationIT.testDependentVariableCardinalityTooHighButWithQueryMakesItWithinRange. 2020-05-05 13:48:35 -07:00
Dan Hermann
6674f14fb3
[7.x] Get index includes parent data stream for backing indices (#56238) 2020-05-05 15:43:42 -05:00
Benjamin Trent
e1c5ca421e
[7.x] [ML] lay ground work for handling >1 result indices (#55892) (#56192)
* [ML] lay ground work for handling >1 result indices (#55892)

This commit removes all but one reference to `getInitialResultsIndexName`. 
This is to support more than one result index for a single job.
2020-05-05 15:54:08 -04:00
William Brafford
3499fa917c
Deprecated xpack "enable" settings should be no-ops (#55416) (#56167)
The following settings are now no-ops:

* xpack.flattened.enabled
* xpack.logstash.enabled
* xpack.rollup.enabled
* xpack.slm.enabled
* xpack.sql.enabled
* xpack.transform.enabled
* xpack.vectors.enabled

Since these settings no longer need to be checked, we can remove settings
parameters from a number of constructors and methods, and do so in this
commit.

We also update documentation to remove references to these settings.
2020-05-05 10:40:49 -04:00
David Roberts
7aa0daaabd
[7.x][ML] More advanced model snapshot retention options (#56194)
This PR implements the following changes to make ML model snapshot
retention more flexible in advance of adding a UI for the feature in
an upcoming release.

- The default for `model_snapshot_retention_days` for new jobs is now
  10 instead of 1
- There is a new job setting, `daily_model_snapshot_retention_after_days`,
  that defaults to 1 for new jobs and `model_snapshot_retention_days`
  for pre-7.8 jobs
- For days that are older than `model_snapshot_retention_days`, all
  model snapshots are deleted as before
- For days that are in between `daily_model_snapshot_retention_after_days`
  and `model_snapshot_retention_days` all but the first model snapshot
  for that day are deleted
- The `retain` setting of model snapshots is still respected to allow
  selected model snapshots to be retained indefinitely

Backport of #56125
2020-05-05 14:31:58 +01:00
Dimitris Athanasiou
75dadb7a6d
[7.x][ML] Add loss_function to regression (#56118) (#56187)
Adds parameters `loss_function` and `loss_function_parameter`
to regression.

Backport of #56118
2020-05-05 14:59:51 +03:00
Dimitris Athanasiou
6061aa3db4
[7.x][ML] Fix race condition updating reindexing progress (#56135) (#56146)
In #55763 I thought I could remove the flag that marks
reindexing was finished on a data frame analytics task.
However, that exposed a race condition. It is possible that
between updating reindexing progress to 100 because we
have called `DataFrameAnalyticsManager.startAnalytics()` and
a call to the _stats API which updates reindexing progress via the
method `DataFrameAnalyticsTask.updateReindexTaskProgress()` we
end up overwriting the 100 with a lower progress value.

This commit fixes this issue by bringing back the help of
a `isReindexingFinished` flag as it was prior to #55763.

Closes #56128

Backport of #56135
2020-05-05 10:48:42 +03:00
Martijn van Groningen
2ac32db607
Move includeDataStream flag from IndicesOptions to IndexNameExpressionResolver.Context (#56151)
Backport of #56034.

Move includeDataStream flag from an IndicesOptions to IndexNameExpressionResolver.Context
as a dedicated field that callers to IndexNameExpressionResolver can set.

Also alter indices stats api to support data streams.
The rollover api uses this api and otherwise rolling over data stream does no longer work.

Relates to #53100
2020-05-04 22:38:33 +02:00
Benjamin Trent
6c26de444d
[ML] reduce InferenceProcessor.Factory log spam by not parsing pipelines (#56020) (#56126)
If there are ill-formed pipelines, or other pipelines are not ready to be parsed, `InferenceProcessor.Factory::accept(ClusterState)` logs warnings. This can be confusing and cause log spam.

It might lead folks to think there an issue with the inference processor. Also, they would see logs for the inference processor even though they might not be using the inference processor. Leading to more confusion.

Additionally, pipelines might not be parseable in this method as some processors require the new cluster state metadata before construction (e.g. `enrich` requires cluster metadata to be set before creating the processor).

closes https://github.com/elastic/elasticsearch/issues/55985
2020-05-04 13:32:01 -04:00
Martijn van Groningen
6d03081560
Add auto create action (#56122)
Backport of #55858 to 7.x branch.

Currently the TransportBulkAction detects whether an index is missing and
then decides whether it should be auto created. The coordination of the
index creation also happens in the TransportBulkAction on the coordinating node.

This change adds a new transport action that the TransportBulkAction delegates to
if missing indices need to be created. The reasons for this change:

* Auto creation of data streams can't occur on the coordinating node.
Based on the index template (v2) either a regular index or a data stream should be created.
However if the coordinating node is slow in processing cluster state updates then it may be
unaware of the existence of certain index templates, which then can load to the
TransportBulkAction creating an index instead of a data stream. Therefor the coordination of
creating an index or data stream should occur on the master node. See #55377

* From a security perspective it is useful to know whether index creation originates from the
create index api or from auto creating a new index via the bulk or index api. For example
a user would be allowed to auto create an index, but not to use the create index api. The
auto create action will allow security to distinguish these two different patterns of
index creation.
This change adds the following new transport actions:

AutoCreateAction, the TransportBulkAction redirects to this action and this action will actually create the index (instead of the TransportCreateIndexAction). Later via #55377, can improve the AutoCreateAction to also determine whether an index or data stream should be created.

The create_index index privilege is also modified, so that if this permission is granted then a user is also allowed to auto create indices. This change does not yet add an auto_create index privilege. A future change can introduce this new index privilege or modify an existing index / write index privilege.

Relates to #53100
2020-05-04 19:10:09 +02:00
Przemysław Witek
44f5a8ccd3
Use snapshot's latest result time rather than snapshot's creation time when creating an annotation (#56093) (#56103) 2020-05-04 12:36:12 +02:00
William Brafford
d53c941c41
Make xpack.monitoring.enabled setting a no-op (#55617) (#56061)
* Make xpack.monitoring.enabled setting a no-op

This commit turns xpack.monitoring.enabled into a no-op. Mostly, this involved
removing the setting from the setup for integration tests. Monitoring may
introduce some complexity for test setup and teardown, so we should keep an eye
out for turbulence and failures

* Docs for making deprecated setting a no-op
2020-05-01 16:42:11 -04:00
Ryan Ernst
52b9d8d15e
Convert remaining license methods to isAllowed (#55908) (#55991)
This commit converts the remaining isXXXAllowed methods to instead of
use isAllowed with a Feature value. There are a couple other methods
that are static, as well as some licensed features that check the
license directly, but those will be dealt with in other followups.
2020-04-30 15:52:22 -07:00
Benjamin Trent
c36bcb4dd0
[ML] fixing file structure finder multiline merge max for delimited formats (#56023) (#56035)
This commit correctly sets the maxLinesPerRow in the CsvPreference for delimited files given the file structure finder settings.

Previously, it was silently ignored.
2020-04-30 10:51:32 -04:00
Benjamin Trent
04b1f6498b
[ML] using new fixed interval in ml tests (#56021) (#56031)
This commit removes deprecated references to DateHistogram.interval from ml tests
2020-04-30 10:26:39 -04:00
Dimitris Athanasiou
17b904def5
[7.x][ML] Decouple DFA progress testing from analyses phases (#55925) (#56024)
This refactors native integ tests to assert progress without
expecting explicit phases for analyses. We can test those with
yaml tests in a single place.

Backport of #55925
2020-04-30 17:05:47 +03:00
William Brafford
273ff6a105
Make xpack.ilm.enabled setting a no-op (#55592) (#55980)
* Make xpack.ilm.enabled setting a no-op

* Add watcher setting to not use ILM

* Update documentation for no-op setting

* Remove NO_ILM ml index templates

* Remove unneeded setting from test setup

* Inline variable definitions for ML templates

* Use identical parameter names in templates

* New ILM/watcher setting falls back to old setting

* Add fallback unit test for watcher/ilm setting
2020-04-30 09:50:18 -04:00
David Kyle
c204353249
[ML] Wait for model loaded and cached in ModelLoadingServiceTests (#56014)
Fixes test by exposing the method ModelLoadingService::addModelLoadedListener() 
so that the test class can be notified when a model is loaded which happens in
a background thread
2020-04-30 13:32:07 +01:00
Dimitris Athanasiou
c5aa281171
[7.x][ML] Remove error on parsing progress for unknown phase in DFA (#55926) (#55954)
On second thought, this check does not seem to be adding value.
We can test that the phases are as we expect them for each analysis
by adding yaml tests. Those would fail if we introduce new phases
from c++ accidentally or without coordination. This would achieve
the same thing. At the same time we would not have to comment out
this code each time a new phase is introduced. Instead we can just
temporarily mute those yaml tests. Note I will add those tests
right after the imminent new phases are added to the c++ side.

Backport of #55926
2020-04-29 20:11:33 +03:00
Benjamin Trent
edd049f9cd
[ML] Allow a certain number of ill-formatted rows when delimited format is specified (#55735) (#55944)
While it is good to not be lenient when attempting to guess the file format, it is frustrating to users when they KNOW it is CSV but there are a few ill-formatted rows in the file (via some entry error, etc.).

This commit allows for up to 10% of sample rows to be considered "bad". These rows are effectively ignored while guessing the format.

This percentage of "allows bad rows" is only applied when the user has specified delimited formatting options. As the structure finder needs some guidance on what a "bad row" actually means.

related to https://github.com/elastic/elasticsearch/issues/38890
2020-04-29 11:15:21 -04:00
Dimitris Athanasiou
d9685a0f19
[7.x][ML] Validate at least one feature is available for DF analytics (#55876) (#55914)
We were previously checking at least one supported field existed
when the _explain API was called. However, in the case of analyses
with required fields (e.g. regression) we were not accounting that
the dependent variable is not a feature and thus if the source index
only contains the dependent variable field there are no features to
train a model on.

This commit adds a validation that at least one feature is available
for analysis. Note that we also move that validation away from
`ExtractedFieldsDetector` and the _explain API and straight into
the _start API. The reason for doing this is to allow the user to use
the _explain API in order to understand why they would be seeing an
error like this one.

For example, the user might be using an index that has fields but
they are of unsupported types. If they start the job and get
an error that there are no features, they will wonder why that is.
Calling the _explain API will show them that all their fields are
unsupported. If the _explain API was failing instead, there would
be no way for the user to understand why all those fields are
ignored.

Closes #55593

Backport of #55876
2020-04-29 11:39:58 +03:00
David Roberts
61ac09ae21
[ML] Add daily_model_snapshot_retention_after_days to job config (#55891)
This change adds a new setting, daily_model_snapshot_retention_after_days,
to the anomaly detection job config.

Initially this has no effect, the effect will be added in a followup PR.
This PR gets the complexities of making changes that interact with BWC
over well before feature freeze.

Backport of #55878
2020-04-29 09:12:53 +01:00
Dimitris Athanasiou
abab4c4d4f
[7.x][ML] Do not fail DFA task when it's stopped whilst reindexing (#55797) (#55800)
Adding to #55659, we missed another way we could set the task to
failed due to task cancellation. CI revealed that we might also
get a `SearchPhaseExecutionException` whose cause is a
`TaskCancelledException`. That exception is not wrapped so
unwrapping it will not return the underlying `TaskCancelledException`.
Thus to be complete in catching this, we also need to check the
error's cause.

Closes #55068

Backport of #55797
2020-04-27 16:03:57 +03:00
Dimitris Athanasiou
7f100c1196
[7.x][ML] Allow analytics process define its own progress phases (#55763) (#55791)
This is a continuation from #55580.

Now that we're parsing phase progresses from the analytics process
we change `ProgressTracker` to allow for custom phases between
the `loading_data` and `writing_results` phases. Each `DataFrameAnalysis`
may declare its own phases.

This commit sets things in place for the analytics process to start
reporting different phases per analysis type. However, this is
still preserving existing behaviour as all analyses currently
declare a single `analyzing` phase.

Backport of #55763
2020-04-27 13:30:05 +03:00
David Roberts
3ba44a5af8
[ML] Adding failed_category_count to model_size_stats (#55761)
The failed_category_count statistic records the number of times
categorization wanted to create a new category but couldn't
because the job had reached its model_memory_limit.

Backport of #55716
2020-04-25 10:36:49 +01:00
Dimitris Athanasiou
210b7f1b76
[7.x][ML] Remove parsing of old progress format in DF Analytics (#55711) (#55720)
Since #55580 we've introduced a new format for parsing progress
from the data frame analytics process. As the process is now
writing out progress in this new way, we can remove the parsing
of the old format.

Backport of #55711
2020-04-24 16:50:56 +03:00
Przemysław Witek
c89917c799
Register DFA jobs on putAnalytics rather than via a separate method (#55458) (#55708) 2020-04-24 10:59:32 +02:00
Dimitris Athanasiou
b8379872a7
[7.x][ML] Logs error when DFA task is set to failed (#55545) (#55668)
Also unmutes the integ test that stops and restarts
an outlier detection job with the hope of learning more
of the failure in #55068.

Backport of #55545

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-04-24 11:06:07 +03:00
David Roberts
46be9959a0
[ML] Audit when unassigned datafeeds are stopped (#55667)
Previously audit messages were indexed when datafeeds that were
assigned to a node were stopped, but not datafeeds that were
unassigned at the time they were stopped.

This change adds auditing for the unassigned case.

Backport of #55656
2020-04-23 20:46:35 +01:00
Dimitris Athanasiou
4b11adf074
[7.x][ML] Do not fail DFA task that is stopped during reindexing (#55659) (#55663)
While we were catching `TaskCancelledException` while we wait for
reindexing to complete, we missed the fact that this exception
may be wrapped in a multi-node cluster. This is the reason
we may still fail the task when stop is called while reindexing.

Some times we're lucky and the exception is thrown by the same
node that runs the job. Then the exception is not wrapped and
things work fine. But when that is not the case the exception is
wrapped, we fail to catch it, and set the task to failed.

The fix is to simply unwrap the exception when we check it it
is `TaskCancelledException`.

Closes #55068

Backport of #55659
2020-04-23 15:57:01 +03:00
Rory Hunter
d66af46724
Always use deprecateAndMaybeLog for deprecation warnings (#55319)
Backport of #55115.

Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...),
with an appropriate key, so that all messages can potentially be deduplicated.
2020-04-23 09:20:54 +01:00
David Roberts
87f4751eca [ML] Make find_file_structure recognize Kibana CSV report timestamps (#55609)
The Kibana CSV export feature uses a non-standard timestamp format.
This change adds it to the formats the find_file_structure endpoint
recognizes out-of-the-box, to make round-tripping data from Kibana
back to Kibana via CSV files easier.

Fixes #55586
2020-04-23 08:39:07 +01:00
Dimitris Athanasiou
50a5afed15
[7.x][ML] Prepare parsing phase_progress from DFA process (#55580) (#55587)
Data frame analytics process currently reports progress as
an integer `progress_percent`. We parse that and report it
from the _stats API as the progress of the `analyzing` phase.
However, we want to allow the DFA process to report progress
for more than one phase. This commit prepares for this by
parsing `phase_progress` from the process, an object that
contains the `phase` name plus the `progress_percent` for that
phase.

Backport of #55580
2020-04-22 16:38:32 +03:00
Benjamin Trent
7c81cd7833
[ML] explicitly disallow partial results in datafeed extractors (#55537) (#55585)
Instead of doing our own checks against REST status, shard counts, and shard failures, this commit changes all our extractor search requests to set `.setAllowPartialSearchResults(false)`.

- Scrolls are automatically cleared when a search failure occurs with `.setAllowPartialSearchResults(false)` set.
- Code error handling is simplified

closes https://github.com/elastic/elasticsearch/issues/40793
2020-04-22 09:07:44 -04:00
David Roberts
810caf5ffe
[ML] Test that audit message is written when closing unassigned job (#55582)
Issue #55521 suggested that audit messages were not written when
closing an unassigned job.  This is not the case, but we didn't
have a test to prove it.

Backport of #55571
2020-04-22 13:23:43 +01:00
David Roberts
2dc5586afe
[ML] Add effective max model memory limit to ML info (#55581)
The ML info endpoint returns the max_model_memory_limit setting
if one is configured.  However, it is still possible to create
a job that cannot run anywhere in the current cluster because
no node in the cluster has enough memory to accommodate it.

This change adds an extra piece of information,
limits.effective_max_model_memory_limit, to the ML info
response that returns the biggest model memory limit that could
be run in the current cluster assuming no other jobs were
running.

The idea is that the ML UI will be able to warn users who try to
create jobs with higher model memory limits that their jobs will
not be able to start unless they add a bigger ML node to their
cluster.

Backport of #55529
2020-04-22 12:28:50 +01:00
David Roberts
da5aeb8be7
[ML] Return assigned node in start/open job/datafeed response (#55570)
Adds a "node" field to the response from the following endpoints:

1. Open anomaly detection job
2. Start datafeed
3. Start data frame analytics job

If the job or datafeed is assigned to a node immediately then
this field will return the ID of that node.

In the case where a job or datafeed is opened or started lazily
the node field will contain an empty string.  Clients that want
to test whether a job or datafeed was opened or started lazily
can therefore check for this.

Backport of #55473
2020-04-22 12:06:53 +01:00
David Kyle
e99ef3542c Mute ModelLoadingServiceTests::testMaxCachedLimitReached 2020-04-22 11:53:07 +01:00
David Kyle
8e8c6b4aee
Fix accounting in ModelLoadingServiceTests (#55307) (#55547)
In the test after the first load event is is not known which models are cached as 
loading a later one will evict an earlier one and the order is not known.
The models could have been loaded 1 or 2 times not exactly twice
2020-04-21 19:25:06 +01:00
Stuart Tettemer
93a2e9b0f9
Test: MockScoreScript can be cacheable. (#55499)
Backport: 0ed1eb5
2020-04-20 17:09:58 -06:00
Benjamin Trent
cabff65aec
[ML] Fixing inference stats race condition (#55163) (#55486)
`updateAndGet` could actually call the internal method more than once on contention.
If I read the JavaDocs, it says:
```* @param updateFunction a side-effect-free function```
So, it could be getting multiple updates on contention, thus having a race condition where stats are double counted.

To fix, I am going to use a `ReadWriteLock`. The `LongAdder` objects allows fast thread safe writes in high contention environments. These can be protected by the `ReadWriteLock::readLock`.

When stats are persisted, I need to call reset on all these adders. This is NOT thread safe if additions are taking place concurrently. So, I am going to protect with `ReadWriteLock::writeLock`.

This should prevent race conditions while allowing high (ish) throughput in the highly contention paths in inference.

I did some simple throughput tests and this change is not significantly slower and is simpler to grok (IMO).

closes  https://github.com/elastic/elasticsearch/issues/54786
2020-04-20 16:21:18 -04:00
Benjamin Trent
24d41eb695
[ML] partitions model definitions into chunks (#55260) (#55484)
This paves the data layer way so that exceptionally large models are partitioned across multiple documents.

This change means that nodes before 7.8.0 will not be able to use trained inference models created on nodes on or after 7.8.0.

I chose the definition document limit to be 100. This *SHOULD* be plenty for any large model. One of the largest models that I have created so far had the following stats:
~314MB of inflated JSON, ~66MB when compressed, ~177MB of heap.
With the chunking sizes of `16 * 1024 * 1024` its compressed string could be partitioned to 5 documents.
Supporting models 20 times this size (compressed) seems adequate for now.
2020-04-20 16:08:54 -04:00
Benjamin Trent
fa0373a19f
[7.x] [ML] Fix log spam and disable ILM/SLM history for native ML tests (#55475)
* [ML] fix native ML test log spam (#55459)

This adds a dependency to ingest common. This removes the log spam resulting from basic plugins being enabled that require the common ingest processors.

* removing unnecessary changes

* removing unused imports

* removing unnecessary java setting
2020-04-20 15:41:30 -04:00
William Brafford
7817948926 Disable monitoring in ML multinode tests (#55461)
Removing the deprecated "xpack.monitoring.enabled" setting introduced
log spam and potentially some failures in ML tests. It's possible to use
a different, non-deprecated setting to disable monitoring, so we do that
here.
2020-04-20 10:51:16 -04:00
Przemysław Witek
7d5f74e964
Fix and unmute testSetUpgradeMode_ExistingTaskGetsUnassigned (#55368) (#55452) 2020-04-20 13:29:29 +02:00
William Brafford
49e30b15a2
Deprecate disabling basic-license features (#54816) (#55405)
We believe there's no longer a need to be able to disable basic-license
features completely using the "xpack.*.enabled" settings. If users don't
want to use those features, they simply don't need to use them. Having
such features always available lets us build more complex features that
assume basic-license features are present.

This commit deprecates settings of the form "xpack.*.enabled" for
basic-license features, excluding "security", which is a special case.
It also removes deprecated settings from integration tests and unit
tests where they're not directly relevant; e.g. monitoring and ILM are
no longer disabled in many integration tests.
2020-04-17 15:04:17 -04:00
Benjamin Trent
8c581c3388
[ML] fixing and unmuting testHRDSplit test (#55349) (#55393)
This fixes the long muted testHRDSplit. Some minor adjustments for modern day elasticsearch changes :). 

The cause of the failure is that a new `by` field entering the model with an exceptionally high count does not cause an anomaly. We have since stopped combining the `rare` and `by` in this manner. New entries in a `by` field are not anomalous because we have no history on them yet. 

closes https://github.com/elastic/elasticsearch/issues/32966
2020-04-17 09:55:52 -04:00
Benjamin Trent
65e0084120
[ML] do not start stopping tasks on reassignment (#55315) (#55388)
When a anomaly jobs, datafeeds, and analytics tasks are stopped, they enter an ephemeral state called `STOPPING`. 

If the node executing the task fails while this is occurring, they could be stuck in the limbo state of `STOPPING`. It is best to mark the tasks as completed if they get reassigned to a node.
2020-04-17 08:57:12 -04:00
Martijn van Groningen
417d5f2009
Make data streams in APIs resolvable. (#55337)
Backport from: #54726

The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used.

In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs.

Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex().
If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).

Relates to #53100
2020-04-17 08:33:37 +02:00
Mark Tozzi
22c55180c1
[7.x] Backport ValuesSourceRegistry and related work (#54922)
* Add ValuesSource Registry and associated logic (#54281)

* Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638)

* ValuesSourceRegistry Prototype (#48758)

* Remove generics from ValuesSource related classes (#49606)

* fix percentile aggregation tests (#50712)

* Basic thread safety for ValuesSourceRegistry (#50340)

* Remove target value type from ValuesSourceAggregationBuilder (#49943)

* Cleanup default values source type (#50992)

* CoreValuesSourceType no longer implements Writable (#51276)

* Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131)

* Put values source types on fields (#51503)

* Remove VST Any (#51539)

* Rewire terms agg to use new VS registry (#51182)

Also adds some basic AggTestCases for untested code
paths (and boilerplate for future tests once the IT are
converted over)

* Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337)

* Wire Percentiles aggregator into new VS framework (#51639)

This required a bit of a refactor to percentiles itself.  Before,
the Builder would switch on the chosen algo to generate an
algo-specific factory.  This doesn't work (or at least, would be
difficult) in the new VS framework.

This refactor consolidates both factories together and introduces
a PercentilesConfig object to act as a standardized way to pass
algo-specific parameters through the factory.  This object
is then used when deciding which kind of aggregator to create

Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will
be moved in a subsequent PR.

* Remove generics and target value type from MultiVSAB (#51647)

* fix checkstyle after merge (#52008)

* Plumb ValuesSourceRegistry through to QuerySearchContext (#51710)

* Convert RareTerms to new VS registry (#52166)

* Wire up Value Count (#52225)

* Wire up Max & Min aggregations (#52219)

* ValuesSource refactoring: Wire up Sum aggregation (#52571)

* ValuesSource refactoring: Wire up SigTerms aggregation (#52590)

* Soft immutability for VSConfig (#52729)

* Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734)

Also fixes Percentiles which was incorrectly specified to only accept
numeric, but in fact also accepts Boolean and Date (because those are
numeric on master - thanks `testSupportedFieldTypes` for catching it!)

* VS refactoring: Wire up stats aggregation (#52891)

* ValuesSource refactoring: Wire up string_stats aggregation (#52875)

* VS refactoring: Wire up median (MAD) aggregation (#52945)

* fix valuesourcetype issue with constant_keyword field (#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java

this commit implements `getValuesSourceType` for
the ConstantKeyword field type.

master was merged into feature/extensible-values-source
introducing a new field type that was not implementing
`getValuesSourceType`.

* ValuesSource refactoring: Wire up Avg aggregation (#52752)

* Wire PercentileRanks aggregator into new VS framework  (#51693)

* Add a VSConfig resolver for aggregations not using the registry (#53038)

* Vs refactor wire up ranges and date ranges (#52918)

* Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034)

This commit updates the geo_bounds aggregation to depend
on registering itself in the ValuesSourceRegistry

relates #42949.

* VS refactoring: convert Boxplot to new registry (#53132)

* Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037)

This commit updates the geo*_grid aggregations to depend
on registering itself in the ValuesSourceRegistry

relates to the values-source refactoring meta issue #42949.

* Wire-up geo_centroid agg to ValuesSourceRegistry (#53040)

This commit updates the geo_centroid aggregation to depend
on registering itself in the ValuesSourceRegistry.

relates to the values-source refactoring meta issue #42949.

* Fix type tests for Missing aggregation (#53501)

* ValuesSource Refactor: move histo VSType into XPack module (#53298)

- Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core.
- This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept
- Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs

* Wire up DateHistogram to the ValuesSourceRegistry (#53484)

* Vs refactor parser cleanup (#53198)

Co-authored-by: Zachary Tong <polyfractal@elastic.co>
Co-authored-by: Zachary Tong <zach@elastic.co>
Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com>
Co-authored-by: Tal Levy <JubBoy333@gmail.com>

* First batch of easy fixes

* Remove List.of from ValuesSourceRegistry

Note that we intend to have a follow up PR dealing with the mutability
of the registry, so I didn't even try to address that here.

* More compiler fixes

* More compiler fixes

* More compiler fixes

* Precommit is happy and so am I

* Add new Core VSTs to tests

* Disabled supported type test on SigTerms until we can backport it's fix

* fix checkstyle

* Fix test failure from semantic merge issue

* Fix some metaData->metadata replacements that got lost

* Fix list of supported types for MinAggregator

* Fix list of supported types for Avg

* remove unused import

Co-authored-by: Zachary Tong <polyfractal@elastic.co>
Co-authored-by: Zachary Tong <zach@elastic.co>
Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com>
Co-authored-by: Tal Levy <JubBoy333@gmail.com>
2020-04-16 16:54:46 -04:00