OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-03 17:39:15 +00:00

Author	SHA1	Message	Date
Dan Hermann	e54b4a729f	[7.x] Adds write_index_only option to put mapping API (#59539 )	2020-07-14 10:34:08 -05:00
Dimitris Athanasiou	e302c66847	[7.x][ML] Fix NPE when starting classification with missing dependent_variable (#59524 ) (#59540 ) Since we have added checking the cardinality of the dependent_variable for classification, we have introduced a bug where an NPE is thrown if the dependent_variable is a missing field. This commit is fixing this issue. Backport of #59524	2020-07-14 17:56:55 +03:00
Dan Hermann	59f639a279	Add auto_configure privilege	2020-07-14 08:23:49 -05:00
Andrei Dan	7dcdaeae49	Default to @timestamp in composable template datastream definition (#59317 ) (#59516 ) This makes the data_stream timestamp field specification optional when defining a composable template. When there isn't one specified it will default to `@timestamp`. (cherry picked from commit 5609353c5d164e15a636c22019c9c17fa98aac30) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-07-14 12:36:54 +01:00
Christos Soulios	3868bcc7b8	[7.x] Histogram integration on Histogram field type (#59431 ) Backports #58930 to 7.x Implements histogram aggregation over histogram fields as requested in #53285.	2020-07-13 19:36:33 +03:00
Dan Hermann	e01d73c737	[7.x] Data stream admin actions are now index-level actions	2020-07-10 14:36:18 -05:00
Dan Hermann	7fa9cf601b	Data stream support for rollup search	2020-07-10 11:13:34 -05:00
Dan Hermann	c7e977701a	Data stream support for async search	2020-07-09 13:12:04 -05:00
Dimitris Athanasiou	b2243337d8	[7.x][ML] Data frame analytics max_num_threads setting (#59254 ) (#59308 ) This adds a setting to data frame analytics jobs called `max_number_threads`. The setting expects a positive integer. When used the user specifies the max number of threads that may be used by the analysis. Note that the actual number of threads used is limited by the number of processors on the node where the job is assigned. Also, the process may use a couple more threads for operational functionality that is not the analysis itself. This setting may also be updated for a stopped job. More threads may reduce the time it takes to complete the job at the cost of using more CPU. Backport of #59254 and #57274	2020-07-09 19:15:46 +03:00
Dimitris Athanasiou	d07b11b86b	[7.x][ML] Perform test inference on java (#58877 ) (#59298 ) Since we are able to load the inference model and perform inference in java, we no longer need to rely on the analytics process to be performing test inference on the docs that were not used for training. The benefit is that we do not need to send test docs and fit them in memory of the c++ process. Backport of #58877 Co-authored-by: Dimitris Athanasiou <dimitris@elastic.co> Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2020-07-09 16:30:49 +03:00
Dimitris Athanasiou	d323f8d698	[ML] Add REST spec for the update data frame analytics endpoint (#59253 ) (#59281 ) Closes #59148 Backport of #59253	2020-07-09 13:12:21 +03:00
Yannick Welsch	0b9eb210b8	Add basic searchable snapshots usage information (#58828 ) (#59160 ) Adds super basic usage information for searchable snapshots, to be extended later. Backport of #58828	2020-07-08 13:09:29 +02:00
David Kyle	c651135562	[ML] Make Inference processor field_map and inference_config optional (#59010 ) Relaxes the requirement that the inference ingest processor must has a field_map and inference_config defined even if they are empty.	2020-07-06 11:35:30 +01:00
Martijn van Groningen	f0dd9b4ace	Add data stream timestamp validation via metadata field mapper (#59002 ) Backport of #58582 to 7.x branch. This commit adds a new metadata field mapper that validates, that a document has exactly a single timestamp value in the data stream timestamp field and that the timestamp field mapping only has `type`, `meta` or `format` attributes configured. Other attributes can affect the guarantee that an index with this meta field mapper has a useable timestamp field. The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever a new backing index of a data stream is created. Relates to #53100	2020-07-06 11:32:33 +02:00
Dan Hermann	5e7746d3bd	[7.x] Mirror privileges over data streams to their backing indices (#58991 )	2020-07-03 06:33:38 -05:00
David Kyle	f6a0c2c59d	[7.x] Pipeline Inference Aggregation (#58965 ) Adds a pipeline aggregation that loads a model and performs inference on the input aggregation results.	2020-07-03 09:29:04 +01:00
Benjamin Trent	bd9b3b6116	[ML] fix inference ml-stats-write alias creation (#58947 ) (#58959 ) The check for potentially creating the .ml-stats-write alias should verify that the indices actually exist. closes #58662	2020-07-02 16:16:42 -04:00
Dan Hermann	c988afdc15	Data stream support for migrations deprecations info API	2020-07-02 11:16:22 -05:00
Przemysław Witek	751e84e4c8	Rename regression evaluation metrics to make the names consistent with loss functions (#58887 ) (#58927 )	2020-07-02 17:35:55 +02:00
Dan Hermann	b78bfa01f6	[7.x] Data stream support for graph explore API	2020-07-02 08:19:03 -05:00
Przemysław Witek	8e074c4495	Rename "error" field to "value" for consistency between metrics (#58726 ) (#58870 )	2020-07-02 09:08:56 +02:00
Yang Wang	a5a8b4ae1d	Add cache for application privileges (#55836 ) (#58798 ) Add caching support for application privileges to reduce number of round-trips to security index when building application privilege descriptors. Privilege retrieving in NativePrivilegeStore is changed to always fetching all privilege documents for a given application. The caching is applied to all places including "get privilege", "has privileges" APIs and CompositeRolesStore (for authentication).	2020-07-02 11:50:03 +10:00
Benjamin Trent	c768467155	Muting flakey test (#58855 ) (#58856 )	2020-07-01 11:54:43 -04:00
Przemysław Witek	909649dd15	[7.x] Implement pseudo Huber loss (PseudoHuber) evaluation metric for regression analysis (#58734 ) (#58825 )	2020-07-01 14:52:06 +02:00
Dan Hermann	22806c943d	Data stream support for ILM remove policy API (#58595 ) (#58770 )	2020-06-30 14:03:19 -05:00
Przemysław Witek	9ea9b7bd3b	[7.x] Implement MSLE (MeanSquaredLogarithmicError) evaluation metric for regression analysis (#58684 ) (#58731 )	2020-06-30 14:09:11 +02:00
Benjamin Trent	def5550df3	[ML] fix ml inference stats tests (#58690 ) (#58729 )	2020-06-30 07:53:33 -04:00
Tim Vernum	dcc5a06dec	Display enterprise license as platinum in /_xpack (#58217 ) The GET /_license endpoint displays "enterprise" licenses as "platinum" by default so that old clients (including beats, kibana and logstash) know to interpret this new license type as if it were a platinum license. However, this compatibility layer was not applied to the GET /_xpack/ endpoint which also displays a license type & mode. This commit causes the _xpack API to mimic the _license API and treat enterprise as platinum by default, with a new accept_enterprise parameter that will cause the API to return the correct "enterprise" value. This BWC layer exists only for the 7.x branch. This is a breaking change because, since 7.6, the _xpack API has returned "enterprise" for enterprise licenses, but this has been found to break old versions of beats and logstash so needs to be corrected.	2020-06-30 16:42:28 +10:00
Dimitris Athanasiou	1817b896c9	[7.x][ML] Add status and increased estimate to memory usage (#58588 ) (#58606 ) Adds parsing of `status` and `memory_reestimate_bytes` to data frame analytics `memory_usage`. When the training surpasses the model memory limit, the status will be set to `hard_limit` and `memory_reestimate_bytes` can be used to update the job's limit in order to restart the job. Backport of #58588	2020-06-28 16:27:26 +03:00
Igor Motov	20af856abd	[7.x] EQL: Adds an ability to execute an asynchronous EQL search (#58192 ) Adds async support to EQL searches Closes #49638 Co-authored-by: James Rodewig james.rodewig@elastic.co	2020-06-25 14:11:57 -04:00
Martijn van Groningen	f4fad9c65a	Re-enable data streams yaml tests in bwc mode (#58500 ) Backport of #58403 to 7.x branch.	2020-06-24 16:59:51 +02:00
Benjamin Trent	0cc84d3caf	[ML] wait for yellow state for stats index in tests (#58436 ) (#58456 ) GET inference stats now reads from the .ml-stats index. Our tests should wait for yellow state before attempting to query the index for stat information.	2020-06-23 13:32:24 -04:00
David Roberts	0d6bfd0ac3	[7.x][ML] Fix wire serialization for flush acknowledgements (#58443 ) There was a discrepancy in the implementation of flush acknowledgements: most of the class was designed on the basis that the "last finalized bucket time" could be null but the wire serialization assumed that it was never null. This works because, the C++ sends zero "last finalized bucket time" when it is not known or not relevant. But then the Java code will print that to XContent as it is assuming null represents not known or not relevant. This change corrects the discrepancies. Internally within the class null represents not known or not relevant, but this is translated from/to 0 for communications from the C++ and old nodes that have the bug. Additionally I switched from Date to Instant for this class and made the member variables final to modernise it a bit. Backport of #58413	2020-06-23 16:42:06 +01:00
Lee Hinman	d272646a55	Fix name of template in allowed warning for DS YML test (#58273 ) The warning was present, but had the incorrect template name, leading to a test failure.	2020-06-17 11:23:04 -06:00
Benjamin Trent	081da09c72	Allow GET <pattern>/_rollup/data to expand data streams (#58173 ) (#58177 )	2020-06-16 14:01:54 -04:00
Dan Hermann	7079a3b09f	[7.x] Prohibit freezing the write index of a data stream (#58168 )	2020-06-16 09:37:32 -05:00
markharwood	03dd73dc0d	Fix for wildcard fields that returned ByteRefs not Strings to scripts. (#58060 ) (#58109 ) This need some reorg of BinaryDV field data classes to allow specialisation of scripted doc values. Moved common logic to a new abstract base class and added a new subclass to return string-based representations to scripts. Closes #58044	2020-06-15 14:52:56 +01:00
Martijn van Groningen	01d8bb8cfa	Enforce valid field mapping exists for timestamp_field in templates. (#58036 ) Backport of #57741 to 7.x branch. Relates to #53100	2020-06-12 15:24:42 +02:00
David Kyle	2905a2f623	Use Search After job iterators (#57875 ) (#57923 ) Search after is a better choice for the delete expired data iterators where processing takes a long time as unlike scroll a context does not have to be kept alive. Also changes the delete expired data endpoint to 404 if the job is unknown	2020-06-11 10:06:18 +01:00
Hendrik Muhs	95bd7b63b0	[Transform] fix page size return in cat transform, add dps (#57871 ) fixes the page size reported after moving page size to settings(#56007) and adds documents per second(throttling) to the output. fixes #56498	2020-06-10 08:10:25 +02:00
Dan Hermann	b501b282f8	Change default backing index naming scheme	2020-06-09 09:31:34 -05:00
David Kyle	08d1286de7	[7.x] Delete expired data by job (#57337 ) (#57796 ) Deleting expired data can take a long time leading to timeouts if there are many jobs. Often the problem is due to a few large jobs which prevent the regular maintenance of the remaining jobs. This change adds a job_id parameter to the delete expired data endpoint to help clean up those problematic jobs.	2020-06-08 13:00:23 +01:00
David Roberts	1d64d55a86	[7.x][ML] Add per-partition categorization option (#57723 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs. Backport of #57683	2020-06-06 08:15:17 +01:00
Dimitris Athanasiou	f49a14ce6f	[7.x][ML] Fix race condition when force stopping DF analytics job (#57680 ) (#57717 ) When we force delete a DF analytics job, we currently first force stop it and then we proceed with deleting the job config. This may result in logging errors if the job config is deleted before it is retrieved while the job is starting. Instead of force stopping the job, it would make more sense to try to stop the job gracefully first. So we now try that out first. If normal stop fails, then we resort to force stopping the job to ensure we can go through with the delete. In addition, this commit introduces `timeout` for the delete action and makes use of it in the child requests. Backport of #57680	2020-06-05 17:50:01 +03:00
Przemysław Witek	6b5f49d097	[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539 ) (#57641 )	2020-06-04 15:15:35 +02:00
Benjamin Trent	35d5126cea	[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351 ) (#57368 ) * [ML] adds new for_export flag to GET _ml/inference API (#57351) Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API. This flag is useful for moving models between clusters.	2020-05-29 14:01:08 -04:00
Benjamin Trent	c8374dc9f3	[ML] add max_model_memory parameter to forecast request (#57254 ) (#57355 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 11:16:08 -04:00
Martijn van Groningen	04ef39da77	Change cluster info actions to be able to resolve data streams. (#57343 ) Backport of #56878 to 7.x branch. With this change the following APIs will be able to resolve data streams: get index, get mappings and ilm explain APIs. Relates to #53100	2020-05-29 12:17:53 +02:00
Benjamin Trent	24d605e41e	[ML] fixing GET _ml/inference so size param is respected (#57303 ) (#57308 ) `size` was previously ignored when grabbing full trained model configs. closes https://github.com/elastic/elasticsearch/issues/57298	2020-05-28 15:45:26 -04:00
Martijn van Groningen	225ccd1cfa	Ensure template exists when creating data stream (#57275 ) Backporting #56888 to 7.x branch. Limit the creation of data streams only for namespaces that have a composable template with a data stream definition. This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover. Also remove `timestamp_field` parameter from create data stream request and let the create data stream api resolve the timestamp field from the data stream definition snippet inside a composable template. Relates to #53100	2020-05-28 15:08:25 +02:00

1 2 3 4 5 ...

721 Commits