OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-18 19:05:06 +00:00

Author	SHA1	Message	Date
Benjamin Trent	a9b868b7a9	[7.x] [ML] allow data streams to be expanded for analytics and transforms (#58280 ) (#58455 ) This commits allows data streams to be a valid source for analytics and transforms. Data streams are fairly transparent and our `_search` and `_reindex` actions work without error. For `_transforms` the check-pointing works as desired as well. Data streams are effectively treated as an `alias` and the backing index values are stored within checkpointing information.	2020-06-23 14:40:35 -04:00
Benjamin Trent	0cc84d3caf	[ML] wait for yellow state for stats index in tests (#58436 ) (#58456 ) GET inference stats now reads from the .ml-stats index. Our tests should wait for yellow state before attempting to query the index for stat information.	2020-06-23 13:32:24 -04:00
James Rodewig	affc3954e6	[DOCS] Fix typo in RoutingNode comment (#58079 ) (#58454 ) Co-authored-by: Howard <danielhuang@tencent.com>	2020-06-23 13:07:08 -04:00
Dimitris Athanasiou	f67fee387b	[7.x][ML] Make regression training set predictable in size (#58331 ) (#58453 ) Unlike `classification`, which is using a cross validation splitter that produces training sets whose size is predictable and equal to `training_percent * class_cardinality`, for regression we have been using a random splitter that takes an independent decision for each document. This means we cannot predict the exact size of the training set. This poses a problem as we move towards performing test inference on the java side as we need to be able to provide an accurate upper bound of the training set size to the c++ process. This commit replaces the random splitter we use for regression with the same streaming-reservoir approach we do for `classification`. Backport of #58331	2020-06-23 19:49:03 +03:00
Marios Trivyzas	e7c40d973e	SQL: Relax parsing of date/time escaped literals (#58336 ) (#58450 ) Improve the usability of the MS-SQL server/ODBC escaped date/time/timestamp literals, by allowing timezone/offset ids in the parsed string, e.g.: ``` {ts '2000-01-01T11:11:11Z'} ``` Closes: #58262 (cherry picked from commit 0af1f2fef805324e802d97d2fd9b4660abb403f0)	2020-06-23 18:05:54 +02:00
Christoph Büscher	642b05a511	Fix test failure in RangeQueryBuilderTests.testToQuery (#58449 ) Very rarely this test can fail if we draw a random TimeZone id that we cannot parse with the legacy joda DateMathParser and get an IllegalArgumentException. In addition to a "SystemV/*" time zone we also need an index "versionCreated" before V_7_0_0 and no "format" setting in the query builder. Given how unlikely this combination is, we should simply dissallow those time zone ids when generating the random query builder for RangeQueryBuilderTests. Closes #58431	2020-06-23 17:44:18 +02:00
David Roberts	0d6bfd0ac3	[7.x][ML] Fix wire serialization for flush acknowledgements (#58443 ) There was a discrepancy in the implementation of flush acknowledgements: most of the class was designed on the basis that the "last finalized bucket time" could be null but the wire serialization assumed that it was never null. This works because, the C++ sends zero "last finalized bucket time" when it is not known or not relevant. But then the Java code will print that to XContent as it is assuming null represents not known or not relevant. This change corrects the discrepancies. Internally within the class null represents not known or not relevant, but this is translated from/to 0 for communications from the C++ and old nodes that have the bug. Additionally I switched from Date to Instant for this class and made the member variables final to modernise it a bit. Backport of #58413	2020-06-23 16:42:06 +01:00
Mark Tozzi	52806a8f89	Small VS config cleanup (#58294 ) (#58442 )	2020-06-23 10:53:06 -04:00
Benjamin Trent	61142a3005	[ML] only log if forecasts are set to failed (#58421 ) (#58437 ) This adjusts the logging level for setting forecasts to failed to WARN. And it will only log if 1 or more forecasts were adjusted to failed.	2020-06-23 10:24:03 -04:00
James Rodewig	afbf3bd33b	[DOCS] Add data streams to bulk, delete, and index API docs (#58340 ) (#58434 ) Updates existing docs for the bulk, delete and index APIs to make them aware of data streams.	2020-06-23 09:40:25 -04:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Nik Everett	519f41950a	Save memory when significant_text is not on top (#58145 ) (#58364 ) This merges the aggregator for `significant_text` into `significant_terms`, applying the optimization built in #55873 to save memory when the aggregation is not on top. The `significant_text` aggregation is pretty memory intensive all on its own and this doesn't particularly help with that, but it'll help with the memory usage of any sub-aggregations.	2020-06-23 09:19:05 -04:00
James Rodewig	9d03204308	[DOCS] Prohibit deletion of composable template in use by data stream (#58347 ) (#58430 ) Notes that you cannot delete a composable template currently in use by a data stream. Relates to #57957.	2020-06-23 09:01:17 -04:00
James Rodewig	b213f0222c	[DOCS] Reword tip in data streams overview	2020-06-23 08:57:59 -04:00
Dan Hermann	41e8f584c1	[7.x] Minimum node version check before creating data stream (#58424 )	2020-06-23 07:45:27 -05:00
Armin Braun	943efb78fd	Save Shard ID Serializations in Bulk Requests (#56209 ) (#58414 ) Just like #56094 but for the request side. Removes a lot of redundant `ShardId` instances from bulk shard requests as well as stops serializing index names when they're not needed because they're not different from what is in the shard id. Even ignoring the index name serialization savings here, this change saves one `ShardId` instance per bulk shard request at least. This means it saves approximately: * 8 bytes for the `ShardId` object (itself + one field) * + another 4 bytes for the `int` in the `ShardId` * 16 bytes (two fields + the instance itself + the padding) for the `Index` object * + 30 bytes for the `Index` uuid string * + all the bytes in the index name string => 60+ bytes per bulk request item saved on heap and over the wire	2020-06-23 12:35:52 +02:00
David Turner	256b660f0a	Remove anonymous PublicationContext implementation (#58412 ) Today the `PublicationContext` interface has a single anonymous implementation, and `PublicationTransportHandler` has various methods that take the variables that this anonymous class captures. This commit refactors this into a proper class with proper fields and moves the relevant methods onto this class. Backport of #58405 to 7.x.	2020-06-23 11:13:23 +01:00
Alan Woodward	519d1278e2	Make FieldTypeLookup immutable (#58162 ) (#58411 ) FieldTypeLookup maps field names to their MappedFieldTypes. In the past, due to the presence of multiple mapping types within a single index, this had to be updated in-place because a mapping update might only affect one type. However, now that we only have a single type per index, we can completely rebuild the FieldTypeLookup on each update, removing lots of concurrency worries.	2020-06-23 10:51:32 +01:00
David Roberts	f97b37190b	[ML] Add a new annotation type for categorization status changes (#58394 ) Adds a new value to the "event" enum of ML annotations, namely "categorization_status_change". This will allow users to see when categorization was found to be performing poorly. Once per-partition categorization is available, it will allow users to see when categorization is performing poorly for a specific partition. It does not make sense to reuse the "model_change" event that annotations already have, because categorizer state is separate to model state ("model" state is really anomaly detector state), and is not reverted by the revert model snapshot API. Therefore annotations related to categorization need to be treated differently to annotations related to anomaly detection.	2020-06-23 09:16:27 +01:00
Rene Groeschke	fc60cf6179	Introduce EnforceDeprecationFailuresPlugin (#58263 ) (#58309 ) - extract fail on deprecated usage into its own plugin - apply on all projects - ensures we don't miss any project (missed xpack/plugin/eql/qa/security before)	2020-06-23 09:14:12 +02:00
Rene Groeschke	bd2dd81bc6	Fix deprecated property usage in archive tasks (#58269 ) (#58308 )	2020-06-23 09:11:46 +02:00
István Zoltán Szabó	3169e4c70e	[DOCS] Updates screenshots in ML population analysis (#58318 )	2020-06-23 09:05:08 +02:00
Martijn van Groningen	7dda9934f9	Keep track of timestamp_field mapping as part of a data stream (#58400 ) Backporting #58096 to 7.x branch. Relates to #53100 * use mapping source direcly instead of using mapper service to extract the relevant mapping details * moved assertion to TimestampField class and added helper method for tests * Improved logic that inserts timestamp field mapping into an mapping. If the timestamp field path consisted out of object fields and if the final mapping did not contain the parent field then an error occurred, because the prior logic assumed that the object field existed.	2020-06-22 17:46:38 +02:00
Costin Leau	765f1b5775	SQL: Fix bug in resolving aliases against filters (#58399 ) When doing aliasing with the same name over non existing fields, the analyzer gets stuck in a loop trying to resolve the alias over and over leading to SO. This PR breaks the cycle by checking the relationship between the alias and the child it tries to replace as an alias should never replace its child. Fix #57270 Close #57417 Co-authored-by: Hailei <zhh5919@163.com> (cherry picked from commit 46786ff2e1ed5951006ff4bdd2b6ac6a1ebcf17b)	2020-06-22 16:05:42 +03:00
Dan Hermann	c5f5cc4cf8	[DOCS] Prohibit cloning, splitting, and shrinking a data stream's write index (#58105 ) (#58401 )	2020-06-22 07:29:26 -05:00
Przemko Robakowski	a44dad9fbb	[7.x] Add support for snapshot and restore to data streams (#57675 ) (#58371 ) * Add support for snapshot and restore to data streams (#57675) This change adds support for including data streams in snapshots. Names are provided in indices field (the same way as in other APIs), wildcards are supported. If rename pattern is specified it renames both data streams and backing indices. It also adds test to make sure SLM works correctly. Closes #57127 Relates to #53100 * version fix * compilation fix * compilation fix * remove unused changes * compilation fix * test fix	2020-06-19 22:41:51 +02:00
Benjamin Trent	bf8641aa15	[7.x] [ML] calculate cache misses for inference and return in stats (#58252 ) (#58363 ) When a local model is constructed, the cache hit miss count is incremented. When a user calls _stats, we will include the sum cache hit miss count across ALL nodes. This statistic is important to in comparing against the inference_count. If the cache hit miss count is near the inference_count it indicates that the cache is overburdened, or inappropriately configured.	2020-06-19 09:46:51 -04:00
James Rodewig	d8dc638a67	[DOCS] Document get data stream API response body (#58344 ) (#58360 )	2020-06-18 16:42:05 -04:00
James Rodewig	b8fa90198b	[DOCS] Prohibit deletion of a data stream's write index (#58341 ) (#58358 )	2020-06-18 16:00:10 -04:00
Lisa Cawley	6680271691	[DOCS] Updates pull and issue release attributes (#58348 )	2020-06-18 12:55:02 -07:00
Nik Everett	49684463dd	Mute ESTestCaseTests#testBasePortGradle Tracked by #58279. Failed a few times a day since June 13th.	2020-06-18 15:41:25 -04:00
Tal Levy	11086d5c7d	add geo_shape documentation for supported aggregations (#58284 ) (#58354 ) This commit adds documentation for geo_shape fields in aggregations Closes #55495.	2020-06-18 12:36:24 -07:00
William Brafford	b3c99f06d6	Mute flaky test (#58356 )	2020-06-18 15:30:11 -04:00
Andrei Dan	30e777856f	[7.x] Validate alias operations don't target data streams (#58327 ) (#58337 ) This adds validation to make sure alias operations (add, remove, remove index) don't target data streams or the backing indices. (cherry picked from commit 816448990e464a02f3960f12f6f6644a8cce36a4) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-18 20:23:07 +01:00
William Brafford	4836236446	Mute flaky multicluster tests (#58350 )	2020-06-18 14:39:59 -04:00
Ryan Ernst	d702cb0ad9	Consolidate temp dir handling in packaging tests (#58292 ) The packaging tests currently have a couple different ways of deciding where temp files should be placed, and then sometimes used fixed file or directory names within that dir. This commit conslidates some of that temp dir handling by making it more compatible with the handling that exists within the bats tests, where /tmp is not always appropriate due to how systemd interacts with it. This commit also adds a utility methhod for creating temp dirs, so as to ensure the new directory is created as if a umask of 022 were used, which is not the case when using Files.createTempDirectory without a set of permissions (it assumes 077).	2020-06-18 11:35:11 -07:00
Stuart Tettemer	20abba8433	Scripting: Deprecate general cache settings (#55753 ) (#58283 ) Backport: ef543b0	2020-06-18 11:54:23 -06:00
Jim Ferenczi	1c1a6d4ec8	Handle failures with no explicit cause in async search (#58319 ) This commit fixes an AOOBE in the handling of fatal failures in _async_search. If the underlying cause is not found, this change uses the root failure. Closes #58311	2020-06-18 18:57:58 +02:00
Przemysław Witek	9dd3d5aa48	[7.x] Delete auto-generated annotations when model snapshot is reverted (#58240 ) (#58335 )	2020-06-18 17:59:52 +02:00
Jason Tedor	be08268562	Allow follower indices to override leader settings (#58103 ) Today when creating a follower index via the put follow API, or via an auto-follow pattern, it is not possible to specify settings overrides for the follower index. Instead, we copy all of the leader index settings to the follower. Yet, there are cases where a user would want some different settings on the follower index such as the number of replicas, or allocation settings. This commit addresses this by allowing the user to specify settings overrides when creating follower index via manual put follower calls, or via auto-follow patterns. Note that not all settings can be overrode (e.g., index.number_of_shards) so we also have detection that prevents attempting to override settings that must be equal between the leader and follow index. Note that we do not even allow specifying such settings in the overrides, even if they are specified to be equal between the leader and the follower index. Instead, the must be implicitly copied from the leader index, not explicitly set by the user.	2020-06-18 11:56:06 -04:00
James Rodewig	9ba1b1d067	[DOCS] Reformat data stream API docs (#58322 ) (#58334 )	2020-06-18 10:59:12 -04:00
Alan Woodward	4b8cf2af6a	Add serialization test for FieldMappers when include_defaults=true (#58235 ) (#58328 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 15:46:04 +01:00
Marios Trivyzas	50b391e91b	SQL: [Docs] Fix TIME_PARSE documentation (#58182 ) (#58317 ) TIME_PARSE works correctly if both date and time parts are specified, and a TIME object (that contains only time is returned). Adjust docs and add a unit test that validates the behavior. Follows: #55223 (cherry picked from commit 9d6b679a5da88f3c131b9bdba49aa92c6c272abe)	2020-06-18 16:09:13 +02:00
Dan Hermann	3b511fd829	[DOCS] Add data stream APIs to main API page (#58204 ) (#58325 )	2020-06-18 08:41:43 -05:00
Dan Hermann	a2837097ff	[DOCS] Move some docs about data streams from the create page to the intro page	2020-06-18 08:24:06 -05:00
James Rodewig	64fb326637	[DOCS] Add data streams to search docs (#58278 ) (#58320 ) Changes: * Adds additional examples to the `Search a data stream` section of `Use a data stream` * Updates existing search docs to make them aware of data streams	2020-06-18 08:59:00 -04:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Rory Hunter	4da767bb3e	Fix version	2020-06-18 12:29:47 +01:00
Rory Hunter	a71f0cabdc	Version bump for 7.8.0 release	2020-06-18 11:04:56 +01:00
Christoph Büscher	ba0b046909	Fix test compilation issue	2020-06-18 11:36:11 +02:00

... 5 6 7 8 9 ...

52541 Commits