OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-06 04:58:50 +00:00

Author	SHA1	Message	Date
Armin Braun	ed4984a32e	Remove Redundant Stream Wrapping from Compression (#62017 ) (#62132 ) In many cases we don't need a `StreamInput` or `StreamOutput` wrapper around these streams so I this commit adjusts the API to just normal streams and adds the wrapping where necessary.	2020-09-09 03:27:38 +02:00
Julie Tibshirani	2ca5f98e05	Small fixes to breaking changes docs. * Move ngram and shingle changes to the analysis section. * Add missing heading for field caps change.	2020-09-08 17:19:36 -07:00
Lisa Cawley	78b955eb86	[DOCS] Fix from and size descriptions for model APIs (#62128 )	2020-09-08 12:56:36 -07:00
Nik Everett	b8e9a7125f	Speed up empty highlighting many fields (backport of #61860 ) (#62122 ) Kibana often highlights everything like this: ``` POST /_search { "query": ..., "size": 500, "highlight": { "fields": { "": { ... } } } } ``` This can get slow when there are hundreds of mapped fields. I tested this locally and unscientifically and it took a request from 20ms to 150ms when there are 100 fields. I've seen clusters with 2000 fields where simple search go from 500ms to 1500ms just by turning on this sort of highlighting. Even when the query is just a `range` that and the fields are all numbers and stuff so it won't highlight anything. This speeds up the `unified` highlighter in this case in a few ways: 1. Build the highlighting infrastructure once field rather than once pre document per field. This cuts out a ton* of work analyzing the query over and over and over again. 2. Bail out of the highlighter before loading values if we can't produce any results. Combined these take that local 150ms case down to 65ms. This is unlikely to be really useful when there are only a few fetched docs and only a few fields, but we often end up having many fields with many fetched docs.	2020-09-08 15:49:50 -04:00
Alan Woodward	28fd4a2ae8	Convert RangeFieldMapper to parametrized form (#62058 ) This also adds the ability to define a serialization check on Parameters, used in this case to only serialize format and locale parameters if the mapper is a date range.	2020-09-08 18:44:13 +01:00
Alan Woodward	5f05eef7e3	Convert some more mapping tests to MapperServiceTestCase (#62089 ) We don't need to extend ESSingleNodeTestCase for all these tests.	2020-09-08 17:51:40 +01:00
James Rodewig	5bca671f57	[DOCS] Fix ILM read only link (#62113 ) (#62119 )	2020-09-08 12:19:24 -04:00
James Rodewig	cc5e01a242	[DOCS] Fix field caps API docs (#62110 ) (#62116 )	2020-09-08 12:19:04 -04:00
Costin Leau	0f9532689f	EQL: Propagate key constraints through the query (#62073 ) Since join keys are common across all queries in a Join/Sequence, any constraint applied on one query needs to be obeyed but all the other queries. This PR enhances the optimizer to propagate such constraints across all queries so they get pushed down to the actual generated ES queries. Fix #58937 (cherry picked from commit 4afa5debc199c132c07015bfae17952c40a21e5d)	2020-09-08 18:40:47 +03:00
Benjamin Trent	057bf3f7d5	[ML] setting require_alias to previous value on bulk index retry (#62103 ) (#62108 ) Previous work has been done to prevent automatically creating a concrete index when an alias is desired. This commit addresses a path where this check was not being done. relates: #62064	2020-09-08 11:38:32 -04:00
Lisa Cawley	f0e7d88699	[DOCS] Fix allow_no_match description for model APIs (#62008 )	2020-09-08 08:15:16 -07:00
Tim Brooks	075271758e	Keep checkpoint file channel open across fsyncs (#61744 ) Currently we open and close the checkpoint file channel for every fsync. This file channel can be kept open for the lifecycle of a translog writer. This avoids the overhead of opening the file, checking file permissions, and closing the file on every fsync.	2020-09-08 08:54:53 -06:00
Adam Locke	8f27e9fa28	[DOCS] [7.8] Clarify HTTPS usage for create key API (#60858 ) (#62098 ) * Update create-api-keys.asciidoc * Adding note to create API keys for https * Adding note for enabling TLS * Add specific setting for ssl.enabled * Incorporating review feedback	2020-09-08 10:23:43 -04:00
James Rodewig	97bba08ea6	[DOCS] Fix typo in Java API docs (#62095 ) (#62097 )	2020-09-08 09:49:03 -04:00
Dimitris Athanasiou	41507cff48	[7.x][ML] Update mappings of ml stats index (#61980 ) (#62091 ) - Adds missing mappings for `alpha`, `gamma`, and `lambda`. - Corrects name of `soft_tree_depth_limit` and `soft_tree_depth_tolerance`. - Removes unused `regularization_depth_penalty_multiplier`, `regularization_leaf_weight_penalty_multiplier` and `regularization_tree_size_penalty_multiplier`. Backport of #61980	2020-09-08 16:41:57 +03:00
David Roberts	b2636678b2	[ML] Add support for date_nanos fields in find_file_structure (#62048 ) Now that #61324 is merged it is possible for the find_file_structure endpoint to suggest using date_nanos fields for timestamps where the timestamp format provides greater than millisecond accuracy.	2020-09-08 13:05:09 +01:00
Francisco Fernández Castaño	2bb5716b3d	Add repositories metering API (#62088 ) This pull request adds a new set of APIs that allows tracking the number of requests performed by the different registered repositories. In order to avoid losing data, the repository statistics are archived after the repository is closed for a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the statistics for the active repositories as well as the modified/closed repositories. Backport of #60371	2020-09-08 14:01:04 +02:00
David Kyle	fb6ee5b36d	[7.x] [ML] Assert mappings match templates in Upgrade tests (#61905 ) At the end of the rolling upgrade tests check the mappings of the concrete .ml and .transform-internal indices match the mappings in the templates. When the templates change, the tests should prove that the mappings have been updated in the new cluster.	2020-09-08 12:21:19 +01:00
Przemko Robakowski	bb357f6aae	[7.x] Move internal index templates to composable templates (#61457 ) (#61661 ) This change moves watcher, ILM history and SLM history templates to composable templates. Versions are updated to reflect the switch. Only change to the templates themselves is added `_meta` to mark them as managed	2020-09-08 11:26:06 +02:00
Armin Braun	ebd1569028	Fix testMasterFailOverWithQueuedDeletes (#62062 ) (#62078 ) Fixing very rare corner case where the delete retry is slow. Closes #62031	2020-09-08 10:35:06 +02:00
Andrei Stefan	7d5791b6bd	EQL: create the search request with a list of indices (#62005 ) (#62076 ) * The query client uses an array of indices instead of the comma separated version of the indices names (cherry picked from commit 8ec4a768f4892a4a2faed25836cb333a9deb2ace)	2020-09-08 10:26:59 +03:00
Nhat Nguyen	6574d81c59	Fix testEnableSoftDeletesOnRestore Relates #62018	2020-09-07 15:10:55 -04:00
David Kyle	a5b24bf44c	Mute ClassificationIT (#62063 ) testWithOnlyTrainingRowsAndTrainingPercentIsFifty_DependentVariableIsBoolean For #60759	2020-09-07 16:10:48 +01:00
Nhat Nguyen	bb0a583990	Allow enabling soft-deletes on restore from snapshot (#62018 ) Closes #61969	2020-09-07 09:45:36 -04:00
Luca Cavanna	168b448a0f	Rename runtime_script field type to runtime (#62034 ) We've had some discussions around the user experience when using runtime fields. Although we do plan on having multiple runtime fields implementation (e.g. grok, lookup etc.) which could be exposed as different field types, we decided to expose all runtime fields under the same `runtime` type. At the moment, the only implementation will be through scripts, hence a `script` must be specified. In the future, there will be other ways to generate values for runtime fields besides scripts. This translates also to renaming the RuntimeScriptFieldMapper class to RuntimeFieldMapper . Relates to #59332	2020-09-07 15:07:23 +02:00
Alan Woodward	cbc9578cbd	Remove SearchPhase interface (#62050 ) The interface is never used as an abstraction - implementations are are called directly, and most of them don't need to implement the preProcess method.	2020-09-07 13:45:43 +01:00
David Turner	3389d5ccb2	Introduce integ tests for high disk watermark (#60460 ) An important goal of the disk threshold decider is to ensure that nodes use less disk space than the high watermark, and to take action if a node ever exceeds this watermark. Today we do not have any integration-style tests of this high-level behaviour. This commit introduces a small test harness that can adjust the apparent size of the disk and verify that the disk threshold decider moves shards around in response. Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-09-07 14:39:39 +02:00
David Kyle	dfd196cb01	Mute Docs rollover index test snippet (#62045 ) (#62047 ) For #62043	2020-09-07 12:47:02 +01:00
Armin Braun	395538f508	Improve Snapshot State Machine Performance (#62000 ) (#62049 ) Just a few random things to optimize motivated by somewhat sub-standard performance for large snapshot cluster states with many concurrent snapshots observed in production.	2020-09-07 13:25:40 +02:00
Jim Ferenczi	fa8e76abb1	Improve reduction of terms aggregations (#61779 ) (#62028 ) Today, the terms aggregation reduces multiple aggregations at once using a map to group same buckets together. This operation can be costly since it requires to lookup every bucket in a global map with no particular order. This commit changes how term buckets are sorted by shards and partial reduces in order to be able to reduce results using a merge-sort strategy. For bwc, results are merged with the legacy code if any of the aggregations use a different sort (if it was returned by a node in prior versions). Relates #51857	2020-09-07 13:13:20 +02:00
István Zoltán Szabó	b07b75ce14	[DOCS] Removes inference from the names of trained model APIs. (#62036 ) (#62041 ) # Conflicts: # docs/reference/ml/df-analytics/apis/get-inference-trained-model.asciidoc	2020-09-07 12:14:13 +02:00
Alan Woodward	a295b0aa86	Fix null_value parsing for data_nanos field mapper (#61994 ) The null_value parameter for date fields is always parsed using DateFormatter.parseMillis, which is incorrect for nanosecond resolution fields. This commit changes the parsing logic to always use DateFieldType.parse() to parse the null value.	2020-09-07 10:58:54 +01:00
Alan Woodward	1799c0c583	Convert completion, binary, boolean tests to MapperTestCase (#62004 ) Also fixes a metadata serialization bug in CompletionFieldMapper.	2020-09-07 10:48:20 +01:00
Martijn van Groningen	7e566ddd06	Move data stream yaml tests to xpack plugin module. (#62032 ) Backport of #61998 to 7.x branch. Moving the data stream yaml tests to xpack plugin module has the following benefits: * The tests are ran both with security enabled (as part of xpack/plugin integTest) and disabled (as part of xpack/plugin/data-stream/qa/rest integTest). * and running the tests in mixed cluster qa environment.	2020-09-07 11:03:32 +02:00
Tanguy Leroux	ebbf4df9fd	Adapt SearchableSnapshotsBlobStoreCacheIntegTests to Lucene 8.7.0 (#61989 ) (#62030 ) Elasticsearch now uses #61957 which includes https://issues.apache.org/jira/browse/LUCENE-9456. We can remove the corresponding //TODO in SearchableSnapshotsBlobStoreCacheIntegTests.	2020-09-07 10:25:44 +02:00
Luca Cavanna	0c8b438577	Add support for runtime fields (#61776 ) This commit includes the work that has been done on the runtime fields feature branch until now. The high level tasks are listed in #59332. The tasks that have not yet been completed can be worked on after merging the feature branch. We are adding a new x-pack plugin called runtime-fields that plugs in a custom mapper which allows to define runtime fields based on a script. The changes included in this commit that were made outside of the x-pack/plugin/runtime-fields directory are minimal and revolve around 1) making the ScriptService available while parsing index mappings so that the scripts associated to runtime fields can be compiled 2) sharing code to manipulate ranges etc. as it can be reused in runtime fields. Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-09-07 09:14:53 +02:00
Howard	b26584dff8	Remove unused deciders in BalancedShardsAllocator (#62026 )	2020-09-07 00:04:16 -04:00
Armin Braun	1e3edbbe74	Simplify BytesReference StreamInput (#61681 ) (#62014 ) Flattening both streams into a single stream here saves a few objects and some indirection. Also, removed the redundant `offset` field which added nothing but complexity by forcing the incrementation of two counters on every read.	2020-09-05 10:45:52 +02:00
Luca Cavanna	ab8f65a099	[TEST] Don't specify a type unless needed (#62011 ) We have a couple of yaml tests that index documents under a 'test' type, while they could omit it. We do want to still test that specifying the type is still allowed in 7.x but we already have specific tests for that, and other tests should use the endpoint that don't require specifying a type.	2020-09-05 09:27:00 +02:00
Lisa Cawley	bc5eec8205	[DOCS] Fix capitalization in HLRC ML APIs (#62010 ) (#62012 )	2020-09-04 16:57:15 -07:00
Ryan Ernst	6d3b691048	Add snapshot only test modules (#61954 ) This commit adds external test modules. These are modules meant for external systems to test edge cases in elasticsearch, but only within snapshots. They are not meant to be used in production, so protections are also added from their accidental inclusion in release builds. Note that this commit does not actually add any new modules, it only adds the infrastructure for the new modules, under `test/external-modules`.	2020-09-04 16:35:18 -07:00
Rene Groeschke	0b6d187932	Fix resolveAllDependencies broken by ArtfactTransforms (#61972 ) (#61979 ) - ignore es extracted configuration for resolveAllDeps - fixes #61945	2020-09-04 20:21:46 +02:00
Lisa Cawley	2789b8e6c4	[DOCS] Refresh machine learning custom URL example (#61826 ) (#61950 )	2020-09-04 09:44:55 -07:00
James Rodewig	9f1f468cef	[DOCS] Document dynamic discovery settings (#61420 ) (#62002 )	2020-09-04 11:36:34 -04:00
James Rodewig	7e2903d888	[DOCS] Document dynamic index mgmt and buffer settings (#61753 ) (#61996 )	2020-09-04 10:40:55 -04:00
Dimitris Athanasiou	d37f197efd	[7.x][ML] Allow training_percent to be any positive double up to hundred (#61977 ) (#61990 ) This changes the valid range of `training_percent` for regression and classification from [1, 100] to (0, 100]. Backport of #61977	2020-09-04 17:34:14 +03:00
James Rodewig	3396184ff3	[DOCS] Use correct get document API (#61804 ) (#61992 ) The documentation refers to a deprecated get document API call (it uses document `type`). Co-authored-by: Thiago Souza <thiago@elastic.co>	2020-09-04 10:04:33 -04:00
Yannick Welsch	6d08b55d4e	Simplify searchable snapshot shard allocation (#61911 ) Simplifies allocation for snapshot-backed shards by always making the recovery source "from snapshot" for those snapshot-backed shards (instead of "recover from local or from empty store"). Also let's the balancer pick a node which to allocate the snapshot-backed shard to (which takes number of shards on each node into account unlike the current implementation which just picks whatever node we are allowed to allocate to, with no notion of "balancing" at all).	2020-09-04 15:45:00 +02:00
James Rodewig	7863df88e3	[DOCS] Fix typo in URL-based access control docs (#61896 ) (#61986 ) Co-authored-by: George Tseres <george.tseres@gmail.com>	2020-09-04 09:24:48 -04:00
Alan Woodward	66bb1eea98	Improve error messages on bad [format] and [null_value] params for date mapper (#61932 ) Currently, if an incorrectly formatted date is passed as a null_value for a date field mapper configuration, you get a vague error: Failed to parse mapping [_doc]: cannot parse empty date Similarly, if you pass an incorrect format, you get the error: Failed to parse mapping [_doc]: Invalid format [...] This commit improves both these errors by including the mapper name and parameter that are misconfigured. Fixes #61712	2020-09-04 14:13:28 +01:00

1 2 3 4 5 ...

53495 Commits