OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	9789e6d154	Migrate some field mapper tests to ESTestCase (#61301 ) (#61346 ) This switches a few tests for field mappers from `ESSingleNodeTestCase` to `ESTestCase` because, in general, we prefer to avoid `ESSingleNodeTestCase` when we can because it is slow and "big". "Big" here means that it pulls in an entire node, making it difficult to reason about what you are testing.	2020-08-19 15:43:49 -04:00
Mark Tozzi	db1df6cc30	[7.x] Remove a bunch of type boilerplate from Aggs (#60852 ) (#61031 )	2020-08-17 12:13:05 -04:00
Alan Woodward	c81dc2b8b7	Convert KeywordFieldMapper to parametrized form (#60645 ) This makes KeywordFieldMapper extend ParametrizedFieldMapper, with explicitly defined parameters. In addition, we add a new option to Parameter, restrictedStringParam, which accepts a restricted set of string options.	2020-08-12 11:41:11 +01:00
Nik Everett	664ba0a80a	Fix the parent join aggregator test case (#60991 ) The test was putting parent and child documents into different segments which is unrealistic and was causing errors. Closes #60980	2020-08-11 17:53:15 -04:00
Nhat Nguyen	4bdf283619	Mute ChildrenToParentAggregatorTests Tracked at #60980	2020-08-11 12:56:29 -04:00
Alan Woodward	54279212cf	Make MetadataFieldMapper extend ParametrizedFieldMapper (#59847 ) (#60924 ) This commit cuts over all metadata field mappers to parametrized format.	2020-08-11 09:02:28 +01:00
Jim Ferenczi	f30f1f04e2	Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60816 ) This commit removes the ability to test the top level result of an aggregator before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test the final output (the one sent to the end user) rather than an intermediary result that could be different. This change also removes spurious commits triggered on top of a random index writer. These commits slow down the tests and are redundant with the commits that the random index writer performs.	2020-08-10 17:23:00 +02:00
Jack Conradson	d4d58e70f5	Unmute the test failure: painless/71_context_api (#60758 ) I was unable to reproduce this locally on either 7.6 (first introduced) and 7.x. This is already not muted on master and doesn't appear to have failures. There were some API changes at the time that could have affected this test, and I'm wondering with backports if this is now stable again. If this has more failures, I will continue to investigate further. Relates to #51939	2020-08-05 10:27:19 -07:00
Jake Landis	f3752ba1d5	7.x suport new path for re-index java-api doc (#60319 ) This commit uses the new location for the reindex java-api documentation. Temporary files have been left behind to pacify the docs build. related #60339	2020-08-05 09:05:07 -05:00
Alan Woodward	b3ae5d26bd	Move mapper validation to the mappers themselves (#60072 ) (#60649 ) Currently, validation of mappers (checking that cross-references are correct, limits on field name lengths and object depths, multiple definitions, etc) is performed by the MapperService. This means that any mapper-specific validation, for example that done on the CompletionFieldMapper, needs to be called specifically from core server code, and so we can't add validation to mappers that live in plugins. This commit reworks the validation framework so that mapper-specific validation is done on the Mapper itself. Mapper gets a new `validate(MappingLookup)` method (already present on `MetadataFieldMapper` and now pulled up to the parent interface), which is called from a new `DocumentMapper.validate()` method. All the validation code currently living on `MapperService` moves either to individual mapper implementations (FieldAliasMapper, CompletionFieldMapper) or into `MappingLookup`, an altered `DocumentFieldMappers` which now knows about object fields and can check for duplicate definitions, or into DocumentMapper which handles soft limit checks.	2020-08-04 14:39:20 +01:00
Rene Groeschke	bdd7347bbf	Merge test runner task into RestIntegTest (7.x backport) (#60600 ) * Merge test runner task into RestIntegTest (#60261) * Merge test runner task into RestIntegTest * Reorganizing Standalone runner and RestIntegTest task * Rework general test task configuration and extension * Fix merge issues * use former 7.x common test configuration	2020-08-04 14:46:32 +02:00
Julie Tibshirani	f99584c6f3	Avoid reloading _source for every inner hit. (#60632 ) Previously if an inner_hits block required _ source, we would reload and parse the root document's source for every hit. This PR adds a shared SourceLookup to the inner hits context that allows inner hits to reuse parsed source if it's already available. This matches our approach for sharing the root document ID. Relates to #32818.	2020-08-03 17:12:27 -07:00
Rene Groeschke	ed4b70190b	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 ) - Replace immediate task creations by using task avoidance api - One step closer to #56610 - Still many tasks are created during configuration phase. Tackled in separate steps	2020-07-31 13:09:04 +02:00
Julie Tibshirani	8ac81a3447	Remove IndexFieldData#clear since it is unused. (#60475 ) This method was never called. It also seemed tricky that calling a method on `IndexFieldData` could clear the contents of a shared cache.	2020-07-30 14:07:55 -07:00
Julie Tibshirani	dfd7f226f0	Clarify SourceLookup sharing across fetch subphases. (#60484 ) The `SourceLookup` class provides access to the _source for a particular document, specified through `SourceLookup#setSegmentAndDocument`. Previously the search context contained a single `SourceLookup` that was shared between different fetch subphases. It was hard to reason about its state: is `SourceLookup` set to the expected document? Is the _source already loaded and available? Instead of using a global source lookup, the fetch hit context now provides access to a lookup that is set to load from the hit document. This refactor closes #31000, since the same `SourceLookup` is no longer shared between the 'fetch _source phase' and script execution.	2020-07-30 13:22:31 -07:00
Julie Tibshirani	5359417ec3	Minor clean-up around search highlight context. (#60422 ) * Rename SearchContextHighlight -> SearchHighlightContext. * Rename HighlighterContext to FieldHighlightContext. * Make the search highlight context immutable. * Avoid storing SearchHighlightContext on HighlighterContext.	2020-07-29 11:39:17 -07:00
David Turner	bbacad648a	Fix network logging test failures (#60334 ) In #60297 we added some tests related to logging from the transport layer, but these tests failed occasionally since the cluster was kept alive between test invocations but the logging framework expected it only to be used for a single test. With this commit we reduce the scope of the internal test cluster to `TEST` to solve this problem. Closes #60321.	2020-07-29 08:29:09 +01:00
Julie Tibshirani	c7bfb5de41	Add search `fields` parameter to support high-level field retrieval. (#60258 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-28 10:58:20 -07:00
David Turner	9c62b5cb96	Mute tests for #60321	2020-07-28 18:12:54 +01:00
David Turner	9450ea08b4	Log and track open/close of transport connections (#60297 ) Transport connections between nodes remain in place until one or other node shuts down or the connection is disrupted by a flaky network. Today it is very difficult to demonstrate that transient failures and cluster instability are caused by the network even though this is often the case. In particular, transport connections open and close without logging anything, even at `DEBUG` level, making it very hard to quantify the scale of the problem or to correlate the networking problems with external events. This commit adds the missing `DEBUG`-level logging when transport connections open and close, and also tracks the total number of transport connections a node has opened as a measure of the stability of the underlying network.	2020-07-28 17:08:04 +01:00
Jake Landis	92ce41cfaf	[7.x] Introduce javaRestTest source set/task and convert modules (#59939 ) (#60026 ) Introduce a javaRestTest source set and task to compliment the yamlRestTest. javaRestTest differs such that the code is sourced from Java and may have different dependencies and setup requirements for the test clusters. This also allows the tests to run in parallel in different cluster instances to prevent any cross test contamination between the two types of tests. Included in this PR is all :modules no longer use the integTest task. The tests are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest. Since only :modules (and :rest-api-spec) have been converted to yamlRestTest we can now disable the integTest task if either yamlRestTest or javaRestTest have been applied. Once all projects are converted, we can delete the integTest task. related: #56841 related: #59444	2020-07-28 08:39:11 -05:00
Yannick Welsch	ffe114b890	Set specific keepalive options by default on supported platforms (#59278 ) keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of distributed systems such as Elasticsearch. This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations that support it (>= Java 11 & (MacOS \|\| Linux)) and where the system defaults are set to something higher than 5 minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings unless they are deemed to be set too high by providing better out-of-the-box defaults.	2020-07-28 11:10:04 +02:00
Jake Landis	55216dabb4	[7.x] Per processor description for verbose simulate (#58207 ) (#60008 ) For ingest node processors a per processor description was recently added. This commit displays that description in the verbose output of the pipeline simulation. related #57906	2020-07-21 17:32:45 -05:00
malpani	0555fef799	Support ignore_keywords flag for word delimiter graph token filter (#59563 ) This commit allows customizing the word delimiter token filters to skip processing tokens tagged as keyword through the `ignore_keywords` flag Lucene's WordDelimiterGraphFilter already exposes. Fix for #59491	2020-07-21 16:11:55 +01:00
Nik Everett	6f6076e208	Drop some params from IndexFieldData.Builder (backport of #59934 ) (#59972 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 10:28:59 -04:00
Armin Braun	5b92596fad	Cleanup and Optimize Multiple Serialization Spots (#59626 ) (#59936 ) Follow up to #59606 using some of the new infrastructure and making similar cleanups (and due to at times better handling of size hints and empty collections also optimizations in the stream utility methods this also means speedups) in various spots in the core codebase.	2020-07-21 10:06:56 +02:00
Nik Everett	95e6e4a452	Small cleanup for IndexFieldData (#59724 ) (#59800 ) This drops `IndexComponent` from `IndexFieldData` because it wasn't doing anything other than forcing us to perform a bunch of ceremony to build them.	2020-07-17 13:38:15 -04:00
Dan Hermann	48df9b1a0e	Update regex file for es user agent node processor (#59697 ) (#59794 )	2020-07-17 11:04:01 -05:00
Benjamin Trent	b7f30fc929	[7.x] Adding new `require_alias` option to indexing requests (#58917 ) (#59769 ) * Adding new `require_alias` option to indexing requests (#58917) This commit adds the `require_alias` flag to requests that create new documents. This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias. When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings. This is useful when an alias is required instead of a concrete index. closes https://github.com/elastic/elasticsearch/issues/55267	2020-07-17 10:24:58 -04:00
Stuart Tettemer	8fdaed0642	Scripting: Augment String with sha1 and sha256 (#59671 ) (#59723 ) Only available in the ingest context for use in ingest pipelines. Digests are computed on the UTF-8 encoding of the String and are returned as hex strings. sha1() return hex strings of length 40, sha256() returns length 64 Fixes: #59647 Backport: 3c85272	2020-07-16 15:17:32 -05:00
Stuart Tettemer	c491212dc1	Scripting: fix generateContextDoc path and url #59676 (#59722 ) * Add doc runtime class path * Use getAllHttpSocketURI.get(0) instead of getAllHttpSocketURI to get a single test cluster URL rather than a list Backport: 3057e0f	2020-07-16 15:03:36 -05:00
Ignacio Vera	f8037abf47	upgrade to lucene-8.6.0 release (#59596 ) (#59599 )	2020-07-15 12:40:57 +02:00
Luca Cavanna	af2f85be15	Consolidate script parsing from object (7.x) (#59509 ) The update by query action parses a script from an object (map or string). We will need to do the same for runtime fields as they are parsed as part of mappings (#59391). This commit moves the existing parsing of a script from an object from RestUpdateByQueryAction to the Script class. It also adds tests and adjusts some error messages that are incorrect. Also, options were not parsed before and they are now. And unsupported fields trigger now a deprecation warning.	2020-07-14 17:08:29 +02:00
Jake Landis	665b7b7bd8	Convert modules to use yamlRestTest (#59089 ) (#59446 ) This commit moves the modules REST tests to the newly introduced yamlRestTest source set. A few tests have also been re-named to include the correct IT suffix. Without changing the names, the testing conventions task would fail since now that the YAML tests are no longer present pacify the convention. These tests have moved to the internalClusterTest source set. related: #56841	2020-07-13 13:53:05 -05:00
Martijn van Groningen	b1b7bf3912	Make data streams a basic licensed feature. (#59392 ) Backport of #59293 to 7.x branch. * Create new data-stream xpack module. * Move TimestampFieldMapper to the new module, this results in storing a composable index template with data stream definition only to work with default distribution. This way data streams can only be used with default distribution, since a data stream can currently only be created if a matching composable index template exists with a data stream definition. * Renamed `_timestamp` meta field mapper to `_data_stream_timestamp` meta field mapper. * Add logic to put composable index template api to fail if `_data_stream_timestamp` meta field mapper isn't registered. So that a more understandable error is returned when attempting to store a template with data stream definition via the oss distribution. In a follow up the data stream transport and rest actions can be moved to the xpack data-stream module.	2020-07-13 17:26:46 +02:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Dan Hermann	34c50c045c	Data stream support for rank eval API	2020-07-09 13:11:29 -05:00
Przemko Robakowski	c870d6e570	[7.x] Restart tests with data streams (#58330 ) (#59303 ) * Restart tests with data streams (#58330)	2020-07-09 17:52:20 +02:00
Alan Woodward	67a27e2b9d	Add declarative parameters to FieldMappers (#58663 ) The FieldMapper infrastructure currently has a bunch of shared parameters, many of which are only applicable to a subset of the 41 mapper implementations we ship with. Merging, parsing and serialization of these parameters are spread around the class hierarchy, with much repetitive boilerplate code required. It would be much easier to reason about these things if we could declare the parameter set of each FieldMapper directly in the implementing class, and share the parsing, merging and serialization logic instead. This commit is a first effort at introducing a declarative parameter style. It adds a new FieldMapper subclass, ParametrizedFieldMapper, and refactors two mappers, Boolean and Binary, to use it. Parameters are declared on Builder classes, with the declaration including the parameter name, whether or not it is updateable, a default value, how to parse it from mappings, and how to extract it from another mapper at merge time. Builders have a getParameters method, which returns a list of the declared parameters; this is then used for parsing, merging and serialization. Merging is achieved by constructing a new Builder from the existing Mapper, and merging in values from the merging Mapper; conflicts are all caught at this point, and if none exist then a new, merged, Mapper can be built from the Builder. This allows all values on the Mapper to be final. Other mappers can be gradually migrated to this new style, and once they have all been refactored we can merge ParametrizedFieldMapper and FieldMapper entirely.	2020-07-09 11:43:21 +01:00
Martijn van Groningen	17bd559253	Fix the timestamp field of a data stream to @timestamp (#59210 ) Backport of #59076 to 7.x branch. The commit makes the following changes: * The timestamp field of a data stream definition in a composable index template can only be set to '@timestamp'. * Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and instead only check that the _timestamp field mapping has been defined on a backing index of a data stream. * Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method to `MetadataIndexTemplateService#collectMappings(...)` method. * Fixed a bug (#58956) that cases timestamp field validation to be performed for each template and instead of the final mappings that is created. * only apply _timestamp meta field if index is created as part of a data stream or data stream rollover, this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition. Relates to #58642 Relates to #53100 Closes #58956 Closes #58583	2020-07-08 17:30:46 +02:00
Nik Everett	a29d3515a2	Improve cardinality measure used to build aggs (#56533 ) (#59107 ) This makes a `parentCardinality` available to every `Aggregator`'s ctor so it can make intelligent choices about how it collects bucket values. This replaces `collectsFromSingleBucket` and is similar to it but: 1. It supports `NONE`, `ONE`, and `MANY` values and is generally extensible if we decide we can use more precise counts. 2. It is more accurate. `collectsFromSingleBucket` assumed that all sub-aggregations live under multi-bucket aggregations. This is normally true but `parentCardinality` is properly carried forward for single bucket aggregations like `filter` and for multi-bucket aggregations configured in single-bucket for like `range` with a single range. While I was touching every aggregation I renamed `doCreateInternal` to `createMapped` because that seemed like a much better name and it was right there, next to the change I was already making. Relates to #56487 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-08 08:42:23 -04:00
Armin Braun	9268b25789	Add Check for Metadata Existence in BlobStoreRepository (#59141 ) (#59216 ) In order to ensure that we do not write a broken piece of `RepositoryData` because the phyiscal repository generation was moved ahead more than one step by erroneous concurrent writing to a repository we must check whether or not the current assumed repository generation exists in the repository physically. Without this check we run the risk of writing on top of stale cached repository data. Relates #56911	2020-07-08 14:25:01 +02:00
Rene Groeschke	a896df53ac	Remove misc dependency related deprecation warnings (7.x backport) (#59122 ) * Fix dependency related deprecations (#58892) * Fix classpath setup for forbiddenapi usage	2020-07-07 17:10:31 +02:00
Ignacio Vera	5cc6457ed8	upgrade to lucene-8.6.0-snapshot-6a715e2ecc3 (#59091 ) (#59120 )	2020-07-07 12:07:41 +02:00
Jake Landis	604c6dd528	7.x - Create plugin for yamlTest task (#56841 ) (#59090 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 14:16:26 -05:00
Nik Everett	2965c7fe12	Fix bug in parent and child aggregators when parent field not defined (#57089 ) (#59074 ) Adding null check for ParentJoinFieldMapper in ChildrenAggregationBuilder.joinFieldResolveConfig Closes #42997 Co-authored-by: ParthPunkster <parthjain.pj1994@gmail.com>	2020-07-06 10:59:47 -04:00
Martijn van Groningen	f0dd9b4ace	Add data stream timestamp validation via metadata field mapper (#59002 ) Backport of #58582 to 7.x branch. This commit adds a new metadata field mapper that validates, that a document has exactly a single timestamp value in the data stream timestamp field and that the timestamp field mapping only has `type`, `meta` or `format` attributes configured. Other attributes can affect the guarantee that an index with this meta field mapper has a useable timestamp field. The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever a new backing index of a data stream is created. Relates to #53100	2020-07-06 11:32:33 +02:00
Dan Hermann	c1781bc7e7	[7.x] Add include_data_streams flag for authorization (#59008 )	2020-07-03 12:58:39 -05:00
Tim Brooks	605e24ed7c	Use `getPortRange` in http server tests (#58794 ) Currently we are leaving the settings to default port range in the nio and netty4 http server test. This has recently led to tests failing due to what appears to be a port conflict with other processes. This commit modifies these tests to use the test case helper method to generate port ranges. Fixes #58433 and #58296.	2020-07-02 13:21:45 -06:00
Dan Hermann	40655069e2	Data stream support for delete-by-query	2020-07-02 08:17:24 -05:00
Dan Hermann	fba1047ad9	Data stream support for update by query API	2020-07-02 08:16:05 -05:00
Alan Woodward	0cd1dc3143	Percolator keyword fields should not store norms (#58899 ) The refactoring in #57666 inadvertently enabled norms on two of the percolator subfields, leading to an increase in memory usage. This commit disables norms on these fields again.	2020-07-02 13:59:28 +01:00
Rene Groeschke	70713a0a19	Remove deprecated AbstractArchiveTask Gradle API usages (#58657 ) (#58894 ) * Fix deprecated ArchiveTask configurations	2020-07-02 13:08:34 +02:00
Alan Woodward	3ba16e0f39	Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58830 ) Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on the generic MappedFieldType. Backport of #58639	2020-07-01 14:52:14 +01:00
Przemyslaw Gomulka	2c275913b9	[7.x] Week based parsing for ingest date processor (#58597 ) (#58802 ) Date processor was incorrectly parsing week based dates because when a weekbased year was provided ingest module was thinking year was not on a date and was trying to applying the logic for dd/MM type of dates. Date Processor is also allowing users to specify locale parameter. It should be taken into account when parsing dates - currently only used for formatting. If someone specifies 'en-us' locale, then calendar data rules for that locale should be used. The exception is iso8601 format. If someone is using that format, then locale should not override calendar data rules. closes #58479	2020-07-01 15:15:56 +02:00
Yannick Welsch	15c85b29fd	Account for recovery throttling when restoring snapshot (#58658 ) (#58811 ) Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account (i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to configure throttling in a single place. The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to `40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change will be observed by clusters where the recovery and restore settings were not adapted. Relates https://github.com/elastic/elasticsearch/issues/57023 Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-07-01 12:19:29 +02:00
Rene Groeschke	d952b101e6	Replace compile configuration usage with api (7.x backport) (#58721 ) * Replace compile configuration usage with api (#58451) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build * Fix compile usages in 7.x branch	2020-06-30 15:57:41 +02:00
Henning Andersen	38be2812b1	Enhance extensible plugin (#58542 ) Rather than let ExtensiblePlugins know extending plugins' classloaders, we now pass along an explicit ExtensionLoader that loads the extensions asked for. Extensions constructed that way can optionally receive their own Plugin instance in the constructor.	2020-06-25 20:37:56 +02:00
Jason Tedor	52ad5842a9	Introduce node.roles setting (#58512 ) Today we have individual settings for configuring node roles such as node.data and node.master. Additionally, roles are pluggable and we have used this to introduce roles such as node.ml and node.voting_only. As the number of roles is growing, managing these becomes harder for the user. For example, to create a master-only node, today a user has to configure: - node.data: false - node.ingest: false - node.remote_cluster_client: false - node.ml: false at a minimum if they are relying on defaults, but also add: - node.master: true - node.transform: false - node.voting_only: false If they want to be explicit. This is also challenging in cases where a user wants to have configure a coordinating-only node which requires disabling all roles, a list which we are adding to, requiring the user to keep checking whether a node has acquired any of these roles. This commit addresses this by adding a list setting node.roles for which a user has explicit control over the list of roles that a node has. If the setting is configured, the node has exactly the roles in the list, and not any additional roles. This means to configure a master-only node, the setting is merely 'node.roles: [master]', and to configure a coordinating-only node, the setting is merely: 'node.roles: []'. With this change we deprecate the existing 'node.*' settings such as 'node.data'.	2020-06-25 14:14:51 -04:00
Tim Brooks	5efec3a517	Add error logging when http test fails (#58505 ) Netty4HttpServerTransportTests has started to fail intermittently. It seems like unexpected successful responses are being received when the test is simulating errors. This commit adds logging to the test to provide additional information when there is an unexpected success. It also adds the logging to the nio http test.	2020-06-24 11:02:20 -06:00
Luca Cavanna	7e2bb8d6a2	Mute Netty4HttpServerTransportTests#testCorsRequest (#58480 ) Relates to #58433	2020-06-24 14:31:38 +02:00
Alan Woodward	d251a482e9	Move MappedFieldType.similarity() to TextSearchInfo (#58439 ) Similarities only apply to a few text-based field types, but are currently set directly on the base MappedFieldType class. This commit moves similarity information into TextSearchInfo, and removes any mentions of it from MappedFieldType or FieldMapper. It was previously possible to include a similarity parameter on a number of field types that would then ignore this information. To make it obvious that this has no effect, setting this parameter on non-text field types now issues a deprecation warning.	2020-06-24 10:00:32 +01:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Alan Woodward	4b8cf2af6a	Add serialization test for FieldMappers when include_defaults=true (#58235 ) (#58328 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 15:46:04 +01:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Rene Groeschke	abc72c1a27	Unify dependency licenses task configuration (#58116 ) (#58274 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-18 08:15:50 +02:00
jimczi	a7488ee16f	Fix PercolatorMatchedSlotSubFetchPhaseTests#testHitsExecute	2020-06-17 23:04:17 +02:00
Jim Ferenczi	a19213dcca	Fix nested document support in percolator query (#58149 ) This commit ensures that we filter out nested documents when retrieving the document slots of a matching query. Closes #52850	2020-06-17 22:32:54 +02:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Tal Levy	69d5e044af	Add optional description parameter to ingest processors. (#57906 ) (#58152 ) This commit adds an optional field, `description`, to all ingest processors so that users can explain the purpose of the specific processor instance. Closes #56000.	2020-06-15 19:27:57 -07:00
Tal Levy	499ad6fcc4	Pre-compile inline scripts in Ingest Script processors (#57960 ) (#58130 ) This commit introduces an optimization for inline scripts. It keeps the compiled ingest script that the ScriptProcessor.Factory has been creating for validation purposes. Previously, the Script Service's cache was leveraged because it was the best way to handle caching of both stored and inline scripts. Since inline scripts are so widely used in Ingest Node, it is probably best to ensure we are using the pre-compiled version from the beginning.	2020-06-15 15:22:56 -07:00
Dan Hermann	8a910443c4	Add ignore_empty_value parameter in set ingest processor (#57030 ) (#58108 )	2020-06-15 08:35:08 -05:00
Rene Groeschke	01e9126588	Remove deprecated usage of testCompile configuration (#57921 ) (#58083 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-14 22:30:44 +02:00
Martijn van Groningen	c8031c6f99	Add data stream support to the reindex api. (#57970 ) Backport of #57870 to 7.x branch. This change now also copies the op_type from the reindex request's destination index request to the actual index request being used in the bulk request. For ensuring no document exists, the op_type create doesn't need to be copied, since Versions.MATCH_DELETED will copied from the 'mainRequest.getDestination().version()'. The `version()` method on IndexRequest only returns Versions.MATCH_DELETED if op_type=create and no specific version has been specified. However in order to be able to index into a data stream, the op_type must be create. So in order to support that the op_type must be copied from the reindex request's destination index request to the actual index request being used in the bulk request. Relates to #53100 and #57788	2020-06-12 09:54:37 +02:00
Mark Tozzi	36f551bdb4	Make ValuesSourceConfig behave like a config object (#57762 ) (#58012 )	2020-06-11 17:23:55 -04:00
Alan Woodward	16e230dcb8	Update to lucene snapshot e7c625430ed (#57981 ) Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.	2020-06-11 14:51:53 +01:00
Nik Everett	0a2bd10758	Save memory when parent and child are not on top (#57892 ) (#57944 ) Reworks the `parent` and `child` aggregation are not at the top level using the optimization from #55873. Instead of wrapping all non-top-level `parent` and `child` aggregators we now handle being a child aggregator in the aggregator, specifically by adding recording which global ordinals show up in the parent and then checking if they match the child.	2020-06-10 16:25:10 -04:00
Yannick Welsch	80f221e920	Use clean thread context for transport and applier service (#57792 ) (#57914 ) Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and also that thread contexts are not leaked). Moves the ClusterApplierService to use the system context (same as we do for MasterService), which allows to remove a hack from TemplateUgradeService and makes it clearer that applying CS updates is fully executing under system context.	2020-06-10 10:30:28 +02:00
Jake Landis	a370d5eead	[7.x] Ensure Joni warning are logged at debug (#57302 ) (#57897 ) When Joni, the regex engine that powers grok emits a warning it does so by default to System.err. System.err logs are all bucketed together in the server log at WARN level. When Joni emits a warning, it can be extremely verbose, logging a message for each execution again that pattern. For ingest node that means for every document that is run that through Grok. Fortunately, Joni provides a call back hook to push these warnings to a custom location. This commit implements Joni's callback hook to push the Joni warning to the Elasticsearch server logger (logger.org.elasticsearch.ingest.common.GrokProcessor) at debug level. Generally these warning indicate a possible issue with the regular expression and upon creation of the Grok processor will do a "test run" of the expression and log the result (if any) at WARN level. This WARN level log should only occur on pipeline creation which is a much lower frequency then every document. Additionally, the documentation is updated with instructions for how to set the logger to debug level.	2020-06-09 17:06:29 -05:00
Yannick Welsch	9eec819c5b	Revert "Use clean thread context for transport and applier service (#57792 )" This reverts commit `259be236cf`.	2020-06-09 22:24:54 +02:00
Jake Landis	fff0a106c9	[7.x] Support `if_seq_no` and `if_primary_term` for ingest (#55430 ) (#57768 ) Allow for optimistic concurrency control during ingest by checking the sequence number and primary term. This is accomplished by defining _if_seq_no and _if_primary_term in the pipeline, similarly to _version and _version_type. Closes #41255 Co-authored-by: Maria Ralli <mariai.ralli@gmail.com>	2020-06-09 14:20:26 -05:00
Yannick Welsch	259be236cf	Use clean thread context for transport and applier service (#57792 ) Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and also that thread contexts are not leaked). Moves the ClusterApplierService to use the system context (same as we do for MasterService), which allows to remove a hack from TemplateUgradeService and makes it clearer that applying CS updates is fully executing under system context.	2020-06-09 12:32:28 +02:00
Mayya Sharipova	70e63a365a	Refactor how to determine if a field is metafield (#57378 ) (#57771 ) Before to determine if a field is meta-field, a static method of MapperService isMetadataField was used. This method was using an outdated static list of meta-fields. This PR instead changes this method to the instance method that is also aware of meta-fields in all registered plugins. Related #38373, #41656 Closes #24422	2020-06-08 09:16:18 -04:00
Tanguy Leroux	0e57528d5d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:34:06 +02:00
Armin Braun	24779c80f9	Serialize Outbound Message on Flush (#57084 ) (#57682 ) Follow up to #56961: We can be a little more efficient than just serializing at the IO loop by serializing only when we flush to a channel. This has the advantage that we don't serialize a long queue of messages for a channel that isn't writable for a longer period of time (unstable network, actually writing large volumes of data, etc.). Also, this further reduces the time for which we hold on to the write buffer for a message, making allocations because of an empty page cache recycler pool less likely.	2020-06-04 18:06:13 +02:00
Nik Everett	928794cd61	Make parent and child aggregator more obvious (#57490 ) (#57553 ) Pulls the way that the `ParentJoinAggregator` collects global ordinals into a strategy object so it is a little simpler to reason about and it'll be simpler to save memory by removing `asMultiBucketAggregator` in the future. Relates to #56487	2020-06-02 16:22:38 -04:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Armin Braun	ba2d70d8eb	Serialize Outbound Messages on IO Threads (#56961 ) (#57080 ) Almost every outbound message is serialized to buffers of 16k pagesize. We were serializing these messages off the IO loop (and retaining the concrete message instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the channel is ready. 1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average. 2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically. With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable. Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC. This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain). I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once. Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.	2020-06-02 16:15:18 +02:00
Nik Everett	f52e779806	Fix casting of scaled_float in sorts (#57207 ) (#57385 ) Previously we'd get a `ClassCastException` when you tried to use `numeric_type` on `scaled_float`. Oops! This cleans up the CCE and moves some code around so the casting actually works.	2020-05-29 18:06:04 -04:00
Tomasz Elendt	a7c36c8af5	Support multiple tokens on LHS in stemmer_override rules (#56113 ) (#56484 ) This commit adds support for rules with multiple tokens on LHS, also known as "contraction rules", into stemmer override token filter. Contraction rules are handy into translating multiple inflected words into the same root form. One side effect of this change is that it brings stemmer override rules format closer to synonym rules format so that it makes it easier to translate one into another. This change also makes stemmer override rules parser more strict so that it should catch more errors which were previously accepted. Closes #56113	2020-05-29 22:34:31 +02:00
Henning Andersen	8427d677e9	Reindex and friends fail nicely when max_docs < slices (#54901 ) (#57348 ) When the parameter `max_docs` is less than `slices` in update_by_query, delete_by_query or reindex API, `max_docs ` is set to 0 and we throw an action_request_validation_exception with confused error message: "maxDocs should be greater than 0...". This change checks that whether `max_docs` is less than `slices` and throw an illegal_argument_exception with clear message. Relates to #52786. Co-authored-by: bellengao <gbl_long@163.com>	2020-05-29 14:30:14 +02:00
Lee Hinman	c0f732b9f6	[7.x] Rename template V2 classes to ComposableTemplate (#57183 ) (#57232 ) Backports the following commits to 7.x: Rename template V2 classes to ComposableTemplate (#57183)	2020-05-27 11:01:59 -06:00
Alan Woodward	d6b79bcd95	Remove Mapper.updateFieldType() (#57151 ) When we had multiple mapping types, an update to a field in one type had to be propagated to the same field in all other types. This was done using the Mapper.updateFieldType() method, called at the end of a merge. However, now that we only have a single type per index, this method is unnecessary and can be removed. Relates to #41059 Backport of #56986	2020-05-27 09:21:24 +01:00
Armin Braun	56401d3f66	Release HTTP Request Body Earlier (#57094 ) (#57110 ) We don't need to hold on to the request body past the beginning of sending the response. There is no need to keep a reference to it until after the response has been sent fully and we can eagerly release it here. Note, this can be optimized further to release the contents even earlier but for now this is an easy increment to saving some memory on the IO pool.	2020-05-25 13:00:19 +02:00
Jack Conradson	35c546b388	Backports for _source bug fix in scripting (#57068 ) * Update DeprecationMap to DynamicMap (#56149) This renames DeprecationMap to DynamicMap, and changes the deprecation messages Map to accept a Map of String (keys) to Functions (updated values) instead. This creates more flexibility in either logging or updating values from params within a script. This change is required to fix (#52103) in a future PR. * Fix Source Return Bug in Scripting (#56831) This change ensures that when a user returns _source directly no matter where accessed within scripting, the value is a Map of the converted source as opposed to a SourceLookup.	2020-05-21 17:07:38 -07:00
markharwood	eb8cb31d46	Update Lucene version to 8.6.0-snapshot-9d6c738ffce (#57024 ) Same version as master	2020-05-21 11:28:16 +01:00
Andrei Balici	19a336e8d3	Add `max_token_length` setting to the CharGroupTokenizer (#56860 ) Adds `max_token_length` option to the CharGroupTokenizer. Updates documentation as well to reflect the changes. Closes #56676	2020-05-20 14:28:40 +02:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Tim Brooks	57c3a61535	Create HttpRequest earlier in pipeline (#56393 ) Elasticsearch requires that a HttpRequest abstraction be implemented by http modules before server processing. This abstraction controls when underlying resources are released. This commit moves this abstraction to be created immediately after content aggregation. This change will enable follow-up work including moving Cors logic into the server package and tracking bytes as they are aggregated from the network level.	2020-05-18 14:54:01 -06:00
Armin Braun	cac85a6f18	Shorter Path in Netty ByteBuf Unwrap (#56740 ) (#56857 ) In most cases we are seeing a `PooledHeapByteBuf` here now. No need to redundantly create an new `ByteBuffer` and single element array for it here when we can just directly unwrap its internal `byte[]`.	2020-05-16 11:54:36 +02:00

1 2 3 4 5 ...

5661 Commits