OpenSearch

Commit Graph

Author	SHA1	Message	Date
Alan Woodward	f89fa421e2	Remove unnecessary IndexSearcher field on HitContext (#62378 ) FastVectorHighlighter uses the top-level reader to rewrite queries against, which it gets via an IndexSearcher field on HitContext. However, we can already access this top-level reader via HitContext's existing LeafReaderContext field. This commit removes the unnecessary field and constructor parameter, and changes the implementation of topLevelReader to go via ReaderUtils and the leaf reader context.	2020-09-15 15:46:14 +01:00
Julie Tibshirani	4a19bdb2ea	Support the 'fields' option in inner_hits and top_hits. (#62337 ) This PR adds support for the 'fields' option in the following places: * Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing * The `top_hits` aggregation Addresses #61949.	2020-09-14 11:51:45 -07:00
Nhat Nguyen	035f0638f4	Support point in time in async_search (#61560 ) This commit integrates point in time into async search and ensures that it works correctly with security enabled. Relates #61062	2020-09-10 19:25:48 -04:00
Jim Ferenczi	3fc35aa76e	Shard Search Scroll failures consistency (#62061 ) Today some uncaught shard failures such as RejectedExecutionException skips the release of shard context and let subsequent scroll requests access the same shard context again. Depending on how the other shards advanced, this behavior can lead to missing data since scrolls always move forward. In order to avoid hidden data loss, this commit ensures that we always release the context of shard search scroll requests whenever a failure occurs locally. The shard search context will no longer exist in subsequent scroll requests which will lead to consistent shard failures in the responses. This change also modifies the retry tests of the reindex feature. Reindex retries scroll search request that contains a shard failure and move on whenever the failure disappears. That is not compatible with how scrolls work and can lead to missing data as explained above. That means that reindex will now report scroll failures when search rejection happen during the operation instead of skipping document silently. Finally this change removes an old TODO that was fulfilled with #61062.	2020-09-10 19:25:48 -04:00
Nhat Nguyen	3d69b5c41e	Introduce point in time APIs in x-pack basic (#61062 ) This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>	2020-09-10 19:25:47 -04:00
Ignacio Vera	c8981ea93d	upgrade to lucene-8.7.0-snapshot-b313618cc1d (#62213 ) (#62222 )	2020-09-10 16:23:18 +02:00
Jake Landis	d8dad9ab2c	[7.x] Remove integTest task from PluginBuildPlugin (#61879 ) (#62135 ) This commit removes `integTest` task from all es-plugins. Most relevant projects have been converted to use yamlRestTest, javaRestTest, or internalClusterTest in prior PRs. A few projects needed to be adjusted to allow complete removal of this task * x-pack/plugin - converted to use yamlRestTest and javaRestTest * plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task * qa/die-with-dignity - convert to javaRestTest * x-pack/qa/security-example-spi-extension - convert to javaRestTest * multiple projects - remove the integTest.enabled = false (yay!) related: #61802 related: #60630 related: #59444 related: #59089 related: #56841 related: #59939 related: #55896	2020-09-09 14:25:41 -05:00
Dan Hermann	eeeb355adf	Configurable output format for date processor (#61324 ) (#62175 )	2020-09-09 11:11:02 -05:00
Dan Hermann	0b1e2172e1	[7.x] Preserve grok pattern ordering and add sort option (#61671 ) (#62162 )	2020-09-09 08:53:11 -05:00
Alan Woodward	28fd4a2ae8	Convert RangeFieldMapper to parametrized form (#62058 ) This also adds the ability to define a serialization check on Parameters, used in this case to only serialize format and locale parameters if the mapper is a date range.	2020-09-08 18:44:13 +01:00
Ignacio Vera	31c026f25c	upgrade to Lucene-8.7.0-snapshot-61ea26a (#61957 ) (#61974 )	2020-09-04 13:46:20 +02:00
Ryan Ernst	d6e17170c3	Simplify adding plugins and modules to testclusters (#61886 ) There are currently half a dozen ways to add plugins and modules for test clusters to use. All of them require the calling project to peek into the plugin or module they want to use to grab its bundlePlugin task, and then both depend on that task, as well as extract the archive path the task will produce. This creates cross project dependencies that are difficult to detect, and if the dependent plugin/module has not yet been configured, the build will fail because the task does not yet exist. This commit makes the plugin and module methods for testclusters symmetetric, and simply adding a file provider directly, or a project path that will produce the plugin/module zip. Internally this new variant uses normal configuration/dependencies across projects to get the zip artifact. It also has the added benefit of no longer needing the caller to add to the test task a dependsOn for bundlePlugin task.	2020-09-03 19:37:46 -07:00
Alan Woodward	e2f006eeb4	Merge FetchSubPhase hitsExecute and hitExecute methods (#60907 ) (#61893 ) FetchSubPhase has two 'execute' methods, one which takes all hits to be examined, and one which takes a single HitContext. It's not obvious which one should be implemented by a given sub-phase, or if implementing both is a possibility; nor is it obvious that we first run the hitExecute methods of all subphases, and then subsequently call all the hitsExecute methods. This commit reworks FetchSubPhase to replace these two variants with a processor class, `FetchSubPhaseProcessor`, that is returned from a single `getProcessor` method. This processor class has two methods, `setNextReader()` and `process`. FetchPhase collects processors from all its subphases (if a subphase does not need to execute on the current search context, it can return `null` from `getProcessor`). It then sorts its hits by docid, and groups them by lucene leaf reader. For each reader group, it calls `setNextReader()` on all non-null processors, and then passes each doc id to `process()`. Implementations of fetch sub phases can divide their concerns into per-request, per-reader and per-document sections, and no longer need to worry about sorting docs or dealing with reader slices. FetchSubPhase now provides a FetchSubPhaseExecutor that exposes two methods, setNextReader(LeafReaderContext) and execute(HitContext). The parent FetchPhase collects all these executors together (if a phase should not be executed, then it returns null here); then it sorts hits, and groups them by reader; for each reader it calls setNextReader, and then execute for each hit in turn. Individual sub phases no longer need to concern themselves with sorting docs or keeping track of readers; global structures can be built in getExecutor(SearchContext), per-reader structures in setNextReader and per-doc in execute.	2020-09-03 12:20:55 +01:00
Alan Woodward	af01ccee93	Add specific test for serializing all mapping parameter values (#61844 ) (#61877 ) This commit adds a test to MapperTestCase that explicitly checks that a mapper can serialize all its default values, and that this serialization can then be re-parsed. Note that the test is disabled for non-parametrized mappers as their serialization may in some cases output parameters that are not accepted. Gradually moving all mappers to parametrized form will address this. The commit also contains a fix to keyword mappers, which were not correctly serializing the similarity parameter; this partially addresses #61563. It also enables `null` as a value for `null_value` on `scaled_float`, as a follow-up to #61798	2020-09-03 09:20:26 +01:00
Nik Everett	c19f67ce30	Support longs in BitArray (backport of #61867 ) (#61871 ) We frequently use `long`s with `BitArray` in aggs and right now we have to assert that the `long` fits in an `int`. This adds support for `long` to `BitArray` so we don't need those assertions.	2020-09-02 17:24:31 -04:00
Jason Tedor	64cd229b35	Upgrade to Lucene 8.6.2 (#61688 ) This commit upgrades the Lucene dependencies to 8.6.2.	2020-08-31 09:54:07 -04:00
Jake Landis	d2e5f2f532	[7.x] Enhance the ingest node simulate verbose output (#60433 ) (#60678 ) This commit enhances the verbose output for the `_ingest/pipeline/_simulate?verbose` api. Specifically this adds the following: * the pipeline processor is now included in the output * the conditional (if) and result is now included in the output iff it was defined * a status field is always displayed. the possible values of status are * `success` - if the processor ran with out errors * `error` - if the processor ran but threw an error that was not ingored * `error_ignored` - if the processor ran but threw an error that was ingored * `skipped` - if the process did not run (currently only possible if the if condition evaluates to false) * `dropped` - if the the `drop` processor ran and dropped the document * a `processor_type` field for the type of processor (e.g. set, rename, etc.) * throw a better error if trying to simulate with a pipeline that does not exist closes #56004	2020-08-27 16:53:09 -05:00
Luca Cavanna	f769821bc8	Pass SearchLookup supplier through to fielddataBuilder (#61430 ) (#61638 ) Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not. To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method. As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors. With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition. Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch. This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>. Relates to #59332 Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-08-27 18:09:56 +02:00
Jay Modi	34c4fc3b91	Remove tasks module to define tasks system index (#61588 ) This commit removes the tasks module that only existed to define the tasks result index, `.tasks`, as a system index. The definition for the tasks results system index descriptor is moved to the `SystemIndices` class with a check that no other plugin or module attempts to define an entry with the same source. Additionally, this change also makes the pattern for the tasks result index a wildcard pattern since we will need this when the index is upgraded (reindex to new name and then alias that to .tasks). Backport of #61540	2020-08-26 09:48:23 -06:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Nik Everett	87cf81e179	Migrate some more mapper test cases (#61507 ) (#61552 ) Migrate some more mapper test cases from `ESSingleNodeTestCase` to `MapperTestCase`.	2020-08-25 15:27:26 -04:00
Przemyslaw Gomulka	f3f7d25316	Header warning logging refactoring backport(#55941 ) (#61515 ) Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog). Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed. relates #55699 relates #52369 backports #55941	2020-08-25 16:35:54 +02:00
Julie Tibshirani	997c73ec17	Correct how field retrieval handles multifields and copy_to. (#61391 ) Before when a value was copied to a field through a parent field or `copy_to`, we parsed it using the `FieldMapper` from the source field. Instead we should parse it using the target `FieldMapper`. This ensures that we apply the appropriate mapping type and options to the copied value. To implement the fix cleanly, this PR refactors the value parsing strategy. Now instead of looking up values directly, field mappers produce a helper object `ValueFetcher`. The value fetchers are responsible for almost all aspects of fetching, including looking up the right paths in the _source. The PR is fairly big but each commit can be reviewed individually. Fixes #61033.	2020-08-20 15:53:35 -07:00
Alan Woodward	a3a0c63ccf	Convert NumberFieldMapper to parametrized form (#61092 ) (#61376 ) In addition, this commit converts ScaledFloatFieldMapper as it was relying on a number of static values taken from NumberFieldMapper that had changed or been removed.	2020-08-20 16:43:26 +01:00
Nik Everett	9789e6d154	Migrate some field mapper tests to ESTestCase (#61301 ) (#61346 ) This switches a few tests for field mappers from `ESSingleNodeTestCase` to `ESTestCase` because, in general, we prefer to avoid `ESSingleNodeTestCase` when we can because it is slow and "big". "Big" here means that it pulls in an entire node, making it difficult to reason about what you are testing.	2020-08-19 15:43:49 -04:00
Mark Tozzi	db1df6cc30	[7.x] Remove a bunch of type boilerplate from Aggs (#60852 ) (#61031 )	2020-08-17 12:13:05 -04:00
Alan Woodward	c81dc2b8b7	Convert KeywordFieldMapper to parametrized form (#60645 ) This makes KeywordFieldMapper extend ParametrizedFieldMapper, with explicitly defined parameters. In addition, we add a new option to Parameter, restrictedStringParam, which accepts a restricted set of string options.	2020-08-12 11:41:11 +01:00
Nik Everett	664ba0a80a	Fix the parent join aggregator test case (#60991 ) The test was putting parent and child documents into different segments which is unrealistic and was causing errors. Closes #60980	2020-08-11 17:53:15 -04:00
Nhat Nguyen	4bdf283619	Mute ChildrenToParentAggregatorTests Tracked at #60980	2020-08-11 12:56:29 -04:00
Alan Woodward	54279212cf	Make MetadataFieldMapper extend ParametrizedFieldMapper (#59847 ) (#60924 ) This commit cuts over all metadata field mappers to parametrized format.	2020-08-11 09:02:28 +01:00
Jim Ferenczi	f30f1f04e2	Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60816 ) This commit removes the ability to test the top level result of an aggregator before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test the final output (the one sent to the end user) rather than an intermediary result that could be different. This change also removes spurious commits triggered on top of a random index writer. These commits slow down the tests and are redundant with the commits that the random index writer performs.	2020-08-10 17:23:00 +02:00
Jack Conradson	d4d58e70f5	Unmute the test failure: painless/71_context_api (#60758 ) I was unable to reproduce this locally on either 7.6 (first introduced) and 7.x. This is already not muted on master and doesn't appear to have failures. There were some API changes at the time that could have affected this test, and I'm wondering with backports if this is now stable again. If this has more failures, I will continue to investigate further. Relates to #51939	2020-08-05 10:27:19 -07:00
Jake Landis	f3752ba1d5	7.x suport new path for re-index java-api doc (#60319 ) This commit uses the new location for the reindex java-api documentation. Temporary files have been left behind to pacify the docs build. related #60339	2020-08-05 09:05:07 -05:00
Alan Woodward	b3ae5d26bd	Move mapper validation to the mappers themselves (#60072 ) (#60649 ) Currently, validation of mappers (checking that cross-references are correct, limits on field name lengths and object depths, multiple definitions, etc) is performed by the MapperService. This means that any mapper-specific validation, for example that done on the CompletionFieldMapper, needs to be called specifically from core server code, and so we can't add validation to mappers that live in plugins. This commit reworks the validation framework so that mapper-specific validation is done on the Mapper itself. Mapper gets a new `validate(MappingLookup)` method (already present on `MetadataFieldMapper` and now pulled up to the parent interface), which is called from a new `DocumentMapper.validate()` method. All the validation code currently living on `MapperService` moves either to individual mapper implementations (FieldAliasMapper, CompletionFieldMapper) or into `MappingLookup`, an altered `DocumentFieldMappers` which now knows about object fields and can check for duplicate definitions, or into DocumentMapper which handles soft limit checks.	2020-08-04 14:39:20 +01:00
Rene Groeschke	bdd7347bbf	Merge test runner task into RestIntegTest (7.x backport) (#60600 ) * Merge test runner task into RestIntegTest (#60261) * Merge test runner task into RestIntegTest * Reorganizing Standalone runner and RestIntegTest task * Rework general test task configuration and extension * Fix merge issues * use former 7.x common test configuration	2020-08-04 14:46:32 +02:00
Julie Tibshirani	f99584c6f3	Avoid reloading _source for every inner hit. (#60632 ) Previously if an inner_hits block required _ source, we would reload and parse the root document's source for every hit. This PR adds a shared SourceLookup to the inner hits context that allows inner hits to reuse parsed source if it's already available. This matches our approach for sharing the root document ID. Relates to #32818.	2020-08-03 17:12:27 -07:00
Rene Groeschke	ed4b70190b	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 ) - Replace immediate task creations by using task avoidance api - One step closer to #56610 - Still many tasks are created during configuration phase. Tackled in separate steps	2020-07-31 13:09:04 +02:00
Julie Tibshirani	8ac81a3447	Remove IndexFieldData#clear since it is unused. (#60475 ) This method was never called. It also seemed tricky that calling a method on `IndexFieldData` could clear the contents of a shared cache.	2020-07-30 14:07:55 -07:00
Julie Tibshirani	dfd7f226f0	Clarify SourceLookup sharing across fetch subphases. (#60484 ) The `SourceLookup` class provides access to the _source for a particular document, specified through `SourceLookup#setSegmentAndDocument`. Previously the search context contained a single `SourceLookup` that was shared between different fetch subphases. It was hard to reason about its state: is `SourceLookup` set to the expected document? Is the _source already loaded and available? Instead of using a global source lookup, the fetch hit context now provides access to a lookup that is set to load from the hit document. This refactor closes #31000, since the same `SourceLookup` is no longer shared between the 'fetch _source phase' and script execution.	2020-07-30 13:22:31 -07:00
Julie Tibshirani	5359417ec3	Minor clean-up around search highlight context. (#60422 ) * Rename SearchContextHighlight -> SearchHighlightContext. * Rename HighlighterContext to FieldHighlightContext. * Make the search highlight context immutable. * Avoid storing SearchHighlightContext on HighlighterContext.	2020-07-29 11:39:17 -07:00
David Turner	bbacad648a	Fix network logging test failures (#60334 ) In #60297 we added some tests related to logging from the transport layer, but these tests failed occasionally since the cluster was kept alive between test invocations but the logging framework expected it only to be used for a single test. With this commit we reduce the scope of the internal test cluster to `TEST` to solve this problem. Closes #60321.	2020-07-29 08:29:09 +01:00
Julie Tibshirani	c7bfb5de41	Add search `fields` parameter to support high-level field retrieval. (#60258 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-28 10:58:20 -07:00
David Turner	9c62b5cb96	Mute tests for #60321	2020-07-28 18:12:54 +01:00
David Turner	9450ea08b4	Log and track open/close of transport connections (#60297 ) Transport connections between nodes remain in place until one or other node shuts down or the connection is disrupted by a flaky network. Today it is very difficult to demonstrate that transient failures and cluster instability are caused by the network even though this is often the case. In particular, transport connections open and close without logging anything, even at `DEBUG` level, making it very hard to quantify the scale of the problem or to correlate the networking problems with external events. This commit adds the missing `DEBUG`-level logging when transport connections open and close, and also tracks the total number of transport connections a node has opened as a measure of the stability of the underlying network.	2020-07-28 17:08:04 +01:00
Jake Landis	92ce41cfaf	[7.x] Introduce javaRestTest source set/task and convert modules (#59939 ) (#60026 ) Introduce a javaRestTest source set and task to compliment the yamlRestTest. javaRestTest differs such that the code is sourced from Java and may have different dependencies and setup requirements for the test clusters. This also allows the tests to run in parallel in different cluster instances to prevent any cross test contamination between the two types of tests. Included in this PR is all :modules no longer use the integTest task. The tests are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest. Since only :modules (and :rest-api-spec) have been converted to yamlRestTest we can now disable the integTest task if either yamlRestTest or javaRestTest have been applied. Once all projects are converted, we can delete the integTest task. related: #56841 related: #59444	2020-07-28 08:39:11 -05:00
Yannick Welsch	ffe114b890	Set specific keepalive options by default on supported platforms (#59278 ) keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of distributed systems such as Elasticsearch. This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations that support it (>= Java 11 & (MacOS \|\| Linux)) and where the system defaults are set to something higher than 5 minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings unless they are deemed to be set too high by providing better out-of-the-box defaults.	2020-07-28 11:10:04 +02:00
Jake Landis	55216dabb4	[7.x] Per processor description for verbose simulate (#58207 ) (#60008 ) For ingest node processors a per processor description was recently added. This commit displays that description in the verbose output of the pipeline simulation. related #57906	2020-07-21 17:32:45 -05:00
malpani	0555fef799	Support ignore_keywords flag for word delimiter graph token filter (#59563 ) This commit allows customizing the word delimiter token filters to skip processing tokens tagged as keyword through the `ignore_keywords` flag Lucene's WordDelimiterGraphFilter already exposes. Fix for #59491	2020-07-21 16:11:55 +01:00
Nik Everett	6f6076e208	Drop some params from IndexFieldData.Builder (backport of #59934 ) (#59972 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 10:28:59 -04:00
Armin Braun	5b92596fad	Cleanup and Optimize Multiple Serialization Spots (#59626 ) (#59936 ) Follow up to #59606 using some of the new infrastructure and making similar cleanups (and due to at times better handling of size hints and empty collections also optimizations in the stream utility methods this also means speedups) in various spots in the core codebase.	2020-07-21 10:06:56 +02:00
Nik Everett	95e6e4a452	Small cleanup for IndexFieldData (#59724 ) (#59800 ) This drops `IndexComponent` from `IndexFieldData` because it wasn't doing anything other than forcing us to perform a bunch of ceremony to build them.	2020-07-17 13:38:15 -04:00
Dan Hermann	48df9b1a0e	Update regex file for es user agent node processor (#59697 ) (#59794 )	2020-07-17 11:04:01 -05:00
Benjamin Trent	b7f30fc929	[7.x] Adding new `require_alias` option to indexing requests (#58917 ) (#59769 ) * Adding new `require_alias` option to indexing requests (#58917) This commit adds the `require_alias` flag to requests that create new documents. This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias. When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings. This is useful when an alias is required instead of a concrete index. closes https://github.com/elastic/elasticsearch/issues/55267	2020-07-17 10:24:58 -04:00
Stuart Tettemer	8fdaed0642	Scripting: Augment String with sha1 and sha256 (#59671 ) (#59723 ) Only available in the ingest context for use in ingest pipelines. Digests are computed on the UTF-8 encoding of the String and are returned as hex strings. sha1() return hex strings of length 40, sha256() returns length 64 Fixes: #59647 Backport: 3c85272	2020-07-16 15:17:32 -05:00
Stuart Tettemer	c491212dc1	Scripting: fix generateContextDoc path and url #59676 (#59722 ) * Add doc runtime class path * Use getAllHttpSocketURI.get(0) instead of getAllHttpSocketURI to get a single test cluster URL rather than a list Backport: 3057e0f	2020-07-16 15:03:36 -05:00
Ignacio Vera	f8037abf47	upgrade to lucene-8.6.0 release (#59596 ) (#59599 )	2020-07-15 12:40:57 +02:00
Luca Cavanna	af2f85be15	Consolidate script parsing from object (7.x) (#59509 ) The update by query action parses a script from an object (map or string). We will need to do the same for runtime fields as they are parsed as part of mappings (#59391). This commit moves the existing parsing of a script from an object from RestUpdateByQueryAction to the Script class. It also adds tests and adjusts some error messages that are incorrect. Also, options were not parsed before and they are now. And unsupported fields trigger now a deprecation warning.	2020-07-14 17:08:29 +02:00
Jake Landis	665b7b7bd8	Convert modules to use yamlRestTest (#59089 ) (#59446 ) This commit moves the modules REST tests to the newly introduced yamlRestTest source set. A few tests have also been re-named to include the correct IT suffix. Without changing the names, the testing conventions task would fail since now that the YAML tests are no longer present pacify the convention. These tests have moved to the internalClusterTest source set. related: #56841	2020-07-13 13:53:05 -05:00
Martijn van Groningen	b1b7bf3912	Make data streams a basic licensed feature. (#59392 ) Backport of #59293 to 7.x branch. * Create new data-stream xpack module. * Move TimestampFieldMapper to the new module, this results in storing a composable index template with data stream definition only to work with default distribution. This way data streams can only be used with default distribution, since a data stream can currently only be created if a matching composable index template exists with a data stream definition. * Renamed `_timestamp` meta field mapper to `_data_stream_timestamp` meta field mapper. * Add logic to put composable index template api to fail if `_data_stream_timestamp` meta field mapper isn't registered. So that a more understandable error is returned when attempting to store a template with data stream definition via the oss distribution. In a follow up the data stream transport and rest actions can be moved to the xpack data-stream module.	2020-07-13 17:26:46 +02:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Dan Hermann	34c50c045c	Data stream support for rank eval API	2020-07-09 13:11:29 -05:00
Przemko Robakowski	c870d6e570	[7.x] Restart tests with data streams (#58330 ) (#59303 ) * Restart tests with data streams (#58330)	2020-07-09 17:52:20 +02:00
Alan Woodward	67a27e2b9d	Add declarative parameters to FieldMappers (#58663 ) The FieldMapper infrastructure currently has a bunch of shared parameters, many of which are only applicable to a subset of the 41 mapper implementations we ship with. Merging, parsing and serialization of these parameters are spread around the class hierarchy, with much repetitive boilerplate code required. It would be much easier to reason about these things if we could declare the parameter set of each FieldMapper directly in the implementing class, and share the parsing, merging and serialization logic instead. This commit is a first effort at introducing a declarative parameter style. It adds a new FieldMapper subclass, ParametrizedFieldMapper, and refactors two mappers, Boolean and Binary, to use it. Parameters are declared on Builder classes, with the declaration including the parameter name, whether or not it is updateable, a default value, how to parse it from mappings, and how to extract it from another mapper at merge time. Builders have a getParameters method, which returns a list of the declared parameters; this is then used for parsing, merging and serialization. Merging is achieved by constructing a new Builder from the existing Mapper, and merging in values from the merging Mapper; conflicts are all caught at this point, and if none exist then a new, merged, Mapper can be built from the Builder. This allows all values on the Mapper to be final. Other mappers can be gradually migrated to this new style, and once they have all been refactored we can merge ParametrizedFieldMapper and FieldMapper entirely.	2020-07-09 11:43:21 +01:00
Martijn van Groningen	17bd559253	Fix the timestamp field of a data stream to @timestamp (#59210 ) Backport of #59076 to 7.x branch. The commit makes the following changes: * The timestamp field of a data stream definition in a composable index template can only be set to '@timestamp'. * Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and instead only check that the _timestamp field mapping has been defined on a backing index of a data stream. * Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method to `MetadataIndexTemplateService#collectMappings(...)` method. * Fixed a bug (#58956) that cases timestamp field validation to be performed for each template and instead of the final mappings that is created. * only apply _timestamp meta field if index is created as part of a data stream or data stream rollover, this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition. Relates to #58642 Relates to #53100 Closes #58956 Closes #58583	2020-07-08 17:30:46 +02:00
Nik Everett	a29d3515a2	Improve cardinality measure used to build aggs (#56533 ) (#59107 ) This makes a `parentCardinality` available to every `Aggregator`'s ctor so it can make intelligent choices about how it collects bucket values. This replaces `collectsFromSingleBucket` and is similar to it but: 1. It supports `NONE`, `ONE`, and `MANY` values and is generally extensible if we decide we can use more precise counts. 2. It is more accurate. `collectsFromSingleBucket` assumed that all sub-aggregations live under multi-bucket aggregations. This is normally true but `parentCardinality` is properly carried forward for single bucket aggregations like `filter` and for multi-bucket aggregations configured in single-bucket for like `range` with a single range. While I was touching every aggregation I renamed `doCreateInternal` to `createMapped` because that seemed like a much better name and it was right there, next to the change I was already making. Relates to #56487 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-08 08:42:23 -04:00
Armin Braun	9268b25789	Add Check for Metadata Existence in BlobStoreRepository (#59141 ) (#59216 ) In order to ensure that we do not write a broken piece of `RepositoryData` because the phyiscal repository generation was moved ahead more than one step by erroneous concurrent writing to a repository we must check whether or not the current assumed repository generation exists in the repository physically. Without this check we run the risk of writing on top of stale cached repository data. Relates #56911	2020-07-08 14:25:01 +02:00
Rene Groeschke	a896df53ac	Remove misc dependency related deprecation warnings (7.x backport) (#59122 ) * Fix dependency related deprecations (#58892) * Fix classpath setup for forbiddenapi usage	2020-07-07 17:10:31 +02:00
Ignacio Vera	5cc6457ed8	upgrade to lucene-8.6.0-snapshot-6a715e2ecc3 (#59091 ) (#59120 )	2020-07-07 12:07:41 +02:00
Jake Landis	604c6dd528	7.x - Create plugin for yamlTest task (#56841 ) (#59090 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 14:16:26 -05:00
Nik Everett	2965c7fe12	Fix bug in parent and child aggregators when parent field not defined (#57089 ) (#59074 ) Adding null check for ParentJoinFieldMapper in ChildrenAggregationBuilder.joinFieldResolveConfig Closes #42997 Co-authored-by: ParthPunkster <parthjain.pj1994@gmail.com>	2020-07-06 10:59:47 -04:00
Martijn van Groningen	f0dd9b4ace	Add data stream timestamp validation via metadata field mapper (#59002 ) Backport of #58582 to 7.x branch. This commit adds a new metadata field mapper that validates, that a document has exactly a single timestamp value in the data stream timestamp field and that the timestamp field mapping only has `type`, `meta` or `format` attributes configured. Other attributes can affect the guarantee that an index with this meta field mapper has a useable timestamp field. The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever a new backing index of a data stream is created. Relates to #53100	2020-07-06 11:32:33 +02:00
Dan Hermann	c1781bc7e7	[7.x] Add include_data_streams flag for authorization (#59008 )	2020-07-03 12:58:39 -05:00
Tim Brooks	605e24ed7c	Use `getPortRange` in http server tests (#58794 ) Currently we are leaving the settings to default port range in the nio and netty4 http server test. This has recently led to tests failing due to what appears to be a port conflict with other processes. This commit modifies these tests to use the test case helper method to generate port ranges. Fixes #58433 and #58296.	2020-07-02 13:21:45 -06:00
Dan Hermann	40655069e2	Data stream support for delete-by-query	2020-07-02 08:17:24 -05:00
Dan Hermann	fba1047ad9	Data stream support for update by query API	2020-07-02 08:16:05 -05:00
Alan Woodward	0cd1dc3143	Percolator keyword fields should not store norms (#58899 ) The refactoring in #57666 inadvertently enabled norms on two of the percolator subfields, leading to an increase in memory usage. This commit disables norms on these fields again.	2020-07-02 13:59:28 +01:00
Rene Groeschke	70713a0a19	Remove deprecated AbstractArchiveTask Gradle API usages (#58657 ) (#58894 ) * Fix deprecated ArchiveTask configurations	2020-07-02 13:08:34 +02:00
Alan Woodward	3ba16e0f39	Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58830 ) Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on the generic MappedFieldType. Backport of #58639	2020-07-01 14:52:14 +01:00
Przemyslaw Gomulka	2c275913b9	[7.x] Week based parsing for ingest date processor (#58597 ) (#58802 ) Date processor was incorrectly parsing week based dates because when a weekbased year was provided ingest module was thinking year was not on a date and was trying to applying the logic for dd/MM type of dates. Date Processor is also allowing users to specify locale parameter. It should be taken into account when parsing dates - currently only used for formatting. If someone specifies 'en-us' locale, then calendar data rules for that locale should be used. The exception is iso8601 format. If someone is using that format, then locale should not override calendar data rules. closes #58479	2020-07-01 15:15:56 +02:00
Yannick Welsch	15c85b29fd	Account for recovery throttling when restoring snapshot (#58658 ) (#58811 ) Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account (i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to configure throttling in a single place. The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to `40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change will be observed by clusters where the recovery and restore settings were not adapted. Relates https://github.com/elastic/elasticsearch/issues/57023 Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-07-01 12:19:29 +02:00
Rene Groeschke	d952b101e6	Replace compile configuration usage with api (7.x backport) (#58721 ) * Replace compile configuration usage with api (#58451) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build * Fix compile usages in 7.x branch	2020-06-30 15:57:41 +02:00
Henning Andersen	38be2812b1	Enhance extensible plugin (#58542 ) Rather than let ExtensiblePlugins know extending plugins' classloaders, we now pass along an explicit ExtensionLoader that loads the extensions asked for. Extensions constructed that way can optionally receive their own Plugin instance in the constructor.	2020-06-25 20:37:56 +02:00
Jason Tedor	52ad5842a9	Introduce node.roles setting (#58512 ) Today we have individual settings for configuring node roles such as node.data and node.master. Additionally, roles are pluggable and we have used this to introduce roles such as node.ml and node.voting_only. As the number of roles is growing, managing these becomes harder for the user. For example, to create a master-only node, today a user has to configure: - node.data: false - node.ingest: false - node.remote_cluster_client: false - node.ml: false at a minimum if they are relying on defaults, but also add: - node.master: true - node.transform: false - node.voting_only: false If they want to be explicit. This is also challenging in cases where a user wants to have configure a coordinating-only node which requires disabling all roles, a list which we are adding to, requiring the user to keep checking whether a node has acquired any of these roles. This commit addresses this by adding a list setting node.roles for which a user has explicit control over the list of roles that a node has. If the setting is configured, the node has exactly the roles in the list, and not any additional roles. This means to configure a master-only node, the setting is merely 'node.roles: [master]', and to configure a coordinating-only node, the setting is merely: 'node.roles: []'. With this change we deprecate the existing 'node.*' settings such as 'node.data'.	2020-06-25 14:14:51 -04:00
Tim Brooks	5efec3a517	Add error logging when http test fails (#58505 ) Netty4HttpServerTransportTests has started to fail intermittently. It seems like unexpected successful responses are being received when the test is simulating errors. This commit adds logging to the test to provide additional information when there is an unexpected success. It also adds the logging to the nio http test.	2020-06-24 11:02:20 -06:00
Luca Cavanna	7e2bb8d6a2	Mute Netty4HttpServerTransportTests#testCorsRequest (#58480 ) Relates to #58433	2020-06-24 14:31:38 +02:00
Alan Woodward	d251a482e9	Move MappedFieldType.similarity() to TextSearchInfo (#58439 ) Similarities only apply to a few text-based field types, but are currently set directly on the base MappedFieldType class. This commit moves similarity information into TextSearchInfo, and removes any mentions of it from MappedFieldType or FieldMapper. It was previously possible to include a similarity parameter on a number of field types that would then ignore this information. To make it obvious that this has no effect, setting this parameter on non-text field types now issues a deprecation warning.	2020-06-24 10:00:32 +01:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Alan Woodward	4b8cf2af6a	Add serialization test for FieldMappers when include_defaults=true (#58235 ) (#58328 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 15:46:04 +01:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Rene Groeschke	abc72c1a27	Unify dependency licenses task configuration (#58116 ) (#58274 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-18 08:15:50 +02:00
jimczi	a7488ee16f	Fix PercolatorMatchedSlotSubFetchPhaseTests#testHitsExecute	2020-06-17 23:04:17 +02:00
Jim Ferenczi	a19213dcca	Fix nested document support in percolator query (#58149 ) This commit ensures that we filter out nested documents when retrieving the document slots of a matching query. Closes #52850	2020-06-17 22:32:54 +02:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Tal Levy	69d5e044af	Add optional description parameter to ingest processors. (#57906 ) (#58152 ) This commit adds an optional field, `description`, to all ingest processors so that users can explain the purpose of the specific processor instance. Closes #56000.	2020-06-15 19:27:57 -07:00
Tal Levy	499ad6fcc4	Pre-compile inline scripts in Ingest Script processors (#57960 ) (#58130 ) This commit introduces an optimization for inline scripts. It keeps the compiled ingest script that the ScriptProcessor.Factory has been creating for validation purposes. Previously, the Script Service's cache was leveraged because it was the best way to handle caching of both stored and inline scripts. Since inline scripts are so widely used in Ingest Node, it is probably best to ensure we are using the pre-compiled version from the beginning.	2020-06-15 15:22:56 -07:00
Dan Hermann	8a910443c4	Add ignore_empty_value parameter in set ingest processor (#57030 ) (#58108 )	2020-06-15 08:35:08 -05:00
Rene Groeschke	01e9126588	Remove deprecated usage of testCompile configuration (#57921 ) (#58083 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-14 22:30:44 +02:00
Martijn van Groningen	c8031c6f99	Add data stream support to the reindex api. (#57970 ) Backport of #57870 to 7.x branch. This change now also copies the op_type from the reindex request's destination index request to the actual index request being used in the bulk request. For ensuring no document exists, the op_type create doesn't need to be copied, since Versions.MATCH_DELETED will copied from the 'mainRequest.getDestination().version()'. The `version()` method on IndexRequest only returns Versions.MATCH_DELETED if op_type=create and no specific version has been specified. However in order to be able to index into a data stream, the op_type must be create. So in order to support that the op_type must be copied from the reindex request's destination index request to the actual index request being used in the bulk request. Relates to #53100 and #57788	2020-06-12 09:54:37 +02:00
Mark Tozzi	36f551bdb4	Make ValuesSourceConfig behave like a config object (#57762 ) (#58012 )	2020-06-11 17:23:55 -04:00
Alan Woodward	16e230dcb8	Update to lucene snapshot e7c625430ed (#57981 ) Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.	2020-06-11 14:51:53 +01:00

1 2 3 4 5 ...

5685 Commits