OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-21 12:27:37 +00:00

Author	SHA1	Message	Date
Jim Ferenczi	a0e4331c49	Cleanup usages of QueryPhaseResultConsumer (#61713 ) This commit generalizes how QueryPhaseResultConsumer is initialized. The query phase always uses this consumer so it doesn't need to be hidden behind an abstract class.	2020-09-02 14:41:02 +02:00
Alan Woodward	d59343b4ba	Allow [null] values in [null_value] (#61798 ) (#61807 ) Several field mappers have a null_value parameter, that allows you to specify a placeholder value to insert into a document if the incoming value for that field is null. The default value for this is always null, meaning "add no placeholder". However, we explicitly bar users from setting this parameter directly to null (done in #7978, in order to fix an NPE). This exclusion means that if a mapper is serialized with include_defaults, then we either need to special-case null_value to ensure that it is not output when it holds the default value, or we find that the resulting serialized form cannot be used to create a mapping. This stops us doing some useful generic testing of mappers. This commit permits null as a parameter value for null_value, and changes the tests to check that it is a) permissible and b) applied without throwing errors. As part of the testing changes, a new base class MapperServiceTestCase is refactored from MapperTestCase, holding the various helper methods related to building mappings but not the single-mapper specific abstract methods. Closes #58823	2020-09-02 10:42:19 +01:00
Igor Motov	48e53cca94	Fix wrong NaN comparison (#61795 ) (#61811 ) Fixes wrong NaN comparison in error message generator in GeoPolygonDecomposer and PolygonBuilder. Supersedes #48207 Co-authored-by: Pedro Luiz Cabral Salomon Prado <pedroprado010@users.noreply.github.com>	2020-09-01 15:50:38 -04:00
Tim Brooks	e573fa9abc	Add data.path fast path for FilePermission (#61302 ) The recursive data.path FilePermission check is an extremely hot codepath in Elasticsearch. Unfortunately the FilePermission check in Java is extremely allocation heavy. As it iterates through different file permissions, it allocates byte arrays for each Path component that must be compared. This PR improves the situation by adding the recursive data.path FilePermission it its own PermissionsCollection object which is checked first.	2020-09-01 12:03:22 -06:00
Armin Braun	28710c985d	Dry up Settings from Map Construction (#61778 ) (#61803 ) We used the same hack all over the place. At least drying it up to a single place. Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>	2020-09-01 19:46:10 +02:00
Tanguy Leroux	6e944d9e21	Throws IndexNotFoundException in TransportGetAction for unknown System indices (#61785 ) (#61791 ) The change #57936 introduced a dedicated thread pool for reads in system indices. It also introduced a potential NPE in the case the index to read in not yet present in the cluster state. This commit fixes that bug by using the getIndexSafe() instead of just index() method when retrieving the index's metadata so that an INFE is thrown if the index does not exist.	2020-09-01 17:41:57 +02:00
Dan Hermann	88a448f1cd	Fix wrong result when executing bulk requests with and without pipeline (#60818 ) (#61777 )	2020-09-01 07:05:25 -05:00
Armin Braun	3fd25bfa87	Fix Concurrent Snapshot Create+Delete + Delete Index (#61770 ) (#61773 ) We had a bug here were we put a `null` value into the shard assignment mapping when reassigning work after a snapshot delete had gone through. This only affects partial snaphots but essentially dead-locks the snapshot process. Closes #61762	2020-09-01 13:20:25 +02:00
Tanguy Leroux	787dfda4c1	Prevent snapshots to be mounted as system indices (#61517 ) (#61727 ) System indices can be snapshotted and are therefore potential candidates to be mounted as searchable snapshot indices. As of today nothing prevents a snapshot to be mounted under an index name starting with . and this can lead to conflicting situations because searchable snapshot indices are read-only and Elasticsearch expects some system indices to be writable; because searchable snapshot indices will soon use an internal system index (#60522) to speed up recoveries and we should prevent the system index to be itself a searchable snapshot index (leading to some deadlock situation for recovery). This commit introduces a changes to prevent snapshots to be mounted as a system index.	2020-09-01 11:13:28 +02:00
Boice Huang	8fdd3d158b	Remove redundant symbol in msearch tests (#61353 )	2020-09-01 10:58:22 +02:00
Nik Everett	fb84c1f73e	Calculate precise cardinality upper bounds (#61529 ) (#61754 ) This reworks `CardinalityUpperBound` to support precise estimates while maintaining most of the public API. This will allow us to make more informed choices about the data structures that we use in aggregations. None of those interesting choices come as part of this change, but they are more possible with it.	2020-08-31 15:10:02 -04:00
Dan Hermann	2858e1efc4	Document new stats in _cat/nodes (#60445 ) (#61742 )	2020-08-31 12:40:21 -05:00
Adam Locke	5723b928d7	Remove Outdated Snapshot Docs (#61684 ) (#61728 ) Removing some now outdated statements that refer to a time when snapshot operations could not run concurrently. Closes #61680	2020-08-31 12:04:27 -04:00
Jason Tedor	43cb7c48bd	Adjust Lucene versions for 7.9.1 This commit adjusts the Lucene versions for 7.9.1 after the backporting of upgrading the 7.9 branch to Lucene 8.6.2.	2020-08-31 10:30:39 -04:00
Jason Tedor	64cd229b35	Upgrade to Lucene 8.6.2 (#61688 ) This commit upgrades the Lucene dependencies to 8.6.2.	2020-08-31 09:54:07 -04:00
Rory Hunter	ff6c071275	Implement deprecation logging using log4j (#61629 ) Backport of #61474. Part of #46106. Simplify the implementation of deprecation logging by relying of log4j more completely, and implementing additional behaviour through custom appenders and filters.	2020-08-31 12:42:04 +01:00
Armin Braun	5c86b216e8	Fix Race in testGetSnapshotsRequest (#61694 ) (#61700 ) The fact that the data node is already blocked on writing data files did not guarantee that the cluster state that made the data node start snapshotting is already applied on master. This could lead to races where the get snapshots action still runs based on a state without the snapshot in it, tripping the assertion. Much safer to handle this by waiting on the non-blocking snapshot create to return, which guarantees that the CS has been applied on master. Closes #61541	2020-08-31 11:06:51 +02:00
Armin Braun	22e4d759c3	Speed up Reading Enum Set from Stream (#61678 ) (#61687 ) No need in adding enum values to a normal set and then copying, the `EnumSet` is directly mutable just fine.	2020-08-30 20:49:51 +02:00
Jake Landis	d2e5f2f532	[7.x] Enhance the ingest node simulate verbose output (#60433 ) (#60678 ) This commit enhances the verbose output for the `_ingest/pipeline/_simulate?verbose` api. Specifically this adds the following: * the pipeline processor is now included in the output * the conditional (if) and result is now included in the output iff it was defined * a status field is always displayed. the possible values of status are * `success` - if the processor ran with out errors * `error` - if the processor ran but threw an error that was not ingored * `error_ignored` - if the processor ran but threw an error that was ingored * `skipped` - if the process did not run (currently only possible if the if condition evaluates to false) * `dropped` - if the the `drop` processor ran and dropped the document * a `processor_type` field for the type of processor (e.g. set, rename, etc.) * throw a better error if trying to simulate with a pipeline that does not exist closes #56004	2020-08-27 16:53:09 -05:00
Lee Hinman	1bfebd54ea	[7.x] Allocate newly created indices on data_hot tier nodes (#61342 ) (#61650 ) This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by default when they are created. This does not break existing behavior, as nodes with the `data` role are considered to be part of the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`, `data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by default. This change is a little more complicated than changing the default value for `index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to have a plugin inject a setting into the builder for a newly created index. This has the benefit of allowing this setting to be visible as part of the settings when retrieving the index, for example: ``` // Create an index PUT /eggplant // Get an index GET /eggplant?flat_settings ``` Returns the default settings now of: ```json { "eggplant" : { "aliases" : { }, "mappings" : { }, "settings" : { "index.creation_date" : "1597855465598", "index.number_of_replicas" : "1", "index.number_of_shards" : "1", "index.provided_name" : "eggplant", "index.routing.allocation.include._tier" : "data_hot", "index.uuid" : "6ySG78s9RWGystRipoBFCA", "index.version.created" : "8000099" } } } ``` After the initial setting of this setting, it can be treated like any other index level setting. This new setting is not set on a new index if any of the following is true: - The index is created with an `index.routing.allocation.include.<anything>` setting - The index is created with an `index.routing.allocation.exclude.<anything>` setting - The index is created with an `index.routing.allocation.require.<anything>` setting - The index is created with a null `index.routing.allocation.include._tier` value - The index was created from an existing source metadata (shrink, clone, split, etc) Relates to #60848	2020-08-27 13:41:12 -06:00
Luca Cavanna	f769821bc8	Pass SearchLookup supplier through to fielddataBuilder (#61430 ) (#61638 ) Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not. To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method. As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors. With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition. Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch. This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>. Relates to #59332 Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-08-27 18:09:56 +02:00
Alan Woodward	b6cb590685	Log more information when mappings fail on index creation (#61577 ) Errors from bad mappings at index creation are currently logged at DEBUG level, which can make it difficult to work out what's going on if the index is being auto-created. This commit ups the log level to INFO for auto-created indices, and includes some more information in the log message.	2020-08-27 15:08:51 +01:00
David Turner	411965d392	Allow background cluster state update in tests (#61455 ) Today the `CoordinatorTests` run the publication process as a single atomic action; however in production it appears possible that another master may be elected, publish its state, then fail, then we win another election, all in between the time we sampled our previous cluster state and started to publish the one we first thought of. This violates the `assertClusterStateConsistency()` assertion that verifies the cluster state update event matches the states we actually published and applied. This commit adjusts the tests to run the publication process more asynchronously so as to allow time for this behaviour to occur. This should eventually result in a reproduction of the failure in #61437 that will let us analyse what's really going on there and help us fix it.	2020-08-27 11:22:58 +01:00
David Turner	b866aaf81c	Use int for number of parts in blob store (#61618 ) Today we use `long` to represent the number of parts of a blob. There's no need for this extra range, it forces us to do some casting elsewhere, and indeed when snapshotting we iterate over the parts using an `int` which would be an infinite loop in case of overflow anyway: for (int i = 0; i < fileInfo.numberOfParts(); i++) { This commit changes the representation of the number of parts of a blob to an `int`.	2020-08-27 10:54:03 +01:00
David Turner	5df74cc888	Replace Math.toIntExact with toIntBytes (#61604 ) We convert longs to ints using `Math.toIntExact` in places where we're sure there will be no overflow, but this doesn't explain the intent of these conversions very well. This commit introduces a dedicated method for these conversions, and adds an assertion that we never overflow.	2020-08-27 08:28:54 +01:00
Jay Modi	34c4fc3b91	Remove tasks module to define tasks system index (#61588 ) This commit removes the tasks module that only existed to define the tasks result index, `.tasks`, as a system index. The definition for the tasks results system index descriptor is moved to the `SystemIndices` class with a check that no other plugin or module attempts to define an entry with the same source. Additionally, this change also makes the pattern for the tasks result index a wildcard pattern since we will need this when the index is upgraded (reindex to new name and then alias that to .tasks). Backport of #61540	2020-08-26 09:48:23 -06:00
David Turner	f2dc664228	Remove dead code in EsExecutors (#61574 ) Removes a couple of unused methods.	2020-08-26 16:08:36 +01:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Igor Motov	f70a59971a	[7.x] Add rate aggregation (#61369 ) (#61554 ) Adds a new rate aggregation that can calculate a document rate for buckets of a date_histogram. Closes #60674	2020-08-25 17:39:00 -04:00
Nik Everett	87cf81e179	Migrate some more mapper test cases (#61507 ) (#61552 ) Migrate some more mapper test cases from `ESSingleNodeTestCase` to `MapperTestCase`.	2020-08-25 15:27:26 -04:00
markharwood	8b56441d2b	Search - add case insensitive support for regex queries. (#59441 ) (#61532 ) Backport to add case insensitive support for regex queries. Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master. This can be removed when 8.7 Lucene is released. Closes #59235	2020-08-25 17:18:59 +01:00
Przemyslaw Gomulka	f3f7d25316	Header warning logging refactoring backport(#55941 ) (#61515 ) Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog). Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed. relates #55699 relates #52369 backports #55941	2020-08-25 16:35:54 +02:00
Armin Braun	f22ddf822e	Some Optimizations around BytesArray (#61183 ) (#61511 ) * Faster `equals` for `BytesArray` which is nice since with this change we use it for the search cache * Lighter `StreamInput` for `BytesArray` that should save memory and some indirection relative to the one on the abstract bytes reference * Lighter `writeTo` implementation * Build a `BytesArray` instead of a PagedBytesReference whenever possible to save indirection and memory	2020-08-25 07:13:39 +02:00
Armin Braun	806dfcfcf7	Speed up Compression Logic by Pooling Resources (#61358 ) (#61495 ) This is mostly motivated by the performance issues we are seeing around the GET mappings REST API which (in case of a large number of indices) will create decompressing streams in a hot loop which takes a significant amount of time for the system calls involved in instantiating deflaters and inflaters. Also, this fixes a leaked deflater when deserializing cached repository data.	2020-08-25 04:01:55 +02:00
Armin Braun	16b932c1dc	Remove Potentially Expensive Use of BytesReference.toBytesRef (#61415 ) (#61503 ) This method might have materialize all the bytes in a reference into a fresh `byte[]`. Using the stream is much safer and only trivially more expensive + in most cases we now run the fast path via `BytesArray` anyway.	2020-08-24 23:58:21 +02:00
Nhat Nguyen	d47bbbafe0	Cancel multisearch when http connection closed (#61399 ) Relates #61337	2020-08-24 15:12:54 -04:00
Nhat Nguyen	23a0f8b617	Detect and optimize noop of update index settings (#61348 ) This optimization is more relevant in the context of CCR. When a node in the follower cluster leaves, we reallocate the shard-follow tasks on that node to other nodes. The new tasks will overwhelm the follower cluster with many put-mapping, update-settings requests, although most of them are noop. This change detects and optimizes the noop update-settings requests.	2020-08-24 15:08:53 -04:00
Nik Everett	f3b6d49ae1	Migrate server mapper tests to new MapperTestCase (#61378 ) (#61490 ) This continues #61301, migrating all of the mappers in `server` to the new `MapperTestCase` which is nicer than `FieldMapperTestCase` because it doesn't depend on all of Elasticsearch.	2020-08-24 13:33:35 -04:00
Armin Braun	bb4d97073c	Remove Favicon Special Path in RestController (#61460 ) (#61487 ) It's unnecessary (and adds one string comparison to every request) to special case the favicon so I added it as a normal REST handler to simplify the code.	2020-08-24 18:36:23 +02:00
Armin Braun	af2e2782eb	Stop Needlessly Copying Bytes in XContent Parsing (#61447 ) (#61469 ) Wrapping a `BytesArray` in a `StreamInput` for deserialization is inefficient. This forces Jackson to internally buffer (i.e. copy) all bytes from the `BytesArray` before deserializing, adding overhead for copying the bytes and managing the buffers. This commit fixes a number of spots where `BytesArray` is the most common type of `BytesReference` to special case this type and parse it more efficiently. Also improves parsing `String`s to use the more efficient direct `String` parsing APIs.	2020-08-24 15:49:15 +02:00
Dan Hermann	c53731a0cd	[7.x] Fix wrong pipeline name in debug log (#58817 ) (#61233 )	2020-08-21 11:14:01 -05:00
David Turner	078e8717ee	Stop opening PING conns to remote clusters (#61408 ) Today a remote cluster connection comprises a `PING` and a `REG` channel. The `PING` channel is only used for health checks between the elected master and the members of its own cluster, so is unused in a remote cluster connection. This commit removes this unused connection.	2020-08-21 12:21:57 +01:00
Armin Braun	e09058df1a	Serialize Get Mappings Response on Generic ThreadPool (#57937 ) (#61401 ) For large responses to the get mappings request, the serialization to XContent can be extremely slow (serializing mappings is expensive since we have to decompress and deserialize the mapping source). To not introduce instability on the IO thread handling the get mappings response we should move the serialization to the management pool. The trade-off of introducing one or two new context switches for responses that are small enough to not cause trouble on the transport thread to prevent instability in case of a large number of mappings in the cluster seems worth it.	2020-08-21 08:06:30 +02:00
Armin Braun	22509c95f8	Fix Blackholed Connection Behavior in DisruptableMockTransport (#61310 ) (#61381 ) It is not realistic to drop messages without eventually failing. To retain the coverage of long pauses this PR adjusts the blackholed behavior to fail a send after 24h (which is assumed to be longer than any timeout in the system) instead of never. Closes #61034	2020-08-21 07:54:56 +02:00
Julie Tibshirani	997c73ec17	Correct how field retrieval handles multifields and copy_to. (#61391 ) Before when a value was copied to a field through a parent field or `copy_to`, we parsed it using the `FieldMapper` from the source field. Instead we should parse it using the target `FieldMapper`. This ensures that we apply the appropriate mapping type and options to the copied value. To implement the fix cleanly, this PR refactors the value parsing strategy. Now instead of looking up values directly, field mappers produce a helper object `ValueFetcher`. The value fetchers are responsible for almost all aspects of fetching, including looking up the right paths in the _source. The PR is fairly big but each commit can be reviewed individually. Fixes #61033.	2020-08-20 15:53:35 -07:00
Julie Tibshirani	85ad328df7	Ensure fetch fields aren't dropped when rewriting search. (#61390 ) Previously we didn't retain the requested fields when performing a shallow copy of the search source. This meant that when a search was rewritten, we could drop the requested fields and fail to return them in the response.	2020-08-20 14:58:58 -07:00
Armin Braun	08dbd6d989	Optimize a few Spots on IO Loop (#60865 ) (#61380 ) Saving some cycles here and there on the IO loop: * Don't instantiate new `Runnable` to execute on `SAME` in a few spots * Don't instantiate complicated wrapped stream for empty messages * Stop instantiating almost never used `ClusterStateObserver` in two spots * Some minor cleanup and preventing pointless `Predicate<>` instantiation in transport master node action	2020-08-20 20:22:49 +02:00
Alan Woodward	a3a0c63ccf	Convert NumberFieldMapper to parametrized form (#61092 ) (#61376 ) In addition, this commit converts ScaledFloatFieldMapper as it was relying on a number of static values taken from NumberFieldMapper that had changed or been removed.	2020-08-20 16:43:26 +01:00
Nhat Nguyen	a3906dcef3	Enable cancellation for msearch requests (#61337 ) Today multi-search requests are not cancellable because we create regular tasks instead of cancellable ones for them.	2020-08-19 16:59:17 -04:00
Nik Everett	9789e6d154	Migrate some field mapper tests to ESTestCase (#61301 ) (#61346 ) This switches a few tests for field mappers from `ESSingleNodeTestCase` to `ESTestCase` because, in general, we prefer to avoid `ESSingleNodeTestCase` when we can because it is slow and "big". "Big" here means that it pulls in an entire node, making it difficult to reason about what you are testing.	2020-08-19 15:43:49 -04:00

1 2 3 4 5 ...

5265 Commits