OpenSearch

Commit Graph

Author	SHA1	Message	Date
Ryan Ernst	742213d710	Improve error message when index settings are not a map (#45588 ) This commit adds an explicit error message when a create index request contains a settings key that is not a json object. Prior to this change the user would be given a ClassCastException with no explanation of what went wrong. closes #45126	2019-08-16 11:39:26 -07:00
Zachary Tong	50c65d05ba	Move bucket reduction from Bucket to the InternalAgg (#45566 ) The current idiom is to have the InternalAggregator find all the buckets sharing the same key, put them in a list, get the first bucket and ask that bucket to reduce all the buckets (including itself). This a somewhat confusing workflow, and feels like the aggregator should be reducing the buckets (since the aggregator owns the buckets), rather than asking one bucket to do all the reductions. This commit basically moves the `Bucket.reduce()` method to the InternalAgg and renames it `reduceBucket()`. It also moves the `createBucket()` (or equivalent) method from the bucket to the InternalAgg as well.	2019-08-16 13:59:00 -04:00
Andrey Ershov	dbc90653dc	transport.publish_address should contain CNAME (#45626 ) This commit adds CNAME reporting for transport.publish_address same way it's done for http.publish_address. Relates #32806 Relates #39970 (cherry picked from commit e0a2558a4c3a6b6fbfc6cd17ed34a6f6ef7b15a9)	2019-08-16 17:42:00 +02:00
Armin Braun	d6a9edea16	Lower Limit for Maximum Message Size in TcpTransport (#44496 ) (#45635 ) * Since we're buffering network reads to the heap and then deserializing them it makes no sense to buffer a message that is 90% of the heap size since we couldn't deserialize it anyway * I think `30%` is a more reasonable guess here given that we can reasonably assume that the deserialized message will be larger than the serialized message itself and processing it will take additional heap as well	2019-08-16 12:27:54 +02:00
Armin Braun	a48242c371	Cleanup Redundant TransportLogger Instantiation (#43265 ) (#45629 ) * This class' methods are all effectively `static` => make them `static` and stop instantiating it needlessly	2019-08-15 21:16:56 +02:00
Zachary Tong	cd441f6906	Catch AllocatedTask registration failures (#45300 ) When a persistent task attempts to register an allocated task locally, this creates the Task object and starts tracking it locally. If there is a failure while initializing the task, this is handled by a catch and subsequent error handling (canceling, unregistering, etc). But if the task fails to be created because an exception is thrown in the tasks ctor, this is uncaught and fails the cluster update thread. The ramification is that a persistent task remains in the cluster state, but is unable to create the allocated task, and the exception prevents other tasks "after" the poisoned task from starting too. Because the allocated task is never created, the cancellation tools are not able to remove the persistent task and it is stuck as a zombie in the CS. This commit adds exception handling around the task creation, and attempts to notify the master if there is a failure (so the persistent task can be removed). Even if this notification fails, the exception handling means the rest of the uninitialized tasks can proceed as normal.	2019-08-15 15:14:19 -04:00
Armin Braun	de58353722	Lower Painless Static Memory Footprint (#45487 ) (#45619 ) * Painless generates a ton of duplicate strings and empty `Hashmap` instances wrapped as unmodifiable * This change brings down the static footprint of Painless on an idle node by 20MB (after running the PMC benchmark against said node) * Since we were looking into ways of optimizing for smaller node sizes I think this is a worthwhile optimization	2019-08-15 19:41:45 +02:00
Alpar Torok	03a1645bc6	Use dynamic port ranges for ExternalTestCluster (#45601 ) Moves methods added in #44213 and uses them to configure the port range for `ExternalTestCluster` too. These were still using `9300-9400` ( teh default ) and running into races.	2019-08-15 16:40:12 +03:00
Armin Braun	1beea3588b	Make BlobStoreRepository Validation Read master.dat (#45546 ) (#45578 ) * Fixing this for two reasons: 1. Why not verify that the seed we wrote is actually there when we can 2. The AWS S3 SDK started to log a bunch of WARN messages about not fully reading the stream now that we started to abuse the read blob as an `exists` check after removing that method from the blob container	2019-08-15 07:07:52 +02:00
Nick Knize	647a8308c3	[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363 ) * Introduce Spatial Plugin (#44389) Introduce a skeleton Spatial plugin that holds new licensed features coming to Geo/Spatial land! * [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923) Refactor DeprecatedParameters specific to legacy geo_shape out of AbstractGeometryFieldMapper.TypeParser#parse. * [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980) Add a new ShapeFieldMapper to the xpack spatial module for indexing arbitrary cartesian geometries using a new field type called shape. The indexing approach leverages lucene's new XYShape field type which is backed by BKD in the same manner as LatLonShape but without the WGS84 latitude longitude restrictions. The new field mapper builds on and extends the refactoring effort in AbstractGeometryFieldMapper and accepts shapes in either GeoJSON or WKT format (both of which support non geospatial geometries). Tests are provided in the ShapeFieldMapperTest class in the same manner as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests. Documentation for how to use the new field type and what parameters are accepted is included. The QueryBuilder for searching indexed shapes is provided in a separate commit. * [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108) Add a new ShapeQueryBuilder to the xpack spatial module for querying arbitrary Cartesian geometries indexed using the new shape field type. The query builder extends AbstractGeometryQueryBuilder and leverages the ShapeQueryProcessor added in the previous field mapper commit. Tests are provided in ShapeQueryTests in the same manner as GeoShapeQueryTests and docs are updated to explain how the query works.	2019-08-14 16:35:10 -05:00
Armin Braun	e0d84e7178	Clean up Callback Chains and Duplicate in SnapshotResiliencyTests (#45398 ) (#45563 ) * It's in the title, follow up to #45233 * Flatten more listeners into `StepListener` * Remove duplication from repo and index bootstrap and asserting that the steps execute successfully	2019-08-14 21:53:07 +02:00
Armin Braun	5f6bc6fc2d	Prevent Leaking Search Tasks on Exceptions in FetchSearchPhase and DfsQueryPhase (#45500 ) (#45540 ) * If `counter.onResult` throws an exception we might leak a transport task because the failure is not handled as a phase failure (instead it bubbles up in the transport service eventually hitting the `onFailure` callback again and couting down the `counter` twice). Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>	2019-08-14 14:49:38 +02:00
Armin Braun	00e4fba2fb	Simplify and Optimize RestController Slightly (#45419 ) (#45485 ) * Simplify the path iterator to generate less garbage * `dispatchRequest` always terminates, adjust code accordingly	2019-08-13 10:43:30 +02:00
Julie Tibshirani	dc1856ca53	Make sure to validate the type before attempting to merge a new mapping. (#45157 ) Currently, when adding a new mapping, we attempt to parse + merge it before checking whether its top-level document type matches the existing type. So when a user attempts to introduce a new mapping type, we may give a confusing error message around merging instead of complaining that it's not possible to add more than one type ("Rejecting mapping update to [my-index] as the final mapping would have more than 1 type..."). This PR moves the type validation to the start of `MetaDataMappingService#applyRequest` so that we make sure the type matches before performing any mapper merging. We already partially addressed this issue in #29316, but the tests there focused on `MapperService` and did not catch this problem with end-to-end mapping updates. Addresses #43012.	2019-08-12 14:28:03 -07:00
Zachary Tong	4d97d2c50f	Revert "Only execute one final reduction in InternalAutoDateHistogram (#45359 )" This reverts commit `c0ea8a867e`.	2019-08-12 17:17:17 -04:00
Julie Tibshirani	8c4394d5d7	Fix a bug where mappings are dropped from rollover requests. (#45411 ) We accidentally introduced this bug when adding a typeless version of the rollover request. The bug is not present if include_type_name is set to true.	2019-08-12 12:46:27 -07:00
Michael Basnight	a521e4c86f	Retrieve processors instead of checking existence (#45354 ) The previous hasProcessors method would validate if a processor was present within a pipeline, but would not return the contents of the processors. This does not allow a consumer to inspect the processor for specific metadata. The method now returns the list of processors based on the class of the processor passed in.	2019-08-12 13:48:17 -05:00
Zachary Tong	472f6ef41a	Mute InternalAutoDateHistogramTests#testReduceRandom()	2019-08-12 14:45:08 -04:00
Zachary Tong	c0ea8a867e	Only execute one final reduction in InternalAutoDateHistogram (#45359 ) Because auto-date-histo can perform multiple reductions while merging buckets, we need to ensure that the intermediate reductions are done with a `finalReduce` set to false to prevent Pipeline aggs from generating their output. Once all the buckets have been merged and the output is stable, a mostly-noop reduction can be performed which will allow pipelines to generate their output.	2019-08-12 14:07:38 -04:00
Albert Zaharovits	2cb172f079	CreateIndex and PutIndexTemplate with typeless mapping (#45120 ) This commit makes sure that mapping parameters to `CreateIndex` and `PutIndexTemplate` are keyed by the type name. `IndexCreationTask` expects mappings to be keyed by the type name. It asserts this for template mappings but not for the mappings in the request. The `CreateIndexRequest` and `RestCreateIndexAction` mostly make it sure that the mapping is keyed by a type name, but not always. When building the create-index request outside of the REST handler, there are a few methods to set the mapping for the request. Some of them add the type name some of them do not. For example, `CreateIndexRequest#mapping(String type, Map<String, ?> source)` adds the type name, but `CreateIndexRequest#mapping(String type, XContentBuilder source)` does not. This PR asserts the type name in the request mapping inside `IndexCreationTask` and makes all `CreateIndexRequest#mapping` methods add the type name.	2019-08-12 08:05:07 +03:00
Armin Braun	a9e1402189	Remove Settings from BaseRestRequest Constructor (#45418 ) (#45429 ) * Resolving the todo, cleaning up the unused `settings` parameter * Cleaning up some other minor dead code in affected classes	2019-08-12 05:14:45 +02:00
Nhat Nguyen	cf9a73b5ac	Call afterWriteOperation after trim translog in peer recovery (#45182 ) testShouldFlushAfterPeerRecovery was added #28350 to make sure the flushing loop triggered by afterWriteOperation eventually terminates. This test relies on the fact that we call afterWriteOperation after making changes in translog. In #44756, we roll a new generation in RecoveryTarget#finalizeRecovery but do not call afterWriteOperation. Relates #28350 Relates #45073	2019-08-10 22:59:02 -04:00
Nhat Nguyen	25c6102101	Trim local translog in peer recovery (#44756 ) Today, if an operation-based peer recovery occurs, we won't trim translog but leave it as is. Some unacknowledged operations existing in translog of that replica might suddenly reappear when it gets promoted. With this change, we ensure trimming translog above the starting sequence number of phase 2. This change can allow us to read translog forward.	2019-08-10 22:59:02 -04:00
Armin Braun	1cd464d675	Isolate Request in Call-Chain for REST Request Handling (#45130 ) (#45417 ) * Follow up to #44949 * Stop using a special code path for multi-line JSON and instead handle its detection like that of other XContent types when creating the request * Only leave a single path that holds a reference to the full REST request * In the next step we can move the copying of request content to happen before the actual request handling and make it conditional on the handler in question to stop copying bulk requests as suggested in #44564	2019-08-10 10:21:01 +02:00
Armin Braun	d1ed9bdbfd	Use StepListener to Simplify SnapshotResiliencyTests (#45233 ) (#45386 ) * Reduces complicated callback relations in `testSuccessfulSnapshotAndRestore` to flat steps of sequential actions * Will refactor the other tests in this suit as a follow up * This format certainly makes it easier to create more complicated tests that involve multiple subsequent snapshots as it would allow adding loops	2019-08-09 18:19:48 +02:00
Yannick Welsch	9e6d874a41	Show BWC version in ClusterFormationFailureHelper (#45352 ) When having a cluster state from 6.x, display the metadata version as the cluster state version. Avoids confusion where a cluster state from 6.x is displayed as version 0 even if has some actual content.	2019-08-09 16:23:38 +02:00
Yannick Welsch	5ddeb488a6	Allow _update on write alias (#45318 ) Using the document update API on aliases with a write index does not work. Follow-up to #31520	2019-08-09 11:44:24 +02:00
Tal Levy	2a99eaa7c2	Revert "removes the CellIdSource abstraction from geo-grid aggs (#45307 ) (#45353 )" This reverts commit `7b0a8040de`.	2019-08-08 17:40:03 -07:00
Armin Braun	12ed6dc999	Only retain reasonable history for peer recoveries (#45208 ) (#45355 ) Today if a shard is not fully allocated we maintain a retention lease for a lost peer for up to 12 hours, retaining all operations that occur in that time period so that we can recover this replica using an operations-based recovery if it returns. However it is not always reasonable to perform an operations-based recovery on such a replica: if the replica is a very long way behind the rest of the replication group then it can be much quicker to perform a file-based recovery instead. This commit introduces a notion of "reasonable" recoveries. If an operations-based recovery would involve copying only a small number of operations, but the index is large, then an operations-based recovery is reasonable; on the other hand if there are many operations to copy across and the index itself is relatively small then it makes more sense to perform a file-based recovery. We measure the size of the index by computing its number of documents (including deleted documents) in all segments belonging to the current safe commit, and compare this to the number of operations a lease is retaining below the local checkpoint of the safe commit. We consider an operations-based recovery to be reasonable iff it would involve replaying at most 10% of the documents in the index. The mechanism for this feature is to expire peer-recovery retention leases early if they are retaining so much history that an operations-based recovery using that lease would be unreasonable. Relates #41536	2019-08-09 01:56:32 +02:00
Tal Levy	7b0a8040de	removes the CellIdSource abstraction from geo-grid aggs (#45307 ) (#45353 ) CellIdSource is a helper ValuesSource that encodes GeoPoint into a long-encoded representation of the grid bucket the point is associated with. This complicates thing as usage evolves to support shapes that are associated with more than one bucket ordinal.	2019-08-08 16:33:16 -07:00
Armin Braun	b19de55095	Add missing wait to testAutomaticReleaseOfIndexBlock (#45342 ) (#45351 ) Today the test waits for one of the shards to be blocked, but this does not mean that the block has been applied on all nodes, so a subsequent indexing operation may still go through. Fixes #45338	2019-08-08 22:39:22 +02:00
Henning Andersen	d139896b66	Reindex share retry between hit sources (#44203 ) (#45348 ) The client and remote hit sources had each their own retry mechanism, which would do the same. Supporting resiliency we would have to expand on the retry mechanisms and as a preparation for that, the retry mechanism is now shared such that each sub class is only responsible for sending requests and converting responses/failures to common format. Part of #42612	2019-08-08 22:01:29 +02:00
Christoph Büscher	a552b33276	Fix occasional SuggestSearchIT failure (#45330 ) Refreshes happening during indexing can result differen segment counts and slightly skewed term statistics, which in turn has the potential to change suggestion output slightly. In order to prevent this, disable refresh for the affected tests. Closes #43261	2019-08-08 21:06:32 +02:00
Dimitris Athanasiou	e53bb050db	Mute testAutomaticReleaseOfIndexBlock Relates #45338	2019-08-08 17:56:41 +03:00
Andrey Ershov	07c656fba9	Mute testCustomDataPaths on Windows See #45333 (cherry picked from commit 671e1ad1068aee4b593ad0c8ab13ff60b4f125b8)	2019-08-08 16:26:56 +02:00
Zachary Tong	86d6597890	Use newIndexSearcher() instead of newSearcher() (#45248 ) `newSearcher()` from lucene can randomly choose index readers which are not compatible with our tests, like ParallelCompositeReader. The `newIndexSearcher()` method on AggregatorTestCase is a wrapper similar to newSearcher but compatible with our tests	2019-08-08 09:34:38 -04:00
Martijn van Groningen	e066133016	Change the ingest simulate api to not include dropped documents (#44161 ) If documents are dropped by the `drop` processor then these documents are returned as a `null` value in the response. === Example Create pipeline: ``` PUT _ingest/pipeline/droppipeline { "processors": [ { "set": { "field": "bla", "value": "val" } }, { "drop": {} } ] } ``` Simulate request: POST _ingest/pipeline/droppipeline/_simulate { "docs": [ { "_source": { "message": "text" } } ] } Response: ``` { "docs": [ null ] } ``` Response if verbose is enabled: ``` { "docs": [ { "processor_results": [ { "doc": { "_index": "_index", "_type": "_doc", "_id": "_id", "_source": { "message": "text", "bla": "val" }, "_ingest": { "timestamp": "2019-07-10T11:07:10.758315Z" } } }, null ] } ] } ``` Closes #36150 * Abort pipeline simulation in verbose mode when document has been dropped by drop processor	2019-08-08 13:04:33 +02:00
Martijn van Groningen	fb959d188c	Backport: Add description to force-merge tasks (#41365 ) (#45191 ) * Add description to force-merge tasks (#41365) This is static information that is part of the force merge request. Relates to #15975	2019-08-08 08:15:09 +02:00
Michael Basnight	89861d0884	Add ingest processor existence helper method (#45156 ) This commit adds a helper method to the ingest service allowing it to inspect a pipeline by id and verify the existence of a processor in the pipeline. This work exposed a potential bug in that some processors contain inner processors that are passed in at instantiation. These processors needed a common way to expose their inner processors, so the WrappingProcessor was created in order to expose the inner processor.	2019-08-07 11:19:04 -05:00
Bukhtawar	cd304c4def	Auto-release flood-stage write block (#42559 ) If a node exceeds the flood-stage disk watermark then we add a block to all of its indices to prevent further writes as a last-ditch attempt to prevent the node completely exhausting its disk space. However today this block remains in place until manually removed, and this block is a source of confusion for users who current have ample disk space and did not even realise they nearly ran out at some point in the past. This commit changes our behaviour to automatically remove this block when a node drops below the high watermark again. The expectation is that the high watermark is some distance below the flood-stage watermark and therefore the disk space problem is truly resolved. Fixes #39334	2019-08-07 11:03:53 +01:00
Tanguy Leroux	a869342910	Restore DefaultShardOperationFailedException's reason after deserialization (#45203 ) The reason field of DefaultShardOperationFailedException is lost during serialization. This is sad because this field is checked for nullity during xcontent generation and it means that the cause won't be included in the generated xcontent and won't be printed in two REST API responses (Close Index API and Indices Shard Stores API). This commit simply restores the reason from the cause during deserialization.	2019-08-07 10:37:15 +02:00
Jason Tedor	bd59ee6c72	Fix clock used in update requests (#45262 ) We accidentally switched to using the relative time provider here. This commit fixes this by switching to the appropriate absolute clock.	2019-08-06 21:15:21 -04:00
David Turner	f5d1381e01	Remove always-true param from IndicesService#stats (#45231 ) Parameter `includePrevious` is always true, so this commit inlines it.	2019-08-06 17:22:11 +01:00
David Turner	355713b9ca	Improve slow logging in MasterService (#45241 ) Adds a tighter threshold for logging a warning about slowness in the `MasterService` instead of relying on the cluster service's 30-second warning threshold. This new threshold applies to the computation of the cluster state update in isolation, so we get a warning if computing a new cluster state update takes longer than 10 seconds even if it is subsequently applied quickly. It also applies independently to the length of time it takes to notify the cluster state tasks on completion of publication, in case any of these notifications holds up the master thread for too long. Relates #45007 Backport of #45086	2019-08-06 17:01:49 +01:00
Tanguy Leroux	772ce1f599	Add deprecation warning for Force Merge API (#44903 ) This commit adds a deprecation warning in 7.x for the Force Merge API when both only_expunge_deletes and max_num_segments are set in a request. Relates #44761	2019-08-06 16:04:24 +02:00
Jason Tedor	5b1b146099	Normalize environment paths (#45179 ) This commit applies a normalization process to environment paths, both in how they are stored internally, also their settings values. This normalization is done via two means: - we make the paths absolute - we remove redundant name elements from the path (what Java calls "normalization") This change ensures that when we compare and refer to these paths within the system, we are using a common ground. For example, prior to the change if the data path was relative, we would not compare it correctly to paths from disk usage. This is because the paths in disk usage were being made absolute.	2019-08-06 06:04:30 -04:00
Yannick Welsch	7aeb2fe73c	Add per-socket keepalive options (#44055 ) Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings. By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like to explore whether we can enable them by default, in particular to force keepalive configurations that are better tuned for running ES.	2019-08-06 10:45:44 +02:00
Igor Motov	b5f88120b5	Geo: add Geometry-based query builders to QueryBuilders (#45058 ) Add Geometry-based method for creation of query builders in QueryBuilder Relates to #44715	2019-08-05 13:34:48 -04:00
Zachary Tong	3df1c76f9b	Allow pipeline aggs to select specific buckets from multi-bucket aggs (#44179 ) This adjusts the `buckets_path` parser so that pipeline aggs can select specific buckets (via their bucket keys) instead of fetching the entire set of buckets. This is useful for bucket_script in particular, which might want specific buckets for calculations. It's possible to workaround this with `filter` aggs, but the workaround is hacky and probably less performant. - Adjusts documentation - Adds a barebones AggregatorTestCase for bucket_script - Tweaks AggTestCase to use getMockScriptService() for reductions and pipelines. Previously pipelines could just pass in a script service for testing, but this didnt work for regular aggs. The new getMockScriptService() method fixes that issue, but needs to be used for pipelines too. This had a knock-on effect of touching MovFn, AvgBucket and ScriptedMetric	2019-08-05 12:18:40 -04:00
Zachary Tong	e5079ac288	[7.x backport] Add more flexibility to MovingFunction window alignment (#45159 ) Introduce shift field to MovingFunction aggregation. By default, shift = 0. Behavior, in this case, is the same as before. Increasing shift by 1 moves starting window position by 1 to the right. To simply include current bucket to the window, use shift = 1 For center alignment (n/2 values before and after the current bucket), use shift = window / 2 For right alignment (n values after the current bucket), use shift = window.	2019-08-05 11:56:52 -04:00
Nhat Nguyen	56083ba1ff	Remove assertion after locally recover replica (#45181 ) If the disk becomes broken after we have locally recovered shard up to the global checkpoint, then the assertion won't hold.	2019-08-05 10:48:02 -04:00
David Turner	13a167051f	Remove fileBasedRecovery flag (#45146 ) Today `RecoveryTarget#prepareForTranslogOperations` takes a boolean flag indicating whether the recovery is file-based or not. This was used in 6.x to bootstrap some commit data that were missing in indices created in 5.x: `b506955f8d/server/src/main/java/org/elasticsearch/indices/recovery/RecoveryTarget.java (L298-L300)` This flag no longer has any effect, so this commit removes it. Backport of #45131 to 7.x.	2019-08-05 08:17:40 +01:00
Armin Braun	41815ed614	Optimize StreamInput#readString (#44930 ) (#45180 ) * Resolve TODO in `readString` by moving to reading chunks of `byte[]` instead of going byte by byte * Motivated by `readString` showing up as a significant user of CPU time on the IO thread in Rally PMC benchmark * Benchmarking this: * Could not reproduce a slowdown in the potential worst case (one or two non-ascii chars) since in this case the cost of creating the string itself exceeds the read times anyway * Speedup for 50%+ for reading 200 char ascii strings from `ByteBuf` or pages bytes backed streams * Longer strings obviously get bigger speedups * More ascii chars -> more speedup	2019-08-05 07:22:42 +02:00
Jason Tedor	d78ecd9c09	Use the full hash in build info (#45163 ) This commit switches to using the full hash to build into the JAR manifest, which is used in node startup and the REST main action to display the build hash.	2019-08-03 11:27:53 -04:00
Tim Brooks	984ba82251	Move nio channel initialization to event loop (#45155 ) Currently in the transport-nio work we connect and bind channels on the a thread before the channel is registered with a selector. Additionally, it is at this point that we set all the socket options. This commit moves these operations onto the event-loop after the channel has been registered with a selector. It attempts to set the socket options for a non-server channel at registration time. If that fails, it will attempt to set the options after the channel is connected. This should fix #41071.	2019-08-02 17:31:31 -04:00
Zachary Tong	ffbe047c32	Revert "Add more flexibility to MovingFunction window alignment (#44360 )" This reverts commit `1a58a487f0`.	2019-08-02 15:16:04 -04:00
Nikita Glashenko	1a58a487f0	Add more flexibility to MovingFunction window alignment (#44360 ) Introduce shift field to MovingFunction aggregation. By default, shift = 0. Behavior, in this case, is the same as before. Increasing shift by 1 moves starting window position by 1 to the right. To simply include current bucket to the window, use shift = 1 For center alignment (n/2 values before and after the current bucket), use shift = window / 2 For right alignment (n values after the current bucket), use shift = window.	2019-08-02 15:10:21 -04:00
David Turner	9ff320d967	Use index for peer recovery instead of translog (#45137 ) Today we recover a replica by copying operations from the primary's translog. However we also retain some historical operations in the index itself, as long as soft-deletes are enabled. This commit adjusts peer recovery to use the operations in the index for recovery rather than those in the translog, and ensures that the replication group retains enough history for use in peer recovery by means of retention leases. Reverts #38904 and #42211 Relates #41536 Backport of #45136 to 7.x.	2019-08-02 15:00:43 +01:00
Armin Braun	9450505d5b	Stop Passing Around REST Request in Multiple Spots (#44949 ) (#45109 ) * Stop Passing Around REST Request in Multiple Spots * Motivated by #44564 * We are currently passing the REST request object around to a large number of places. This works fine since we simply copy the full request content before we handle the rest itself which is needlessly hard on GC and heap. * This PR removes a number of spots where the request is passed around needlessly. There are many more spots to optimize in follow-ups to this, but this one would already enable bypassing the request copying for some error paths in a follow up.	2019-08-02 07:31:38 +02:00
Jim Ferenczi	3f94e2ea43	Sparse role queries can throw an NPE (#45053 ) Sparse role queries are executed differently than other queries in order to account for the fact that most of the documents are filtered from search. However this special execution does not set the scorer for the query so any collector that needs to access the score of a document fails with an NPE. This change fixed this bug by setting the scorer before collecting any hits when intersecting the main query and the sparse role.	2019-08-01 20:21:53 +02:00
William Brafford	5f50da947a	Fix bug in the Settings#processSetting method (#45095 ) The Settings#processSetting method is intended to take a setting map and add a setting to it, adjusting the keys as it goes in case of "conflicts" where the new setting implies an object where there is currently a string, or vice versa. processSetting was failing in two cases: adding a setting two levels under a string, and adding a setting two levels under a string and four levels under a map. This commit fixes the bug and adds test coverage for the previously faulty edge cases. * fix issue #43791 about settings * add unit test in testProcessSetting()	2019-08-01 13:27:08 -04:00
Yannick Welsch	917510d3e4	Always use primary term of operation in InternalEngine (#45083 ) We keep adding the current primary term to operations for which we do not assign a sequence number. This does not make sense anymore as all operations which we care about have sequence numbers now. The goal of this commit is to clean things up in InternalEngine and reduce the complexity.	2019-08-01 17:30:00 +02:00
Armin Braun	48dc53f8d2	Make PathTrieIterator a Little more Memory Efficient (#44951 ) (#45070 ) * There's no need to have the trie iterator hold another reference to the request object (which could be huge, see #44564) * Also removed unused boolean field from trie node	2019-08-01 17:26:08 +02:00
Nhat Nguyen	3a487379c3	Tighten no pending scheduled refresh check (#45025 ) Previously, we use ThreadPoolStats to ensure that the scheduledRefresh triggered by the internal refresh setting update is executed before we index a new document. With that change (#40387), this test did not fail for the last 3 months. However, using ThreadPoolStats is not entirely watertight as both "active" and "queue" count can be 0 in a very small interval when ThreadPoolExecutor pulls a task from the queue but before marking the corresponding worker as active (i.e., lock it). Closes #39565	2019-08-01 09:06:22 -04:00
David Turner	c088bafbbc	Wait for events in waitForRelocation (#45074 ) Adds a `waitForEvents(Priority.LANGUID)` to the cluster health request in `ESIntegTestCase#waitForRelocation()` to deal with the case that this health request returns successfully despite the fact that there is a pending reroute task which will relocate another shard. Relates #44433 Fixes #45003	2019-08-01 13:47:39 +01:00
David Turner	532ade7816	More logging for slow cluster state application (#45007 ) Today the lag detector may remove nodes from the cluster if they fail to apply a cluster state within a reasonable timeframe, but it is rather unclear from the default logging that this has occurred and there is very little extra information beyond the fact that the removed node was lagging. Moreover the only forewarning that the lag detector might be invoked is a message indicating that cluster state publication took unreasonably long, which does not contain enough information to investigate the problem further. This commit adds a good deal more detail to make the issues of slow nodes more prominent: - after 10 seconds (by default) we log an INFO message indicating that a publication is still waiting for responses from some nodes, including the identities of the problematic nodes. - when the publication times out after 30 seconds (by default) we log a WARN message identifying the nodes that are still pending. - the lag detector logs a more detailed warning when a fatally-lagging node is detected. - if applying a cluster state takes too long then the cluster applier service logs a breakdown of all the tasks it ran as part of that process.	2019-08-01 13:20:46 +01:00
Hendrik Muhs	b3be8f75f0	Fix version logic after 7.3 release (BWC) (#45077 ) removes unreleased version 7.2.2 after release of 7.3.0 as it breaks the version verifier, add documentation that explains the logic	2019-08-01 12:43:23 +02:00
Christoph Büscher	a669efd2a4	Remove left-over AwaitsFix in RateClusterStateIT (#45043 ) Issues are closed and fixes in #42580 and #42430 seem to be merged to 7.x at least.	2019-08-01 12:03:29 +02:00
Tim Brooks	aff66e3ac5	Add Cors integration tests (#44361 ) This commit adds integration tests to ensure that the basic cors functionality works for the netty and nio transports.	2019-07-31 14:24:23 -06:00
Armin Braun	8d63bd1d1e	Cleanup Various Action- Listener and Runnable Usages (#42273 ) (#45052 ) * Dry up code for creating simple `ActionRunnable` a little * Shorten some other code around `ActionListener` usage, in particular when wrapping it in a `TransportResponseListener`	2019-07-31 18:55:31 +02:00
Armin Braun	ee663dc9ac	Reenable Parallel Restore Test on Windows (#45037 ) (#45050 ) * As a result of #44096 this test shouldn't fail anymore on `master` and `7.4`+ so we should reenable it there * For older versions we won't backport that change so the tests should stay disabled there * Closes #44671	2019-07-31 18:35:34 +02:00
Christoph Büscher	35291ae175	Remove muted AckIT and AckClusterUpdateSettingsIT (#45044 ) Reading up on #33673 it looks like parts of these tests have been reworked and there is no intention to fix the remains on 7.x, so I think we can remove the entire test.	2019-07-31 17:17:21 +02:00
Luca Cavanna	8cc3c0dd93	Remove task null check in TransportAction (#45014 ) The task that TaskManager#register returns cannot be null. The method enforces that it is not null after calling request#createTask. It is then needless to check for null in the listener later. Also, added the call to the delegate listener in a finally block, just to make sure.	2019-07-31 17:16:41 +02:00
Christoph Büscher	e85b53a955	Remove left-over AwaitsFix in DedicatedClusterSnapshotRestoreIT (#45042 ) The issue mentioned (#38845) seems to have been closed with #38891 so the test can be re-activated.	2019-07-31 17:15:41 +02:00
Armin Braun	c7d7230524	Stop Recreating Wrapped Handlers in RestController (#44964 ) (#45040 ) * We shouldn't be recreating wrapped REST handlers over and over for every request. We only use this hook in x-pack and the wrapper there does not have any per request state. This is inefficient and could lead to some very unexpected memory behavior => I made the logic create the wrapper on handler registration and adjusted the x-pack wrapper implementation to correctly forward the circuit breaker and content stream flags	2019-07-31 17:11:34 +02:00
Zachary Tong	c25f3dd5d0	Introduce 7.3.1 version (#45046 )	2019-07-31 10:53:55 -04:00
Andrey Ershov	c27ac3d24c	Unmute testClusterJoinDespiteOfPublishingIssues and testElectMasterWithLatestVersion (#38555 ) See my comments for #37539 and #37685 (cherry picked from commit 038d4ab2940340eca942e32b54044f183b7804d9)	2019-07-31 14:55:02 +02:00
David Roberts	5e3010a606	Use system context for looking up connected nodes (#43991 ) When finding nodes in a connected cluster for cross cluster search the requests to get cluster state on the connected cluster should be made in the system context because logically they are equivalent to checking a single detail in the local cluster state and should not require that the user who made the request that is using this method in its implementation is authorized to view the entire cluster state. Fixes #43974	2019-07-31 09:09:56 +01:00
Igor Motov	1a1bb4707d	Geo: move indexShape to AbstractGeometryFieldMapper.Indexer (#44979 ) Move indexShape functionality into AbstractGeometryFieldMapper to make it more unit testable. Relates to #43644	2019-07-30 14:50:23 -04:00
Mayya Sharipova	a154b73d99	Assure index ops are successful for SimpleNestedIT (#44815 ) relates to #44486	2019-07-30 14:24:28 -04:00
Nhat Nguyen	979d0a71c7	Remove leniency during replay translog in peer recovery (#44989 ) This change removes leniency in InternalEngine during replaying translog in peer recovery.	2019-07-30 13:25:15 -04:00
Jake Landis	41a99c9e4a	introduce 7.2.2 as a version (#44371 ) * introduce 7.2.2 as a version	2019-07-30 18:52:34 +02:00
Jake Landis	03fea1c503	introduce 6.8.3 as a version (#44708 )	2019-07-30 18:48:41 +02:00
David Kyle	78aa6143a6	Mute FilteringAllocationIT testTransientSettingsStillApplied Relates to https://github.com/elastic/elasticsearch/issues/45003	2019-07-30 14:10:50 +01:00
Yannick Welsch	c1b569ed4b	Revert "Mute Zen1IT#testMixedClusterDisruption" This reverts commit `cf78ca58e3`.	2019-07-30 13:10:14 +02:00
David Turner	55f1dd8da6	Close nodes properly in Coordinator tests (#44967 ) Today closing a `ClusterNode` in an `AbstractCoordinatorTestCase` uses `onNode()` so has no effect if the node is not in the current list of nodes. It also discards the `Runnable` it creates without having run it, so has no effect anyway. This commit makes these tests much stricter about properly closing the nodes started during `Coordinator` tests, by tracking the persisted states that are opened, and adds an assertion to catch the trappy requirement that the closing node still belongs to the cluster.	2019-07-30 11:47:36 +01:00
David Kyle	cf78ca58e3	Mute Zen1IT#testMixedClusterDisruption	2019-07-30 11:33:39 +01:00
Jim Ferenczi	43bd8f2ba0	Fix aggregators early termination with breadth-first mode (#44963 ) This commit fixes a bug when a deferred aggregator tries to early terminate the collection. In such case the CollectionTerminatedException is not caught and the search fails on the shard. This change makes sure that we catch the exception in order to continue the deferred collection on the next leaf. Fixes #44909	2019-07-30 11:26:40 +02:00
Andrey Ershov	5a0bd696fc	Snapshot tool S3 cleanup 7.x backport (#44575 ) Backport of #44551	2019-07-30 11:02:08 +02:00
Nhat Nguyen	4813728783	Remove leniency in reset engine from translog (#44711 ) Replaying operations from the local translog must never fail as those operations were processed successfully on the primary before and the mapping is up to update already. This change removes leniency during resetting engine from translog in IndexShard and InternalEngine.	2019-07-29 16:31:45 -04:00
Jack Conradson	1a21682ed0	Fix JodaCompatibleZonedDateTime casts in Painless (#44874 ) This is a temporary fix during the Joda to Java datetime transition. This will implicitly cast a JodaCompatibleZonedDateTime to a ZonedDateTime for both def and static types. This is necessary to insulate users from needing to know about JodaCompatibleZonedDateTime explicitly.	2019-07-29 12:05:26 -07:00
Igor Motov	b6cef227a5	Geo: fix geo query decomposition (#44924 ) The recent refactoring introduced an issue where queries where not going through the decomposition processing. Fixes #44891	2019-07-29 11:48:24 -04:00
Luca Cavanna	a3cc32da64	TaskListener#onFailure to accept Exception instead of Throwable (#44946 ) TaskListener accepts today Throwable in its onFailure method. Though looking at where it is called (TransportAction), it can never be notified of a Throwable. This commit changes the signature of TaskListener#onFailure so that it accepts an `Exception` rather than a `Throwable` as second argument.	2019-07-29 16:47:19 +02:00
Michał Perlak	245c9b7914	Optimize Min and Max BKD optimizations (#44315 ) MinAggregator - skip BKD optimization when no result found after 1024 lookups. MaxAggregator - skip unnecessary conversions.	2019-07-29 10:04:39 -04:00
Yannick Welsch	24873dd3e3	Do not block transport thread on startup (#44939 ) We currently block the transport thread on startup, which has caused test failures. I think this is some kind of deadlock situation. I don't think we should even block a transport thread, and there's also no need to do so. We can just reject requests as long we're not fully set up. Note that the HTTP layer is only started much later (after we've completed full start up of the transport layer), so that one should be completely unaffected by this. Closes #41745	2019-07-29 11:35:17 +02:00
Armin Braun	f5efafd4d6	Cleanup Deadcode o.e.indices (#44931 ) (#44938 ) * none of this is used anywhere	2019-07-29 10:38:35 +02:00
Igor Motov	cfc8d17bb4	Geo: refactor geo mapper and query builder (#44884 ) Refactors out the indexing and query generation logic out of the mapper and query builder into a separate unit-testable classes.	2019-07-26 16:48:31 -04:00
Yannick Welsch	1561ab5420	Guard open connection call in RemoteClusterConnection (#44921 ) Fixes an issue where a call to openConnection was not properly guarded, allowing an exception to bubble up to the uncaught exception handler, causing test failures. Closes #44912	2019-07-26 22:27:45 +02:00
Tanguy Leroux	e1b626b947	Ensure index is green in SimpleClusterStateIT.testIndicesOptions() (#44893 ) SimpleClusterStateIT testIndicesOptions failed in #44817 because it tries to close an index at the beginning of the test. With random index settings, it is possible that the index has a high number of shards (10) and replicas (1), which means that on CI this index can take time to be fully allocated. The close index request can fail in the case where replicas are still recovering operations. Thiscommit adds a simple ensureGreen() at the beginning of the test to be sure that all replicas are started before trying to close the index. closes #44817	2019-07-26 17:07:53 +02:00
Armin Braun	1340ff19bc	Fix Test Failure in ScalingThreadPoolTests (#44898 ) (#44901 ) * Due to #44894 some constellations log a deprecation warning here now * Fixed by checking for that	2019-07-26 17:05:50 +02:00

1 2 3 4 5 ...

3553 Commits