OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Kyle	531efb3fe5	Remove unreleased 7.1.2 version constant (#43629 ) This was breaking BWC tests as the presence of the constant implied 7.1.2 was released	2019-06-26 13:53:05 +01:00
David Kyle	58d0d5c51b	Mute DiskDisruptionIT#testGlobalCheckpointIsSafe Relates to #43626	2019-06-26 10:13:41 +01:00
Yannick Welsch	2049f715b3	Add voting-only master node (#43410 ) A voting-only master-eligible node is a node that can participate in master elections but will not act as a master in the cluster. In particular, a voting-only node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. This only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers. This means that one of the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a voting-only non-dedicated master node can play the role of the third master-eligible node, which allows running an HA cluster with only two dedicated master nodes. Closes #14340 Co-authored-by: David Turner <david.turner@elastic.co>	2019-06-26 08:07:56 +02:00
David Turner	11f41c4e7d	Omit non-masters in ClusterFormationFailureHelper (#41344 ) Today the `ClusterFormationFailureHelper` says `... discovery will continue using ... from last-known cluster state` and lists all the nodes in the last-known cluster state. In fact we ignore the master-ineligible nodes in the last-known cluster state during discovery. This commit fixes this by listing only the master-eligible nodes from the cluster state in this message.	2019-06-26 08:07:56 +02:00
Nhat Nguyen	05e1f55a88	Ensure relocation target still tracked when start handoff (#42201 ) If the master removes the relocating shard, but recovery isn't aware of it, then we can enter an invalid state where ReplicationTracker does not include the local shard.	2019-06-25 23:19:59 -04:00
Jake Landis	9a3c86d422	include 7.2.1 as a version (#43584 )	2019-06-25 16:02:48 -05:00
David Turner	e738f0e6d2	Allow extra time for a warning to be logged (#43597 ) Today we assert that a warning is logged after no more than `discovery.cluster_formation_warning_timeout`, but the deterministic scheduler adds a small amount of extra randomness to the timing of future events, causing the following build to fail: ./gradlew :server:test --tests "org.elasticsearch.cluster.coordination.CoordinatorTests.testLogsWarningPeriodicallyIfClusterNotFormed" -Dtests.seed=DF35C28D4FA9EE2D This commit adds an allowance for this extra time.	2019-06-25 20:04:56 +01:00
Tanguy Leroux	0dc1c12f13	Fix indices shown in _cat/indices (#43286 ) After two recent changes (#38824 and #33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes #39933	2019-06-25 20:02:34 +02:00
Zachary Tong	63fef5a31e	Add scripting support to AggregatorTestCase (#43494 ) This refactors AggregatorTestCase to allow testing mock scripts. The main change is to QueryShardContext. This was previously mocked, but to get the ScriptService you have to invoke a final method which can't be mocked. Instead, we just create a mostly-empty QueryShardContext and populate the fields that are needed for testing. It also introduces a few new helper methods that can be overridden to change the default behavior a bit. Most tests should be able to override getMockScriptService() to supply a ScriptService to the context, which is later used by the aggs. More complicated tests can override queryShardContextMock() as before. Adds a test to MaxAggregatorTests to test out the new functionality.	2019-06-25 11:52:12 -04:00
Przemysław Witek	c702cd7415	[7.x] Implement XContentParser.genericMap and XContentParser.genericMapOrdered methods (#42059 ) (#43575 )	2019-06-25 16:04:54 +02:00
Armin Braun	62a28921e8	Cleanup IndicesService#CacheCleaner Scheduling (#42060 ) (#43528 ) * Follow up to #42016	2019-06-25 13:04:04 +02:00
Yannick Welsch	3d5e4577aa	Fix testPostOperationGlobalCheckpointSync The conditions in this test do not hold true anymore after #43205. Relates to #43205	2019-06-25 12:49:29 +02:00
Nhat Nguyen	01205432fe	Unmute testOpenCloseApiWildcards Relates #39578	2019-06-24 17:12:57 -04:00
Jim Ferenczi	ae31ca5f7e	Fix score mode of the MinimumScoreCollector (#43527 ) This change fixes the score mode of the minimum score collector to be set based on the score mode of the child collector (top docs). Closes #43497	2019-06-24 21:32:33 +02:00
Yannick Welsch	d45f12799c	Sync global checkpoint on pending in-sync shards (#43526 ) At the end of a peer recovery the primary wants to mark the replica as in-sync. For that the persisted local checkpoint of the replica needs to have caught up with the global checkpoint on the primary. If translog durability is set to ASYNC, this means that information about the persisted local checkpoint can lag on the primary and might need to be explicitly fetched through a global checkpoint sync action. Unfortunately, that action will only be triggered after 30 seconds, and, even worse, will only run based on what the in-sync shard copies say (see IndexShard.maybeSyncGlobalCheckpoint). As the replica has not been marked as in-sync yet, it is not taken into consideration, and the primary might have its global checkpoint equal to the max seq no, so it thinks nothing needs to be done. Closes #43486	2019-06-24 18:35:57 +02:00
Zachary Tong	eaa9ee1f16	Set document on script when using Bytes.WithScript (#43390 ) Long and Double ValuesSource set the current document on the script before executing, but Bytes was missing this method call. That meant it was possible to generate an OutOfBoundsException when using a "value" script (field + script) on keyword or other bytes fields. This adds in the method call, and a few yaml tests to verify correct behavior.	2019-06-24 12:20:28 -04:00
Andrey Ershov	98d7d231bb	Fix testNoMasterActions (#43471 ) This commit performs the proper restore of network disruption. Previously disruptionScheme.stopDisrupting() was called that does not ensure that connectivity between cluster nodes is restored. The test was checking that the cluster has green status, but it was not checking that connectivity between nodes is restored. Here we switch to internalCluster().clearDisruptionScheme(true) which performs both checks before returning. Similar to #42798 Closes #42051 (cherry picked from commit cd1ed662f847a0055ede7dfbd325e214ec4d1490)	2019-06-24 18:53:58 +03:00
Martijn van Groningen	101cf384ba	Replace Streamable w/ Writable in AcknowledgedResponse and subclasses (backport 7.x) (#43525 ) This commit replaces usages of Streamable with Writeable for the AcknowledgedResponse and its subclasses, plus associated actions. Note that where possible response fields were made final and default constructors were removed. This is a large PR, but the change is mostly mechanical. Relates to #34389 Backport of #43414	2019-06-24 13:47:37 +02:00
Tanguy Leroux	41ebaf57b5	Do not hang on unsupported HTTP methods (#43362 ) Unsupported HTTP methods are detected during requests dispatching which generates an appropriate error response. Sadly, this error is never sent back to the client because the method of the original request is checked again in DefaultRestChannel which throws again an IllegalArgumentException that is never handled. This pull request changes the DefaultRestChannel so that the latest exception is swallowed, allowing the error message to be sent back to the client. It also eagerly adds the objects to close to the toClose list so that resources are more likely to be released if something goes wrong during the response creation and sending.	2019-06-24 13:16:29 +02:00
Yannick Welsch	19520d4640	Add additional logging for #43034 It's unclear why sometimes the shard is not flushed on closing	2019-06-24 12:30:22 +02:00
Yannick Welsch	127a608147	Assert that NOOPs must succeed (#43483 ) We currently assert that adding deletion tombstones to Lucene must always succeed if it's not a tragic exception, and the same should also hold true for NOOP tombstones. We rely on this assumption, as without this, we have the risk of creating gaps in the history, which will break operation-based recoveries and CCR.	2019-06-24 11:38:34 +02:00
Nhat Nguyen	04bc754d8d	Cleanup legacy logic in CombinedDeletionPolicy (#43484 ) This change removes the support for pre-v6 index commits which do not have sequence numbers.	2019-06-23 11:30:04 -04:00
Luca Cavanna	186c3122be	[TEST] Embed msearch samples in MultiSearchRequestTests (#43482 ) Depending on git configuration, line feed on checked out files may be platform dependent, which causes problems to some msearch tests as the line separator must always be `/n`. With this change we move two files to the test code so that we control exactly what line separator is used, given that the corresponding tests fail on windows. Closes #43464	2019-06-21 19:05:53 +02:00
David Turner	e4fd0ce730	Reduce TestLogging usage in DisruptionIT tests (#43411 ) Removes `@TestLogging` annotations in `*DisruptionIT` tests, so that the only tests with annotations are those with open issues. Also adds links to the open issues in the remaining cases. Relates #43403	2019-06-21 15:01:03 +01:00
Christoph Büscher	4fe650c9e5	Fix DefaultShardOperationFailedException subclass xcontent serialization (#43435 ) The current toXContent implementation can fail when the superclasses toXContent is called (see #43423). This change makes sure that DefaultShardOperationFailedException#toXContent is final and implementations need to add special fields in #innerToXContent. All implementations should write to self-contained xContent objects. Also adding a test for xContent deserialization to CloseIndexResponseTests. Closes #43423	2019-06-21 14:31:19 +02:00
Yu	c88f2f23a5	Make Recovery API support `detailed` params (#29076 ) Properly forwards the `detailed` parameter to show the recovery stats details. Closes #28910	2019-06-21 09:05:33 +02:00
Andrei Stefan	90e151edeb	Mute MultiSearchRequestTests.java tests (#43467 )	2019-06-21 08:38:21 +03:00
Jim Ferenczi	cc6c114cb8	Fix round up of date range without rounding (#43303 ) Today when searching for an exclusive range the java date math parser rounds up the value with the granularity of the operation. So when searching for values that are greater than "now-2M" the parser rounds up the operation to "now-1M". This behavior was introduced when we migrated to java date but it looks like a bug since the joda math parser rounds up values but only when a rounding is used. So "now/M" is rounded to "now-1ms" (minus 1ms to get the largest inclusive value) in the joda parser if the result should be exclusive but no rounding is applied if the input is a simple operation like "now-1M". This change restores the joda behavior in order to have a consistent parsing in all versions. Closes #43277	2019-06-20 23:59:08 +02:00
Tim Brooks	827f8fcbd5	Move reindex request parsing into request (#43450 ) Currently the fromXContent logic for reindex requests is implemented in the rest action. This is inconsistent with other requests where the logic is implemented in the request. Additionally, it requires access to the rest action in order to parse the request. This commit moves the logic and tests into the ReindexRequest.	2019-06-20 17:49:11 -04:00
sandmannn	cf610b5e81	Added parsing of erroneous field value (#42321 )	2019-06-20 15:24:04 -04:00
Jake Landis	2f2d0a198f	add version 6.8.2	2019-06-20 12:07:55 -05:00
Zachary Tong	a8a81200d0	Better support for unmapped fields in AggregatorTestCase (#43405 ) AggregatorTestCase will NPE if only a single, null MappedFieldType is provided (which is required to simulate an unmapped field). While it's possible to test unmapped fields by supplying other, non-related field types... that's clunky and unnecessary. AggregatorTestCase just needs to filter out null field types when setting up.	2019-06-20 11:31:49 -04:00
Yannick Welsch	8c856d6d91	Adapt local checkpoint assertion With async durability, it does not hold true anymore after #43205. This is fine.	2019-06-20 17:29:53 +02:00
Armin Braun	99a44a04f7	Fix Infinite Loops in ExceptionsHelper#unwrap (#42716 ) (#43421 ) * Fix Infinite Loops in ExceptionsHelper#unwrap * Keep track of all seen exceptions and break out on loops * Closes #42340	2019-06-20 16:38:28 +02:00
Armin Braun	39fef8379b	Fix FsRepositoryTests.testSnapshotAndRestore (#42925 ) (#43420 ) * The commit generation can be 3 or 2 here -> fixed by checking the actual generation on the second commit instead of hard coding 2 * Closes #42905	2019-06-20 16:36:40 +02:00
synical	b4c4018d00	Remove Confusing Comment (#43400 )	2019-06-20 15:02:37 +01:00
David Turner	c8eb09f158	Fail connection attempts earlier in tests (#43320 ) Today the `DisruptibleMockTransport` always allows a connection to a node to be established, and then fails requests sent to that node such as the subsequent handshake. Since #42342, we log handshake failures on an open connection as a warning, and this makes the test logs rather noisy. This change fails the connection attempt first, avoiding these unrealistic warnings.	2019-06-20 14:45:24 +01:00
Yannick Welsch	e04a2258fc	Fix testGlobalCheckpointSync The test needed adaption after #43205, as the ReplicationTracker now distinguishes between the knowledge of the persisted global checkpoint and the computed global checkpoint on the primary Follow-up to #43205	2019-06-20 14:00:00 +02:00
Yannick Welsch	a76c034866	Reduce shard started failure logging (#43330 ) If the master is stepping or shutting down, the error-level logging can cause quite a bit of noise.	2019-06-20 13:23:05 +02:00
Yannick Welsch	7f8e1454ab	Advance checkpoints only after persisting ops (#43205 ) Local and global checkpoints currently do not correctly reflect what's persisted to disk. The issue is that the local checkpoint is adapted as soon as an operation is processed (but not fsynced yet). This leaves room for the history below the global checkpoint to still change in case of a crash. As we rely on global checkpoints for CCR as well as operation-based recoveries, this has the risk of shard copies / follower clusters going out of sync. This commit required changing some core classes in the system: - The LocalCheckpointTracker keeps track now not only of the information whether an operation has been processed, but also whether that operation has been persisted to disk. - TranslogWriter now keeps track of the sequence numbers that have not been fsynced yet. Once they are fsynced, TranslogWriter notifies LocalCheckpointTracker of this. - ReplicationTracker now keeps track of the persisted local and persisted global checkpoints of all shard copies when in primary mode. The computed global checkpoint (which represents the minimum of all persisted local checkpoints of all in-sync shard copies), which was previously stored in the checkpoint entry for the local shard copy, has been moved to an extra field. - The periodic global checkpoint sync now also takes async durability into account, where the local checkpoints on shards only advance when the translog is asynchronously fsynced. This means that the previous condition to detect inactivity (max sequence number is equal to global checkpoint) is not sufficient anymore. - The new index closing API does not work when combined with async durability. The shard verification step is now requires an additional pre-flight step to fsync the translog, so that the main verify shard step has the most up-to-date global checkpoint at disposition.	2019-06-20 11:12:38 +02:00
Tanguy Leroux	24cfca53fa	Reconnect remote cluster when seeds are changed (#43379 ) The RemoteClusterService should close the current RemoteClusterConnection and should build it again if the seeds are changed, similarly to what is done when the ping interval or the compression settings are changed. Closes #37799	2019-06-20 10:30:02 +02:00
Luca Cavanna	94a4bc9933	SearchPhaseContext to not extend ActionListener (#43269 ) The fact that SearchPhaseContext extends ActionListener makes it hard to reason about when the original listener is notified and to trace those calls. Also, the corresponding onFailure and onResponse were only needed in two places, one each, where they can be replaced by a more intuitive call, like sendSearchResponse for onResponse.	2019-06-20 10:21:24 +02:00
Jim Ferenczi	c33d62adbc	Reduce the number of docvalues iterator created in the global ordinals fielddata (#43091 ) Today the fielddata for global ordinals re-creates docvalues readers of each segment when building the iterator of a single segment. This is required because the lookup of global ordinals needs to access the docvalues's TermsEnum of each segment to retrieve the original terms. This also means that we need to create NxN (where N is the number of segment in the index) docvalues iterators each time we want to collect global ordinal values. This wasn't an issue in previous versions since docvalues readers are stateless before 6.0 so they are reused on each segment but now that docvalues are iterators we need to create a new instance each time we want to access the values. In order to avoid creating too many iterators this change splits the global ordinals fielddata in two classes, one that is used to cache a single instance per directory reader and one that is created from the cached instance that can be used by a single consumer. The latter creates the TermsEnum of each segment once and reuse them to create the segment's iterator. This prevents the creation of all TermsEnums each time we want to access the value of a single segment, hence reducing the number of docvalues iterator to create to Nx2 (one iterator and one lookup per segment).	2019-06-20 08:44:07 +02:00
Jason Tedor	1f1a035def	Remove stale test logging annotations (#43403 ) This commit removes some very old test logging annotations that appeared to be added to investigate test failures that are long since closed. If these are needed, they can be added back on a case-by-case basis with a comment associating them to a test failure.	2019-06-19 22:58:22 -04:00
Lee Hinman	6b084e55c5	[7.x] Prevent NullPointerException in TransportRolloverAction (#43353 ) (#43397 ) It's possible for the passed in `IndexMetaData` to be null (for instance, cluster state passed in does not have the index in its metadata) which in turn can cause a `NullPointerException` when evaluating the conditions for an index. This commit adds null protection and unit tests for this case. Resolves #43296	2019-06-19 16:07:28 -06:00
Jim Ferenczi	b957aa46ce	Allocate memory lazily in BestBucketsDeferringCollector (#43339 ) While investigating memory consumption of deeply nested aggregations for #43091 the memory used to keep track of the doc ids and buckets in the BestBucketsDeferringCollector showed up as one of the main contributor. In my tests half of the memory held in the BestBucketsDeferringCollector is associated to segments that don't have matching docs in the selected buckets. This is expected on fields that have a big cardinality since each bucket can appear in very few segments. By allocating the builders lazily this change reduces the memory consumption by a factor 2 (from 1GB to 512MB), hence reducing the impact on gcs for these volatile allocations. This commit also switches the PackedLongValues.Builder with a RoaringDocIdSet in order to handle very sparse buckets more efficiently. I ran all my tests on the `geoname` rally track with the following query: ```` { "size": 0, "aggs": { "country_population": { "terms": { "size": 100, "field": "country_code.raw" }, "aggs": { "admin1_code": { "terms": { "size": 100, "field": "admin1_code.raw" }, "aggs": { "admin2_code": { "terms": { "size": 100, "field": "admin2_code.raw" }, "aggs": { "sum_population": { "sum": { "field": "population" } } } } } } } } } } ````	2019-06-19 22:10:59 +02:00
Christos Soulios	d1637ca476	Backport: Refactor aggregation base classes to remove doEquals() and doHashCode() (#43363 ) This PR is a backport a of #43214 from v8.0.0 A number of the aggregation base classes have an abstract doEquals() and doHashCode() (e.g. InternalAggregation.java, AbstractPipelineAggregationBuilder.java). Theoretically this is so the sub-classes can add to the equals/hashCode and don't need to worry about calling super.equals(). In practice, it's mostly just confusing/inconsistent. And if there are more than two levels, we end up with situations like InternalMappedSignificantTerms which has to call super.doEquals() which defeats the point of having these overridable methods. This PR removes the do versions and just use equals/hashCode ensuring the super when necessary.	2019-06-19 22:31:06 +03:00
Armin Braun	be42b2c70c	Fix NetworkUtilsTests (#43295 ) (#43378 ) * Follow up to #42109: * Adjust test to only check that interface lookup by name works not actually lookup IPs which is brittle since virtual interfaces can be destroyed/created by Docker while the tests are running Co-authored-by: Jason Tedor <jason@tedor.me>	2019-06-19 21:23:09 +02:00
Lee Hinman	d81ce9a647	Return 0 for negative "free" and "total" memory reported by the OS (#42725 ) * Return 0 for negative "free" and "total" memory reported by the OS We've had a situation where the MX bean reported negative values for the free memory of the OS, in those rare cases we want to return a value of 0 rather than blowing up later down the pipeline. In the event that there is a serialization or creation error with regard to memory use, this adds asserts so the failure will occur as soon as possible and give us a better location for investigation. Resolves #42157 * Fix test passing in invalid memory value * Fix another test passing in invalid memory value * Also change mem check in MachineLearning.machineMemoryFromStats * Add background documentation for why we prevent negative return values * Clarify comment a bit more	2019-06-19 10:35:48 -06:00
Nhat Nguyen	b5c8b32cab	Do not use soft-deletes to resolve indexing strategy (#43336 ) This PR reverts #35230. Previously, we reply on soft-deletes to fill the mismatch between the version map and the Lucene index. This is no longer needed after #43202 where we rebuild the version map when opening an engine. Moreover, PrunePostingsMergePolicy can prune _id of soft-deleted documents out of order; thus the lookup result including soft-deletes sometimes does not return the latest version (although it's okay as we only use a valid result in an engine). With this change, we use only live documents in Lucene to resolve the indexing strategy. This is perfectly safe since we keep all deleted documents after the local checkpoint in the version map. Closes #42979	2019-06-19 10:40:24 -04:00
Martijn van Groningen	a4c45b5d70	Replace Streamable w/ Writeable in SingleShardRequest and subclasses (#43222 ) (#43364 ) Backport of: https://github.com/elastic/elasticsearch/pull/43222 This commit replaces usages of Streamable with Writeable for the SingleShardRequest / TransportSingleShardAction classes and subclasses of these classes. Note that where possible response fields were made final and default constructors were removed. Relates to #34389	2019-06-19 16:15:09 +02:00
Paul Sanwald	8578aba654	[backport] Adds a minimum interval to `auto_date_histogram`. (#42814 ) (#43285 ) Backports minimum interval to date histogram	2019-06-19 07:06:45 -04:00
Igor Motov	9f7d1ff2de	Geo: Add coerce support to libs/geo WKT parser (#43273 ) Adds support for coercing not closed polygons and ignoring Z value to libs/geo WKT parser. Closes #43173	2019-06-18 14:41:01 -04:00
Jim Ferenczi	de1a685cce	Fix sporadic failures in QueryStringQueryTests#testToQueryFuzzyQueryAutoFuziness (#43322 ) This commit ensures that the test does not use reserved keyword (OR, AND, NOT) when generating the random query strings. Closes #43318	2019-06-18 20:18:09 +02:00
David Turner	90a8589294	Local node is discovered when cluster fails (#43316 ) Today the `ClusterFormationFailureHelper` does not include the local node in the list of nodes it claims to have discovered. This means that it sometimes reports that it has not discovered a quorum when in fact it has. This commit adds the local node to the set of discovered nodes.	2019-06-18 12:23:23 +01:00
David Turner	2e064e0d13	Allow election of nodes outside voting config (#43243 ) Today we suppress election attempts on master-eligible nodes that are not in the voting configuration. In fact this restriction is not necessary: any master-eligible node can safely become master as long as it has a fresh enough cluster state and can gather a quorum of votes. Moreover, this restriction is sometimes undesirable: there may be a reason why we do not want any of the nodes in the voting configuration to become master. The reason for this restriction is as follows. If you want to shut the master down then you might first exclude it from the voting configuration. When this exclusion succeeds you might reasonably expect that a new master has been elected, since the voting config exclusion is almost always a step towards shutting the node down. If we allow nodes outside the voting configuration to be the master then the excluded node will continue to be master, which is confusing. This commit adjusts the logic to allow master-eligible nodes to attempt an election even if they are not in the voting configuration. If such a master is successfully elected then it adds itself to the voting configuration. This commit also adjusts the logic that causes master nodes to abdicate when they are excluded from the voting configuration, to avoid the confusion described above. Relates #37712, #37802.	2019-06-18 12:10:48 +01:00
Nhat Nguyen	0c5086d2f3	Rebuild version map when opening internal engine (#43202 ) With this change, we will rebuild the live version map and local checkpoint using documents (including soft-deleted) from the safe commit when opening an internal engine. This allows us to safely prune away _id of all soft-deleted documents as the version map is always in-sync with the Lucene index. Relates #40741 Supersedes #42979	2019-06-17 18:08:09 -04:00
David Turner	2d9b3a69e8	Relocation targets are assigned shards too (#43276 ) Adds relocation targets to the output of `IndexShardRoutingTable#assignedShards`.	2019-06-17 17:14:09 +01:00
Henning Andersen	ba15d08e14	Allow cluster access during node restart (#42946 ) (#43272 ) This commit modifies InternalTestCluster to allow using client() and other operations inside a RestartCallback (onStoppedNode typically). Restarting nodes are now removed from the map and thus all methods now return the state as if the restarting node does not exist. This avoids various exceptions stemming from accessing the stopped node(s).	2019-06-17 15:04:17 +02:00
David Turner	4b58827beb	Make DiscoveryNodeRole into a value object (#43257 ) Adds `equals()` and `hashcode()` methods to `DiscoveryNodeRole` to compare these objects' values for equality, and adds a field to allow us to distinguish unknown roles from known ones with the same name and abbreviation, for clearer test failures. Relates #43175	2019-06-17 10:23:29 +01:00
Alpar Torok	a8bf18184a	Refactor Version class to make version bumps easier (#42668 ) (#43215 ) With this change we only have to add one line to add a new version. The intent is to make it less error prone and easier to write a script to automate the process.	2019-06-17 10:49:20 +03:00
Nhat Nguyen	4b643c50fa	Account soft deletes in committed segments (#43126 ) This change fixes the delete count issue in segment stats where we don't account soft-deleted documents from committed segments. Relates #43103	2019-06-16 22:56:24 -04:00
Jay Modi	c3f1e6a542	Ensure threads running before closing node (#43240 ) There are a few tests within NodeTests that submit items to the threadpool and then close the node. The tests are designed to check how running tasks are affected during node close. These tests can cause CI failures since the submitted tasks may not be running when the node is closed and then execute after the thread context is closed, which triggers an unexpected exception. This change ensures the threads are running so we avoid the unexpected exception and can test these cases. The test of task submittal while a node is closing is also important so an additional but muted test has been added that tests the case where a task may be getting submitted while the node is closing and ensuring we do not trigger anything unexpected in these cases. Relates #42774 Relates #42577	2019-06-14 12:35:43 -06:00
Julie Tibshirani	4b1d8e4433	Allow big integers and decimals to be mapped dynamically. (#42827 ) This PR proposes to model big integers as longs (and big decimals as doubles) in the context of dynamic mappings. Previously, the dynamic mapping logic did not recognize big integers or decimals, and would an error of the form "No matching token for number_type [BIG_INTEGER]" when a dynamic big integer was encountered. It now accepts these numeric types and interprets them as 'long' and 'double' respectively. This allows `dynamic_templates` to accept and and remap them as another type such as `keyword` or `scaled_float`. Addresses #37846.	2019-06-14 10:05:11 -07:00
Yannick Welsch	be9f27bb16	Properly use cancellable threads to stop UnicastZenPing (#42844 ) Fixes a backport issue with #42884 where Zen1 was not properly taken into account.	2019-06-14 13:32:44 +02:00
David Turner	221d23de9f	Fix DiscoveryNodeRoleIT (#43225 ) The test fails if querying the roles via a transport client, since the transport client does not have the plugin necessary to interpret the additional role correctly. This commit adds this plugin to the transport client used. Relates #43175 Fixes #43223	2019-06-14 12:27:01 +01:00
Christoph Büscher	7af23324e3	SimpleQ.S.B and QueryStringQ.S.B tests should avoid `now` in query (#43199 ) Currently the randomization of the q.b. in these tests can create query strings that can cause caching to be disabled for this query if we query all fields and there is a date field present. This is pretty much an anomaly that we shouldn't generally test for in the "testToQuery" tests where cache policies are checked. This change makes sure we don't create offending query strings so the cache checks never hit these cases and adds a special test method to check this edge case. Closes #43112	2019-06-14 11:21:48 +02:00
Przemyslaw Gomulka	4c8e77e092	Disable DiscoveryNodeRoleIT test due to failures (#43224 ) relates #43223	2019-06-14 10:57:22 +02:00
Przemysław Witek	65a584b6fb	[7.x] Report timing stats as part of the Job stats response (#42709 ) (#43193 )	2019-06-14 09:03:14 +02:00
Przemyslaw Gomulka	d27c0fd50d	Fix roundUp parsing with composite patterns backport(#43080 ) (#43191 ) roundUp parsers were losing the composite pattern information when new JavaDateFormatter was created from methods withLocale or withZone. The roundUp parser should be preserved when calling these methods. This is the same approach in withLocale/Zone methods as in `daa2ec8a60/server/src/main/java/org/elasticsearch/common/time/JavaDateFormatter.java` closes #42835	2019-06-14 08:56:26 +02:00
Jason Tedor	2bcc49424d	Register possible node roles in transport client The transport client needs to be told about the possible node roles. This commit does that.	2019-06-13 16:46:38 -04:00
Jason Tedor	55dba6ffad	Fix JDK-version dependent exception message parsing This commit fixes some JDK-version dependent exception message checking in the discovery node role tests.	2019-06-13 15:46:53 -04:00
Jason Tedor	5bc3b7f741	Enable node roles to be pluggable (#43175 ) This commit introduces the possibility for a plugin to introduce additional node roles.	2019-06-13 15:15:48 -04:00
Simon Willnauer	f70141c862	Only load FST off heap if we are actually using mmaps for the term dictionary (#43158 ) Given the significant performance impact that NIOFS has when term dicts are loaded off-heap this change enforces FstLoadMode#AUTO that loads term dicts off heap only if the underlying index input indicates a memory map. Relates to #43150	2019-06-13 07:54:02 +02:00
Tal Levy	20031fb13f	Introduce unit tests for ValuesSourceType (#43174 ) (#43176 ) As the ValuesSourceType evolves, it is important to be confident that new enum constants do not break backwards-compatibility on the stream. Having dedicated unit tests for this class will help be sure of that.	2019-06-12 18:17:23 -07:00
Jim Ferenczi	6cfed7ec72	Also mmap terms index (`.tip`) files for hybridfs (#43150 ) This change adds the terms index (`.tip`) to the list of extensions that are memory-mapped by hybridfs. These files used to be accessed only once to load the terms index on-heap but since #42838 they can now be used to read the binary FST directly so it is benefical to memory-map them instead of accessing them via NIO.	2019-06-12 20:54:09 +02:00
Yannick Welsch	8711a092bf	Stop SeedHostsResolver on shutdown (#42844 ) Fixes an issue where tests would sometimes hang for 5 seconds when restarting a node. The reason is that the SeedHostsResolver is blockingly waiting on a result for the full 5 seconds when the corresponding threadpool is shut down.	2019-06-12 19:36:10 +02:00
Simon Willnauer	9d2adfb41e	Remove usage of FileSwitchDirectory (#42937 ) We are still using `FileSwitchDirectory` in the case a user configures file based pre-load of mmaps. This is trappy for multiple reasons if the both directories used by `FileSwitchDirectory` point to the same filesystem directory. One issue is LUCENE-8835 that cause issues like #37111 - unless LUCENE-8835 isn't fixed we should not use it in elasticsearch. Instead we use a similar trick as we use for HybridFS and subclass mmap directory directly.	2019-06-12 19:35:27 +02:00
Alan Woodward	9de1c69c28	IndexAnalyzers doesn't need to extend AbstractIndexComponent (#43149 ) AIC doesn't add anything here, and it removes the need to pass index settings to the constructor.	2019-06-12 17:48:31 +01:00
Jim Ferenczi	79614aeb2d	SearchRequest#allowPartialSearchResults does not handle successful retries (#43095 ) When set to false, allowPartialSearchResults option does not check if the shard failures have been reseted to null. The atomic array, that is used to record shard failures, is filled with a null value if a successful request on a shard happens after a failure on a shard of another replica. In this case the atomic array is not empty but contains only null values so this shouldn't be considered as a failure since all shards are successful (some replicas have failed but the retries on another replica succeeded). This change fixes this bug by checking the content of the atomic array and fails the request only if allowPartialSearchResults is set to false and at least one shard failure is not null. Closes #40743	2019-06-12 16:27:10 +02:00
Christoph Büscher	7f690e8606	Fix suggestions for empty indices (#42927 ) Currently suggesters return null values on empty shards. Usually this gets replaced by results from other non-epmty shards, but if the index is completely epmty (e.g. after creation) the search responses "suggest" is also "null" and we don't render a corresponding output in the REST response. This is an irritating edge case that requires special handling on the user side (see #42473) and should be fixed. This change makes sure every suggester type (completion, terms, phrase) returns at least an empty skeleton suggestion output, even for empty shards. This way, even if we don't find any suggestions anywhere, we still return and output the empty suggestion. Closes #42473	2019-06-12 15:42:23 +02:00
Alexander Reelsen	6f95038001	Upgrade HPPC to version 0.8.1 (#43025 )	2019-06-12 13:14:16 +02:00
Luca Cavanna	afeda1a7b9	Split search in two when made against throttled and non throttled searches (#42510 ) When a search on some indices takes a long time, it may cause problems to other indices that are being searched as part of the same search request and being written to as well, because their search context needs to stay open for a long time. This is especially a problem when searching against throttled and non-throttled indices as part of the same request. The problem can be generalized though: this may happen whenever read-only indices are searched together with indices that are being written to. Search contexts staying open for a long time is only an issue for indices that are being written to, in practice. This commit splits the search in two sub-searches: one for read-only indices, and one for ordinary indices. This way the two don't interfere with each other. The split is done only when size is greater than 0, no scroll is provided and query_then_fetch is used as search type. Otherwise, the search executes like before. Note that the returned num_reduce_phases reflect the number of reduction phases that were run. If the search is split in two, there are three reductions: one non-final for each search, and a final one that merges the results of the previous two. Closes #40900	2019-06-12 11:25:03 +02:00
Luca Cavanna	31e8bff2ac	Rename SearchRequest#crossClusterSearch (#42363 ) The SearchRequest#crossClusterSearch method is currently used only as part of cross cluster search request, when minimizing roundtrips. It will soon be used also when splitting a search into two: one for throttled and one for non throttled indices. It will probably be used for other usecases as well in the future, hence it makes sense to generalize its name to subSearchRequest.	2019-06-12 11:25:03 +02:00
Henning Andersen	30d8085d96	scheduleAtFixedRate would hang (#42993 ) Though not in use in elasticsearch currently, it seems surprising that ThreadPool.scheduler().scheduleAtFixedRate would hang. A recurring scheduled task is never completed (except on failure) and we test for exceptions using RunnableFuture.get(), which hangs for periodic tasks. Fixed by checking that task is done before calling .get().	2019-06-11 19:46:37 +02:00
David Turner	04cde1d6e2	Defer reroute when nodes join (#42855 ) Today the master eagerly reroutes the cluster as part of processing node joins. However, it is not necessary to do this reroute straight away, and it is sometimes preferable to defer it until later. For instance, when the master wins its election it processes joins and performs a reroute, but it would be better to defer the reroute until after the master has become properly established. This change defers this reroute into a separate task, and batches multiple such tasks together.	2019-06-11 14:00:18 +01:00
Henning Andersen	1c7cd09375	Enable TRACE for testRecoverBrokenIndexMetadata (#43081 ) Relates to #43034	2019-06-11 12:38:48 +02:00
Jim Ferenczi	900eb4f882	Handle empty terms index in TermsSliceQuery (#43078 ) #40741 introduced a merge policy that can drop the postings for the `_id` field on soft deleted documents. The TermsSliceQuery assumes that every document has has an entry in the postings for that field so it doesn't check if the terms index exists or not. This change fixes this bug by checking if the terms index for the `_id` field is null and ignore the segment entirely if it's the case. This should be harmless since segments without an `_id` terms index should only contain soft deleted documents. Closes #42996	2019-06-11 12:01:53 +02:00
Henning Andersen	6a77dde5ea	Better test diag output on OOM (#42989 ) If linearizability checking fails with OOM (or other exception), we did not get the serialized history written into the log, making it difficult to debug in cases where the problem is hard to reproduce. Fixed to always attempt dumping the serialized history. Related to #42244	2019-06-11 09:48:52 +02:00
Alan Woodward	8e23e4518a	Move construction of custom analyzers into AnalysisRegistry (#42940 ) Both TransportAnalyzeAction and CategorizationAnalyzer have logic to build custom analyzers for index-independent analysis. A lot of this code is duplicated, and it requires the AnalysisRegistry to expose a number of internal provider classes, as well as making some assumptions about when analysis components are constructed. This commit moves the build logic directly into AnalysisRegistry, reducing the registry's API surface considerably.	2019-06-10 14:33:25 +01:00
Jim Ferenczi	39cb1abc9d	Fix auto fuzziness in query_string query (#42897 ) Setting `auto` after the fuzzy operator (e.g. `"query": "foo~auto"`) in the `query_string` does not take the length of the term into account when computing the distance and always use a max distance of 1. This change fixes this disrepancy by ensuring that the term is passed when the fuzziness is computed.	2019-06-10 10:13:16 +02:00
Vigya Sharma	25218733e6	Allow routing commands with ?retry_failed=true (#42658 ) We respect allocation deciders, including the `MaxRetryAllocationDecider`, when executing reroute commands. If you specify `?retry_failed=true` then the retry counter is reset, but today this does not happen until after trying to execute the reroute commands. This means that if an allocation has repeatedly failed, but you want to take control and assign a shard to a particular node to work around the repeated failures, you cannot execute the routing command in the same call to `POST /_cluster/reroute` as the one that resets the failure counter. This commit fixes this by resetting the failure counter first, meaning that you can now explicitly allocate a repeatedly-failed shard like this: ``` POST /_cluster/reroute?retry_failed=true { "commands": [ { "allocate_replica": { "index": "blahblah", "shard": 2, "node": "node-4" } } ] } ``` Fixes #39546	2019-06-10 08:31:05 +01:00
Jason Tedor	63bad28005	Do not allow modify aliases on followers (#43017 ) Now that aliases are replicated by a follower from its leader, this commit prevents directly modifying aliases on follower indices.	2019-06-09 22:53:54 -04:00
Nhat Nguyen	0ebcb21d2c	Unmuted testRecoverBrokenIndexMetadata These tests should be okay as we flush at the end of peer recovery. Closes #40867	2019-06-09 10:26:57 -04:00
Nhat Nguyen	afe65b5988	Fix assertion in ReadOnlyEngine (#43010 ) We should execute the assertion before throwing an exception; otherwise, it's a noop.	2019-06-09 10:26:56 -04:00
Jason Tedor	915d2f2daa	Refactor put mapping request validation for reuse (#43005 ) This commit refactors put mapping request validation for reuse. The concrete case that we are after here is the ability to apply effectively the same framework to indices aliases requests. This commit refactors the put mapping request validation framework to allow for that.	2019-06-09 10:19:04 -04:00
Nhat Nguyen	0a982fc57f	Mute testLookupSeqNoByIdInLucene Tracked at #42979	2019-06-08 00:30:12 -04:00
Jason Tedor	b580677412	Fix put mapping request validators random test This commit fixes a test bug in the request validators random test. In particular, an assertion was not properly nested in a guard that would ensure that was at least one failure. Relates #43000	2019-06-07 17:47:51 -04:00
Jason Tedor	d6fe4b648d	Fix possible NPE in put mapping validators (#43000 ) When applying put mapping validators, we apply all the validators in the collection. If a failure occurs, we collect that as a top-level exception, and suppress any additional failures into the top-level exception. However, if a request passes the validator after a top-level exception has been collected, we would try to suppress a null exception into the top-level exception. This is a violation of the Throwable#addSuppressed API. This commit addresses this, and adds test to cover the logic of collecting the failures when validating a put mapping request.	2019-06-07 16:24:12 -04:00
David Turner	5bc0dfce94	Improve translog corruption detection (#42980 ) Today we test for translog corruption by incrementing a byte by 1 somewhere in a file, and verify that this leads to a `TranslogCorruptionException`. However, we rely on _all_ corruptions leading to this exception in the `RemoveCorruptedShardDataCommand`: this command fails if a translog file corruption leads to a different kind of exception, and `EOFException` and `NegativeArraySizeException` are both possible. This commit strengthens the translog corruption detection tests by simulating the following: - a random value is written - the file is truncated It also makes sure that we return a `TranslogCorruptionException` in all such cases. Fixes #42661 Backport of #42744	2019-06-07 20:28:02 +01:00

1 2 3 4 5 ...

3226 Commits