OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nhat Nguyen	713e5c987b	Adjust init map size of user data of index commit (#40965 ) The number of user data attributes of an index commit has increased from 6 to 8, but we forgot to adjust. This change increases the initial size of that map to avoid resizing.	2019-04-08 22:47:20 -04:00
Christoph Büscher	335955b874	Some internal refactorings in AnalysisRegistry (#40609 ) Reducing some methods scope and marking them as static where possible. Removing "alias" support from AnalysisRegistry#produceAnalyze and changing that method to return a NamedAnalyzer instead of having a side effect on the analyzer map passed in. Also, CustomAnalyzerProvider doesn't seem to need the `environment` field.	2019-04-08 20:48:34 +02:00
David Turner	8eef92fafd	Revert "Short-circuit rebalancing when disabled (#40942 )" This reverts commit `f78e6ef73b`.	2019-04-08 15:58:56 +01:00
David Turner	f78e6ef73b	Short-circuit rebalancing when disabled (#40942 ) Today if `cluster.routing.rebalance.enable: none` then rebalancing is disabled, but we still execute `balanceByWeights()` and perform some rather expensive calculations before discovering that we cannot rebalance any shards. In a large cluster this can make cluster state updates occur rather slowly. With this change we check earlier whether rebalancing is globally disabled and, if so, avoid the rebalancing process entirely.	2019-04-08 14:57:29 +01:00
Jim Ferenczi	bc0fe7d64d	Handle min_doc_freq in phrase suggester (#40840 ) The phrase suggesters have an option to remove terms that have a frequency lower than a provided min_doc_freq. However this value is overwritten by the frequency of the original term in the popular mode. This change ensures that we keep the maximum value between the provided min_doc_value and the original term frequency as a threshold to select candidates. Fixes #16764	2019-04-08 12:23:54 +02:00
Jason Tedor	4163e59768	Mute failing IndexShard local history test This test fails reliably with, so this commit mutes that test until a fix is available.	2019-04-07 10:17:46 -04:00
Jason Tedor	6900399144	Be lenient when parsing build flavor and type on the wire (#40734 ) Today we are strict when parsing build flavor and types off the wire. This means that if a later version introduces a new build flavor or type, an older version would not be able to parse what that new version is sending. For a practical example of this, we recently added the build type "docker", and this means that in a rolling upgrade scenario older nodes would not be able to understand the build type that the newer node is sending. This breaks clusters and is bad. We do not normally think of adding a new enumeration value as being a serialization breaking change, it is just not a lesson that we have learned before. We should be lenient here though, so that we can add future changes without running the risk of breaking ourselves horribly. It is either that, or we have super-strict testing infrastructure here yet still I fear the possibility of mistakes. This commit changes the parsing of build flavor and build type so that we are still strict at startup, yet we are lenient with values coming across the wire. This will help avoid us breaking rolling upgrades, or clients that are on an older version.	2019-04-06 17:24:16 -04:00
Jason Tedor	e44e84ab42	Suppress lease background sync failures if stopping (#40902 ) If the transport service is stopped, likely because we are shutting down, and a retention lease background sync fires the logs will display a warn message and stacktrace. Yet, this situaton is harmless and can happen as a normal course of business when shutting down. This commit suppresses the log messages in this case.	2019-04-06 10:18:52 -04:00
David Turner	2ff19bc1b7	Use Writeable for TransportReplAction derivatives (#40905 ) Relates #34389, backport of #40894.	2019-04-05 19:10:10 +01:00
Colin Goodheart-Smithe	4452e8e10f	Mutes GatewayIndexStateIT.testRecoverBrokenIndexMetadata	2019-04-05 10:53:52 -04:00
David Turner	922a70ce32	Remove unused import Relates #40863	2019-04-05 09:21:34 +01:00
David Turner	d8956d2601	Remove test-only customisation from TransReplAct (#40863 ) The `getIndexShard()` and `sendReplicaRequest()` methods in TransportReplicationAction are effectively only used to customise some behaviour in tests. However there are other ways to do this that do not cause such an obstacle to separating the TransportReplicationAction into its two halves (see #40706). This commit removes these customisation points and injects the test-only behaviour using other techniques.	2019-04-05 08:54:41 +01:00
Martijn van Groningen	809a5f13a4	Make -try xlint warning disabled by default. (#40833 ) Many gradle projects specifically use the -try exclude flag, because there are many cases where auto-closeable resource ignore is never referenced in body of corresponding try statement. Suppressing this warning specifically in each case that it happens using `@SuppressWarnings("try")` would be very verbose. This change removes `-try` from any gradle project and adds it to the build plugin. Also this change removes exclude flags from gradle projects that is already specified in build plugin (for example -deprecation). Relates to #40366	2019-04-05 08:02:26 +02:00
Nhat Nguyen	5a2eb07c0e	Primary replica resync should not send ops without seqno (#40433 ) Primary-replica resync in a mixed-cluster between 6.x and 5.6 can send operations without sequence number to a replica which already processed operations with sequence number. This leads to the failure of that replica for we trip the sequence number assertion when writing resync operations without sequence number to translog.	2019-04-04 21:54:31 -04:00
Colin Goodheart-Smithe	402f312c5e	Adds version 6.7.2	2019-04-04 16:35:39 +01:00
Nhat Nguyen	2756a3936b	Reject illegal flush parameters (#40213 ) This change rejects an illegal combination of flush parameters where force is true, but wait_if_ongoing is false. This combination is trappy and should be forbidden. Closes #36342	2019-04-04 09:02:31 -04:00
Nhat Nguyen	c4960ad736	Ensure flush happen before closing an index (#40184 ) If there's an ongoing flush triggered by the translog flush threshold, we may fail to execute a flush because waitIfOngoing is false by default. Relates to #36342	2019-04-04 09:02:31 -04:00
Nhat Nguyen	e716b9ceee	Ensure no scheduled refresh in testPendingRefreshWithIntervalChange If a refresh, which is scheduled by the setting change, executes after the index-2 operation and win the refresh race (i.e., maybeRefresh) with the scheduledRefresh that we are going to check, then the latter will return false. Closes #39565 Relates #39462 PR #40387	2019-04-04 09:02:31 -04:00
Adrien Grand	670e76669c	Fix alias resolution runtime complexity. (#40263 ) (#40788 ) A user reported that the same query that takes ~900ms when querying an index pattern only takes ~50ms when only querying indices that have matches. The query is a date range query and we confirmed that the `can_match` phase works as expected. I was able to reproduce this issue locally with a single node: with 900 1-shard indices, a query to an index pattern that matches all indices runs in ~90ms while a query to the only index that has matches runs in 0-1ms. This ended up not being related to the `can_match` phase but to the cost of resolving aliases when querying an index pattern that matches lots of indices. In that case, we first resolve the index pattern to a list of concrete indices and then for each concrete index, we check whether it was matched through an alias, meaning we might have to apply alias filters. Unfortunately this second per-index operation runs in linear time with the number of matched concrete indices, which means that alias resolution runs in O(num_indices^2) overall. So queries get exponentially slower as an index pattern matches more indices. I reorganized alias resolution into a one-step operation that runs in linear time with the number of matches indices, and then a per-index operation that runs in linear time with the number of aliases of this index. This makes alias resolution run is O(num_indices * num_aliases_per_index) overall instead. When testing the scenario described above, the `took` went down from ~90ms to ~10ms. It is still more than the 0-1ms latency that one gets when only querying the single index that has data, but still much better than what we had before. Closes #40248	2019-04-04 11:40:42 +02:00
Adrien Grand	f5f5c3e429	Add unit test for MetaDataMappingService with typeless put mapping. (#40578 ) (#40720 ) This is currently only tested via REST tests. Closes #37450	2019-04-04 10:07:55 +02:00
Ryan Ernst	a28d5f35d9	Fix geo points missing test (#40704 ) This commit initializes the geo points for the missing doc values test. fixes #40684	2019-04-03 16:48:09 -07:00
Mayya Sharipova	a94e9500ac	Correct bug in ScriptDocValues (#40488 ) If a field `field_name` was missing in a document, doc['field_name'].get(0) incorrectly retrieved a value of the previously accessed document. This happened because `get(int index)` function was just accessing `values[index]` without checking the number of values - `count`. This PR fixes this.	2019-04-03 16:47:59 -07:00
Yannick Welsch	6ae7d593ea	Avoid background sync on relocated primary (#40800 ) There were some test failures caused by the background retention lease sync running on a relocated primary. This commit fixes the situation that triggered the assertion and reactivates the failing test. Closes #40731	2019-04-03 20:28:48 +02:00
Christoph Büscher	89389197b3	Help Eclipse infering lambda parameter types (#40747 ) The Eclipse compiler (4.10, Photon) cannot build this test because it cannot correctly infer the type arguments of the functions. Explicitely adding them helps in this case.	2019-04-03 17:51:22 +02:00
Christoph Büscher	09ba3ec677	Small refactorings to analysis components (#40745 ) This change adds the following internal refactorings: * wraps input analyzers into an unmodifiable map in IndexAnalyzers ctor * removes duplicated indexSetting in IndexAnalyzers * removes references to IndexAnalyzers from DocumentMapperParser and TypeParser.ParserContext. It can always be retrieve it from MapperService directly in those cases	2019-04-03 14:22:16 +02:00
David Turner	1d2bc85586	Inline TransportReplAction#registerRequestHandlers (#40762 ) It is important that resync actions are not rejected on the primary even if its `write` threadpool is overloaded. Today we do this by exposing `registerRequestHandlers` to subclasses and overriding it in `TransportResyncReplicationAction`. This isn't ideal because it obscures the difference between this action and other replication actions, and also might allow subclasses to try and use some state before they are properly initialised. This change replaces this override with a constructor parameter to solve these issues. Relates #40706	2019-04-03 12:12:26 +01:00
Jason Tedor	df65e46d10	Deprecate versions of Java prior to Java 11 (#40756 ) This commit deprecates versions of Java prior to Java 11. This commit will cause a warning to be printed to standard error when any command line tool is invoked, or when Elasticsearch is started. Additionally, we log a deprecation message when Elasticsearch is started.	2019-04-03 06:39:40 -04:00
David Turner	e64524c46f	Remove some abstractions from `TransportReplicationAction` (#40706 ) `TransportReplicationAction` is a rather complex beast, and some of its concrete implementations do not need all of its features. More specifically, it (a) chases a primary around the cluster until it manages to pin it down and then (b) executes an action on that primary and all its replicas. There are some actions that are coordinated by the primary itself, meaning that there is no need for the chase-the-primary phases, and in the case of peer recovery retention leases and primary/replica resync it is important to bypass these first phases. This commit is a step towards separating the `TransportReplicationAction` into these two parts. It is a mostly mechanical sequence of steps to remove some abstractions that are no longer in use.	2019-04-03 09:08:29 +01:00
Simon Willnauer	dd624c31b0	Don't mark shard as refreshPending on stats fetching (#40458 ) Completion and DocStats are pulled from internal readers instead of external since #33835 and #33847 which doesn't require us to refresh after a stats call since refreshes will happen internally anyhow and that will cause updated stats on ongoing indexing.	2019-04-02 16:15:30 +02:00
David Turner	6f00952abd	Use TAR instead of DOCKER build type before 6.7.0 (#40723 ) In 6.7.0 (#39378) we added a build type of DOCKER for the docker images, but unfortunately earlier versions do not understand this and will reject any transport messages that mention this build type. This commit fixes this by reporting TAR instead of DOCKER when talking to older nodes. Relates (but does not fix) #40511 Relates #39378	2019-04-02 13:17:50 +01:00
Alexander Reelsen	c644fbfc6e	Allow single digit milliseconds in strict date parsing (#40676 ) In order to remain compatible with the existing joda based implementation the parsing of milliseconds should support parsing single digits instead of relying on three, even with strict formats. This adds a few tests to duel against the existing joda based implementation in order to ensure the parsing behaviour is the same. Closes #40403	2019-04-02 10:27:50 +02:00
Andrey Ershov	287e334ef3	Do not perform cleanup if Manifest write fails with dirty exception (#40519 ) Currently, if Manifest write is unsuccessful (i.e. WriteStateException is thrown) we perform cleanup of newly created metadata files. However, this is wrong. Consider the following sequence (caught by CI here https://github.com/elastic/elasticsearch/issues/39077): - cluster global data is written successful - the associated manifest write fails (during the fsync, ie files have been written) - deleting (revert) the manifest files, fails, metadata is therefore persisted - deleting (revert) the cluster global data is successful In this case, when trying to load metadata (after node restart because of dirty WriteStateException), the following exception will happen ``` java.io.IOException: failed to find global metadata [generation: 0] ``` because the manifest file is referencing missing global metadata file. This commit checks if thrown WriteStateException is dirty and if its we don't perform any cleanup, because new Manifest file might be created, but its deletion has failed. In the future, we might add more fine-grained check - perform the clean up if WriteStateException is dirty, but Manifest deletion is successful. Closes https://github.com/elastic/elasticsearch/issues/39077 (cherry picked from commit 1fac56916bb3c4f3333c639e59188dbe743e385b)	2019-04-01 12:52:32 +03:00
Jim Ferenczi	7cc79123df	Fix merging of text field mapper (#40627 ) On mapping updates the `text` field mapper does not update the field types for the underlying prefix and phrase fields. In practice this shouldn't be considered as a bug but we have an assert in the code that check that field types in the mapper service are identical to the ones present in field mappers.	2019-04-01 08:41:42 +02:00
Jason Tedor	cebe509460	Fix bug in detecting use of bundled JDK on macOS This commit fixes a bug in detecting the use of the bundled JDK on macOS. This bug arose because the path of Java home is different on macOS.	2019-03-31 19:43:17 -04:00
Henning Andersen	92d07e9377	Geo Point parse error fix (#40447 ) When geo point parsing threw a parse exception, it did not consume remaining tokens from the parser. This in turn meant that indexing documents with malformed geo points into mappings with ignore_malformed=true would fail in some cases, since DocumentParser expects geo_point parsing to end on the END_OBJECT token. Related to #17617	2019-03-29 17:39:12 +01:00
Luca Cavanna	a0b02ce6ef	Move top-level pipeline aggs out of QuerySearchResult (#40319 ) As part of #40177 we have added top-level pipeline aggs to `InternalAggregations`. Given that `QuerySearchResult` holds an `InternalAggregations` instance, there is no need to keep on setting top-level pipeline aggs separately. Top-level pipeline aggs can then always be transported through `InternalAggregations`. Such change is made in a backwards compatible manner.	2019-03-29 17:01:14 +01:00
Oghenovo Usiwoma	444b4c4136	Improve error message for absence of indices (#39789 ) "no indices exist" has been added to the error message for absence of indices	2019-03-29 17:01:14 +01:00
Luca Cavanna	48b0deef4f	Remove throws IOException from PipelineAggregationBuilder#create (#40222 ) IOException are never thrown in any of the existing pipeline aggregation builders. Removing the throws IOException from the create method allows to remove it also from a couple of other methods which ends up simplifying AggregationPhase (one less catch).	2019-03-29 17:01:14 +01:00
Jason Tedor	585f38787c	Add usage indicators for the bundled JDK (#40616 ) This commit adds indications whether or not a distribution is from the bundled JDK, and whether or not we are using the bundled JDK.	2019-03-29 08:25:32 -04:00
Martijn van Groningen	be31800154	Update ingest jdocs that a null return value will drop the current document. (#40359 )	2019-03-29 09:46:54 +01:00
Jason Tedor	7255562afd	Add start and stop time to cat recovery API (#40378 ) The cat recovery API is incredibly useful. Yet it is missing the start and stop time as an option from the output. This commit adds these as options to the cat recovery API. We elect to make these not visible by default to avoid breaking the output that users might rely on.	2019-03-28 16:23:37 -04:00
Mayya Sharipova	24755209b4	Add randomScore function in script_score query (#40186 ) To make script_score query to have the same features as function_score query, we need to add randomScore function. This function produces different random scores on different index shards. It is also able to produce random scores based on the internal Lucene Document Ids.	2019-03-28 13:23:47 -04:00
David Turner	1a3916a8de	Optimise rejection of out-of-range `long` values (#40325 ) Today if you try and insert a very large number like `1e9999999` into a long field we first construct this number as a `BigDecimal`, convert this to a `BigInteger` and then reject it because it is out of range. Unfortunately making such a large `BigInteger` is rather expensive. We can avoid this expense by performing a (weaker) range check on the `BigDecimal` representation of incoming `long`s too. Relates #26137 Closes #40323	2019-03-28 12:27:34 +00:00
David Turner	073b13f5b0	Add docs for cluster.remote..proxy setting (#40281 ) In #33062 we introduced the `cluster.remote..proxy` setting for proxied connections to remote clusters, but left it deliberately undocumented since it needed followup work so that it could work with SNI. However, since #32517 is now closed we can add this documentation and remove the comment about its lack of documentation.	2019-03-28 12:11:24 +00:00
jimczi	8775e37d03	Fix SearchResponseMerger#testMergeSearchHits This commit fixes an edge case in tests where search hits are empty after the merge but some shards returned hits. This can happen if the total number of merged hits is less than the provided `from`. Closes #40553	2019-03-28 09:57:21 +01:00
Adrien Grand	65a35c985c	Remove type from VersionConflictEngineException. (#37490 ) (#40514 ) It initially mentioned the type in the exception because the type used to be required to uniquely identify a document. This is not necessary anymore given that indices have at most one type.	2019-03-28 09:32:09 +01:00
Adrien Grand	2326a3dccb	Remove String interning from `o.e.index.Index`. (#40350 ) (#40517 ) `Index` interns its name and uuid. My guess is that the main goal is to avoid having duplicate strings in the representation of the cluster state. However I doubt it helps much given that we have many other objects in the cluster state that we don't try to reuse, and interning has some cost. When looking into #40263 my profiler pointed to string interning because of the `Index` object that is created in `QueryShardContext` as one of the bottlenecks of the `can_match` phase.	2019-03-28 09:31:42 +01:00
Andy Bristol	23395a9b9f	search as you type fieldmapper (#35600 ) Adds the search_as_you_type field type that acts like a text field optimized for as-you-type search completion. It creates a couple subfields that analyze the indexed terms as shingles, against which full terms are queried, and a prefix subfield that analyze terms as the largest shingle size used and edge-ngrams, against which partial terms are queried Adds a match_bool_prefix query type that creates a boolean clause of a term query for each term except the last, for which a boolean clause with a prefix query is created. The match_bool_prefix query is the recommended way of querying a search as you type field, which will boil down to term queries for each shingle of the input text on the appropriate shingle field, and the final (possibly partial) term as a term query on the prefix field. This field type also supports phrase and phrase prefix queries however	2019-03-27 13:29:13 -07:00
Benjamin Trent	6563dc7ed9	Muting test for #40553 (#40555 )	2019-03-27 14:52:12 -05:00
Tim Brooks	ab44f5fd5d	Add InboundHandler for inbound message handling (#40430 ) This commit adds an InboundHandler to handle inbound message processing. With this commit, this code is moved out of the TcpTransport. Additionally, finer grained unit tests are added to ensure that the inbound processing works as expected	2019-03-27 12:33:26 -06:00
Yannick Welsch	64b31f44af	No mapper service and index caches for replicated closed indices (#40423 ) Replicated closed indices can't be indexed into or searched, and therefore don't need a shard with full indexing and search capabilities allocated. We can save on a lot of heap memory for those indices by not allocating a mapper service and caching infrastructure (which preallocates a constant amount per instance). Before this change, a 1GB ES instance could host 250 replicated closed metricbeat indices (each index with one shard). After this change, the same instance can host 7300 replicated closed metricbeat instances (not that this would be a recommended configuration). Most of the remaining memory is in the cluster state and the IndexSettings object.	2019-03-27 19:04:24 +01:00
Yannick Welsch	8f7c5732f1	Use default discovery implementation for single-node discovery (#40036 ) Switches "discovery.type: single-node" from using a separate implementation for single-node discovery to using the existing standard discovery implementation, with two small adaptions: - auto-bootstrapping, but requiring initial_master_nodes not to be set. - not actively pinging other nodes using the Peerfinder - not allowing other nodes to join its single-node cluster (if they have e.g. been set up using regular discovery and connect to the single-disco node).	2019-03-27 19:04:24 +01:00
Tim Brooks	3860ddd1a4	Move outbound message handling to OutboundHandler (#40336 ) Currently there are some components of message serializer and sending that still occur in TcpTransport. This commit makes it possible to send a message without the TcpTransport by moving all of the remaining application logic to the OutboundHandler. Additionally, it adds unit tests to ensure that this logic works as expected.	2019-03-27 11:47:36 -06:00
David Turner	707d40ce06	Stabilise testStaleMasterNotHijackingMajority (#40253 ) This test inadvertently asserts that the election occurs after a master failure is clean. However, messy elections are a fact of life so we should not fail on a messy election. This change moves this test away from an `AbstractDisruptionTestCase` since it does not need the fault detector to be so enthusiastic, and weakens the assertions to merely say that we ignore states published by the old master without saying anything about the cleanliness of the election. Closes #36556	2019-03-27 16:00:14 +00:00
Tim Brooks	760cfffe4b	Move TransportMessageListener to TransportService (#40474 ) Currently the TransportMessageListener is applied and used in the Transport class. However, local requests and responses never make it to this class. This PR moves the listener add/remove methods to the TransportService. After this change the Transport can only have one listener set with it. This one listener is the TransportService, which will then propogate the events to the external listeners. Additionally this commit back ports #40237 Remove Tracer from MockTransportService Currently the TransportMessageListener is applied and used in the Transport class. However, local requests and responses never make it to this class. This PR moves the listener add/remove methods to the TransportService. After this change the Transport can only have one listener set with it. This one listener is the TransportService, which will then propogate the events to the external listeners.	2019-03-27 09:24:20 -06:00
Przemyslaw Gomulka	65f01277ed	Parse composite patterns using ClassicFormat.parseObject backport(#40100 ) (#40501 ) Java-time fails parsing composite patterns when first pattern matches only the prefix of the input. It expects pattern in longest to shortest order. Because of this constructing just one DateTimeFormatter with appendOptional is not sufficient. Parsers have to be iterated and if the parsing fails, the next one in order should be used. In order to not degrade performance parsing should not be throw exceptions on failure. Format.parseObject was used as it only returns null when parsing failed and allows to check if full input was read. closes #39916 backport #40100	2019-03-27 13:51:44 +01:00
Yannick Welsch	b4b17e16e0	Remove TransportSingleItemBulkWriteAction as replication action (#40424 ) The implementation of TransportIndexAction and TransportDeleteAction as TransportReplicationAction existed for interoperability with older 5.x nodes, as these older nodes coordinated single index / deletes as replication requests. This BWC layer is no longer needed in 7.x, where these single actions are now mapped to bulk requests. Completely removing the deprecated transport actions is not possible yet if we want to keep BWC with a 6.x transport client. The best way here is to wait for the transport client to go away and then just remove the actions.	2019-03-27 13:16:58 +01:00
Jim Ferenczi	fe05a4d511	Fix random failures in SearchResponseMerger#testMergeSearchHits (#40223 ) This commit fixes the expectation in the test when the search hits are empty. Closes #40214	2019-03-27 11:17:10 +01:00
alex101101	fb8ad0cf30	Add a soft limit to the field name length (#40309 ) Adds an optional limit to the length of field names, throws an IllegalArgumentException if the limit is breached. Closes #33651	2019-03-26 17:58:32 +01:00
Daniel Mitterdorfer	f2b5960f90	Add version 6.7.1	2019-03-26 17:38:14 +01:00
Yannick Welsch	bf7b167bba	Remove timeout task after completing cluster state publication (#40411 ) Each cluster state publication schedules a cancellation task with the provided publication timeout (30s by default). This scheduled cancellation keeps a reference to the publication, and therefore the full cluster state that was published. In case of frequently updating a large cluster state, this results in a large number of cancellation tasks keeping references to all previously published cluster states.	2019-03-26 17:13:57 +01:00
Henning Andersen	bf444b9f02	Store Pending Deletions Fix (#40345 ) FilterDirectory.getPendingDeletions does not delegate, fixed temporarily by overriding in StoreDirectory. This in turn caused duplicate file name use after a trimUnsafeCommits had been done, since a new IndexWriter would not consider the pending deletes in IndexFileDeleter. This should only happen on windows (AFAIK). Reenabled doing index updates for all tests using IndexShardTests.indexOnReplicaWithGaps (which could fail due to above when using mocked WindowsFS). Added getPendingDeletions delegation to all elasticsearch FilterDirectory subclasses that were not trivial test-only overrides to minimize the risk of hitting this issue in another case.	2019-03-26 15:30:44 +01:00
Alan Woodward	12634850d6	IntervalQueryBuilderTests#testNonIndexedFields test fix (#40418 ) This test checks that interval queries constructed against a field with no indexed positions will throw exceptions. It uses a randomly-build IntervalsSourceProvider against a fixed set of fields; however, the random source builder can occasionally provide a source with a fixed field, meaning that even if the top-level query asks for a set of intervals over a non-indexed field, the source will delegate to another field, and no exception will be thrown. This commit changes the test to always use a simple Match provider. Fixes #40436	2019-03-26 08:33:42 +00:00
Nhat Nguyen	495dc11c9c	Mute testPendingRefreshWithIntervalChange Tracked at #39565	2019-03-25 11:47:08 -04:00
Armin Braun	3968d46a17	Remove Redundant Request Wrappers from RepositoryService (#40192 ) (#40404 )	2019-03-25 16:36:02 +01:00
Armin Braun	dc5ff0fffc	Log Warning on Failed Blob Deletes in BlobStoreRepository (#40188 ) (#40340 ) * Log Warning on Failed Blob Deletes in BlobStoreRepository * We should not just debug log these spots, they all can and will lead to leaked files when snapshot deletion fails	2019-03-25 08:52:09 +01:00
Nhat Nguyen	b9f96a8e1f	Expose external refreshes through the stats API (#38643 ) Right now, the stats API only provides refresh metrics regarding internal refreshes. This isn't very useful and somewhat misleading for cluster administrators since the internal refreshes are not indicative of documents being available for search. In this PR I added a new metric for collecting external refreshes as they occur and exposing them through the stats API. Now, calling an endpoint for stats will yield external refresh metrics as well. Relates #36712	2019-03-24 22:21:00 -04:00
Armin Braun	13d76239a0	Use Netty ByteBuf Bulk Operations for Faster Deserialization (#40158 ) (#40339 ) * Use bulk methods to read numbers faster from byte buffers	2019-03-24 19:08:51 +01:00
Jason Tedor	10bbb082a4	Only run retention lease actions on active primary (#40386 ) In some cases, a request to perform a retention lease action can arrive on a primary shard before it is active. In this case, the primary shard would not yet be in primary mode, tripping an assertion in the replication tracker. Instead, we should not attempt to perform such actions on an initializing shard. This commit addresses this by not returning the primary shard in the single shard iterator if the primary shard is not yet active.	2019-03-23 09:39:39 -04:00
Zachary Tong	78f737dad3	Map value field to double in MovavgIT (#40230 ) We were accidentally not mapping the index, which meant dynamic mapping was choosing floats for the values. This led to enough loss of precision for the aggregated values to differ slightly from the test doubles, which accumulated into large differences in the holt output. This test fix adds an explicit mapping.	2019-03-21 14:03:14 -04:00
Jason Tedor	1e6941b138	Reduce retention lease sync intervals (#40302 ) This commit adjusts the frequency with which CCR renews retention leases and with which primaries sync retention leases to replicas. This helps Lucene reclaim soft-deleted documents more aggressively, which we have found in some use-cases can help improve performance, and either way will help keep disk space under more control.	2019-03-21 07:37:44 -04:00
Alan Woodward	83d2870308	Add `use_field` option to intervals query (#40157 ) This is the equivalent of the `field_masking_span` query, allowing users to merge intervals from multiple fields - for example, to search for stemmed tokens near unstemmed tokens.	2019-03-20 16:26:04 +00:00
Like	6f64267626	Make setting index.translog.sync_interval be dynamic (#37382 ) Currently, we cannot update index setting index.translog.sync_interval if index is open, because it's not dynamic which can be updated for closed index only. Closes #32763	2019-03-20 17:12:45 +01:00
Yannick Welsch	a5fb7fb17c	Fix snapshot restore logging on fresh restore (#40252 ) A recent refactoring (#37130) where imports got mixed up (changing Lucene's IndexNotFoundException to Elasticsearch's IndexNotFoundException) led to many warnings being logged in case of restoring a fresh snapshot.	2019-03-20 16:51:44 +01:00
Jim Ferenczi	3400483af4	Add date and date_nanos conversion to the numeric_type sort option (#40199 ) (#40224 ) This change adds an option to convert a `date` field to nanoseconds resolution and a `date_nanos` field to millisecond resolution when sorting. The resolution of the sort can be set using the `numeric_type` option of the field sort builder. The conversion is done at the shard level and is restricted to dates from 1970 to 2262 for the nanoseconds resolution in order to avoid numeric overflow.	2019-03-20 16:50:28 +01:00
Nhat Nguyen	efaf95628b	Use separate translog dir in testDeleteWithFatalError This test currently opens a new engine but shares the same translog directory of the previous opening engine.	2019-03-20 10:22:27 -04:00
Mayya Sharipova	49a7c6e0e8	Expose proximity boosting (#39385 ) (#40251 ) Expose DistanceFeatureQuery for geo, date and date_nanos types Closes #33382	2019-03-20 09:24:41 -04:00
Henning Andersen	4c2a8638ca	Cascading primary failure lead to MSU too low (#40249 ) If a replica were first reset due to one primary failover and then promoted (before resync completes), its MSU would not include changes since global checkpoint, leading to errors during translog replay. Fixed by re-initializing MSU before restoring local history.	2019-03-20 14:00:43 +01:00
Simon Willnauer	235f57989f	Return cached segments stats if `include_unloaded_segments` is true (#39698 ) Today we don't return segments stats for closed indices which makes it hard to tell how much memory such an index would require. With this change we return the statistics if requested by setting `include_unloaded_segments` to true on the rest request. Relates to #39512	2019-03-20 12:08:41 +01:00
Jason Tedor	9ce740a2eb	Modfiy casing in JVM home log message This makes the log message consistent with the following line that shows the JVM arguments.	2019-03-20 00:06:16 -04:00
Zachary Tong	69f5869707	Mute SearchResponseMergerTests#testMergeSearchHits Tracking issue: https://github.com/elastic/elasticsearch/issues/40214	2019-03-19 13:40:38 -04:00
David Turner	33d8738c68	Fix RareClusterStateIT on MacOS (#40203 ) Today RareClusterStateIT#testAssignmentWithJustAddedNodes fails on my Mac because it waits for the default connection timeout of 30 seconds to connect to a fake node with IP address 0.0.0.0. This connection attempt fails much more quickly on Linux so the test passes. This commit fixes this by reducing the connection timeout for this test.	2019-03-19 17:33:21 +00:00
Nhat Nguyen	a13b4bc8c5	Always fail engine if delete operation fails (#40117 ) Unlike index operations which can fail at the document level to analyzing errors, delete operations should never fail at the document level whether soft-deletes is enabled or not. With this change, we will always fail the engine if we fail to apply a delete operation to Lucene. Closes #33256	2019-03-19 13:09:23 -04:00
Luca Cavanna	d14e79e849	Serialize top-level pipeline aggs as part of InternalAggregations (#40177 ) We currently convert pipeline aggregators to their corresponding InternalAggregation instance as part of the final reduction phase. They arrive to the coordinating node as part of QuerySearchResult objects fom the shards and, despite we may incrementally reduce aggs (hence we may have some non-final reduce and the final one later) all the reduction phases happen on the same node. With CCS minimizing roundtrips though, each cluster performs its own non-final reduction, and then serializes the results back to the CCS coordinating node which will perform the final coordination. This breaks the assumptions made up until now around reductions happening all on the same node. With #40101 we have made sure that top-level pipeline aggs are not reduced as part of the non-final reduction. The next step is to make sure that they don't get lost, meaning that each coordinating node needs to send them back to the CCS coordinating node as part of the top-level `InternalAggregations` object. Closes #40059	2019-03-19 14:43:39 +01:00
Luca Cavanna	803ec46331	Skip sibling pipeline aggregators reduction during non-final reduce (#40101 ) Today a coordinating node forces a final reduction of sibling pipeline aggregators whenever reducing aggs, unless it is reducing aggs incrementally. This works well for incremental reduction of aggs, but breaks CCS when minimizing roundtrips as each cluster ends up reducing its own pipeline aggregators locally while that should only be done by the CCS coordinating node later. This causes issues as after their reduction, pipeline aggs cannot be further reduced, which is what happens with CCS causing errors like "java.lang.UnsupportedOperationException: Not supported" being returned. Each coordinating node should rather honour the reduce context flag that indicates whether we are executing a final reduce or not. If not, it should leave the sibling pipeline aggregations alone. Note that his bug affects only pipeline aggs that don't have a parent in the aggs tree, while all the others work well. Relates to #40059 but does not fix it yet, as the CCS coordinating node also needs to be adapted to recreate sibling pipeline aggregators from the request.	2019-03-19 14:43:39 +01:00
Luca Cavanna	83f12a3d9c	CCS: skip empty search hits when minimizing round-trips (#40098 ) When minimizing round-trips, each cluster returns its own independent search response. In case sort by field and/or field collapsing were requested, when one cluster has no results to return, the information about the field that sorting was based on (SortField array) as well as the field (and the values) that collapsing was performed on are missing in the search response. That causes problems as we can't build the proper `TopDocs` instance which would need to be either `TopFieldDocs` or `CollapseTopFieldDocs`. The merge routine expects that all the top docs are of the same exact type which can't be guaranteed. Given that the problematic results are empty, hence have no impact on the final results, we can simply skip them. Relates to #32125 Closes #40067	2019-03-19 14:43:39 +01:00
Luca Cavanna	9c38fa6468	[TEST] Update TransportSearchActionTests#testShouldMinimizeRoundtrips Relates to #40044 Closes #40051	2019-03-19 14:43:38 +01:00
Luca Cavanna	07bfb4c7f7	CCS: Disable minimizing round-trips when dfs is requested (#40044 ) When using DFS_QUERY_THEN_FETCH search type, the dfs phase is run and its results are used in the query phase to make scoring accurate. When using CCS, depending on whether the DFS phase runs in the CCS coordinating node (like if all shards were local) or in each remote cluster (when minimizing round-trips), scoring will differ. This commit disables minimizing round-trips whenever DFS is requested, as it is not currently possible to ensure that scoring is accurate in that case. Relates to #32125	2019-03-19 14:43:38 +01:00
Nhat Nguyen	8dc6862b17	Unmute and trace testPendingRefreshWithIntervalChange Tracked at #39565	2019-03-19 09:07:54 -04:00
Henning Andersen	dde41cc2dd	Node repurpose tool (#39403 ) When a node is repurposed to master/no-data or no-master/no-data, v7.x will not start (see #37748 and #37347). The `elasticsearch repurpose` tool can fix this by cleaning up the problematic data.	2019-03-19 11:52:02 +01:00
Dimitris Athanasiou	95f660d577	Mute NoMasterNodeIT.testNoMasterActionsWriteMasterBlock test (#39689 ) Relates #39688	2019-03-18 15:04:26 -06:00
Henning Andersen	0b214c1bfb	Linearizability checker memory reduction (#40149 ) The cache used in linearizability checker now uses approximately 6x less memory by changing the cache from a set of (bits, state) tuples into a map from bits -> { state }. Each combination of states is kept once only, building on the assumption that the number of state permutations is small compared to the number of bits permutations. For those histories that are difficult to check we will have many bits combinations that use the same state permutations. We end up now using approximately 15 bytes per entry compared to 101 bytes before, ie. a 6x improvement, allowing us to linearizability check significantly longer histories. Re-enabled linearizability checker in CoordinatorTests, hoping above ensures we no longer run out of memory. Resolves #39437	2019-03-18 21:16:59 +01:00
Nhat Nguyen	38e9522218	Remove wait for cluster state step in peer recovery (#40004 ) We introduced WAIT_CLUSTERSTATE action in #19287 (5.0), but then stopped using it since #25692 (6.0). This change removes that action and related code in 7.x and 8.0. Relates #19287 Relates #25692	2019-03-18 15:17:21 -04:00
Nhat Nguyen	d720a64b9e	Ensure sendBatch not called recursively (#39988 ) This PR introduces AsyncRecoveryTarget which executes remote calls of peer recovery asynchronously. In this change, we also add a new assertion to ensure that method sendBatch, which sends a batch of history operations in phase2, is never called recursively on the same thread. This new assertion will also be used in method sendFileChunks.	2019-03-18 15:17:21 -04:00
Jim Ferenczi	eb540125ea	Fix IndexSearcherWrapper visibility (#39071 ) (#40145 ) This change adds a wrapper for IndexSearcher that makes IndexSearcher#search(List, Weight, Collector) visible by sub-classes. The wrapper is used by the ContextIndexSearcher to call this protected method on a searcher created by a plugin. This ensures that an override of the protected method in an IndexSearcherWrapper plugin is called when a search is executed. Closes #30758	2019-03-18 11:33:54 +01:00
Jim Ferenczi	5b73a1bc7d	Add an option to force the numeric type of a field sort (#38095 ) (#40084 ) This change adds an option to the `FieldSortBuilder` that allows to transform the type of a numeric field into another. Possible values for this option are `long` that transforms the source field into an integer and `double` that transforms the source field into a floating point. This new option is useful for cross-index search when the sort field is mapped differently on some indices. For instance if a field is mapped as a floating point in one index and as an integer in another it is possible to align the type for both indices using the `numeric_type` option: ``` { "sort": { "field": "my_field", "numeric_type": "double" <1> } } ``` <1> Ensure that values for this field are transformed to a floating point if needed.	2019-03-18 09:32:45 +01:00
Albert Zaharovits	1b75ee0bd7	AuditTrail correctly handle ReplicatedWriteRequest (#39925 ) This fix deduplicates index names in `BulkShardRequests` and only audits the specific resolved index for every comprising `BulkItemRequest`.	2019-03-17 13:05:26 +02:00
Jason Tedor	86d1d03c37	Remove cluster state size (#40109 ) This commit removes the cluster state size field from the cluster state response, and drops the backwards compatibility layer added in 6.7.0 to continue to support this field. As calculation of this field was expensive and had dubious value, we have elected to remove this field.	2019-03-15 17:16:25 -04:00
Tim Brooks	0b50a670a4	Remove transport name from tcp channel (#40074 ) Currently, we maintain a transport name ("mock-nio", "nio", "netty") that is passed to a `TcpTransportChannel` when a request is received. The value of this name is to associate with the task when we register a task with the task manager. However, it is only possible to run ES with one transport, so having an implementation specific name is unnecessary. This commit removes the name and replaces it with the generic "transport".	2019-03-15 12:04:13 -06:00
Zachary Tong	c72feedd74	Do not allow Sampler to allocate more than maxDoc size, better CB accounting (#39381 ) The `sampler` agg creates a BestDocsDeferringCollector, which internally initializes a priority queue of size `shardSize`. This queue is populated with empty `Object` sentinels, which is roughly 16b per object. Similarly, the Diversified samplers create a DiversifiedTopDocsCollectors which internally track PQ slots with ScoreDocKeys, weighing in around 28kb If the user sets a very abusive `shard_size`, this could easily OOM a node or cluster since these PQ are allocated up-front without any checks. This commit makes sure that when we create the collector, it cannot be larger than the maxDoc so that we don't accidentally blow up the node. We ensure the size is not greater than the overall index maxDoc. A similar treatment is done for `maxDocsPerValue` parameter of the diversified samplers For good measure, this also adds in some CB accounting to try and track memory usage. Finally, a redundant array creation is removed to reduce a bit of temporary memory.	2019-03-15 13:19:55 -04:00

1 2 3 4 5 ...

2866 Commits