OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-16 01:46:25 +00:00

Author	SHA1	Message	Date
David Turner	cc3364e4f8	Stats to record how often the ClusterState diff mechanism is used successfully (#26973 ) It's believed that using diffs obsoletes the other mechanism for reusing the bits of the ClusterState that didn't change between updates, but in fact we don't know for sure how often the diff mechanism works successfully. The stats collected here will tell us.	2017-10-25 07:35:25 +01:00
Lee Hinman	6bc7024f26	Tie-break shard path decision based on total number of shards on path (#27039 ) Right now if the number of shards for a particular index is equal across the data paths, we tie-break on space. This changes to tie-break first on the total number of shards for each path, and then, if that is the same, on the usable bytes. Relates to #26654 (it's a follow-up)	2017-10-24 16:12:47 -06:00
Jason Tedor	7a792d2c1f	Timed runnable should delegate to abstract runnable If timed runnable wraps an abstract runnable, then it should delegate to the abstract runnable otherwise force execution and handling rejections is dropped on the floor. Thus, timed runnable should itself be an abstract runnable delegating all methods to the wrapped runnable in cases when it is an abstract runnable. This commit causes this to be the case. Relates #27095	2017-10-24 11:36:50 -04:00
Lee Hinman	fcfbdf1f37	Expose adaptive replica selection stats in /_nodes/stats API This exposes the collected metrics we store for ARS in the nodes stats, as well as the computed rank of nodes. Each node exposes its perspective about the cluster. Here's an example output (with `?human`): ```json ... "adaptive_selection" : { "_k6v1-wERxyUd5ke6s-D0g" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "7.8ms", "avg_service_time_ns" : 7896963, "avg_response_time" : "9ms", "avg_response_time_ns" : 9095598, "rank" : "9.1" }, "VJiCUFoiTpySGmO00eWmtQ" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "1.3ms", "avg_service_time_ns" : 1330240, "avg_response_time" : "4.5ms", "avg_response_time_ns" : 4524154, "rank" : "4.5" }, "DHNGTdzyT9iiaCpEUsIAKA" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "2.1ms", "avg_service_time_ns" : 2113164, "avg_response_time" : "6.3ms", "avg_response_time_ns" : 6375810, "rank" : "6.4" } } ... ```	2017-10-24 08:58:42 -06:00
David Turner	cf2d0834f5	Remove duplicated test (#27091 )	2017-10-24 11:52:01 +01:00
Nhat	bf557fd886	test: avoid generating duplicate multiple fields (#27080 ) Multifields parser does not allow duplicate values, however the MultiFieldTests may produce duplicate field values. See https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+release-tests/132/console.	2017-10-23 09:59:40 -04:00
Adrien Grand	d0104c22a5	Reduce the default number of cached queries. (#26949 ) Memory usage of queries can't be properly accounted, which can be an issue when large queries are cached since the actual memory usage will be much higher than what the cache thinks. This problem is very hard if not impossible to fix so as a workaround I would like to decrease the maximum number of cached queries so that this problem is less likely to cause trouble in practice. For the record, this problem is more likely to occur in envirenments that have small shards or don't give much memory to the JVM. Closes #26938	2017-10-23 14:11:35 +02:00
Jason Tedor	35984a616e	Keep cumulative elapsed scroll time in microseconds Today we internally accumulate elapsed scroll time in nanoseconds. The problem here is that this can reasonably overflow. For example, on a system with scrolls that are open for ten minutes on average, after sixteen million scrolls the largest value that can be represented by a long will be executed. To address this, we switch to internally representing scrolls using microseconds as this enables with the same number of scrolls scrolls that are open for seven days on average, or with the same average elapsed time sixteen billion scrolls which will never happen (executing one scroll a second until sixteen billion have executed would not occur until more than five-hundred years had elapsed). Relates #27068	2017-10-21 13:18:28 +02:00
Tanguy Leroux	463e7e6fa3	Revert "Upgrade to Jackson 2.9.2 (#27032 )" This reverts commit 0b9acc5acea90887cfab666a05cb6d3cd8aa1e02.	2017-10-20 08:25:41 +02:00
Tanguy Leroux	0b9acc5ace	Upgrade to Jackson 2.9.2 (#27032 ) Upgrade to Jackson 2.9.2 and also use a boolean `closed` flag to indicate that a FastStringReader instance is closed, so that length is still correctly reported after the reader is closed.	2017-10-19 15:15:02 +02:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00
Alexander Kazakov	9a3a1cd1b7	Handle leniency for cross_fields type in multi_match query (#27045 )	2017-10-19 10:29:28 +02:00
Stephen Yeargin	8a05e5b92c	Fix typo in thrown exception in IndicesAliasesRequest (#27025 ) There is a typo in the exception thrown in `IndicesAliasesRequest`. This PR corrects the spelling and removes extraneous word.	2017-10-18 13:54:16 +00:00
Lee Hinman	78c54c4560	Balance shards for an index more evenly across multiple data paths (#26654 ) * Balance shards for an index more evenly across multiple data paths When a node has multiple data paths configured, and is assigned all of the shards for a particular index, it's possible now that all shards will be assigned to the same path (see #16763). This change keeps the same behavior around determining the "best" path for a shard based on space, however, it enforces limits for the number of shards on a path for an index from the single-node perspective. For example: Assume you had a node with 4 data paths, where `/path1` has a tremendously high amount of disk space available compared to the other paths. If you create an index with 5 primary shards, the previous behavior would be to assign all 5 shards to `/path1`. This change would enforce a limit of 2 shards to each data path for that particular node, so you would end up with the following distribution: - `/path1` - 2 shards (because it has the most usable space) - `/path2` - 1 shard - `/path3` - 1 shard - `/path4` - 1 shard Note, however, that this limit is only enforced at the local node level for simplicity in implementation, so if you had multiple nodes, the "limit" for the node is still 2, so assuming you had enough nodes that there was only 2 shards for this index assigned to this node, they would still both be assigned to `/path1`. * Switch from ObjectLongHashMap to regular HashMap * Remove unneeded Files.isDirectory check * Skip iterating directories when not necessary * Add message to assert * Implement different (better) ranking for node paths This is the method we discussed * Remove unused pathHasEnoughSpace method * Use findFirst instead of .get(0); * Update for master merge to fix compilation Settings.putArray -> Settings.putList	2017-10-17 05:49:24 -06:00
Jason Tedor	62bf3c11a9	Stop invoking non-existant syscall Today when getting ready to enter seccomp, we do some probes to ensure that we are really talking to seccomp, etc. One of these probes is pure paranoia. The paranoia was driven by a kernel bug (https://lkml.org/lkml/2014/7/20/222) that only impacted 32-bit x86 kernels wherein invoking a non-existant syscall was not returning ENOSYS (as it should). This probe causes problems though, for example in containers with syscall filters, invoking a non-existant syscall will lead to the process being sent SIGSYS and terminated. We do not need this paranoid, we do not support 32-bit, and our other probes give us enough of a defense to ensure that we are talking to seccomp (and we hardcode the seccomp syscall number for platforms that we support). Given that this probe offers us little value, but does cause problems in valid use-cases, this commit removes this paranoia. Relates #27016	2017-10-17 11:34:44 +02:00
Jason Tedor	3664ede9b5	Remove unnecessary exception for engine constructor The internal engine constructor declares a checked engine exception yet this constructor does not actually throw this exception. This commit removes this declaration from the internal engine constructor. Relates #27022	2017-10-16 10:17:37 -04:00
Simon Willnauer	8dda827ff4	Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000 ) Today all these API calls have a sideeffect of making documents visible to search requests. While this is sometimes desired it's an unnecessary sideeffect and now that we have an internal (engine-private) index reader (#26972) we artificially add a refresh call for bwc. This change removes this sideeffect in 7.0.	2017-10-16 10:16:35 +02:00
Tim Brooks	277637f42f	Do not set SO_LINGER on server channels (#26997 ) Right now we are attempting to set SO_LINGER to 0 on server channels when we are stopping the tcp transport. This is not a supported socket option and throws an exception. This also prevents the channels from being closed. This commit 1. doesn't set SO_LINGER for server channges, 2. checks that it is a supported option in nio, and 3. changes the log message to warn for server channel close exceptions.	2017-10-13 13:06:38 -06:00
olcbean	bb013c60b5	Fix inconsistencies in the rest api specs for *_script (#26971 )	2017-10-13 11:20:34 -07:00
Simon Willnauer	cae1790492	[TEST] Add test that replicates versioned updates with random flushes	2017-10-13 11:54:51 +02:00
Simon Willnauer	a517758432	Use internal searcher for all indexing related operations in the engine The changes introduced in #26972 missed two places where an internal searcher should be used.	2017-10-13 11:41:00 +02:00
Tim Brooks	e40c597b67	Fix reference to TcpTransport in documentation	2017-10-12 13:27:24 -06:00
Simon Willnauer	047a916169	Allow Uid#decodeId to decode from a byte array slice (#26987 ) Today we only allow to decode byte arrays where the data has a 0 offset and the same length as the array. Allowing to decode stuff from a slice will make decoding IDs cheaper if the the ID is for instance coming from a term dictionary or BytesRef. Relates to #26931	2017-10-12 20:19:14 +02:00
Simon Willnauer	21eb9bdf6a	Use separate searchers for "search visibility" vs "move indexing buffer to disk (#26972 ) Today, when ES detects it's using too much heap vs the configured indexing buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move the bytes to disk, clear version map, etc. But this has the unexpected side effect of making newly indexed/deleted documents visible to future searches, which is not nice for users who are trying to prevent that, e.g. #3593. This is also an indirect spinoff from #26802 where we potentially pay a big price on rebuilding caches etc. when updates / realtime-get is used. We are refreshing the internal reader for realtime gets which causes for instance global ords to be rebuild. I think we can gain quite a bit if we'd use a reader that is only used for GETs and not for searches etc. that way we can also solve problems of searchers being refreshed unexpectedly aside of replica recovery / relocation. Closes #15768 Closes #26912	2017-10-12 17:19:43 +02:00
Colin Goodheart-Smithe	e1679bfe5e	Create weights lazily in filter and filters aggregation (#26983 ) Previous to this change the weights for the filter and filters aggregation were created in the `Filter(s)AggregatorFactory` which meant that they were created regardless of whether the aggregator actually collects any documents. This meant that for filters that are expensive to initialise, requests would not be quick when the query of the request was (or effectively was) a `match_none` query. This change maintains a single Weight instance for each filter across parent buckets but passes a weight supplier to the aggregator instances which will create the weight on first call and then return that instance for subsequent calls.	2017-10-12 14:56:32 +01:00
Jason Tedor	f81ee225ff	Fire global checkpoint sync under system context The global checkpoint sync action should fire under the system context since it is not a user-facing management action. Relates #26984	2017-10-12 09:18:58 -04:00
kel	2e36f19051	Add support for parsing inline script (#23824 ) (#26846 ) * Add support for parsing inline script (#23824) * Fix test	2017-10-11 09:15:37 -07:00
Alexander Kazakov	592ab043dd	Change default value to true for transpositions parameter of fuzzy query (#26901 )	2017-10-11 15:31:48 +02:00
Yannick Welsch	d97b21d1da	Adding unreleased 5.6.4 version number to Version.java	2017-10-11 08:40:30 +02:00
Tim Brooks	4a870d8c63	Rename TCPTransportTests to TcpTransportTests (#26954 ) Our convention is to use lower case when naming things "Tcp". For example, `TcpTransport`. This commit renames the outlier (TcpTransportTests) to use lower case.	2017-10-10 20:15:14 -06:00
Nhat	b63f718a0b	Fix NPE for /_cat/indices when no primary shard (#26953 ) When a node which contains the primary shard is unavailable, the primary stats (and the total stats) of an `IndexStats` will be empty for a short moment (while the primary shard is being relocated). However, we assume that these stats are always non-empty when handling `_cat/indices` in RestIndicesAction. This commit checks the content of these stats before accessing. Closes #26942	2017-10-10 17:04:55 -04:00
Jason Tedor	4c06b8f1d2	Check for closed connection while opening While opening a connection to a node, a channel can subsequently close. If this happens, a future callback whose purpose is to close all other channels and disconnect from the node will fire. However, this future will not be ready to close all the channels because the connection will not be exposed to the future callback yet. Since this callback is run once, we will never try to disconnect from this node again and we will be left with a closed channel. This commit adds a check that all channels are open before exposing the channel and throws a general connection exception. In this case, the usual connection retry logic will take over. Relates #26932	2017-10-10 13:34:51 -04:00
Tanguy Leroux	6658ff0fd6	Don't detect source's XContentType in DocumentParser.parseDocument() (#26880 ) DocumentParser.parseDocument() auto detects the XContentType of the document to parse, but this information is already provided by SourceToParse.	2017-10-10 15:31:56 +02:00
hanbj	3ab27d16ad	Fix thread context handling of headers overriding (#26068 ) Previously collisions in headers between old and new contexts could be silently ignored, allowing the original context's headers to "win". This commit fixes the headers to require they are disjoint.	2017-10-09 14:41:09 -07:00
Boaz Leskes	84742690cd	SearchWhileCreatingIndexIT: remove usage of _only_nodes the only nodes preference was used as a replacement of `_primary` which was removed. Sadly, it's not the same as we also check that it makes sense - i.e., that the given node has a shard copy. Since the test uses indices with >1 shards, the primaries may be spread to multiple nodes. Using one (like it currently does) will fail for some primaries. Using all will probably end up hitting all nodes. This commit removed the `_only_nodes` usage in favor a simple search Relates to #26791	2017-10-09 19:37:19 +02:00
Martijn van Groningen	96823b0480	update Lucene version for 6.0-RC2 version	2017-10-09 15:27:06 +02:00
kel	1d4f70210f	Calculate and cache result when advanceExact is called (#26920 ) Cache final result instead of result of advanceExact. Fix SortedNumericDoubleValues does not test MEDIAN mode Replace deprecated random string generation method	2017-10-09 14:02:38 +02:00
Simon Willnauer	cdd7c1e6c2	Return List instead of an array from settings (#26903 ) Today we return a `String[]` that requires copying values for every access. Yet, we already store the setting as a list so we can also directly return the unmodifiable list directly. This makes list / array access in settings a much cheaper operation especially if lists are large.	2017-10-09 09:52:08 +02:00
Nhat	bf4c3642b2	remove _primary and _replica shard preferences (#26791 ) The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335	2017-10-08 11:03:06 -04:00
kel	100e3c9a8a	Remove UnsortedNumericDoubleValues (#26817 ) Closes #24086	2017-10-06 16:31:50 +02:00
Thomas Kappler	16431a6601	Fix IndexOutOfBoundsException in histograms for NaN doubles (#26787 ) (#26856 )	2017-10-06 16:27:01 +02:00
Jim Ferenczi	e8f72353d8	Fix search_after with geo distance sorting (#26891 ) Support for search_after and geo distance sorting is broken when the optimized LatLonDocValuesField.distanceSort is used. This commit fixes the parsing of the search_after value for this case.	2017-10-06 11:34:33 +02:00
Jason Tedor	470e5e7cfc	Add additional low-level logging handler () * Add additional low-level logging handler We have the trace handler which is useful for recording sent messages but there are times where it would be useful to have more low-level logging about the events occurring on a channel. This commit adds a logging handler that can be enabled by setting a certain log level (org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that provides trace logging on low-level channel events and includes some information about the request/response read/write events on the channel as well. * Remove imports * License header * Remove redundant * Add test * More assertions	2017-10-05 12:10:58 -04:00
Simon Willnauer	8583727590	[TEST] add test to ensure legacy list syntax in yml works fine	2017-10-05 14:41:51 +02:00
Simon Willnauer	41925e1171	Bump BWC version for settings serialization to 6.1.0	2017-10-05 14:07:05 +02:00
Md. Abdulla-Al-Sun	a40c474e10	Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238)	2017-10-05 13:25:05 +02:00
kel	a978ddf37b	Fix toString() in SnapshotStatus (#26852 ) Closes #26851	2017-10-05 12:57:46 +02:00
Jim Ferenczi	24359c1a75	#26870 change bwc version for fuzzy_transpositions to 6.1 after backport	2017-10-05 11:28:59 +02:00
Martijn van Groningen	b863eaff4d	update lucene version after upgrading Lucene deps in 6.x branch too	2017-10-05 09:49:30 +02:00
Simon Willnauer	00dfdf50cf	Represent lists as actual lists inside Settings (#26878 ) Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to #26723	2017-10-05 09:27:08 +02:00

1 2 3 4 5 ...

8951 Commits