OpenSearch

Commit Graph

Author	SHA1	Message	Date
Yannick Welsch	1a20760d79	Simplify IndexShard indexing and deletion methods (#25249 ) Indexing or deleting documents through the IndexShard interface is quite complex and error-prone. It requires multiple calls, e.g. first prepareIndexOnPrimary, then do some checks if mapping updates have occurred, then do the actual indexing using index(...) etc. Currently each consumer of the interface (local recovery, peer recovery, replication) has additional custom checks built around it to deal with mapping updates, some of which are even inconsistent. This commit aims at reducing the complexity by exposing a simpler interface on IndexShard. There are no more prepare*** methods and the mapping complexity is also hidden, but still giving callers a possibility to implement custom logic to deal with mapping updates.	2017-06-19 20:11:54 +02:00
David Kyle	d1be2ecfdb	Initialise empty lists in BaseTaskResponse constructor (#25290 ) * Initialise empty lists in BaseTaskResponse constructor * Remove little used default constructor which leaves uninitialised members	2017-06-19 16:37:21 +01:00
Luca Cavanna	d9ec2a23c5	Remove (deprecated) support for '+' in index expressions (#25274 ) Relates to #24515	2017-06-19 15:19:17 +02:00
Tanguy Leroux	e4f4886d40	[Test] Extend parsing checks for DocWriteResponses (#25257 ) This commit changes the parsing logic of DocWriteResponse, ReplicationResponse and GetResult so that it skips any unknown additional fields (for forward compatibility reasons). This affects the IndexResponse, UpdateResponse,DeleteResponse and GetResponse objects.	2017-06-19 13:19:09 +02:00
Martijn van Groningen	bcaa413b0b	test: Port the remaining old indices search tests to full cluster restart qa module Also tweaked the qa module's gradle file to actually run bwc tests against all index compat versions. Relates to #24939	2017-06-19 12:27:24 +02:00
Simon Willnauer	dc02b32650	Simplify connection closing and cleanups in TcpTransport (#25250 ) Today we maintain a map of open connections in order to close them when a low level channel gets closed or handles a failure. We also spawn a thread due to some tricky concurrency issues especially with respect to netty since they listener might be called on a transport / boss thread. Executions on those threads must not be blocking since otherwise we will likely deadlock the event processing which adds to the complexity of the concurrency model in this class. This change associates the connection with the close callback that every channel invokes once it's closed which allows us to remove the connections map. A relaxed non-blocking concurrency model in the connection close listener allows cleaning up connected nodes without blocking on any lock.	2017-06-19 09:19:45 +02:00
Boaz Leskes	7291aba8ae	enable debug logging for testMasterFailoverDuringIndexingWithMappingChanges	2017-06-18 22:40:13 +02:00
Jason Tedor	4c28e781dd	Fix failing delete index test This test is failing because delete /{index} requests no longer support index matching an alias. This commit removes testing such requests again aliases. Closes #25284	2017-06-18 15:32:43 -04:00
Christoph Büscher	3f9f713b44	Add AwaitsFix on IndicesRequestIT due to #25284	2017-06-18 18:56:41 +02:00
Christoph Büscher	e99ced06cc	[Tests] Check that parsing aggregations works in a forward compatible way (#25219 ) This change adds tests for the aggregation parsing that try to simulate that we can parse existing aggregations in a forward compatible way in the future, ignoring potential newly added fields or substructures to the xContent response.	2017-06-17 13:06:31 +02:00
Ali Beyad	0c697348f4	Adds AwaitsFix on snapshot test failing due to #25281	2017-06-16 16:57:01 -04:00
Simon Willnauer	f18b0d293c	Move TransportStats accounting into TcpTransport (#25251 ) Today TcpTransport is the de-facto base-class for transport implementations. The need for all the callbacks we have in TransportServiceAdaptor are not necessary anymore since we can simply have the logic inside the base class itself. This change moves the stats metrics directly into TcpTransport removing the need for low level bytes send / received callbacks.	2017-06-16 22:34:11 +02:00
Nik Everett	ecc87f613f	Move pre-configured "keyword" tokenizer to the analysis-common module (#24863 ) Moves the keyword tokenizer to the analysis-common module. The keyword tokenizer is special because it is used by CustomNormalizerProvider so I pulled it out into its own PR. To get the move to work I've reworked the lookup from static to one using the AnalysisRegistry. This seems safe enough. Part of #23658.	2017-06-16 11:48:15 -04:00
Luca Cavanna	b5cea6980b	Delete index API to work only against concrete indices (#25268 ) With #23997 we have introduced a new internal index option that allows to resolve index expressions only against concrete indices while ignoring aliases. Such index option was applied to IndicesAliasesRequest, so that the index part of alias actions would only be resolved against concrete indices. Same is done in this commit with delete index request. Deleting aliases has always been confusing as some users expect it to only remove the alias from the index (which has its own specific API). Even worse, in case of filtered aliases, deleting an alias may leave users with the expectation that only the documents that match the filter are deleted, which was never the case. To address all this confusion, delete index api works now only against concrete indices. WIldcard expressions will be only resolved against concrete index, as if aliases didn't exist. If one tries to delete against an alias, an IndexNotFoundException will be thrown regardless of whether the alias exists or not, as a concrete index with such a name doesn't exist. Closes #2318	2017-06-16 17:46:01 +02:00
Boaz Leskes	9ddea539f5	Introduce translog size and age based retention policies (#25147 ) This PR extends the TranslogDeletionPolicy to allow keeping the translog files longer than what is needed for recovery from lucene. Specifically, we allow specifying the total size of the files and their maximum age (i.e., keep up to 512MB but no longer than 12 hours). This will allow making ops based recoveries more common. Note that the default size and age still set to 0, maintaining current behavior. This is needed as the other components in the system are not yet ready for a longer translog retention. I will adapt those in follow up PRs. Relates to #10708	2017-06-16 09:09:51 +02:00
Ali Beyad	350125ed2a	Improves snapshot logging and snapshoth deletion error handling (#25264 ) This commit does two things: 1. Adds logging at the DEBUG level for when the index-N blob is updated. 2. When attempting to delete a snapshot, if the snapshot was not found in the repository data, an exception is now thrown instead of silently ignoring the lack of presence of the snapshot in the repository data.	2017-06-15 19:43:19 -04:00
Christoph Büscher	d3442f7d0c	Add unit test for PathHierarchyTokenizerFactory (#24984 )	2017-06-15 19:18:33 +02:00
Guillaume Le Floch	a9014dfcc5	Deprecate tribe service This commit deprecates the tribe service so that deprecation log messages are delivered if a tribe node is configured. Relates #24598	2017-06-15 12:41:05 -04:00
Martijn van Groningen	428e70758a	Moved more token filters to analysis-common module. The following token filters were moved: `edge_ngram`, `ngram`, `uppercase`, `lowercase`, `length`, `flatten_graph` and `unique`. Relates to #23658	2017-06-15 18:28:31 +02:00
Jim Ferenczi	2a78b0a19f	[Test] Make sure that SearchAfterSortedDocQueryTests uses a single threaded searcher	2017-06-15 18:13:38 +02:00
markharwood	7a3155368c	Test fix - removed superfluous assertion (#25247 ) Closes #25245	2017-06-15 16:29:25 +01:00
Martijn van Groningen	fe02829aac	test: Ported more OldIndexBackwardsCompatibilityIT tests to full cluster restart qa tests. (#25173 ) Relates to #24939	2017-06-15 14:48:06 +02:00
Adrien Grand	1b90c46a53	Allow reader wrappers to have different live docs but the same cache key. Relates to #19856	2017-06-15 13:51:46 +02:00
Boaz Leskes	648b4717a4	move assertBusy to use CheckException (#25246 ) We use assertBusy in many places where the underlying code throw exceptions. Currently we need to wrap those exceptions in a RuntimeException which is ugly.	2017-06-15 13:24:07 +02:00
Tanguy Leroux	27f1206999	Use SPI in High Level Rest Client to load XContent parsers (#25098 ) This commit adds a NamedXContentProvider interface that can be implemented by plugins or modules using Java's SPI feature in order to provide additional NamedXContent parsers to external applications like the Java High Level Rest Client.	2017-06-15 12:50:02 +02:00
Adrien Grand	5a6fa62844	Speed up PK lookups at index time. (#19856 ) At index time Elasticsearch needs to look up the version associated with the `_id` of the document that is being indexed, which is often the bottleneck for indexing. While reviewing the output of the `jfr` telemetry from a Rally benchmark, I saw that significant time was spent in `ConcurrentHashMap#get` and `ThreadLocal#get`. The reason is that we cache lookup objects per thread and segment, and for every indexed document, we first need to look up the cache associated with this segment (`ConcurrentHashMap#get`) and then get a state that is local to the current thread (`ThreadLocal#get`). So if you are indexing N documents per second and have S segments, both these methods will be called N*S times per second. This commit changes version lookup to use a cache per index reader rather than per segment. While this makes cache entries live for less long, we now only need to do one call to `ConcurrentHashMap#get` and `ThreadLocal#get` per indexed document.	2017-06-15 10:17:42 +02:00
Adrien Grand	0c117145f6	Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222 ) This snapshot has faster range queries on range fields (LUCENE-7828), more accurate norms (LUCENE-7730) and the ability to use fake term frequencies (LUCENE-7854).	2017-06-15 09:52:07 +02:00
Ryan Ernst	caf7792db1	Scripting: Rename SearchScript.needsScores to needs_score (#25235 ) This commit renames the needsScores method so as to make it automatically generatable, based on the name of the `_score` variable which is available in search scripts. It also adds documentation to ScriptContext to explain the naming and signature of such methods.	2017-06-14 22:01:19 -07:00
Jim Ferenczi	68deda6d03	FastVectorHighlighter should not cache the field query globally (#25197 ) This commit removes the global caching of the field query and replaces it with a caching per field. Each field can use a different `highlight_query` and the rewriting of some queries (prefix, automaton, ...) depends on the targeted field so the query used for highlighting must be unique per field. There might be a small performance penalty when highlighting multiple fields since the query needs to be rewritten once per highlighted field with this change. Fixes #25171	2017-06-15 00:33:01 +02:00
Lee Hinman	4a30e23365	Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (#25223 ) * Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes This was a BWC layer where we expicitly set the `search_type` to "query_and_fetch" when a single node is queried on pre-5.3 nodes. Since 6.0 no longer needs to be compatible with 5.3 nodes, this can be removed. * Fix indentation * Remove unused QUERY_FETCH_ACTION_NAME constant	2017-06-14 15:42:29 -06:00
Zachary Tong	52719b2118	Add more missing AggregationBuilder getters (#25198 ) * Add more missing AggregationBuilder getters - getMetadata for all aggs - various getters on TermsAggBuilder (without "get" prefix to maintain convention) - Also makes InternalSum's ctor public, to follow suit of other metrics (min/max/avg/etc)	2017-06-14 14:31:01 -04:00
Lee Hinman	aa3134c093	Refactor TransportShardBulkAction.executeUpdateRequest and add tests This splits `executeUpdateRequest` into separate parts and adds some unit tests for the behavior in it. The actual behavior has not been changed.	2017-06-14 09:27:58 -06:00
Adrien Grand	cadd31b3a8	Make sure range queries are correctly profiled. (#25108 ) We introduced a new API for ranges in order to be able to decide whether points or doc values would be more appropriate to execute a query, but since `ProfileWeight` does not implement this API, the optimization is disabled when profiling is enabled.	2017-06-14 16:31:16 +02:00
Martijn van Groningen	e333955557	Remove PrefixAnalyzer, because it is no longer used.	2017-06-14 08:59:10 +02:00
Ryan Ernst	9ec1fc7b02	Internal: Remove Strings.cleanPath (#25209 ) This commit removes the cleanPath method, in favor of using java's Path.normalize().	2017-06-13 21:09:45 -07:00
Simon Willnauer	bc7ec68e76	Add Cross Cluster Search support for scroll searches (#25094 ) To complete the cross cluster search capabilities for all search types and function this change adds cross cluster search support for scroll searches.	2017-06-13 17:22:49 +02:00
Sergey Galkin	1c95cbc4e8	Rollover max docs should only count primaries (#24977 ) max_doc condition for index rollover should use document count only from primary shards Fixes #24217	2017-06-13 14:30:46 +02:00
Simon Willnauer	01d7c217f6	Add remote cluster infrastructure to fetch discovery nodes. (#25123 ) In order to add scroll support for cross cluster search we need to resolve the nodes encoded in the scroll ID to send requests to the corresponding nodes. This change adds the low level connection infrastructure that also ensures that connections are re-established if the cluster is disconnected due to a network failure or restarts. Relates to #25094	2017-06-13 14:23:56 +02:00
Simon Willnauer	186c16ea41	Ensure pending transport handlers are invoked for all channel failures (#25150 ) Today if a channel gets closed due to a disconnect we notify the response handler that the connection is closed and the node is disconnected. Unfortunately this is not a complete solution since it only works for published connections. Connections that are unpublished ie. for discovery can indefinitely hang since we never invoke their handers when we get a failure while a user is waiting for the response. This change adds connection tracking to TcpTransport that ensures we are notifying the corresponding connection if there is a failure on a channel.	2017-06-13 09:37:05 +02:00
Lee Hinman	ee1113c902	Tweak AggregatorBase.addRequestCircuitBreakerBytes This modifies a method Mark added to the AggregatorBase that allows aggregations to add additional memory tracking for datastructures used during execution. If an aggregation would like to reclaim circuit breaker reserved bytes by adding a negative number, `addWithoutBreaking` should be used instead of `addEstimateBytesAndMaybeBreak`. Resolves #24511	2017-06-12 12:55:50 -06:00
Jason Tedor	bb66f3b76b	Explicitly reject duplicate data paths Duplicate data paths already fail to work because we would attempt to take out a node lock on the directory a second time which will fail after the first lock attempt succeeds. However, how this failure manifests is not apparent at all and is quite difficult to debug. Instead, we should explicitly reject duplicate data paths to make the failure cause more obvious. Relates #25178	2017-06-12 12:55:19 -04:00
Jason Tedor	982900eabf	Do not swallow node lock failed exception When attempting to obtain the node lock, if an exception is thrown it is not logged. This makes debugging difficult. This commit causes such an exception to be logged. Relates #25176	2017-06-12 11:42:45 -04:00
markharwood	518cda6637	Aggregations bug: Significant_text fails on arrays of text. (#25030 ) * Aggregations bug: Significant_text fails on arrays of text. The set of previously-seen tokens in a doc was allocated per-JSON-field string value rather than once per JSON document meaning the number of docs containing a term could be over-counted leading to exceptions from the checks in significance heuristics. Added unit test for this scenario Closes #25029	2017-06-12 14:02:54 +01:00
Jim Ferenczi	7ab3d5d04a	Speed up sorted scroll when the index sort matches the search sort (#25138 ) Sorted scroll search can use early termination when the index sort matches the scroll search sort. The optimization can be done after the first query (which still needs to collect all documents) by applying a query that only matches documents that are greater than the last doc retrieved in the previous request. Since the index is sorted, retrieving the list of documents that are greater than the last doc only requires a binary search on each segment. This change introduces this new query called `SortedSearchAfterDocQuery` and apply it when possible. Scrolls with this optimization will search all documents on the first request and then will early terminate each segment after $size doc for any subsequent requests. Relates #6720	2017-06-12 09:33:30 +02:00
Boaz Leskes	f34136eda4	TranslogTests.testWithRandomException ignored a possible simulated OOM when trimming files	2017-06-12 08:32:55 +02:00
Boaz Leskes	cfb5f6a5a6	Adapt TranslogTests.testWithRandomException to checkpoint syncing on trim #25005 changed the translog dynamic to fsync the checkpoint before trimming a file. This changed the dynamics of potential failure modes which requires a change to testWithRandomException - it's now possible that we had an exception but the translog was trimmed. Closes #25133	2017-06-11 23:17:10 +02:00
Jason Tedor	dcf57f296e	Fix get mappings HEAD requests Get mappings HEAD requests incorrectly return a content-length header of 0. This commit addresses this by removing the special handling for get mappings HEAD requests, and just relying on the general mechanism that exists for handling HEAD requests in the REST layer. Relates #23192	2017-06-11 14:58:56 -04:00
Boaz Leskes	9b8754e4c2	TranslogTests#commit didn't allow for a concurrent closing of a view The view closing will trim unneeded files but there is a small window where they may still be around.	2017-06-11 19:09:01 +02:00
Jason Tedor	7182577904	Fix handling of exceptions thrown on HEAD requests Today when an exception is thrown handling a HEAD request, the body is swallowed before the channel has a chance to see it. Yet, the channel is where we compute the content length that would be returned as a header in the response. This is a violation of the HTTP specification. This commit addresses the issue. To address this issue, we remove the special handling in bytes rest response for HEAD requests when an exception is thrown. Instead, we let the upstream channel handle the special case, as we already do today for the non-exceptional case. Relates #25172	2017-06-10 23:44:18 -04:00
Jason Tedor	5108fa7529	Remove unneeded weak reference from prefix logger We have a custom logger implementation known as a prefix logger that is used to write every message by the logger with a given prefix. This is useful for node-level, index-level, and shard-level messages where we want to log the node name, index name, and shard ID, respectively, if possible. The mechanism that we employ is that of a marker. Log4j has a built-in facility for managing these markers, but its effectively a memory leak because these markers are held in a map and can never be released. This is problematic for us since indices and shards do not necessarily have infinite life spans and so on a node where there are many indices being creted and destroyed, this infinite lifespan can be a problem indeed. To solve this, we use our own cache of markers. This is necessary to prevent too many instances of the marker for the same prefix from being created (just think of all the shard-level components that exist in the system), and to workaround the effective leak in Log4j. These markers are stored as weak references in a weak hash map. It is these weak references that are unneeded. When a key is removed from a weak hash map, the corresponding entry is placed on a reference queue that is eventually cleared. This commit simplifies prefix logger by removing this unnecessary weak reference wrapper. Relates #22460	2017-06-10 13:20:45 -04:00

1 2 3 4 5 ...

8393 Commits