OpenSearch

Commit Graph

Author	SHA1	Message	Date
Simon Willnauer	6f7caed148	Add federated cross cluster search capabilities (#22502 ) Today Elasticsearch can act as a tribe node that fully joins another cluster to support operations across all different clusters. While tribe is a popular feature it also has it's downsides like: * non-trivial testing * special configuration * receives a non trivial amount of cluster-state updates * maintains connections to all nodes in all clusters and vice versa * has to be restarted to join another cluster * needs special strategies to disambiguate indices with identical names That said, from a functionality standpoint the only feature in elasticsearch that needs cross cluster communication is in-fact the search layer. Everything else can be done on the client side by communicating with each cluster individually. For `_search` the merge of aggregations etc. is non trivial and can't be done on the client side. The feature added in this PR is called `cross cluster search` and allows any node to act as a federated search node without joining any of the other clusters. There are 2 basic modes of operations: * globally configured remote clusters via cluster state settings * locally configured remote clusters via elasticsearch.yaml A node with such a configuration will `connect` to the remote cluster and discover a set of nodes that it can communicate with for federated search (Default # of nodes is 3). It won't connect to all nodes but only eligible nodes in the remote cluster (depending on their version and optionally on a node attribute (`node.addr`). Remote clusters can be configured and updated at any time via cluster settings without the need of a node-restart. Each remote cluster specified via a `name` -> `seed node IP list` ie: * `search.remote.my_cluster.seeds: 127.0.0.1:9300, 127.0.0.1:9301` indices of this cluster can then be addressed via the clusters alias: `GET index_1,my_cluster:index_1, my_cluster:other*/_search` Closes #21473	2017-01-18 09:28:04 +01:00
Simon Willnauer	19f9cb307a	Merge branch 'master' into feature/multi_cluster_search	2017-01-18 09:24:35 +01:00
Scott Somerville	372812da98	Allow an index to be partitioned with custom routing (#22274 ) This change makes it possible for custom routing values to go to a subset of shards rather than just a single shard. This enables the ability to utilize the spatial locality that custom routing can provide while mitigating the likelihood of ending up with an imbalanced cluster or suffering from a hot shard. This is ideal for large multi-tenant indices with custom routing that suffer from one or both of the following: - The big tenants cannot fit into a single shard or there is so many of them that they will likely end up on the same shard - Tenants often have a surge in write traffic and a single shard cannot process it fast enough Beyond that, this should also be useful for use cases where most queries are done under the context of a specific field (e.g. a category) since it gives a hint at how the data can be stored to minimize the number of shards to check per query. While a similar solution can be achieved with multiple concrete indices or aliases per value today, those approaches breakdown for high cardinality fields. A partitioned index enforces that mappings have routing required, that the partition size does not change when shrinking an index (the partitions will shrink proportionally), and rejects mappings that have parent/child relationships. Closes #21585	2017-01-18 08:51:23 +01:00
Elijah	3b92179e09	Improve wording in recipes docs This commit improves some of the wording the recipes docs. Relates #22661	2017-01-17 21:00:36 -05:00
Elijah	b56605d68c	Correct grammar in list in how-to docs This commit corrects the grammar in a list in the how-to docs; namely, the word "and" was missing preceding the final element in a list. Relates #22663	2017-01-17 20:57:22 -05:00
Igor Motov	500548fcda	Remove taskManager.registerChildTask Instead of forcing each task to register all nodes where its children are running, this commit runs cancellation on all nodes. The task cancellation operation doesn't run too frequently, so this optimization doesn't seem to be worth additional complexity of the interface.	2017-01-17 18:07:31 -05:00
Ali Beyad	099d229138	[TEST] fix documentation checking tests to account for possible pending tasks in the cluster state	2017-01-17 17:29:14 -05:00
Ryan Ernst	621643a5c3	Build: Only add ASL license to pom for elasticsearch project (#22664 ) The extra plugins that may be attached to the elasticsearch build contain their own license. In the past, the ASL license elasticsearch uses was avoided by specially checking for the gradle project prefix of `:x-plugins`. However, since refactoring to the elasticsearch-extra dir structure, this mechanism was broken. This change fixes the pom license adding to only be applied to projects that fall under the root project (ie elasticsearch).	2017-01-17 13:38:32 -08:00
Ali Beyad	ce811feba7	[TEST] testAckedIndexing waits for the cluster state to have propogated to all nodes in the cluster before checking the existance of documents on each node	2017-01-17 15:36:31 -05:00
Nik Everett	1169cd936e	Fix compilation in eclipse Eclipse needs a bit of extra special help with type parameters in `TransportReplicationActionTests` now.	2017-01-17 14:53:54 -05:00
Elijah	297b1b7d9a	Capitalize "Elasticsearch" in indexing speed docs This commit fixes the capitalization of "Elasticsearch" in the indexing speed docs. Relates #22659	2017-01-17 12:33:01 -05:00
Ali Beyad	554a5e3039	[TEST] add retries to MockRepository getRepositoryData to try to diagnose a NotXContentException being thrown	2017-01-17 12:17:29 -05:00
Simon Willnauer	69f1ffb1f8	fix exception message	2017-01-17 17:29:43 +01:00
Simon Willnauer	292e3a60d1	apply review comments	2017-01-17 17:20:52 +01:00
Alex	a0c83c4511	Minor doc changes to clarify mapping index param for string type (#22652 ) * Grammatical correction * Add note for legacy string mapping type * Update truncate token filter to not mention the keyword tokenizer The advice predates the existence of the keyword field Closes #22650	2017-01-17 16:43:11 +01:00
Luca Cavanna	bc5b604cbd	[TEST] parse global parameters from _common.json (#22655 ) Replace the hardcoded global parameters in the yaml test suite with parameters parsed from the newly added _common.json file. Relates to #22569	2017-01-17 16:13:09 +01:00
Ali Beyad	e2977889b8	Allow comma delimited array settings to have a space after each entry (#22591 ) Previously, certain settings that could take multiple comma delimited values would pick up incorrect values for all entries but the first if each comma separated value was followed by a whitespace character. For example, the multi-value "A,B,C" would be correctly parsed as ["A", "B", "C"] but the multi-value "A, B, C" would be incorrectly parsed as ["A", " B", " C"]. This commit allows a comma separated list to have whitespace characters after each entry. The specific settings that were affected by this are: cluster.routing.allocation.awareness.attributes index.routing.allocation.require.* index.routing.allocation.include.* index.routing.allocation.exclude.* cluster.routing.allocation.require.* cluster.routing.allocation.include.* cluster.routing.allocation.exclude.* http.cors.allow-methods http.cors.allow-headers For the allocation filtering related settings, this commit also provides validation of each specified entry if the filtering is done by _ip, _host_ip, or _publish_ip, to ensure that each entry is a valid IP address. Closes #22297	2017-01-17 08:51:04 -06:00
Tanguy Leroux	f5542ed47f	Simplify ElasticsearchException rendering as a XContent (#22611 ) This commit tries to simplify the way ElasticsearchException are rendered to xcontent. It adds some documentation and renames and merges some methods. Current behavior is preserved, the goal is to be more readable and centralize everything in the ElasticsearchException class.	2017-01-17 15:44:49 +01:00
Simon Willnauer	197cd7d7a9	Add test for the grouping error message if indices and cluster can't be disambiguated	2017-01-17 14:13:09 +01:00
Simon Willnauer	88f6ae55f5	Improve remote / local indices filtering by not modifying external state	2017-01-17 14:05:36 +01:00
Greg Marzouka	e0f8d88d5c	Include global query string parameters in the REST spec Closes #11638	2017-01-17 07:35:14 -05:00
Simon Willnauer	709cb9a39e	Merge branch 'master' into feature/multi_cluster_search	2017-01-17 12:34:36 +01:00
Clinton Gormley	401438819e	Docs: Fix the first highlighting example to work Closes #22642	2017-01-17 12:20:03 +01:00
Clinton Gormley	519a9c469d	Update truncate token filter to not mention the keyword tokenizer The advice predates the existence of the keyword field Closes #22650	2017-01-17 12:15:22 +01:00
Simon Willnauer	d7eee637d9	fix some docs issues	2017-01-17 11:47:29 +01:00
Simon Willnauer	1c5cc58373	apply review comments	2017-01-17 11:46:55 +01:00
Tim Brooks	16a76d9bc0	Remove blocking TCP clients and servers (#22639 ) This commit removes the option to use the blocking variants of the TCP transport server, TCP transport client, or http server.	2017-01-16 18:38:51 -06:00
Michael McCandless	ebd38e2a6a	Expose FlattenGraphTokenFilter (#22643 ) FlattenGraphTokenFilter is necessary for using graph-based token streams (e.g. the new SynonymGraphFilter) during indexing.	2017-01-16 16:53:32 -05:00
Boaz Leskes	d80e3eea6c	Replace EngineClosedException with AlreadyClosedExcpetion (#22631 ) `EngineClosedException` is a ES level exception that is used to indicate that the engine is closed when operation starts. It doesn't really add much value and we can use `AlreadyClosedException` from Lucene (which may already bubble if things go wrong during operations). Having two exception can just add confusion and lead to bugs, like wrong handling of `EngineClosedException` when dealing with document level failures. The latter was exposed by `IndexWithShadowReplicasIT`. This PR also removes the AwaitFix from the `IndexWithShadowReplicasIT` tests (which was what cause this to be discovered). While debugging the source of the issue I found some mismatches in document uid management in the tests. The term that was passed to the engine didn't correspond to the uid in the parsed doc - those are fixed as well.	2017-01-16 21:14:41 +01:00
Simon Willnauer	f30b1f82ee	Remove HttpServer and HttpServerAdapter in favor of a simple dispatch method (#22636 ) Today we have quite some abstractions that are essentially providing a simple dispatch method to the plugins defining a `HttpServerTransport`. This commit removes `HttpServer` and `HttpServerAdaptor` and introduces a simple `Dispatcher` functional interface that delegate to `RestController` by default. Relates to #18482	2017-01-16 21:06:08 +01:00
Luca Cavanna	193111919c	move ignore parameter support from yaml test client to low level rest client (#22637 ) All the language clients support a special ignore parameter that doesn't get passed to elasticsearch with the request, but used to indicate which error code should not lead to an exception if returned for a specific request. Moving this to the low level REST client will allow the high level REST client to make use of it too, for instance so that it doesn't have to intercept ResponseExceptions when the get api returns a 404.	2017-01-16 18:54:44 +01:00
Michael McCandless	eea4db5512	Fix thread safety of Stempel's token filter factory (#22610 ) Closes #21911	2017-01-16 10:36:36 -05:00
Christoph Büscher	2791c69960	Update profile.asciidoc Making the "Human readable output" section a note instead of an own section.	2017-01-16 16:19:07 +01:00
Tim Brooks	7a8884d9fa	Wrap rest httpclient with doPrivileged blocks (#22603 ) This is related to #22116. A number of modules (reindex, etc) use the rest client. The rest client opens connections using the apache http client. To avoid throwing SecurityException when using the SecurityManager these operations must be privileged. This is tricky because connections are opened within the httpclient code on its reactor thread. The way I confronted this was to wrap the creation of the client (and creation of reactor thread) in a doPrivileged block. The new thread inherits the existing security context.	2017-01-16 09:17:44 -06:00
Boaz Leskes	f88ab76067	Revert "Add a deprecation notice to shadow replicas (#22025 )" This reverts commit `0da190234c`.	2017-01-16 16:15:41 +01:00
Boaz Leskes	b887681550	Revert "Don'y use `INDEX_SHARED_FS_ALLOW_RECOVERY_ON_ANY_NODE_SETTING` directly as it triggers (many) deprecation logging" This reverts commit `e976aa09bb`.	2017-01-16 16:15:32 +01:00
Boaz Leskes	e976aa09bb	Don'y use `INDEX_SHARED_FS_ALLOW_RECOVERY_ON_ANY_NODE_SETTING` directly as it triggers (many) deprecation logging #22025 deprecated this setting (pending it's removal) but it's frequent usage will spam the deprecation logs and also fails test. As temporary work around we should not use the setting object directly.	2017-01-16 16:11:59 +01:00
Boaz Leskes	0da190234c	Add a deprecation notice to shadow replicas (#22025 ) Also adds deprecation logging. See #22024	2017-01-16 15:40:05 +01:00
Christoph Büscher	49a49da3f5	[Docs] Fix section title in profile.asciidoc	2017-01-16 14:53:06 +01:00
Christoph Büscher	59a48ffc41	ProfileResult and CollectorResult should print machine readable timing information (#22561 ) Currently both ProfileResult and CollectorResult print the time field in a human readable string format (e.g. "time": "55.20315000ms"). When trying to parse this back to a long value, for example to use in the planned high level java rest client, we can lose precision because of conversion and rounding issues. This change adds a new additional field (`time_in_nanos`) to the profile response to be able to get the original time value in nanoseconds back. The old `time` field is only printed when the `?`human=true` flag in the url is set. This follow the behaviour for all other stats-related apis. Also the format of the `time` field is slightly changed. Instead of always formatting the output as a 10-digit ms value, by using the `XContentBuilder#timeValueField()` method we now print the largest time unit present is used (e.g. "s", "ms", "micros").	2017-01-16 14:27:55 +01:00
David Pilato	ee0c4c1776	Merge pull request #22612 from dadoonet/doc/remote-debug gradle run --debug-jvm is explained twice	2017-01-16 14:19:11 +01:00
Jason Tedor	e6dc74f2bf	Add replica ops with version conflict to translog An operation that completed successfully on a primary can result in a version conflict on a replica due to the asynchronous nature of operations. When a replica operation results in a version conflict, the operation is not added to the translog. This leads to gaps in the translog which is problematic as it can lead to situations where a replica shard can never advance its local checkpoint. As such operations are just normal course of business for a replica shard, these operations should be treated as if they completed successfully. This commit adds these operations to the translog. Relates #22626	2017-01-16 08:08:52 -05:00
javanna	8e3f1dd689	Replace custom Functional interface in ElasticsearchException with CheckedFunction	2017-01-16 13:57:58 +01:00
javanna	9a910d3c9d	Make RestChannelConsumer extend CheckedConsumer<RestChannel, Exception>	2017-01-16 13:57:58 +01:00
javanna	ab144c418e	replace ShardSearchRequest.FilterParser functional interface with CheckedFunction	2017-01-16 13:57:58 +01:00
javanna	a8a13bb46f	replace custom functional interface with CheckedFunction in percolate module	2017-01-16 13:57:58 +01:00
javanna	bc22afcb2f	[TEST] replace SizeFunction with Function<Integer, Integer>	2017-01-16 13:57:58 +01:00
javanna	884302dcaa	Expose CheckedFunction	2017-01-16 13:57:58 +01:00
Jason Tedor	fc3280b3cf	Expose logs base path For certain situations, end-users need the base path for Elasticsearch logs. Exposing this as a property is better than hard-coding the path into the logging configuration file as otherwise the logging configuration file could easily diverge from the Elasticsearch configuration file. Additionally, Elasticsearch will only have permissions to write to the log directory configured in the Elasticsearch configuration file. This commit adds a property that exposes this base path. One use-case for this is configuring a rollover strategy to retain logs for a certain period of time. As such, we add an example of this to the documentation. Additionally, we expose the property es.logs.cluster_name as this is used as the name of the log files in the default configuration. Finally, we expose es.logs.node_name in cases where node.name is explicitly set in case users want to include the node name as part of the name of the log files. Relates #22625	2017-01-16 07:39:37 -05:00
Jason Tedor	9ae5410ea6	Do not configure a logger named level When logger.level is set, we end up configuring a logger named "level" because we look for all settings of the form "logger\..+" as configuring a logger. Yet, logger.level is special and is meant to only configure the default logging level. This commit causes is to avoid not configuring a logger named level. Relates #22624	2017-01-16 07:30:21 -05:00

1 2 3 4 5 ...

26112 Commits All Branches Search

26112 Commits

All Branches