OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-09 14:34:43 +00:00

Author	SHA1	Message	Date
Simon Willnauer	5c8164a561	Clean up BytesReference (#19196 ) BytesReference should be a really simple interface, yet it has a gazillion ways to achieve the same this. Methods like `#hasArray`, `#toBytesArray`, `#copyBytesArray` `#toBytesRef` `#bytes` are all really duplicates. This change simplifies the interface dramatically and makes implementations of it much simpler. All array access has been removed and is streamlined through a single `#toBytesRef` method. Utility methods to materialize a compact byte array has been added too for convenience.	2016-07-01 16:09:31 +02:00
Nik Everett	27e320d5ce	Migrate sum, min, and max aggs to NamedWriteable	2016-07-01 09:23:26 -04:00
Nik Everett	91b66e3cf4	Migration stats and extended stats to NamedWriteable Migrates the `stats` and `extended_stats` aggregations and pipeline aggregations from the special purpose aggregations streams to `NamedWriteable`. These are the first pipeline aggregations so this adds the infrastructure to support both streams and `NamedWriteable`s for pipeline aggregations.	2016-07-01 09:13:15 -04:00
javanna	598c36128e	Revert "Raised IOException on deleteBlob (#18815 )" This reverts commit d24cc65cad2f3152237df8b6c457a2d0a603f13a as it seems to be causing test failures.	2016-07-01 11:00:32 +02:00
gfyoung	d24cc65cad	Raised IOException on deleteBlob (#18815 ) Raise IOException on deleteBlob if the blob doesn't exist This commit raises an IOException on BlobContainer#deleteBlob if the blob does not exist, in conformance with the BlobContainer interface contract. Each implementation of BlobContainer now conforms to this contract (file system, S3, Azure, HDFS). This commit also contains blob container tests for each of the repository implementations. Closes #18530	2016-06-30 23:00:10 -04:00
Ryan Ernst	8275ab497b	Merge pull request #19170 from rjernst/rest_handler_client Changed rest handler interface to take NodeClient	2016-06-30 11:00:09 -07:00
Nik Everett	f5a269b029	Start migration away from aggregation streams We'll migrate to NamedWriteable so we can share code with the rest of the system. So we can work on this in multiple pull requests without breaking Elasticsearch in between the commits this change supports both old style `InternalAggregations.stream` serialization and `NamedWriteable` style serialization. As such it creates about a half dozen `// NORELEASE` comments that will have to be removed once the migration is complete. This also introduces a boolean `transportClient` flag to `SearchModule` which is used to skip inappropriate registrations for for the transport client while still registering the things it needs. In this case that means that the `InternalAggregation` subclasses are registered with the `NamedWriteableRegistry` but the `AggregationBuilder` subclasses are not. Finally, this moves aggregation registration from guice configuration time to `SearchModule` construction time. This will make it simpler to work with in the future as we further clean up Elasticsearch's extension points.	2016-06-30 12:57:34 -04:00
Boaz Leskes	09ca6d6ed2	Add a BridgePartition to be used by testAckedIndexing (#19172 ) We have long worked to capture different partitioning scenarios in our testing infra. This PR adds a new variant, inspired by the Jepsen blogs, which was forgotten far - namely a partition where one node can still see and be seen by all other nodes. It also updates the resiliency page to better reflect all the work that was done in this area.	2016-06-30 17:58:12 +02:00
Ryan Ernst	04a4bcdca0	Add comment explaining bytes reference edge case	2016-06-30 08:47:55 -07:00
Ryan Ernst	e079c83020	Fix test edge case for bytes reference	2016-06-30 08:45:54 -07:00
Ryan Ernst	c762e7aa15	Merge branch 'master' into rest_handler_client	2016-06-30 08:16:25 -07:00
Ryan Ernst	0732004ae8	Merge pull request #19177 from rjernst/ingest_factory_generic Remove generics from ingest Processor.Factory	2016-06-30 08:08:26 -07:00
Christoph Büscher	afb5e6332b	Make sure TimeIntervalRounding is monotonic for increasing dates (#19020 ) Currently there are cases when using TimeIntervalRounding#round() and date1 < date2 that round(date2) < round(date1). These errors can happen when using a non-fixed time zone and the values to be rounded are slightly after a time zone offset change (e.g. DST transition). Here is an example for the "CET" time zone with a 45 minute rounding interval. The dates to be rounded are on the left (with utc time stamp), the rounded values on the right. The error case is marked: 2011-10-30T01:40:00.000+02:00 1319931600000 \| 2011-10-30T01:30:00.000+02:00 1319931000000 2011-10-30T02:02:30.000+02:00 1319932950000 \| 2011-10-30T01:30:00.000+02:00 1319931000000 2011-10-30T02:25:00.000+02:00 1319934300000 \| 2011-10-30T02:15:00.000+02:00 1319933700000 2011-10-30T02:47:30.000+02:00 1319935650000 \| 2011-10-30T02:15:00.000+02:00 1319933700000 2011-10-30T02:10:00.000+01:00 1319937000000 \| 2011-10-30T01:30:00.000+02:00 1319931000000 * 2011-10-30T02:32:30.000+01:00 1319938350000 \| 2011-10-30T02:15:00.000+01:00 1319937300000 2011-10-30T02:55:00.000+01:00 1319939700000 \| 2011-10-30T02:15:00.000+01:00 1319937300000 2011-10-30T03:17:30.000+01:00 1319941050000 \| 2011-10-30T03:00:00.000+01:00 1319940000000 We should correct this by detecting that we are crossing a transition when rounding, and in that case pick the largest valid rounded value before the transition. This change adds this correction logic to the rounding function and adds this invariant to the randomized TimeIntervalRounding tests. Also adding the example test case from above (with corrected behaviour) for illustrative purposes.	2016-06-30 17:05:54 +02:00
Simon Willnauer	40ec639c89	Factor out abstract TCPTransport* classes to reduce the netty footprint (#19096 ) Today we have a ton of logic inside the NettyTransport* codebase. The footprint of the code that has a direct netty dependency is large and alternative implementations are pretty hard today since they need to know all about our proticol etc. This change moves most of the code into TCPTransport* baseclasses and moves all the protocol send code together. The base classes now contain the majority of the logic while NettyTransport* classes remain to implement the glue code, configuration and optimization.	2016-06-30 13:41:53 +02:00
Ryan Ernst	e4f265eb3a	Ingest: Remove generics from Processor.Factory The factory for ingest processor is generic, but that is only for the return type of the create mehtod. However, the actual consumer of the factories only cares about Processor, so generics are not needed. This change removes the generic type from the factory. It also removes AbstractProcessorFactory which only existed in order pull the optional tag from config. This functionality is moved to the caller of the factories in ConfigurationUtil, and the create method now takes the tag. This allows the covariant return of the implementation to work with tests not needing casts.	2016-06-30 02:33:54 -07:00
Martijn van Groningen	299c6fcc63	test: use the reader from the searcher (newSearcher(...) method may change the reader) instead of the reader we create in the test Closes #19151	2016-06-30 11:10:38 +02:00
Ryan Ernst	c77dc4a82c	Merge pull request #19136 from rjernst/script_service_deps Scripts: Remove ClusterState from compile api	2016-06-29 22:34:40 -07:00
Ryan Ernst	865b951b7d	Internal: Changed rest handler interface to take NodeClient Previously all rest handlers would take Client in their injected ctor. However, it was only to hold the client around for runtime. Instead, this can be done just once in the HttpService which handles rest requests, and passed along through the handleRequest method. It also should always be a NodeClient, and other types of Clients (eg a TransportClient) would not work anyways (and some handlers can be simplified in follow ups like reindex by taking NodeClient).	2016-06-29 18:02:18 -07:00
Ryan Ernst	7c50de182e	Remove test for closing ingest processors, this is now handled at the plugin level	2016-06-29 16:23:16 -07:00
Ryan Ernst	172ced3e2d	Fix test bug in plugin cli progress tests	2016-06-29 15:56:36 -07:00
Nik Everett	8db43c0107	Move RestHandler registration to ActionModule and ActionPlugin `RestHandler`s are highly tied to actions so registering them in the same place makes sense. Removes the need to for plugins to check if they are in transport client mode before registering a RestHandler - `getRestHandlers` isn't called at all in transport client mode. This caused guice to throw a massive fit about the circular dependency between NodeClient and the allocation deciders. I broke the circular dependency by registering the actions map with the node client after instantiation.	2016-06-29 18:31:44 -04:00
Ryan Ernst	4dcb2b8024	Merge pull request #19137 from rjernst/closeable_plugins Make plugins closeable	2016-06-29 13:54:20 -07:00
Ryan Ernst	b3daf7d683	Remove unnecessary variant of detailedMessage	2016-06-29 11:25:23 -07:00
Ryan Ernst	8b533b7ca9	Internal: Deprecate ExceptionsHelper.detailedMessage This is a trappy "helper" and only hurts. See #19069	2016-06-29 11:09:35 -07:00
Jason Tedor	fc38e503e0	Clearer error when handling fractional time values In 2f638b5a23597967a98b1ced1deac91d64af5a44, support for fractional time values was removed. While this change is documented, the error message presented does not give an indication that fractional inputs are not supported. This commit fixes this by detecting when the input is a time value that would successfully parse as a double but will not parse as a long and presenting a clear error message that fractional time values are not supported. Relates #19158	2016-06-29 13:36:11 -04:00
Christoph Büscher	0d81dee013	Fix key_as_string for date histogram and epoch_millis/epoch_second format When doing a `date_histogram` aggregation with `"format":"epoch_millis"` or `"format" : "epoch_second"` and using a time zone other than UTC, the `key_as_string` ouput in the response does not reflect the UTC timestamp that is used as the key. This happens because when applying the `time_zone` in DocValueFormat.DateTime to an epoch-based formatter, this adds the time zone offset to the value being formated. Instead we should adjust the added display offset to get back the utc instance in EpochTimePrinter. Closes #19038	2016-06-29 19:18:12 +02:00
Alexander Reelsen	56fa751928	Plugins: Add status bar on download (#18695 ) As some plugins are becoming big now, it is hard for the user to know, if the plugin is being downloaded or just nothing happens. This commit adds a progress bar during download, which can be disabled by using the `-q` parameter. In addition this updates to jimfs 1.1, which allows us to test the batch mode, as adding security policies are now supported due to having jimfs:// protocol support in URL stream handlers.	2016-06-29 16:44:12 +02:00
Britta Weber	6d5666553c	[TEST] mute test because it fails about 1/100 runs	2016-06-29 15:53:57 +02:00
Simon Willnauer	819fe40d61	Extract AbstractBytesReferenceTestCase (#19141 ) We have a ton of tests for PagedBytesReference but not really many for the other implementation of BytesReference. This change factors out a basic AbstractBytesReferenceTestCase that simplifies testing other implementations. It also caught a couple of bug here and there like a missing mask when reading bytes as ints in PagedBytesReference.	2016-06-29 14:45:54 +02:00
Simon Willnauer	872cdffc27	Factor out ChannelBuffer from BytesReference (#19129 ) The ChannelBuffer interface today leaks into the BytesReference abstraction which causes a hard dependency on Netty across the board. This chance moves this dependency and all BytesReference -> ChannelBuffer conversion into NettyUtlis and removes the abstraction leak on BytesReference. This change also removes unused methods on the BytesReference interface and simplifies access to internal pages.	2016-06-29 10:45:05 +02:00
Ryan Ernst	6590e77c1a	Plugins: Make plugins closeable This change allows Plugin implementions to implement Closeable when they have resources that should be released. As a first example of how this can be used, I switched over ingest plugins, which just had the geoip processor. The ingest framework had chains of closeable to support this, which is now removed.	2016-06-28 16:16:26 -07:00
Ryan Ernst	ecf6101798	Scripts: Remove ClusterState from compile api Stored scripts are pulled from the cluster state, and the current api requires passing the ClusterState on each call to compile. However, this means every user of the ScriptService needs to depend on the ClusterService. Instead, this change makes the ScriptService a ClusterStateListener. It also simplifies tests a lot, as they no longer need to create fake cluster states (except when testing stored scripts).	2016-06-28 13:20:00 -07:00
Simon Willnauer	9b9e17abf7	Cleanup Compressor interface (#19125 ) Today we have several deprecated methods, leaking netty interfaces, support for multiple compressors on the compressor interface. The netty interface can simply be replaced by BytesReference which we already have an implementation for, all the others are not used and are removed in this commit.	2016-06-28 17:51:33 +02:00
Yannick Welsch	0515791846	Fix logger usages	2016-06-28 16:51:06 +02:00
Boaz Leskes	2512594d9e	Testing infra - stablize data folder usage and clean up (#19111 ) The plan for persistent node ids ( #17811 ) is to tie the node identity to a file stored in it's data folders. As such it becomes important that nodes in our testing infra have better affinity with their data folders and that their data folders are not cleaned underneath them. The first is important because we fix the random seed used for node id generation (for reproducibility) and allowing the same node to use two different data folders causes two separate nodes to have the same id, which prevents the cluster from forming. The second is important, for example, where a full cluster restart / single node restart need to maintain node identity and wiping the data folders at the wrong moment prevents this. Concretely this commit does the following: 1) Remove previous attempts to have data folder per role using a prefix. This wasn't effective as it was using the data paths settings which are only used for part of the runs. An attempt to completely separate the paths via the home dir failed due to assumptions made by index custom path about node data folder ordinal uniqueness (see #19076) 2) Change full cluster restarts to start up nodes in the same order their were first created in, only randomly swapping nodes with the same roles. 3) Change test cluster reset methods to first shutdown the unneeded nodes and then re-start the shared nodes that were shut down, so they'll reclaim their data folders. 4) Improve data folder wiping logic and make sure it wipes only folders of "offline" nodes. 5) Add some very basic tests	2016-06-28 16:38:56 +02:00
Jim Ferenczi	6d069078d3	Fixed tests that assumed that broken settings can be updated	2016-06-28 16:14:57 +02:00
Jim Ferenczi	ef0e3db0de	Validates new dynamic settings from the current state Thanks to https://github.com/elastic/elasticsearch/pull/19088 the settings are now validated against dynamic updaters on the master. Though only the new settings are applied to the IndexService created for the validation. Because of this we cannot check the transition from one value to another in a dynamic updaters. This change creates the IndexService from the current settings and validates that the new dynamic settings can replace the current settings. This change also removes the validation of dynamic settings when an index is opened. The validation should have occurred when the settings have been updated.	2016-06-28 15:35:04 +02:00
Nik Everett	fa4844c3f4	Pull actions from plugins Instead of implementing onModule(ActionModule) to register actions, this has plugins implement ActionPlugin to declare actions. This is yet another step in cleaning up the plugin infrastructure. While I was in there I switched AutoCreateIndex and DestructiveOperations to be eagerly constructed which makes them easier to use when de-guice-ing the code base.	2016-06-28 08:36:24 -04:00
Jason Tedor	2f638b5a23	Keep input time unit when parsing TimeValues This commit modifies TimeValue parsing to keep the input time unit. This enables round-trip parsing from instances of String to instances of TimeValue and vice-versa. With this, this commit removes support for the unit "w" representing weeks, and also removes support for fractional values of units (e.g., 0.5s). Relates #19102	2016-06-27 18:41:18 -04:00
Ryan Ernst	3f2946ce6d	Fix line length in new indices module tests.	2016-06-27 11:33:22 -07:00
Ryan Ernst	33ccc5aead	Merge branch 'master' into mapper_plugin_api	2016-06-27 11:19:59 -07:00
Ryan Ernst	f17fcce3ed	Add duplicate mapper detection and tests	2016-06-27 11:17:58 -07:00
Jim Ferenczi	eb1e231a63	Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`" This reverts commit 2f46f53dc8feb78412e6d648751ffe97b1e35119.	2016-06-27 17:20:32 +02:00
Simon Willnauer	4fb1c4fe5a	Validate settings against dynamic updaters on the master (#19088 ) Today all settings are only validated against their validators that are available when settings are registered. Yet, some settings updaters have validators that are dynamic ie. their validation depends on other variables that are only available at runtime. We do not run those validators when settings are updated causing index updates to fail on the data nodes instead of on the master. Relates to #19046	2016-06-27 17:18:26 +02:00
Colin Goodheart-Smithe	108ba23073	Pass resolved extended bounds to unmapped histogram aggregator Previous to this change the unresolved extended bounds was passed into the histogram aggregator which meant extendedbounds.min and extendedbounds.max was passed through as null. This had two effects on the histogram aggregator: 1. If the histogram aggregator was unmapped across all shards, the reduce phase would not add buckets for the extended bounds and the response would contain zero buckets 2. If the histogram aggregator was not unmapped in some shards, the reduce phase might sometimes chose to reduce based on the unmapped shard response and therefore the extended bounds would be ignored. This change resolves the extended bounds in the unmapped case and solves the above two issues. Closes #19009	2016-06-27 14:07:37 +01:00
Boaz Leskes	cb0824e957	Make shard store fetch less dependent on the current cluster state, both on master and non data nodes (#19044 ) #18938 has changed the timing in which we send out to nodes to fetch their shard stores. Instead of doing this after the cluster state resulting of the node's join was published, #18938 made it be sent concurrently to the publishing processes. This revealed a couple of points where the shard store fetching is dependent of the current state of affairs of the cluster state, both on the master and the data nodes. The problem discovered were already present without #18938 but required a failure/extreme situations to make them happen.This PR tries to remove as much as possible of these dependencies making shard store fetching simpler and make the way to re-introduce #18938 which was reverted. These are the notable changes: 1) Allow TransportNodesAction (of which shard store fetching is derived) callers to supply concrete disco nodes, so it won't need the cluster state to resolve them. This was a problem because the cluster state containing the needed nodes was not yet made available through ClusterService. Note that long term we can expect the rest layer to resolve node ids to concrete nodes, making this mode the only one needed. 2) The data node relied on the cluster state to have the relevant index meta data so it can find data when custom paths are used. We now fall back to read the meta data from disk if needed. 3) The data node was relying on it's own IndexService state to indicate whether the data it has corresponds to an existing allocation. This is of course something it can not know until it got (and processed) the new cluster state from the master. This flag in the response is now removed. This is not a problem because we used that flag to protect against double assigning of a shard to the same node, but we are already protected from it by the allocation deciders. 4) I removed the redundant filterNodeIds method in TransportNodesAction - if people want to filter they can override resolveRequest.	2016-06-27 15:05:06 +02:00
Martijn van Groningen	d3cd58eb2f	Merges PR #18957 This commit fixes several NPEs caused by implicitly performing a get request for a document that exists with its _source disabled and then trying to access the source. Instead of causing an NPE the following queries will throw an exception with a "source disabled" message (similar behavior as if the document does not exist).: - GeoShape query for pre-indexed shape (throws IllegalArgumentException) - Percolate query for an existing document (throws IllegalArgumentException) A Terms query with a lookup will ignore the document if the source does not exist (same as if the document does not exist). GET and HEAD requests for the document _source will return a 404 if the source is disabled (even if the document exists).	2016-06-27 09:37:28 +02:00
Martijn van Groningen	ba90508b91	fix checkstyle issue	2016-06-27 09:00:13 +02:00
Nik Everett	71b95fb63c	Switch analysis from push to pull Instead of plugins calling `registerTokenizer` to extend the analyzer they now instead have to implement `AnalysisPlugin` and override `getTokenizer`. This lines up extending plugins in with extending scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry` immediately as part of its constructor which makes testing anslysis much simpler. This also moves the default analysis configuration into `AnalysisModule` which is how search is setup. Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`. Instead it is only responsible for building `AnslysisRegistry`. We still bind `AnalysisRegistry` but we only do so in `Node`. This is means it is available at module construction time so we slowly remove the need to bind it in guice.	2016-06-26 07:15:42 -04:00
Jason Tedor	c79e27180e	Require timeout units when parsing query body Today when parsing the timeout field in a query body, if time units are supplied the parser throws a NumberFormatException. Addtionally, the parsing allows the timeout field to not specify units (it assumes milliseconds). This commit fixes this behavior by not only allowing time units to be specified but requires time units to be specified. This is consistent with the documented behavior and the behavior in 2.x. Relates #19077	2016-06-25 16:18:25 -04:00

1 2 3 4 5 ...

5556 Commits