OpenSearch

Commit Graph

Author	SHA1	Message	Date
Christoph Büscher	16a7cbe463	Add `count` value to rest output of `geo_centroid` (#24387 ) Currently we don't write the count value to the geo_centroid aggregation rest response, but it is provided via the java api and the count() method in the GeoCentroid interface. We should add this parameter to the rest output and also provide it via the getProperty() method.	2017-04-28 16:25:22 +02:00
Nik Everett	e3b7b88756	Fix compilation in Ecipse (#24391 ) Eclipse doesn't allow extra semicolons after an import statement: ``` import foo.Bar;; // <-- syntax error! ``` Here is the Eclipse bug: https://bugs.eclipse.org/bugs/show_bug.cgi?id=425140 which the Eclipse folks closed as "the spec doesn't allow these semicolons so why should we?" Which is fair. Here is the bug against javac for allowing them: https://bugs.openjdk.java.net/browse/JDK-8027682 which hasn't been touched since 2013 without explanation. There is, however, a rather educations mailing list thread: http://mail.openjdk.java.net/pipermail/compiler-dev/2013-August/006956.html which contains gems like, "In general, it is better/simpler to change javac to conform to the spec. (Except when it is not.)" I suspect the reason this hasn't been fixed is: ``` FWIW, if we change javac such that the set of programs accepted by javac is changed, we have an process (currently Oracle internal) to get approval for such a change. So, we would not simply change javac on a whim to meet the spec; we would at least have other eyes looking at the behavioral change to determine if it is "acceptable". ``` from http://mail.openjdk.java.net/pipermail/compiler-dev/2013-August/006973.html	2017-04-28 09:52:14 -04:00
Guillaume Le Floch	382a617d34	Handle multiple aliases in _cat/aliases api (#23698 ) The alias parameter was documented as a list in our rest-spec, yet only the first value out of a list was getting read and processed. This commit adds support for multiple aliases to _cat/aliases Closes #23661	2017-04-28 15:21:44 +02:00
Kunal Kapoor	a5bd2012b6	Added validation for upserd request (#24282 ) The version on an update request is a syntactic sugar for get of a specific version, doc merge and a version index. This changes it to reject requests with both upsert and a version. If the upsert index request is versioned, we also reject the op.	2017-04-28 08:02:09 -04:00
Yannick Welsch	a72db191f2	Weaken assertion in ZenDiscovery.publish The previous commit (`35f78d098a`) introduced an assertion in ZenDiscovery that was overly restrictive - it could trip when a cluster state that was successfully published would not be applied locally because a master with a better cluster state came along in the meantime.	2017-04-28 11:33:53 +02:00
Yannick Welsch	35f78d098a	Separate publishing from applying cluster states (#24236 ) Separates cluster state publishing from applying cluster states: - ClusterService is split into two classes MasterService and ClusterApplierService. MasterService has the responsibility to calculate cluster state updates for actions that want to change the cluster state (create index, update shard routing table, etc.). ClusterApplierService has the responsibility to apply cluster states that have been successfully published and invokes the cluster state appliers and listeners. - ClusterApplierService keeps track of the last applied state, but MasterService is stateless and uses the last cluster state that is provided by the discovery module to calculate the next prospective state. The ClusterService class is still kept around, which now just delegates actions to ClusterApplierService and MasterService. - The discovery implementation is now responsible for managing the last cluster state that is used by the consensus layer and the master service. It also exposes the initial cluster state which is used by the ClusterApplierService. The discovery implementation is also responsible for adding the right cluster-level blocks to the initial state. - NoneDiscovery has been renamed to TribeDiscovery as it is exclusively used by TribeService. It adds the tribe blocks to the initial state. - ZenDiscovery is synchronized on state changes to the last cluster state that is used by the consensus layer and the master service, and does not submit cluster state update tasks anymore to make changes to the disco state (except when becoming master). Control flow for cluster state updates is now as follows: - State updates are sent to MasterService - MasterService gets the latest committed cluster state from the discovery implementation and calculates the next cluster state to publish - MasterService submits the new prospective cluster state to the discovery implementation for publishing - Discovery implementation publishes cluster states to all nodes and, once the state is committed, asks the ClusterApplierService to apply the newly committed state. - ClusterApplierService applies state to local node.	2017-04-28 09:34:31 +02:00
Toby McLaughlin	e4bb360ae0	Fix typo in node environment exception message This commit fixes a typo in an exception message when trying to create a node environment. Relates #24381	2017-04-28 00:48:52 -04:00
Zachary Tong	350573290f	Agg builder accessibility fixes (#24323 ) - Getters for DateHisto `interval` and `offset` should return a long, not double - Add getter for the filter in a FilterAgg - Add getters for subaggs / pipelines in base AggregationBuilder	2017-04-27 16:50:59 -04:00
Ryan Ernst	cdcc75dd2a	Plugins: Add support for platform specific plugins (#24265 ) This commit adds support for plugins having a platform specific variant. It also adds unit tests for all official and maven urls.	2017-04-27 11:27:29 -07:00
Ali Beyad	2facc42a55	Change snapshot status error to use generic SnapshotException (#24355 ) Changes the snapshot status read exception from the (misleading) IndexShardRestoreFailedException to the generic SnapshotException Closes #24225	2017-04-27 09:36:26 -04:00
Yannick Welsch	2fa1c9fff1	Provide target allocation id as part of start recovery request (#24333 ) This makes it possible for the recovery source to verify that it is talking to the shard it thinks it is talking to. Closes #24167	2017-04-27 14:45:44 +02:00
Tim Vernum	65f90b25e0	Pass Context to ConstructingObjectParser's function (#24230 ) Allow the `Context` to be used in the builder function used within ConstructingObjectParser. This facilitates scenarios where a constructor argument comes from a URL parameter, or from document id.	2017-04-27 20:26:10 +10:00
Clinton Gormley	8a8410b5ce	Added bwc indices for v2.4.5	2017-04-27 10:30:53 +02:00
Ryan Ernst	4a5c3c5a4a	Test: Write node ports file before starting tribe service (#24351 ) The tribe service can take a while to initialize, depending on how many cluster it needs to connect to. This change moves writing the ports file used by tests to before the tribe service is started.	2017-04-27 09:59:54 +02:00
Adrien Grand	1be2800120	Only allow one type on 7.0 indices (#24317 ) This adds the `index.mapping.single_type` setting, which enforces that indices have at most one type when it is true. The default value is true for 6.0+ indices and false for old indices. Relates #15613	2017-04-27 08:43:20 +02:00
Koen De Groote	7f9d84cb1a	The parseObject method in DocumentParse can be void. There is no point in the code that actually expects the return, plus the variable created for it was never actually used. (#24350 )	2017-04-26 16:54:23 -06:00
Ali Beyad	d387dcfd6c	[TEST] fixes NPE in RoutingTableTests	2017-04-26 18:39:26 -04:00
Ali Beyad	0e74f5ddb1	[TEST] fixes shard count of source shard index in a restore shrink index test	2017-04-26 16:34:53 -04:00
Nik Everett	bc45d10e82	Remove most usages of 1-arg Script ctor (#24325 ) The one argument ctor for `Script` creates a script with the default language but most usages of are for testing and either don't care about the language or are for use with `MockScriptEngine`. This replaces most usages of the one argument ctor on `Script` with calls to `ESTestCase#mockScript` to make it clear that the tests don't need the default scripting language. I've also factored out some copy and pasted script generation code into a single place. I would have had to change that code to use `mockScript` anyway, so it was easier to perform the refactor. Relates to #16314	2017-04-26 16:04:38 -04:00
Luca Cavanna	149629fec6	Cross Cluster Search: propagate original indices per cluster (#24328 ) In case of a Cross Cluster Search, the coordinating node should split the original indices per cluster, and send over to each cluster only its own set of original indices, rather than the set taken from the original search request which contains all the indices. In fact, each remote cluster should not be aware of the indices belonging to other remote clusters.	2017-04-26 21:45:49 +02:00
Yannick Welsch	b7bf651738	[TEST] Fix cluster forming in testDynamicUpdateMinimumMasterNodes This test can run into a split-brain situation as minimum_master_nodes is not properly set. To prevent this, make sure that at least one of the two master nodes that are initially started has minimum_master_nodes correctly set.	2017-04-26 21:13:27 +02:00
Martijn van Groningen	ebe98f9d62	test: don't randomly wrap index reader	2017-04-26 21:07:56 +02:00
Ali Beyad	0e52e3420e	Fixes restore of a shrunken index when initial recovery node is gone (#24322 ) When an index is shrunk using the shrink APIs, the shrink operation adds some internal index settings to the shrink index, for example `index.shrink.source.name\|uuid` to denote the source index, as well as `index.routing.allocation.initial_recovery._id` to denote the node on which all shards for the source index resided when the shrunken index was created. However, this presents a problem when taking a snapshot of the shrunken index and restoring it to a cluster where the initial recovery node is not present, or restoring to the same cluster where the initial recovery node is offline or decomissioned. The restore operation fails to allocate the shard in the shrunken index to a node when the initial recovery node is not present, and a restore type of recovery will not go through the PrimaryShardAllocator, meaning that it will not have the chance to force allocate the primary to a node in the cluster. Rather, restore initiated shard allocation goes through the BalancedShardAllocator which does not attempt to force allocate a primary. This commit fixes the aforementioned problem by not requiring allocation to occur on the initial recovery node when the recovery type is a restore of a snapshot. This commit also ensures that the internal shrink index settings are recognized and not archived (which can trip an assertion in the restore scenario). Closes #24257	2017-04-26 14:48:10 -04:00
Koen De Groote	3187ed73fc	Removal of dead code in ScriptedMetricAggregationBuilder (#24346 ) This code removes a few lines of dead code from ScriptedMetricAggregationBuilder. Just completely dead code, it adds things to a Set that is then not used in any way.	2017-04-26 14:44:03 -04:00
Koen De Groote	4c0eb35c22	Removal of dead code from SnapshotsService (#24347 ) This code removes a few lines of dead code from SnapshotsService. Looks like a forgotten remnant of a past implementation.	2017-04-26 14:32:35 -04:00
Nik Everett	7c3efb829b	Move char filters into analysis-common (#24261 ) Another step down the road to dropping the lucene-analyzers-common dependency from core. Note that this removes some tests that no longer compile from core. I played around with adding them to the analysis-common module where they would compile but we already test these in the tests generated from the example usage in the documentation. I'm not super happy with the way that `requriesAnalysisSettings` works with regards to plugins. I think it'd be fairly bug-prone for plugin authors to use. But I'm making it visible as is for now and I'll rethink later. A part of #23658	2017-04-26 13:25:34 -04:00
Christoph Büscher	db1b243343	InternalPercentilesBucket should not rely on ordered percents array (#24336 ) Currently InternalPercentilesBucket#percentile() relies on the percent array passed in to be in sorted order. This changes the aggregation to store an internal lookup table that is constructed from the percent/percentiles arrays passed in that can be used to look up the percentile values. Closes #24331	2017-04-26 19:15:48 +02:00
Yannick Welsch	91b61ce569	[TEST] Do a reroute with retry_failed after a bridge partition on testAckedIndexing In case of a bridge partition, shard allocation can fail "index.allocation.max_retries" times if the master is the super-connected node and recovery source and target are on opposite sides of the bridge. This commit adds a reroute with retry_failed after healing the network partition so that the ensureGreen check succeeds.	2017-04-26 16:08:16 +02:00
Jay Modi	7f8fe8b81d	StreamInput throws exceptions instead of using assertions (#24294 ) StreamInput has methods such as readVInt that perform sanity checks on the data using assertions, which will catch bad data in tests but provide no safety when running as a node without assertions enabled. The use of assertions also make testing with invalid data difficult since we would need to handle assertion errors in the code using the stream input and errors like this should not be something we try to catch. This commit introduces a flag that will throw an IOException instead of using an assertion.	2017-04-26 07:23:07 -04:00
Martijn van Groningen	c17de49a6d	[percolator] Fix memory leak when percolator uses bitset or field data cache. The percolator doesn't close the IndexReader of the memory index any more. Prior to 2.x the percolator had its own SearchContext (PercolatorContext) that did this, but that was removed when the percolator was refactored as part of the 5.0 release. I think an alternative way to fix this is to let percolator not use the bitset and fielddata caches, that way we prevent the memory leak. Closes #24108	2017-04-26 11:08:15 +02:00
Koen De Groote	3c845727f8	Replace alternating regex with character classes This commit replaces two alternating regular expressions (that is, regular expressions that consist of the form a\|b where a and b are characters) with the equivalent regular expression rewritten as a character class (that is, [ab]) The reason this is an improvement is because a\|b involves backtracking while [ab] does not. Relates #24316	2017-04-25 22:15:00 -04:00
Guillaume Le Floch	739cb35d1b	Allow passing single scrollID in clear scroll API body (#24242 ) * Allow single scrollId in string format Closes #24233	2017-04-25 13:43:21 +02:00
Koen De Groote	88de33d43d	Minor changes to collection creation from enums (#24274 ) These changes are mainly cosmetic with minor perf advantages drawn from checkstyle.	2017-04-25 13:13:55 +02:00
Ryan Ernst	6ebf08759b	Templates: Add compileTemplate method to ScriptService for template consumers (#24280 ) This commit adds a compileTemplate method to the ScriptService. Eventually this will be used to easily cutover all consumers to a new TemplateService. relates #16314	2017-04-24 15:45:20 -07:00
Christoph Büscher	026bf2e3ee	Remove getCountAsString() from InternalStats and Stats interface (#24291 ) The `count` value in the stats aggregation represents a simple doc count that doesn't require a formatted version. We didn't render an "as_string" version for count in the rest response, so the method should also be removed in favour of just using String.valueOf(getCount()) if a string version of the count is needed. Closes #24287	2017-04-24 18:40:57 +02:00
Ali Beyad	c5b6f52ecc	Fixes maintaining the shards a snapshot is waiting on (#24289 ) There was a bug in the calculation of the shards that a snapshot must wait on, due to their relocating or initializing, before the snapshot can proceed safely to snapshot the shard data. In this bug, an incorrect key was used to look up the index of the waiting shards, resulting in the fact that each index would have at most one shard in the waiting state causing the snapshot to pause. This could be problematic if there are more than one shard in the relocating or initializing state, which would result in a snapshot prematurely starting because it thinks its only waiting on one relocating or initializing shard (when in fact there could be more than one). While not a common case and likely rare in practice, it is still problematic. This commit fixes the issue by ensuring the correct key is used to look up the waiting indices map as it is being built up, so the list of waiting shards for each index (those shards that are relocating or initializing) are aggregated for a given index instead of overwritten.	2017-04-24 10:59:08 -04:00
Martijn van Groningen	dabbf5d4f4	[TEST] Added unittests for InternalGeoCentroid Relates to #22278	2017-04-24 16:57:25 +02:00
Nilabh Sagar	373edee29a	Provide informative error message in case of unknown suggestion context. (#24241 ) Provide a list of available contexts when you send an unknown context to the completion suggester.	2017-04-24 10:35:14 -04:00
Jason Tedor	1500beafc7	Check for default.path.data included in path.data If the user explicitly configured path.data to include default.path.data, then we should not fail the node if we find indices in default.path.data. This commit addresses this. Relates #24285	2017-04-24 09:31:54 -04:00
Jason Tedor	a7947b404b	Fix hash code for AliasFilter This commit fixes the hash code for AliasFilter as the previous implementation was neglecting to take into consideration the fact that the aliases field is an array and thus a deep hash code of it should be computed rather than a shallow hash code on the reference. Relates #24286	2017-04-24 09:06:36 -04:00
Yannick Welsch	7c395070e2	[TEST] Wait for tribe node to be fully connected before shutting it down The tribe was being shutdown by the test while a publishing round (that adds the tribe node to a cluster) is not completed yet (i.e. the node itself knows that it became part of the cluster, and the test shuts the tribe node down, but another node has not applied the cluster state yet, which makes that node hang while trying to connect to the node that is shutting down (due to connect_timeout being 30 seconds), delaying publishing for 30 seconds, and subsequently tripping an assertion when another tribe instance wants to join. Relates to #23695	2017-04-24 12:27:41 +02:00
Colin Goodheart-Smithe	6d6a230f70	Makes StoredScriptSource implement ToXContentObject	2017-04-24 10:20:15 +01:00
Colin Goodheart-Smithe	d4a6ba8ec9	No longer add illegal content type option to stored search templates (#24251 ) When parsing StoredSearchScript we were adding a Content type option that was forbidden (by a check that threw an exception) by the parser thats used to parse the template when we read it from the cluster state. This was stopping Elastisearch from starting after stored search templates had been added. This change no longer adds the content type option to the StoredScriptSource object when parsing from the put search template request. This is safe because the StoredScriptSource content is always JSON when its stored in the cluster state since we do a conversion to JSON before this point. Also removes the check for the content type in the options when parsing StoredScriptSource so users who already have stored scripts can start Elasticsearch. Closes #24227	2017-04-22 13:37:04 -04:00
Ryan Ernst	473e98981b	Scripts: Remove unnecessary executable shortcut (#24264 ) ScriptService has two executable methods, one which takes a CompiledScript, which is similar to search, and one that takes a raw Script and both compiles and returns an ExecutableScript for it. The latter is not needed, and the call sites which used one or the other were mixed. This commit removes the extra executable method in favor of callers first calling compile, then executable.	2017-04-21 17:53:03 -07:00
Ryan Ernst	aadc33d260	Scripts: Remove unwrap method from executable scripts (#24263 ) The unwrap method was leftover from support javascript and python. Since those languages are removed in 6.0, this commit removes the unwrap feature from scripts.	2017-04-21 17:50:22 -07:00
Nik Everett	447f307ebb	Fix _bulk response when it can't create an index (#24048 ) Before #22488 when an index couldn't be created during a `_bulk` operation we'd do all the other actions and return the index creation error on each failing action. In #22488 we accidentally changed it so that we now reject the entire bulk request if a single action cannot create an index that it must create to run. This gets reverts to the old behavior while still keeping the nicer error messages. Instead of failing the entire request we now only fail the portions of the request that can't work because the index doesn't exist. Closes #24028	2017-04-21 18:56:04 -04:00
Jason Tedor	fe91c72151	Use a marker file when removing a plugin Today when removing a plugin, we attempt to move the plugin directory to a temporary directory and then delete that directory from the filesystem. We do this to avoid a plugin being in a half-removed state. We previously tried an atomic move, and fell back to a non-atomic move if that failed. Atomic moves can fail on union filesystems when the plugin directory is not in the top layer of the filesystem. Interestingly, the regular move can fail as well. This is because when the JDK is executing such a move, it first tries to rename the source directory to the target directory and if this fails with EXDEV (as in the case of an atomic move failing), it falls back to copying the source to the target, and then attempts to rmdir the source. The bug here is that the JDK never deleted the contents of the source so the rmdir will always fail (except in the case of an empty directory). Given all this silliness, we were inspired to find a different strategy. The strategy is simple. We will add a marker file to the plugin directory that indicates the plugin is in a state of removal. This file will be the last file out the door during removal. If this file exists during startup, we fail startup. Relates #24252	2017-04-21 15:50:44 -04:00
Simon Willnauer	2ca7072b24	Fill missing sequence IDs up to max sequence ID when recovering from store (#24238 ) Today we might promote a primary and recover from store where after translog recovery the local checkpoint is still behind the maximum sequence ID seen. To fill the holes in the sequence ID history this PR adds a utility method that fills up all missing sequence IDs up to the maximum seen sequence ID with no-ops. Relates to #10708	2017-04-21 20:28:00 +02:00
Ryan Ernst	ba48674695	Build: Move plugin cli and tests to distribution tool (#24220 ) The plugin cli currently resides inside the elasticsearch jar. This commit moves it into a plugin-cli jar. This is change alone is a no-op; it does not change anything about what is loaded at runtime. But it will allow easier testing (with fixtures in the future to test ES or maven installation), as well as eventually not loading these classes when starting elasticsearch.	2017-04-21 09:25:58 -07:00
Boaz Leskes	badb2be066	Peer Recovery: remove maxUnsafeAutoIdTimestamp hand off (#24243 ) With #24149 , it is now stored in the Lucene commit and is implicitly transferred in the file phase of the recovery.	2017-04-21 17:31:50 +02:00
Ali Beyad	63e5aff5d6	Adds version 5.3.2 and backwards compatibility indices for 5.3.1	2017-04-21 10:48:41 -04:00
Tanguy Leroux	480bf0996d	Add utility method to parse named XContent objects with typed prefix (#24240 ) This commit adds a XContentParserUtils.parseTypedKeysObject() method that can be used to parse named XContent objects identified by a field name containing a type identifier, a delimiter and the name of the object to parse.	2017-04-21 15:41:27 +02:00
Tanguy Leroux	251b6d452b	MultiBucketsAggregation.Bucket should not extend Writeable (#24216 ) The MultiBucketsAggregation.Bucket interface extends Writeable, forcing all implementation classes to implement writeTo(). This commit removes the Writeable from the interface and move it down to the InternalBucket implementation.	2017-04-21 15:29:53 +02:00
Yannick Welsch	c2deb1c81d	Don't expose cleaned-up tasks as pending in PrioritizedEsThreadPoolExecutor (#24237 ) Changes in #24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null. This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet. Instead of safeguarding consumers of the API (as was done before #24102) this changes the executor to not count these tasks as pending at all.	2017-04-21 15:25:19 +02:00
Colin Goodheart-Smithe	3c7c4bc824	Adds declareNamedObjects methods to ConstructingObjectParser (#24219 ) * Adds declareNamedObjects methods to ConstructingObjectParser * Addresses review comments	2017-04-21 09:50:30 +01:00
Christoph Büscher	c8ad26edc9	Tests: Extend InternalStatsTests (#24212 ) Currently we don't test for count = 0 which will make a difference when adding tests for parsing for the high level rest client. Also min/max/sum should also be tested with negative values and on a larger range.	2017-04-21 10:38:09 +02:00
Adrien Grand	81b64ed587	IndicesQueryCache should delegate the scorerSupplier method. (#24209 ) Otherwise the range improvements that we did on range queries would not work. This is similar to https://issues.apache.org/jira/browse/LUCENE-7749.	2017-04-21 10:33:02 +02:00
Adrien Grand	f322f537e4	Speed up parsing of large `terms` queries. (#24210 ) The addition of the normalization feature on keywords slowed down the parsing of large `terms` queries since all terms now have to go through normalization. However this can be avoided in the default case that the analyzer is a `keyword` analyzer since all that normalization will do is a UTF8 conversion. Using `Analyzer.normalize` for that is a bit overkill and could be skipped.	2017-04-21 10:32:33 +02:00
Jim Ferenczi	a4365971a0	[TEST] make sure that the random query_string query generator defines a default_field or a list of fields	2017-04-21 02:56:26 +02:00
Fabien Baligand	4a45579506	token_count type : add an option to count tokens (fix #23227 ) (#24175 ) Add option "enable_position_increments" with default value true. If option is set to false, indexed value is the number of tokens (not position increments count)	2017-04-21 00:53:28 +02:00
Jim Ferenczi	525101b64d	Query string default field (#24214 ) Currently any `query_string` query that use a wildcard field with no matching field is rewritten with the `_all` field. For instance: ```` #creating test doc PUT testing/t/1 { "test": { "field_one": "hello", "field_two": "world" } } #searching abc.* (does not exist) -> hit GET testing/t/_search { "query": { "query_string": { "fields": [ "abc.*" ], "query": "hello" } } } ```` This bug first appeared in 5.0 after the query refactoring and impacts only users that use `_all` as default field. Indices created in 6.x will not have this problem since `_all` is deactivated in this version. This change fixes this bug by returning a MatchNoDocsQuery for any term that expand to an empty list of field.	2017-04-20 22:12:20 +02:00
Luca Cavanna	82c678b5c7	Make Aggregations an abstract class rather than an interface (#24184 ) Some of the base methods that don't have to do with reduce phase and serialization can be moved to the base class which is no longer an interface. This will be reusable by the high level REST client further on the road. Also it simplify things as having an interface with a single implementor is not that helpful.	2017-04-20 21:31:34 +02:00
Areek Zillur	077a6c3ee7	[TEST] ensure expected sequence no and version are set when index/delete engine operation has a document failure	2017-04-20 13:38:52 -04:00
Yannick Welsch	22e0795990	Extract batch executor out of cluster service (#24102 ) Refactoring that extracts the task batching functionality from ClusterService and makes it a reusable component that can be tested in isolation.	2017-04-20 17:28:43 +02:00
Tanguy Leroux	55a879ee8d	Align behavior or HDR percentiles iterator with percentile() method (#24206 )	2017-04-20 12:37:33 +02:00
Nik Everett	caf376c8af	Start building analysis-common module (#23614 ) Start moving built in analysis components into the new analysis-common module. The goal of this project is: 1. Remove core's dependency on lucene-analyzers-common.jar which should shrink the dependencies for transport client and high level rest client. 2. Prove that analysis plugins can do all the "built in" things by moving all "built in" behavior to a plugin. 3. Force tests not to depend on any oddball analyzer behavior. If tests need anything more than the standard analyzer they can use the mock analyzer provided by Lucene's test infrastructure.	2017-04-19 18:51:34 -04:00
Jason Tedor	4796557a30	Add primary term to doc write response This commit adds the primary term to the doc write response. Relates #24171	2017-04-19 14:44:22 -04:00
Ryan Ernst	c7e9231a86	Plugins: Remove leniency for missing plugins dir (#24173 ) This leniency was left in after plugin installer refactoring for 2.0 because some tests still relied on it. However, the need for this leniency no longer exists.	2017-04-19 09:09:34 -07:00
Christoph Büscher	a9657a5a09	Add BucketMetricValue interface (#24188 ) Unlike other implementations of InternalNumericMetricsAggregation.SingleValue, the InternalBucketMetricValue aggregation currently doesn't implement a specialized interface that exposes the `keys()` method. This change adds this so that clients can access the keys via the interface.	2017-04-19 16:27:33 +02:00
Jim Ferenczi	f05af0a382	Enable index-time sorting (#24055 ) This change adds an index setting to define how the documents should be sorted inside each Segment. It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk. It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index. This change does not add early termination capabilities in the search layer. This will be added in a follow up. Relates #6720	2017-04-19 14:36:11 +02:00
Boaz Leskes	8758c541b3	ElectMasterService.hasEnoughMasterNodes should return false if no masters were found This is a regression introduced in #20063	2017-04-19 09:52:06 +02:00
Tanguy Leroux	741c031384	[Test] Add unit tests for InternalHDRPercentilesTests (#24157 ) Related to #22278	2017-04-19 09:37:01 +02:00
Areek Zillur	4f773e2dbb	Replicate write failures (#23314 ) * Replicate write failures Currently, when a primary write operation fails after generating a sequence number, the failure is not communicated to the replicas. Ideally, every operation which generates a sequence number on primary should be recorded in all replicas. In this change, a sequence number is associated with write operation failure. When a failure with an assinged seqence number arrives at a replica, the failure cause and sequence number is recorded in the translog and the sequence number is marked as completed via executing `Engine.noOp` on the replica engine. * use zlong to serialize seq_no * Incorporate feedback * track write failures in translog as a noop in primary * Add tests for replicating write failures. Test that document failure (w/ seq no generated) are recorded as no-op in the translog for primary and replica shards * Update to master * update shouldExecuteOnReplica comment * rename indexshard noop to markSeqNoAsNoOp * remove redundant conditional * Consolidate possible replica action for bulk item request depanding on it's primary execution * remove bulk shard result abstraction * fix failure handling logic for bwc * add more tests * minor fix * cleanup * incorporate feedback * incorporate feedback * add assert to remove handling noop primary response when 5.0 nodes are not supported	2017-04-19 01:23:54 -04:00
Jason Tedor	9e0ebc5965	Rename variable in translog simple commit test This commit renames a variable for clarity in the translog simple commit test.	2017-04-18 23:43:25 -04:00
Jason Tedor	20181dd0ad	Strengthen translog commit with open view test This commit strengthens an assertion in the translog commit with open view test.	2017-04-18 23:41:55 -04:00
Jason Tedor	180d1f2219	Stronger check in translog prepare and commit test This commit strengthens an assertion in the translog prepare commit and commit test.	2017-04-18 23:37:54 -04:00
Jason Tedor	23b224a5a9	Fix translog prepare commit and commit test This test was terribly, horribly, no goodly, and badly broken it's amazing it ever passed so this commit fixes it.	2017-04-18 23:32:47 -04:00
Boaz Leskes	edff30f82a	Engine: store maxUnsafeAutoIdTimestamp in commit (#24149 ) The `maxUnsafeAutoIdTimestamp` timestamp is a safety marker guaranteeing that no retried-indexing operation with a higher auto gen id timestamp was process by the engine. This allows us to safely process documents without checking if they were seen before. Currently this property is maintained in memory and is handed off from the primary to any replica during the recovery process. This commit takes a more natural approach and stores it in the lucene commit, using the same semantics (no retry op with a higher time stamp is part of this commit). This means that the knowledge is transferred during the file copy and also means that we don't need to worry about crazy situations where an original append only request arrives at the engine after a retry was processed and the engine was restarted.	2017-04-18 20:11:32 +02:00
Simon Willnauer	ab9884b2e9	Remove leniency when merging fetched hits in a search response phase (#24158 ) Today when we merge hits we have a hard check to prevent AIOOB exceptions that simply skips an expected search hit. This can only happen if there is a bug in the code which should be turned into a hard exception or an assertion triggered. This change adds an assertion an removes the lenient check for the fetched hits.	2017-04-18 17:19:57 +02:00
Tanguy Leroux	829dd068d6	[Test] Use appropriate DocValueFormats in Aggregations tests (#24155 ) Some aggregations (like Min, Max etc) use a wrong DocValueFormat in tests (like IP or GeoHash). We should not test aggregations that expect a numeric value with a DocValueFormat like IP. Such wrong DocValueFormat can also prevent the aggregation to be rendered as ToXContent, and this will be an issue for the High Level Rest Client tests which expect to be able to parse back aggregations.	2017-04-18 17:03:32 +02:00
Christoph Büscher	8f540346a9	Tests: Fixing typo in class name of InternalGlobalTests Renaming from InternalGlogbalTests -> InternalGlobalTests	2017-04-18 16:27:15 +02:00
Adrien Grand	4632661bc7	Upgrade to a Lucene 7 snapshot (#24089 ) We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index. Some notes about the change: - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge	2017-04-18 15:17:21 +02:00
Tanguy Leroux	f217eb8ad8	Merge Percentile class with interface (#24154 ) This commit merges the Percentile interface with the InternalPercentile class, as we don't need to maintain both.	2017-04-18 14:47:18 +02:00
Martijn van Groningen	edada2581e	[TEST] Added unittests for InternalSampler	2017-04-18 14:31:58 +02:00
Yannick Welsch	0b2cb68f6f	[TEST] Randomly add and remove no_master blocks in IndicesClusterStateServiceRandomUpdatesTests Checks that IndicesClusterStateService stays consistent with incoming cluster states that contain no_master blocks (especially discovery.zen.no_master_block=all which disables state persistence). In particular this checks that active shards which have no in-memory data structures on a node are failed.	2017-04-18 14:27:54 +02:00
Martijn van Groningen	ac41fb2c4a	[TEST] Added test for GeoCentroidAggregator and made constructors of GeoCentroidAggregator, GeoCentroidAggregatorFactory and InternalGeoCentroid package protected.	2017-04-18 13:54:31 +02:00
Tanguy Leroux	81dbdb239f	[Test] Add unit tests for InternalTDigestPercentilesTests (#24090 )	2017-04-18 09:48:35 +02:00
Chris Earle	12c8423ec9	Warn on not enough masters during election (#20063 ) This changes the trace level logging to warn, and adds the needed number to the message as well. My fear is that it may get noisy, but this is an issue that you want to be noisy.	2017-04-17 22:18:28 -04:00
Jason Tedor	34eda1a1a8	Do not set path.data in environment if not set When preparing the final settings in the environment, we unconditionally set path.data even if path.data was not explicitly set. This confounds detection for whether or not path.data was explicitly set, and this is trappy. This commit adds logic to only set path.data in the final settings if path.data was explicitly set, and provides a test case that fails without this logic. Relates #24132	2017-04-17 10:43:13 -04:00
Jason Tedor	f7ebe9d18f	Preserve multiple translog generations Today when a flush is performed, the translog is committed and if there are no outstanding views, only the current translog generation is preserved. Yet for the purpose of sequence numbers, we need stronger guarantees than this. This commit migrates the preservation of translog generations to keep the minimum generation that would be needed to recover after the local checkpoint. Relates #24015	2017-04-17 08:51:54 -04:00
Jason Tedor	8033c576b7	Detect remnants of path.data/default.path.data bug In Elasticsearch 5.3.0 a bug was introduced in the merging of default settings when the target setting existed as an array. When this bug concerns path.data and default.path.data, we ended up in a situation where the paths specified in both settings would be used to write index data. Since our packaging sets default.path.data, users that configure multiple data paths via an array and use the packaging are subject to having shards land in paths in default.path.data when that is very likely not what they intended. This commit is an attempt to rectify this situation. If path.data and default.path.data are configured, we check for the presence of indices there. If we find any, we log messages explaining the situation and fail the node. Relates #24099	2017-04-17 07:03:46 -04:00
jaymode	a8be0a5836	Cat APIs should not close the stream obtained from the channel The cat APIs and rest tables would obtain a stream from the RestChannel, which happened to be a ReleasableBytesStreamOutput. These APIs used the stream to write content to, closed the stream, and then tried to send a response. After #23941 was merged, closing the stream meant that the bytes were released for use elsewhere. This caused occasional corruption of the response when the bytes were used prior to the response being sent. This commit changes these two usages to wrap the stream obtained from the channel in a flush on close stream so that the bytes are still reserved until the message is sent.	2017-04-15 14:57:00 -04:00
Jason Tedor	cd8e059885	Do not produce empty IDs in simple versioning test Empty IDs are rejected during indexing, so we should not randomly produce them during tests. This commit modifies the simple versioning tests to no longer produce empty IDs.	2017-04-15 12:15:45 -04:00
Jason Tedor	972bdc09ee	Reject empty IDs When indexing a document via the bulk API where IDs can be explicitly specified, we currently accept an empty ID. This is problematic because such a document can not be obtained via the get API. Instead, we should rejected these requets as accepting them could be a dangerous form of leniency. Additionally, we already have a way of specifying auto-generated IDs and that is to not explicitly specify an ID so we do not need a second way. This commit rejects the individual requests where ID is specified but empty. Relates #24118	2017-04-15 10:36:03 -04:00
Boaz Leskes	ecf81688fb	Use sequence numbers to identify out of order delivery in replicas & recovery (#24060 ) Internal indexing requests in Elasticsearch may be processed out of order and repeatedly. This is important during recovery and due to concurrency in replicating requests between primary and replicas. As such, a replica/recovering shard needs to be able to identify that an incoming request contains information that is old and thus need not be processed. The current logic is based on external version. This is sadly not sufficient. This PR moves the logic to rely on sequences numbers and primary terms which give the semantics we need. Relates to #10708	2017-04-14 21:46:17 +02:00
Jason Tedor	09efdc3151	Improve performance of extracting warning value When building headers for a REST response, we de-duplicate the warning headers based on the actual warning value. The current implementation of this uses a capturing regular expression that is prone to excessive backtracking. In cases a request involves a large number of warnings, this extraction can be a severe performance penalty. An example where this can arise is a bulk indexing request that utilizes a deprecated feature (e.g., using deprecated forms of boolean values). This commit is an attempt to address this performance regression. We already know the format of the warning header, so we do not need to use a regular expression to parse it but rather can parse it by hand to extract the warning value. This gains back the vast majority of the performance lost due to the usage of a deprecated feature. There is still a performance loss due to logging the deprecation message but we do not address that concern in this commit. Relates #24114	2017-04-14 12:18:00 -04:00
Jay Modi	30ab8739a6	Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23941 ) This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so that we can use try-with-resources with these streams and avoid leaking memory by not returning the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only release the BigArray once. In order to make some of the changes cleaner, the ReleasableBytesStream interface has been removed. The BytesStream interface is changed to a abstract class so that we can use it as a useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a given stream and overrides the close method so that the stream is simply flushed and not closed. This behavior is used in the TcpTransport when compression is used with a ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data is written from this stream. Closing the compressed stream will try to close the underlying stream but we only want to flush so that all of the written bytes are available. Additionally, an error message method added in the BytesRestResponse did not use a builder provided by the channel and instead created its own JSON builder. This changes that method to use the channel builder and in turn the bytes stream output that is managed by the channel. Note, this commit differs from `6bfecdf921` in that it updates ReleasableBytesStreamOutput to handle the case of the BigArray decreasing in size, which changes the reference to the BigArray. When the reference is changed, the releasable needs to be updated otherwise there could be a leak of bytes and corruption of data in unrelated streams. This reverts commit `afd45c1432`, which reverted #23572.	2017-04-14 10:50:31 -04:00
Yannick Welsch	e3aa2a89f9	[TEST] Wait in OldIndexBackwardsCompatibilityIT for cluster to be fully initialized There are test failures that suggest that the import of dangling indices is happening too early, before the dangling indices are ready to be consumed. This commit adds an ensureGreen() at the end of cluster initialization to make sure that no cluster state updates are happening while the dangling indices are prepared on-disk.	2017-04-14 11:02:55 +02:00
Ali Beyad	5e54c0261a	[TEST] fixes InternalTopHitsTests test to initialize the SearchHits maxScore to Float.NaN if there is no max score, as that is what Lucene's TopDocs does	2017-04-13 18:27:42 -04:00
Igor Motov	cce321a560	Task Management: Make TaskInfo parsing forwards compatible (#24073 ) TaskInfo is stored as a part of TaskResult and therefore can be read by nodes with an older version. If we add any additional information to TaskInfo (for #23250, for example), nodes with an older version should be able to ignore it, otherwise they will not be able to read TaskResults stored by newer nodes.	2017-04-13 16:16:01 -04:00

1 2 3 4 5 ...

7977 Commits