OpenSearch

Commit Graph

Author	SHA1	Message	Date
Simon Willnauer	c7db8843b3	[TEST] Stabelize BenchmarkIntegrationTest#testAbortBenchmark	2014-05-17 23:33:49 +02:00
Alex Ksikes	db991dc3a4	More Like This Query: Added searching for multiple items. The syntax to specify one or more items is the same as for the Multi GET API. If only one document is specified, the results returned are the same as when using the More Like This API. Relates #4075 Closes #5857	2014-05-17 19:14:56 +02:00
Igor Motov	a3581959d7	[TESTS] Ignore SnapshotMissingException in snapshotWithStuckNodeTest The retry mechanism in the transport layer might cause the delete snapshot request to be executed twice if the cluster master is closed while the request is executed. First time delete snapshot request is getting successfully executed on the old master and then it is retried on the newly elected master. When the new master tries to delete the snapshot - the snapshot no longer exists (since it was successfully deleted by the old master) and SnapshotMissingException is returned.	2014-05-17 11:18:11 -04:00
Igor Motov	c20713530d	Switch to shared thread pool for all snapshot repositories Closes #6181	2014-05-16 19:03:15 -04:00
Igor Motov	7f5befd95e	Add Partial snapshot state Currently even if some shards of the snapshot are not snapshotted successfully, the snapshot is still marked as "SUCCESS". Users may miss the fact the there are shard failures present in the snapshot and think that snapshot was completed. This change adds a new snapshot state "PARTIAL" that provides a quick indication that the snapshot was only partially successful. Closes #5792	2014-05-16 18:26:56 -04:00
Boaz Leskes	9f10547f4b	Allow 0 as a valid external version Until now all version types have officially required the version to be a positive long number. Despite of this has being documented, ES versions <=1.0 did not enforce it when using the `external` version type. As a result people have succesfully indexed documents with 0 as a version. In 1.1. we introduced validation checks on incoming version values and causing indexing request to fail if the version was set to 0. While this is strictly speaking OK, we effectively have a situation where data already indexed does not match the version invariant. To be lenient and adhere to spirit of our data backward compatibility policy, we have decided to allow 0 as a valid external version type. This is somewhat complicated as 0 is also the internal value of `MATCH_ANY`, which indicates requests should succeed regardles off the current doc version. To keep things simple, this commit changes the internal value of `MATCH_ANY` to `-3` for all version types. Since we're doing this in a minor release (and because versions are stored in the transaction log), the default `internal` version type still accepts 0 as a `MATCH_ANY` value. This is not a problem for other version types as `MATCH_ANY` doesn't make sense in that context. Closes #5662	2014-05-16 22:10:16 +02:00
Simon Willnauer	bf22df7fd0	Remove SoftReferences from StreamInput/StreamOutput We try to reuse character arrays and UTF8 writers with softreferences. SoftReferences have negative impact on GC and should be avoided in general. Yet in this case it can simply replaced with a per-stream Bytes/CharsRef that is thread local and has the same lifetime as the stream.	2014-05-16 20:58:42 +02:00
Simon Willnauer	11a3201a09	Use EnumSet rather than static mutable arrays ClusterBlockLevel uses arrays but should use EnumSets instead	2014-05-16 20:54:01 +02:00
Simon Willnauer	d65e9e9bea	Add some finals where appropriate	2014-05-16 20:54:01 +02:00
Simon Willnauer	c561900512	Use UTF-8 as string encoding	2014-05-16 20:54:01 +02:00
David Pilato	0dbc83e7b0	[TEST] Do not filter gz files	2014-05-16 15:23:09 +02:00
Simon Willnauer	d806b567e4	Remove dead code	2014-05-16 15:08:56 +02:00
Simon Willnauer	eef505ed51	RecoveryID should not be a per JVM but per Node Today the RecovyerID is taken from a static atomic long which is essentially a per JVM ID. We run the tests within the same JVM and that means we don't really simulate what happens in production environments. Instead we should use a per node generated ID.	2014-05-16 14:59:32 +02:00
Simon Willnauer	9a9cc0b8e4	Add simple example to XContentParser how to obtain an instance of it	2014-05-16 14:55:22 +02:00
David Pilato	bd871f96c2	Check that a plugin is Lucene compatible with the current running node using `lucene` property in `es-plugin.properties` file. * If plugin does not provide `lucene` property, we consider that the plugin is compatible. * If plugin provides `lucene` property, we try to load related Enum org.apache.lucene.util.Version. If this fails, it means that the node is too "old" comparing to the Lucene version the plugin was built for. * We compare then two first digits of current node lucene version against two first digits of plugin Lucene version. If not equal, it means that the plugin is too "old" for the current node. Plugin developers who wants to launch plugin check only have to add a `lucene` property in `es-plugin.properties` file. If you are using maven to build your plugin, you can do it like this: In `pom.xml`: ```xml <properties> <lucene.version>4.6.0</lucene.version> </properties> <build> <resources> <resource> <directory>src/main/resources</directory> <filtering>true</filtering> </resource> </resources> </build> ``` In `es-plugin.properties`, add: ```properties lucene=${lucene.version} ``` BTW, if you don't already have it, you can add the plugin version as well: ```properties version=${project.version} ``` You can disable that check using `plugins.check_lucene: false`.	2014-05-16 13:41:20 +02:00
Simon Willnauer	094908ac7f	Randomize CMS settings in index template This commit adds randomization for: * `index.merge.scheduler.max_thread_count` * `index.merge.scheduler.max_merge_count` This commit also moves to use EsExecutors#boundedNumberOfProcessors(Settings) to default configure the default `max_thread_count` for better reproducibility Closes #6194	2014-05-15 23:16:45 +02:00
javanna	7548b2edb7	Unified MetaData#concreteIndices methods into a single method that accepts indices (or aliases) and indices options Added new internal flag to IndicesOptions that tells whether aliases can be resolved to multiple indices or not. Cut over to new metaData#concreteIndices(IndicesOptions, String...) for all the api previously using MetaData#concreteIndices(String[], IndicesOptions) and removed old method, deprecation is not needed as it doesn't break client code. Introduced constants for flags in IndicesOptions for more readability Renamed MetaData#concreteIndex to concreteSingleIndex, left method as a shortcut although it calls the common concreteIndices that accepts IndicesOptions and multipleIndices	2014-05-15 20:53:05 +02:00
Boaz Leskes	1f28cd0ba8	When sending shard start/failed message due to a cluster state change, use the master indicated in the new state rather than current This commit also adds extra protection in other cases against a master node being de-elected and thus being null. Closes #6189	2014-05-15 18:42:26 +02:00
Boaz Leskes	84593f0d7c	Added meta data and routing version to cluster state's pretty print	2014-05-15 15:55:11 +02:00
Boaz Leskes	dc07ece790	Added some debug logs to the recovery process	2014-05-15 15:37:30 +02:00
Simon Willnauer	e47de1f809	[TEST] Randomize number of available processors We configure the threadpools according to the number of processors which is different on every machine. Yet, we had some test failures related to this and #6174 that only happened reproducibly on a node with 1 available processor. This commit does: * sometimes randomize the number of available processors * if we don't randomize we should set the actual number of available processors in the settings on the test node * always print out the num of processors when a test fails to make sure we can reproduce the thread pool settings with the reproduce info line Closes #6176	2014-05-15 12:24:53 +02:00
Simon Willnauer	53bfe44e19	Fix debug logging message for put template action	2014-05-15 11:13:30 +02:00
Andrew Selden	fc0bed5236	Fix bug for BENCH thread pool size == 1 On small hardware, the BENCH thread pool can be set to size 1. This is problematic as it means that while a benchmark is active, there are no threads available to service administrative tasks such as listing and aborting. This change fixes that by executing list and abort operations on the GENERIC thread pool. Closes #6174	2014-05-14 10:40:39 -07:00
Simon Willnauer	2c1c5c163f	[TEST] Ensure all benchmarks are aborted on failure and latches are counted down	2014-05-14 16:40:34 +02:00
Simon Willnauer	fc2ab0909e	[TEST] Remove busy waiting from BenchmarkIntegrationTest I think Chuck Norris is required to fix this at this point until we have an API that can for instance pause a Benchmark. We basically wait for a query to be executed and that query syncs on a latch with the test in a script :) This commit also adds some more testing for benchmarks that run into errors.	2014-05-14 14:40:27 +02:00
David Pilato	e0a95d9c19	Allow sorting on nested sub generated field When you have a nested document and want to sort on its fields, it's perfectly doable on regular fields but not on "generated" sub fields. Here is a SENSE recreation: ``` DELETE /tmp PUT /tmp PUT /tmp/doc/_mapping { "properties": { "flat": { "type": "string", "index": "analyzed", "fields": { "sub": { "type": "string", "index": "not_analyzed" } } }, "nested": { "type": "nested", "properties": { "foo": { "type": "string", "index": "analyzed", "fields": { "sub": { "type": "string", "index": "not_analyzed" } } } } } } } PUT /tmp/doc/1 { "flat":"bar", "nested":{ "foo":"bar" } } ``` When sorting on `flat.sub` sub field, everything is fine: ``` GET /tmp/doc/_search { "sort": [ { "flat.sub": { "order": "desc" } } ] } ``` When sorting on `nested` field, everything is fine: ``` GET /tmp/doc/_search { "sort": [ { "nested.foo": { "order": "desc" } } ] } ``` But when sorting on `nested.sub` field, sorting is incorrect: ``` GET /tmp/doc/_search { "sort": [ { "nested.foo.sub": { "order": "desc" } } ] } Closes #6150.	2014-05-14 14:13:44 +02:00
Britta Weber	08e57890f8	use shard_min_doc_count also in TermsAggregation This was discussed in issue #6041 and #5998 . closes #6143	2014-05-14 14:10:04 +02:00
Britta Weber	d4a0eb818e	refactor: make requiredSize, shardSize, minDocCount and shardMinDocCount a single parameter Every class using these parameters has their own member where these four are stored. This clutters the code. Because they mostly needed together it might make sense to group them.	2014-05-14 14:10:02 +02:00
Britta Weber	8e3bcb5e2f	refactor: unify terms and significant_terms parsing Both need the requiredSize, shardSize, minDocCount and shardMinDocCount. Parsing should not be duplicated.	2014-05-14 14:09:59 +02:00
Adrien Grand	bfcebbb957	[TESTS] Fix test bug in PagedBytesReferenceTest.	2014-05-14 10:09:11 +02:00
Adrien Grand	265b386fa7	[TESTS] Fix test bugs for parent/child queries. If you got a bad seed and tests.nightly=true, these tests would either call Random#nextInt on `0` or trigger infinite loops.	2014-05-14 09:35:45 +02:00
Boaz Leskes	9daa72941a	[Test] increase ping timeout to 400ms in MinimumMasterNodesTests.dynamicUpdateMinimumMasterNodes	2014-05-14 09:28:44 +02:00
Boaz Leskes	fb501b22e1	[Test] SimpleNodesInfoTests.testNodesInfos didn't wait for cluster to form properly	2014-05-14 08:59:48 +02:00
Lee Hinman	588ae1ba9e	Track the number of times the CircuitBreaker has been tripped Fixes #6130	2014-05-13 21:08:48 +02:00
javanna	ffe97f004e	[TEST] improved MetaDataTests coverage for different index options Relates to #6068	2014-05-13 20:17:46 +02:00
David Pilato	2971a102f6	[Javadoc] Add full link to TDigest class (cherry picked from commit ed72484)	2014-05-13 20:04:45 +02:00
Andrew Selden	8713a090c2	Fix recovery percentage > 100% The recovery API was sometimes misreporting the recovered byte percentages of index files. This was caused by summing up total file lengths on each file chunk transfer. It should have been summing the lengths of each transfer request. Closes #6113	2014-05-13 09:38:02 -07:00
Simon Willnauer	0457b0b765	[TEST] Raise request timeout windows is sometimes extraordinary slow	2014-05-13 18:05:34 +02:00
Martijn van Groningen	c6c9bbdd72	Removed useless and illegal json object in the response. Relates to #5865	2014-05-13 14:32:03 +02:00
Adrien Grand	3ad321fcb2	Fix NPE when initializing an accepted socket in NettyTransport. NettyTransport's ChannelPipelineFactory uses the instance variable serverOpenChannels in order to create sockets. However, this instance variable is set to null when stoping the netty transport, so if the transport tries to stop and to initialize a socket at the same time you might hit the following NullPointerException: [2014-05-13 07:33:47,616][WARN ][netty.channel.socket.nio.AbstractNioSelector] Failed to initialize an accepted socket. java.lang.NullPointerException: handler at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.<init>(DefaultChannelPipeline.java:725) at org.jboss.netty.channel.DefaultChannelPipeline.init(DefaultChannelPipeline.java:667) at org.jboss.netty.channel.DefaultChannelPipeline.addLast(DefaultChannelPipeline.java:96) at org.elasticsearch.transport.netty.NettyTransport$2.getPipeline(NettyTransport.java:327) at org.jboss.netty.channel.socket.nio.NioServerBoss.registerAcceptedChannel(NioServerBoss.java:134) at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:104) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) This fix ensures that the ChannelPipelineFactory always uses the channels that have been used upon start, even if a stop request is issued concurrently. Close #6144	2014-05-13 13:39:56 +02:00
Simon Willnauer	d8c02c2599	Read full message on free context Since #5730 we write a boolean in the FreeContextResponse which should be deserialized Closes #6147	2014-05-13 12:36:13 +02:00
Simon Willnauer	1feddac315	Log cache recycler clear call as debug	2014-05-13 12:26:31 +02:00
Simon Willnauer	53db387698	Report all errors if benchmark fails and mark as failed	2014-05-13 12:26:31 +02:00
Simon Willnauer	65d27bff9d	[TEST] Ensure no bechmarks are running after test	2014-05-13 12:26:30 +02:00
Benjamin Devèze	240a2a8abf	[JAVADOCS] Fix wrong javadoc in IdentityHashSet. Close #6121	2014-05-13 11:57:11 +02:00
Adrien Grand	cc530b9037	Use t-digest as a dependency. Our improvements to t-digest have been pushed upstream and t-digest also got some additional nice improvements around memory usage and speedups of quantile estimation. So it makes sense to use it as a dependency now. This also allows to remove the test dependency on Apache Mahout. Close #6142	2014-05-13 10:38:08 +02:00
markharwood	1e560b0d92	Significant_terms agg: added option for a background_filter to define background context for analysis of term frequencies Closes #5944	2014-05-13 09:10:30 +01:00
Lee Hinman	3484ca3737	Log script change/add and removal at INFO level Closes #6104	2014-05-13 09:33:22 +02:00
Igor Motov	dfdc183ba6	Fix for hanging aborted snapshot during node shutdown If a node is shutdown while a snapshot that runs on this node is aborted, it might cause the snapshot process to hang. Closes #5958	2014-05-12 18:02:07 -04:00
Andrew Selden	fdbefa0cd1	Fix for benchmark test timeout Lower number of random requests generated for each test so as not to timeout on heavy tests. Addresses #6094	2014-05-12 14:45:43 -07:00
javanna	154688bba1	improved IndicesOptions javadocs	2014-05-12 23:26:29 +02:00
javanna	c69c66bb7a	fixed MetaData#concreteIndices to throw exception with a single index argument in case allowNoIndices == false and ignoreUnavailable == true Closes #6137	2014-05-12 23:26:29 +02:00
mikemccand	00fcf4d560	#6081 : set IO throttling back to 20 MB/sec now that #6018 is fixed	2014-05-12 14:42:26 -04:00
mikemccand	254ebc2f88	#6120 Remove SerialMergeScheduler (master only) It's dangerous to expose SerialMergeScheduler as an option: since it only allows one merge at a time, it can easily cause merging to fall behind. Closes #6120	2014-05-12 14:06:20 -04:00
mikemccand	eae304aa39	5882: put back Elasticsearch's 1.1 defaults for ConcurrentMergeScheduler	2014-05-12 13:22:33 -04:00
Britta Weber	4b2e4becc7	Check if root mapping is actually valid When a mapping is declared and the type is known from the uri then the type can be skipped in the body (see #4483). However, there was no check if the given keys actually make a valid mapping. closes #5864 closes #6093	2014-05-12 18:36:14 +02:00
Adrien Grand	caacce9429	[TESTS] Improve BenchmarkIntegrationTest's check that percentiles are increasing. Percentiles are supposed to be monotonically increasing but floating-point rounding issues can come into play and make the test fail if checks are too strict.	2014-05-12 17:21:11 +02:00
Shay Banon	78e39882ee	Allow to change concurrent merge scheduling setting dynamically Allow to change the concurrent merge scheduler settings dynamically using the update settings API closes #6098	2014-05-12 07:33:31 -07:00
Adrien Grand	6d9da390ed	[TESTS] Fix MinDocCountTests. The new include/exclude support for global ordinals didn't exclude terms in `buildAggregation` (which is required if minDocCount is 0).	2014-05-12 15:45:29 +02:00
javanna	9361305177	[TEST] made catch request more accurate in REST tests runner Excluded 404, 403 and 409 status codes from the catch request as they have their own specific catch codes	2014-05-12 14:29:56 +02:00
Martijn van Groningen	64c43c6dc0	Made the include and exclude support for terms and significant terms aggregations based on global ordinals. Closes #6000	2014-05-12 13:14:13 +02:00
javanna	7980911d96	restored @Test annotation in SimpleValidateQueryTests	2014-05-12 12:52:48 +02:00
Alex Marandon	d1ddbd2c51	Detect unsupported fields after query in validate query api The validate API was failing to reject JSON input that had unsupported fields placed after a supported field. This was causing invalid requests to be reported as valid. Fixes #5685	2014-05-12 12:49:25 +02:00
javanna	3d63bac51d	Fixed validate query parsing issues Made sure that a match_all query is used when no query is specified and ensure no NPE is thrown either. Also used the same code path as the search api to ensure that alias filters are taken into account, same for type filters. Closes #6111 Closes #6112 Closes #6116	2014-05-12 12:49:25 +02:00
Alex Ksikes	513f25ae97	More Like This: Fix correct use of size and from parameters More Like This API would not take into account 'size' and 'from' in request body parameters. Instead these values would always be overriden by the default values of REST parameters 'search_size' and 'search_from'. Closes #5981	2014-05-12 12:30:04 +02:00
Martijn van Groningen	d4d6c3459e	[TEST] Make sure all shards are allocated before the delete type is being executed.	2014-05-12 11:59:09 +02:00
Adrien Grand	ebfab19400	[TESTS] Disable BenchmarkIntegrationTest#testSubmitBenchmark until it is fixed.	2014-05-12 11:30:01 +02:00
Martijn van Groningen	145efbf6ea	Return missing (404) is a scroll_id is cleared that no longer exists. Closes #5730	2014-05-12 09:43:56 +02:00
Adrien Grand	51de01bae5	[TESTS] Tentative fix of BigArrays byte-accounting checks.	2014-05-12 09:25:49 +02:00
cccabot	58ebcf1252	Fixed typos in FieldSortBuilder	2014-05-10 02:57:51 +02:00
Andrew Selden	48879752a2	[TEST] Fix for benchmark tests - Fix bug where repeatedly calling computeSummaryStatistics() could accumulate some values incorrectly - Fix check for number of responsive nodes on list is <= number of candidate benchmark nodes - Add public getters for summary statistics - Add javadoc for new getters - Add javadoc comments about API use - Improve abort and status tests by calling awaitBusy() to wait for jobs to be completely submitted before testing them	2014-05-09 16:01:57 -07:00
mikemccand	5e40a4b95a	don't call isFinite from XAnalyzingSuggester; re-enable test on Java 8	2014-05-09 18:24:13 -04:00
javanna	6678da8c28	[TEST] randomly added node.bench=true to client node in test cluster and re-enabled REST benchmark tests based on number of bench nodes available In our REST tests we already have support for features and skip sections that allow to skip tests if a feature is not supported. We can then add a skip section based on the benchmark feature to the benchmark tests and execute them only when they are supported, knowing that they need at least a node with node.bench settings within the cluster. We can check that this requirement is met by calling the nodes info api. This way we can dynamically decide whether to execute those tests or not and we don't need to have a node.bench around all the time. In fact, given that the REST tests use the GLOBAL cluster, we want to be able to randomize settings as much as possible and run tests against default settings as well. Also, this mechanism can be easily supported by the external cluster implementation that is used during the release process. Introduced ability to disable benchmark nodes which is needed by BenchmarkNegativeTest.	2014-05-09 23:36:00 +02:00
Alex Ksikes	d8bb7c157a	[TEST] Removed the restriction on the number of bool clauses that must match. The test failed because 'percent_terms_to_match' defaults to 0.3, which results in requiring that some terms only found in the queried document must match, when all the documents are on the same shard.	2014-05-09 19:14:32 +02:00
Lee Hinman	e7e4ef859a	Add /_cat/fielddata to display fielddata usage Closes #4593	2014-05-09 13:18:02 +02:00
Alex Ksikes	dae48d9fe8	Added the ability to include the queried document for More Like This API. By default More Like This API excludes the queried document from the response. However, when debugging or when comparing scores across different queries, it could be useful to have the best possible matched hit. So this option lets users explicitly specify the desired behavior. Closes #6067	2014-05-09 12:59:39 +02:00
mikemccand	aa31c71776	mute this test until we fix isFinite	2014-05-09 05:24:22 -04:00
Martijn van Groningen	67fe88c63c	[TEST] Enforce that only one shard per node is allocated. The prevents during node shutdown, that a second shard is assigned the another node.	2014-05-09 10:43:08 +02:00
Martijn van Groningen	d7c05e5924	Temporarily disabling benchmark tests. Relates #6094	2014-05-08 13:18:12 +02:00
Martijn van Groningen	d5b95e3e8a	A number of changes to fix reduce failures if shard failures have occurred: * The shardTopDocs array should get created with the size equal to the total number of shard level requests and not the total number of requests that have a shard level result. * Make sure no null TopDocs entires are passed down to TopDocs#merge * Added dedicated scroll tests that tests scrolling on an index that has missing shards due to node failure. * Made sure that the sort fields in SimpleNestedTests exists by adding the fields in the mapping during index creation. Closes #6022	2014-05-08 10:17:00 +02:00
Martijn van Groningen	0efeeff49a	The percolator needs to deleted percolator documents into account when running in near realtime mode. This bug only occurs in non-realtime mode when query, filter, facet or aggs is used. Closes #5843 Closes #5840	2014-05-08 09:52:27 +02:00
Andrew Selden	c00120b818	Fix for benchmark test - Fix bug where repeatedly calling computeSummaryStatistics() could accumulate some values incorrectly. - Fix check for number of responsive nodes on list is <= number of candidate benchmark nodes. - Add public getters for summary statistics - Add javadoc for new getters - Add javadoc comments about API use	2014-05-07 18:42:39 -07:00
mikemccand	82aad78ff2	it's safe to use OneMerge.getTotalBytesSize (fixed in LUCENE-4775)	2014-05-07 17:25:06 -04:00
Andrew Selden	f23274523a	Integration tests for benchmark API. - Randomized integration tests for the benchmark API. - Negative tests for cases where the cluster cannot run benchmarks. - Return 404 on missing benchmark name. - Allow to specify 'types' as an array in the JSON syntax when describing a benchmark competition. - Don't record slowest for single-request competitions. Closes #6003, #5906, #5903, #5904	2014-05-07 14:14:54 -07:00
uboness	fc52db1209	Changed the respnose structure of the percentiles aggregation where now all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false` Closes #5870	2014-05-07 18:35:24 +02:00
Shay Banon	743dc19acb	Node version sometimes empty in _cat/nodes closes #5480	2014-05-07 18:08:11 +02:00
Britta Weber	7944369fd1	Add `shard_min_doc_count` parameter for significant terms similar to `shard_size` Significant terms internally maintain a priority queue per shard with a size potentially lower than the number of terms. This queue uses the score as criterion to determine if a bucket is kept or not. If many terms with low subsetDF score very high but the `min_doc_count` is set high, this might result in no terms being returned because the pq is filled with low frequent terms which are all sorted out in the end. This can be avoided by increasing the `shard_size` parameter to a higher value. However, it is not immediately clear to which value this parameter must be set because we can not know how many terms with low frequency are scored higher that the high frequent terms that we are actually interested in. On the other hand, if there is no routing of docs to shards involved, we can maybe assume that the documents of classes and also the terms therein are distributed evenly across shards. In that case it might be easier to not add documents to the pq that have subsetDF <= `shard_min_doc_count` which can be set to something like `min_doc_count`/number of shards because we would assume that even when summing up the subsetDF across shards `min_doc_count` will not be reached. closes #5998 closes #6041	2014-05-07 18:02:56 +02:00
javanna	f554178fc7	Renamed IndicesOptions#strict and IndicesOptions#lenient to make it clearer what they actually return, reused methods and introduced new one Relates to #6059, where two new constants were introduced in IndicesOptions. There were already two constants there though, one of which we could have reused. This commit tries to unify them.	2014-05-07 17:40:57 +02:00
Alexander Reelsen	0c0f717aba	Removed Index Status API The functionality of the index status API has been replaced by the recovery API. Relates #4854	2014-05-07 16:57:19 +02:00
Adrien Grand	c49276cda7	Add a dedicated field data type for the _index field mapper. This makes aggregations work on the _index field, and also allows to remove the special facet aggregator for the _index field. Close #5848	2014-05-07 14:06:13 +02:00
Adrien Grand	c4f127fb6f	Limit the number of bytes that can be allocated to process requests. This should prevent costly requests from killing the whole cluster. Close #6050	2014-05-07 12:55:48 +02:00
Adrien Grand	8cd7811955	Lower initial sizing of sub aggregations. We currently compute initial sizings based on the cardinality of our fields. This can be highly exagerated for sub aggregations, for example if there is a parent terms aggregation that is executed over a field that has a very long tail: most buckets will only collect a couple of documents. Close #5994	2014-05-06 17:23:34 +02:00
Adrien Grand	c306d8c5f5	Don't assume fixed earth diameter in the geo-distance bounding box optimization. We switched to Lucene's SloppyMath way of computing an approximate value of the eath diameter given a latitude in order to compute distances, yet the bounding box optimization of the geo distance filter still assumed a constant earth diameter, equal to the average. Close #6008	2014-05-06 16:20:31 +02:00
Shay Banon	44fd962a9f	Improve 404 on missing scroll id This relates to #6040, the fix is twofold, first, not handling missing context specifically in the search code, but behave the same as we do in non scroll search, where if all the shards failed, raise an exception. The second is to apply this logic in both scroll cases.	2014-05-06 15:55:42 +02:00
Shay Banon	66296de38d	Remove unused dump infra Way back when, when ES started, there was an idea for a dump infrastructure, but it ended up supporting its serviceability aspects through APIs, remove the unused code	2014-05-06 14:02:24 +02:00
javanna	a8b6f81525	Made it mandatory to specify IndicesOptions when calling MetaData#concreteIndices Removed MetaData#concreteIndices variations that didn't require an IndicesOptions argument. Every caller should specify how indices should be resolved to concrete indices based on the indices options argument. Closes #6059	2014-05-06 12:45:16 +02:00
Adrien Grand	90b547cf2c	Remove RootMapper.validate and validate the routing key up-front. RootMapper.validate was only used by the routing field mapper, which makes buggy assumptions about how fields are indexed. For example, it assumes that the index representation of a field is the same as its external representation. Close #5844	2014-05-06 11:55:31 +02:00
Adrien Grand	589360c8b1	[TESTS] Don't randomize mappings in SimpleValidateQueryTests. This test relies on the fact that the _id field is not indexed.	2014-05-06 11:46:31 +02:00
Adrien Grand	17a32fca03	[TEST] Random dynamic templates. This change randomly indexes the _id field and randomizes field data formats and loading. Close #5834	2014-05-06 11:07:43 +02:00
Alexander Reelsen	d356881664	[REST] Missing scroll id now returns 404 A bad/non-existing scroll ID used to return a 200, however a 404 might be more useful. Also, this PR returns the right Exception (SearchContextMissingException) in the Java API. Additionally: Added StatusToXContent interface and RestStatusToXContentListener listener, so the appropriate RestStatus can be returned Closes #5729	2014-05-05 17:37:26 +02:00
Shay Banon	fad5e2d0e1	Remove operation threading from broadcast actions Similar to search removal, the operation threading options are not really ued, and the default should always be used. This also considerably simplifies the code. A side affect is that we can now remove the ShardIterator#firstOrNull method, which can cause for sneaky bugs to occur. closes #6044	2014-05-05 17:09:36 +02:00
Alexander Reelsen	799bb2491c	Analyze API: Default analyzer accidentally removed stopwords The analyze API used the standard analyzer from lucene and therefore removed stopwords instead of using the elasticsearch default analyzer. Closes #5974	2014-05-05 15:55:33 +02:00
Alexander Reelsen	d4fcf23057	Cluster State API: Remove index template filtering The possibility of filtering for index templates in the cluster state API had been introduced before there was a dedicated index templates API. This commit removes this support from the cluster state API, as it was not really clean, requiring you to specify the metadata and the index templates. Closes #4954	2014-05-05 14:54:14 +02:00
Shay Banon	7ce8306bc5	Remove search operation threading option Search operation threading is an option that is not really used, and current non default implementations are flawed. Handling it also creates quite the complexity in the search handling codebase... This is a breaking change, but one that is actually a good one, since I haven't seen/heard anybody use it, and if its used, its problematic... closes #6042	2014-05-05 11:39:16 +02:00
Benjamin Devèze	cea2d21c50	Fix bug in PropertyPlaceholder and add unit tests. Close #6034	2014-05-05 10:21:18 +02:00
Adrien Grand	727e6172e3	Restore read/write visibility is PlainShardsIterator. Change #5561 introduced a potential bug in that iterations that are performed on a thread are might not be visible to other threads due to the removal of the `volatile` keyword. Close #6039	2014-05-05 10:05:44 +02:00
Shay Banon	342a32fb16	Search might not return on thread pool rejection When a thread pool rejects the execution on the local node, the search might not return. This happens due to the fact that we move to the next shard only within the execution on the thread pool in the start method. If it fails to submit the task to the thread pool, it will go through the fail shard logic, but without "counting" the current shard itself. When this happens, the relevant shard will then execute more times than intended, causing the total opes counter to skew, and for example, if on another shard the search is successful, the total ops will be incremented beyond the expectedTotalOps, causing the check on == as the exit condition to never happen. The fix here makes sure that the shard iterator properly progresses even in the case of rejections, and also includes improvement to when cleaning a context is sent in case of failures (which were exposed by the test). Though the change fixes the problem, we should work on simplifying the code path considerably, the first suggestion as a followup is to remove the support for operation threading (also in broadcast), and move the local optimization execution to SearchService, this will simplify the code in different search action considerably, and will allow to remove the problematic #firstOrNull method on the shard iterator. The second suggestion is to move the optimization of local execution to the TransportService, so all actions will not have to explicitly do the mentioned optimization. fixes #4887	2014-05-05 09:24:53 +02:00
javanna	e96e634d10	[TEST] fixed _cat/thread_pool REST tests with local transport, in case the transport port is not available and gets returned as '-' Re-enabled REST tests suite Closes #6033	2014-05-04 22:10:03 +02:00
mikemccand	6bc3a744a1	Fix StackOverflowException for long suggestion strings Changed getFiniteStrings to use an iterative implementation instead of recursive, so we don't use a Java stack-frame per character for each suggestion at build & query time.	2014-05-04 13:35:05 -04:00
Shay Banon	c9f1792c81	Change default filter cache to 10% and circuit breaker to 60% The defaults we have today in our data intensive memory structures don't properly add up to properly protected from potential OOM. The circuit breaker, today at 80%, aims at protecting from extensive field data loading. The default threshold today is too permissive and can still cause OOMs. The filter cache today is at 20%, and its too high when adding it to other limits we have, reduce it to 10%, which is still a big enough portion of the heap, yet provides improved safety measure. closes #5990	2014-05-04 15:38:16 +02:00
Adrien Grand	01eb01cb70	[TEST] Disable REST tests until #6033 is fixed.	2014-05-04 11:58:30 +02:00
Boaz Leskes	694bf287d6	Do not start a recovery process if the primary shard is currently allocated on a node which is not part of the cluster state If a source node disconnect during recover, the target node will respond by canceling the recovery. Typically the master will respond by removing the disconnected node from the cluster state, promoting another shard to become primary. This is sent it to all nodes and the target node will start recovering from the new primary. However, if the drop of a node caused the node count to go bellow min_master_node, the master will step down and will not promote shard immediately. When a new master is elected we may publish a new cluster state (who's point is to notify of a new master) which is not yet updated. This caused the node to start a recovery to a non existent node. Before we aborted the recovery without cleaning up the shard, causing subsequent correct cluster states to be ignored. We should not start the recovery process but wait for another cluster state to come in. Closes #6024	2014-05-02 23:30:24 +02:00
Alex Ksikes	b55d8ed2e3	Fix behavior on default boost factor for More Like This. A boost terms factor of 1.0 is not the same as no boosting of terms. The desired behavior is to deactivate boosting by default. If the user specifies any value other than 0, then boosting is activated. Closes #6021	2014-05-02 16:59:09 +02:00
Holger Hoffstätte	f5c9bf6f0f	Update JNA to latest version Updating to this version allows to configure a special JNA directory, in case the /tmp directory is mounted with the noexec option, as JNA extracts some data and tries to execute parts of it. Also updated documentation to clarify mlockall and memory settings as well as pointing to the new jna.tmpdir system property. Closes #5493	2014-05-02 11:52:57 +02:00
Britta Weber	2e44040388	function_score parser throws exception if both functions:[] and single function given In addition, add a special warning if the misplaced function is a "boost_factor" function to avoid confusion of "boost" and "boost_function". closes #5995	2014-05-02 10:53:33 +02:00
Shay Banon	a557ee8daf	Support empty properties array in mappings closes #5887	2014-05-01 12:18:39 -04:00
Boaz Leskes	42a112f50b	debug log of receiving a cluster state from another master could be erroneously logged Added trace logging to MinimumMasterNodesTests.multipleNodesShutdownNonMasterNodes	2014-05-01 13:15:08 +02:00
Martijn van Groningen	9493824a0e	[TEST] (RecoveryPercolatorTests) Don't stop the master node and always use the client of the master node	2014-05-01 14:06:34 +07:00
Martijn van Groningen	61093f1bd1	[TEST] Replace execute().actionGet() with get()	2014-05-01 14:06:34 +07:00
Shay Banon	23f200bc0e	Use non analyzed token stream optimization everywhere In the string type, we have an optimization to reuse the StringTokenStream on a thread local when a non analyzed field is used (instead of creating it each time). We should use this across the board on all places where we create a field with a String. Also, move to a specific XStringField, that we can reuse StringTokenStream instead of copying it. closes #6001	2014-04-30 17:18:15 -04:00
Martijn van Groningen	12f43fbbc0	Fixed license headers.	2014-05-01 00:33:17 +07:00
Martijn van Groningen	013b319415	Added `reverse_nested` aggregation. The `reverse_nested` aggregation allows to aggregate on properties outside of the nested scope of a `nested` aggregation. Closes #5507	2014-05-01 00:23:05 +07:00
Martijn van Groningen	5a0070071a	Use collectExistingBucket in GlobalOrdinalsSignificantTermsAggregator.WithHash. Relates to #5955.	2014-04-30 23:24:33 +07:00
Matt Weber	2663d04a96	Run tests through forbidden-apis.	2014-04-30 17:48:33 +02:00
Adrien Grand	34fb5e48e2	Use collectExistingBucket in GlobalOrdinalsStringTermsAggregator.WithHash. Relates to #5955.	2014-04-30 15:34:01 +02:00
Boaz Leskes	870bd90f54	ThreadPool.EstimatedTimeThread should be set on initialization Some tests run before the thread is started and thus use 0 as a the current time, which later on leads to big time jumps and thus failures. Ex. InternalEngineTests.testVersioningReplicaConflict2	2014-04-30 11:47:47 +02:00
Adrien Grand	b2db7c8222	Improve the way sub-aggregations are collected. Sub-aggregations are currently collected directly, by just forwarding the doc ID and bucket ordinal to them. This change adds the new BucketCollector abstract class that Aggregator extends, so that we have more flexibility to add implicit filters or buffering between an aggregator and its sub aggregators. Close #5975	2014-04-30 08:47:25 +02:00
Adrien Grand	2eeaa56d95	Fix setting of readerGen in BytesRefOrdValComparator on nested documents. Sorting was broken on nested documents because the `missing(slot)` method didn't correctly set the segment ordinal (readerGen), causing term ordinals to be compared across segments. Close #5986	2014-04-30 08:21:26 +02:00
Shay Banon	2076194d8f	Upgrade to Jackson 2.3.3 fixes the long value bug as well...	2014-04-29 20:13:43 -04:00
Shay Banon	34302a7cc5	disable using CBOR in randomized test infra due to a bug in CBOR handling long values (test case to verify it is included), disalbe using CBOR in our tests till it gets fixed	2014-04-29 19:11:12 -04:00
Martijn van Groningen	dce127bcdf	Added global ordinals based implementations for significant terms aggregator. Closes #5970	2014-04-30 01:36:02 +07:00
Shay Banon	a4ef418e6e	Range/Term query/filter on dates fail to handle numbers properly When providing a number (milliseconds since epoch, UTC), range and term query/filter don't handle it correctly and convert it to a string, that is then first tried to parse as a date closes #5969	2014-04-29 14:25:05 -04:00
mikemccand	fb53784e3b	add thread name to logger message from IndexWriter's infoStream	2014-04-29 10:50:36 -04:00
Adrien Grand	6ec01c13e5	Fix computation of the missing ord (leftover of the ordinals change).	2014-04-29 16:29:01 +02:00
Britta Weber	9d214d14fe	Provide meaningful error message if field has no fielddata type closes #5930	2014-04-29 15:19:01 +02:00
mikemccand	a8d4c04fc2	include thread name when logging IndexWriter's infoStream messages	2014-04-29 05:50:13 -04:00
Adrien Grand	d07c5a5c32	Aggregations parsing is too lenient. Close #5827	2014-04-29 11:07:06 +02:00
Martijn van Groningen	8817281a70	Added AwaitsFix	2014-04-29 13:58:39 +07:00
Martijn van Groningen	0f23485a3c	Cut p/c queries (has_child and has_parent queries) over to use global ordinals instead of being bytes values based. Closes #5846	2014-04-29 12:41:04 +07:00
Martijn van Groningen	fc3efda6af	Cut other aggregations over to use collectExistingBucket() if a bucket ord has been hit, that already exists. Closes #5955	2014-04-29 11:07:12 +07:00
Martijn van Groningen	f3219f7098	Added global ordinals terms aggregator impl that is optimized low cardinality fields. Instead of resolving the global ordinal for each hit on the fly, resolve the global ordinals during post collect. On fields with not so many unique values, that can reduce the number of global ordinals significantly. Closes #5895 Closes #5854	2014-04-29 11:04:03 +07:00
Matt Weber	4df4506875	Use URI vs URL accessing File from classpath. URL escapes special characters such as spaces which causes the resource to not be found when used to create a File object. Use URI. Closes #5915	2014-04-28 18:49:55 +02:00
javanna	51ba3ca220	[TEST] made sure nodeSettings method gets called for every node type, not only data nodes in case numDataNodes is specified. This fixes a test ZenUnicastDiscoveryTests when running in network mode	2014-04-28 18:31:47 +02:00
javanna	a414e4f2f3	[TEST] randomly introduced a client node within test cluster The default number of clients nodes is randomized between 0 and 1, applied to all cluster scopes (global, suite and test). Can be changed through the newly added `@ClusterScope#numClientNodes`. In our tests we currently refer to nodes in a generic way. All the tests that either stop or start nodes rely on the fact that those nodes hold data though. Made that clearer as that becomes more important when introducing other types of nodes within the test cluster. Reflected this by adapting and renaming the following methods in `TestCluster`: - ensureAtLeastNumNodes to ensureAtLeastNumDataNodes - ensureAtMostNumNodes to ensureAtMostNumDataNodes - stopRandomNode to stopRandomDataNode and the following ones in `ElasticsearchIntegrationTest`: - allowNodes to allowDataNodes - dataNodes to numDataNodes. - @ClusterScope#numNodes to numDataNodes - @ClusterScope#minNumNodes to minNumDataNodes - @ClusterScope#maxNumNodes to maxNumDataNodes Added facilities to be able to deal with data nodes specifically, like for instance retrieve a client to a data node, or retrieve an instance of a class through guice only from data nodes. Adapted existing tests to successfully run although there's a node client around. Fixed _cat/allocation REST tests to make disk.total, disk.avail and disk.percent optional as client nodes won't return that info. Closes #5949	2014-04-28 16:31:36 +02:00
Martijn van Groningen	17a5575757	Disabled parent/child queries in the delete by query api. It wasn't properly implemented and could lead to a shard being failed and not able to recover. Closes #5828 #5916	2014-04-28 20:12:54 +07:00
Adrien Grand	22cbdd930c	[TEST] Fix test bug in MultiOrdinalsTests.	2014-04-28 13:56:01 +02:00
Robert Muir	8e0a479316	Upgrade to Lucene 4.8 Closes #5932	2014-04-28 06:45:50 -04:00
Chris Earle	5528370e24	Added type, max, min, queueSize & keepAlive to _cat/thread_pool Closes #5366	2014-04-28 12:00:27 +02:00
Simon Willnauer	f285ffc610	Multi value handling in decay functions Decay functions currently only use the first value in a field that contains multiple values to compute the distance to the origin. Instead, it should consider all distances if more values are in the field and then use one of min/max/sum/avg which is defined by the user. Relates to #3960 closes #5940	2014-04-28 11:55:32 +02:00
Britta Weber	f993945e5c	Move SortMode to org.elasticsearch.search and rename to MultiValueMode	2014-04-28 11:55:32 +02:00
Shay Banon	6b2c1d0f62	spelling	2014-04-28 11:07:54 +02:00
Shay Banon	dedddf3908	Raise node disconnected even if the transport is stopped during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case. Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects closes #5918	2014-04-28 10:56:43 +02:00
Adrien Grand	fc32875ae9	Make ordinals start at 0. Our ordinals currently start at 1, like FieldCache did in older Lucene versions. However, Lucene 4.2 changed it in order to make ordinals start at 0, using -1 as the ordinal for the missing value. We should switch to the same numbering as Lucene for consistency. This also allows to remove some abstraction on top of Lucene doc values. Close #5871	2014-04-28 10:21:50 +02:00
javanna	aa4dc092da	_cat/allocation to return no value for `disk.total` when not available (e.g. non data nodes) instead of `-1b` Closes #5948	2014-04-26 16:46:34 +02:00
Lee Hinman	81e83cca74	Disable dynamic scripting by default Closes #5853	2014-04-25 15:08:26 -06:00
Boaz Leskes	051beb51a3	Version types `EXTERNAL` & `EXTERNAL_GTE` test for version equality in read operation & disallow them in the Update API Separate version check logic for reads and writes for all version types, which allows different behavior in these cases. Change `VersionType.EXTERNAL` & `VersionType.EXTERNAL_GTE` to behave the same as `VersionType.INTERNAL` for read operations. The previous behavior was fit for writes but is useless in reads. This commit also makes the usage of `EXTERNAL` & `EXTERNAL_GTE` in the update api raise a validation error as it make cause data to be lost. Closes #5663 , Closes #5661, Closes #5929	2014-04-25 23:06:12 +02:00
Martijn van Groningen	a2aa167e6e	Don't create docCounts equal to maxOrd for the GlobalOrdinalsStringTermsAggregator.WithHash impl. Relates #5873	2014-04-26 00:01:52 +07:00
Martijn van Groningen	eb9805389a	Use segment ordinals as global ordinals if a segment contains all values for a field on a shard level. Relates to #5854 Closes #5873	2014-04-25 23:05:07 +07:00
Shay Banon	65bc017271	Don't lookup version for auto generated id and create When a create document is executed, and its an auto generated id (based on UUID), we know that the document will not exists in the index, so there is no need to try and lookup the version from the index. For many cases, like logging, where ids are auto generated, this can improve the indexing performance, specifically for lightweight documents where analysis is not a big part of the execution. closes #5917	2014-04-25 14:31:20 +02:00
Simon Willnauer	0b3605f4f2	[TEST] Logger names differ based on the classpath, inside the IDE the package name is used as a prefix	2014-04-25 12:36:32 +02:00
Simon Willnauer	b7325d005b	Make Create/Update/Delete classes less mutable Today we use a builder pattern / setters to set relevant information to Engine#Delete\|Create\|Index. Yet almost all the values are required but they are not passed via ctor arguments but via an error prone builder pattern. If we add a required argument we should see compile errors on that level to make sure we don't miss any place to set them. Prerequisite for #5917	2014-04-25 11:42:05 +02:00
mikemccand	908c0d4165	temporarily mute this test on Java 8 until we fix getFiniteStrings	2014-04-25 05:41:18 -04:00
Simon Willnauer	d0f8742f8d	[TEST] Prevent deletion of the second document by using different ids	2014-04-25 11:07:31 +02:00
Simon Willnauer	ec5dbbaf51	[TEST] Expect all shards failed in SearchWithRandomExceptionsTests	2014-04-25 11:03:22 +02:00
Britta Weber	8076a31ac1	Throw exception if an additional field was placed inside the "query" body Currently the parser accepts queries like ``` "query" : { "any_query": { ... }, "any_field_name":... } ``` The "any_field_name" is silently ignored. However, this also causes the parser not to move to the next closing bracket which in turn can lead to additional query paremters being ignored such as "fields", "highlight",... This was the case in issue #4895 closes issue #4895	2014-04-25 08:57:06 +02:00
Britta Weber	c7bb784b08	Fix TemplateQueryParser swallows additional parameters Request parameters such as "size" and "fields" were ignored when placed after the template query in the reqest. closes #5933	2014-04-25 08:51:08 +02:00
Adrien Grand	d8f0f7077f	[TEST] Use assertAllSuccessful instead of assertNoFailures in CompletionSuggestSearchTests.	2014-04-25 00:07:10 +02:00
Adrien Grand	f1916d16dc	[TEST] Fix typo in DateHistogramTests that fails the test since it expects dates to be rounded by day.	2014-04-24 23:25:23 +02:00
Adrien Grand	f109802960	Remove java6ism in FSTBytesAtomicFieldData.	2014-04-24 22:41:12 +02:00
mikemccand	ba73877580	Send Lucene's IndexWriter infoStream messages to Logger lucene.iw, level=TRACE Lucene's IndexWriter logs many low-level details to its infoStream which can be helpful for diagnosing; with this change, if you enable TRACE logging for the "lucene.iw" logger name then IndexWriter's infoStream output will be captured and can later be scrutinized e.g. using https://code.google.com/a/apache-extras.org/p/luceneutil/source/browse/src/python/iwLogToGraphs.py to generate graphs like http://people.apache.org/~mikemccand/lucenebench/iw.html Closes #5891	2014-04-24 16:31:23 -04:00
javanna	9a68e60142	[TEST] Allow to disable randomization of shards and replicas via system property Needed for REST backwards compatibility tests, since we need to run older tests with the latest runner, which randomizes shards and replicas, but the tests rely on defaults (5,1). Done in a generic way based on compatibility versions e.g. `-Dtests.compatibility=1.0.0` allows to run tests in a special manner that is compatibile with 1.0.0 version. Also moved back randomIndexTemplate to ElasticsearchIntegrationTest (from ImmutableCluster) where all the randomized aspects should be. Closes #5897	2014-04-24 22:18:31 +02:00
Isabel Drost-Fromm	dcfc7cead0	Add some more documentation to TemplateQueryParser Relates to #4879	2014-04-24 22:11:17 +02:00
mikemccand	84af7d9f9a	test was missing tie-break for the two suggestions	2014-04-24 15:42:08 -04:00
Britta Weber	e84d3111a3	Revert "Throw exception if decay is requested for a field with multiple values" This reverts commit `95d781510f`. see https://github.com/elasticsearch/elasticsearch/issues/3960#issuecomment-41279373	2014-04-24 15:46:48 +02:00
Britta Weber	95d781510f	Throw exception if decay is requested for a field with multiple values closes #3960	2014-04-24 15:18:39 +02:00
Adrien Grand	0631b6a042	[TEST] DisabledFieldDataFormatTests assumes a single replica.	2014-04-24 14:16:07 +02:00
Adrien Grand	d792d14926	Instantiate facets/aggregations during the QUERY phase. In case of a DFS_QUERY_THEN_FETCH request, facets and aggregations are currently instantiated during the DFS phase while they only become useful during the QUERY phase. By instantiating during the QUERY phase instead, we can make better use of recycling since objects will have a shorter life out of the recyclers. Close #5821	2014-04-24 11:48:36 +02:00
Adrien Grand	d8880f2906	Fail a DFS_QUERY_THEN_FETCH request if all shards failed the QUERY phase. Today, if some shards pass the DFS phase but all of them fail the QUERY phase, the response will only consist of failed shards. We should throw an exception instead in order to be consistent with the QUERY_THEN_FETCH type.	2014-04-24 11:48:24 +02:00
Adrien Grand	cb8139a583	Remove abstraction in the percentiles aggregation. We initially added abstraction in the percentiles aggregation in order to be able to plug in different percentiles estimators. However, only one of the 3 options that we looked into proved useful and I don't see us adding new estimators in the future. Moreover, because of this, we let the parser put unknown parameters into a hash table in case these parameters would have meaning for a specific percentiles estimator impl. But this makes parsing error-prone: for example a user reported that his percentiles aggregation reported extremely high (in the order of several millions while the maximum field value was `5`), and the reason was that he had a typo and had written `fields` instead of `field`. As a consequence, the percentiles aggregation used the parent value source which was a timestamp, hence the large values. Parsing would now barf in case of an unknown parameter. Close #5859	2014-04-24 09:44:36 +02:00
Adrien Grand	b3e0e58094	Field data diet. We have lots of unused, or almost unused methods in our field data impls, especially when dealing with ordinals. Let's nuke them. Close #5874	2014-04-24 09:14:09 +02:00
Shay Banon	0a84253045	[TEST] add a test that explicitly verifies no duplicates are created we do this test in other places in ES, but no dedicated test for it. This test was born out of the auto generate id work, but we should have this test regardless if it gets in or not	2014-04-23 21:13:12 +02:00
Lee Hinman	b5adc877ca	Include name of the field that caused a circuit break in the log and exception message Fixes #5718 Closes #5841	2014-04-23 09:54:00 -06:00
javanna	6eb655380c	[TEST] Randomized number of replicas between 0 and the number of data nodes - 1 (rather than just between 0 and 1) Closes #5896	2014-04-23 17:46:35 +02:00
mikemccand	3e63d530f8	Closes #5882	2014-04-23 10:42:41 -04:00
Simon Willnauer	b36ef995bb	Change default recovery throttling to 50MB / sec The current setting of 20MB/sec seems to be too conservative given the capabilities of modern hardware / network throughput. A 50MB default should provide better out of the box performance.	2014-04-23 15:40:21 +02:00
Robert Muir	8568c18e6f	Change default numeric precision_step Change the default numeric precision_step to 16 for 64-bit types, 8 for 32-bit and 16-bit types. Disable precision_step for the 8-bit byte type. Closes #5905	2014-04-23 09:01:25 -04:00
Martijn van Groningen	f8d35d81d8	Re-order log statements to be correct for segment and top level warming.	2014-04-23 17:13:44 +07:00
Simon Willnauer	b4f0603169	Change default merge throttling to 50MB / sec The current setting of 20MB/sec seems to be too conservative given the capabilities of modern hardware. Even on cloud infrastructure this seems to be too lowish. A 50MB default should provide better out of the box performance	2014-04-22 21:08:40 +02:00
Lee Hinman	029b13cf68	Parse has_child query/filter after child type has been parsed Fixes #5783 Fixes #5838	2014-04-22 09:29:48 -06:00
Shay Banon	8136a38b3f	Improved bloom filter hashing Make improvements to how bloom filter hashing works based on guava 17 upcoming changes, see more here (https://code.google.com/p/guava-libraries/issues/detail?id=1119) In order to do it, introduce a hashing enum, and use the (unused until now) hash type serialization to choose the correct hashing used based on serialized version. Also, move to use our own optimized murmur hash for the new hashing logic.	2014-04-22 17:17:25 +02:00
Lee Hinman	57bee03193	[DOCS] Add /_search_shards documentation	2014-04-22 08:54:32 -06:00
Simon Willnauer	cb9f7c1da5	[TEST] Randomize translog setting per index	2014-04-22 16:41:00 +02:00
Simon Willnauer	1cf62e7782	Use unlimited flush_threshold_ops for translog Currently we use 5k operations as a flush threshold. Indexing 5k documents per second is rather common which would cause the index to be committed on the lucene level each time the flush logic runs which is 5 seconds by default. We should rather use a size based threshold similar to the lucene index writer that doesn't cause such agressive commits which can slow down indexing significantly especially since they cause the underlying devices to fsync their data.	2014-04-22 16:37:07 +02:00
Boaz Leskes	1434f6bcbb	A new ClusterStateStatus to indicate cluster state life cycles When the ClusterService applies a new cluster state, it is first assigned as the new active one and then all listeners are called. Some of ES's features sample the current state and try to take action on it (for example index a document). If that fails, they will wait for change in the cluster state and try again (for example, wait for a shard to start and try indexing again). If you're unlucky you sample the state after it has been assigned as the "active" state but before all listeners has done the work. In this cases the action take (i.e., indexing a doc) will still fail (as the shard is not yet started) but waiting for a new state may take a long time or fail. This commit adds a new ClusterStateStatus that allows to better track the stages a cluster state goes through (currently `RECEIVED`, `BEING_APPLIED` & `APPLIED`). This allows detecting that a cluster state is not yet fully applied and retry without waiting for a new state to arrive. This commit also adds a utility class , ClusterStateObserver, to make this pattern slightly simpler and avoid common pit falls. Closes #5741	2014-04-22 10:14:41 +02:00
Simon Willnauer	41cc1f5bcb	[TEST] Ensure that iteration order of TestSection is consistent	2014-04-22 10:06:58 +02:00
Simon Willnauer	ae911f6e75	[TEST] Remove ambigious 4th suggestion - order differs slightly on Java 8	2014-04-22 10:00:02 +02:00
javanna	918da65d35	[TEST] Added blacklist to be able to skip specific REST tests The blacklist can be provided through -Dtests.rest.blacklist and supports a comma separated list of globs e.g. -Dtests.rest.blacklist=get/10_basic/,index//* Also added some missing docs and made it clearer that the suite/test descriptions effectively contains their (relative) path (api/yaml_file/test section) Closes #5881	2014-04-22 09:52:48 +02:00
Andrew Selden	3121ad20dd	Return valid empty JSON response when no recovery information This is a fix to send back to the client a valid empty JSON response in the case when we have no recovery information. Closes #5743	2014-04-21 16:52:25 -07:00
Andrew Selden	1f7f72135a	Bug fix for hung clients on cluster without benchmark nodes This is a fix for a bug whereby a cluster that has no nodes started with -Des.node.bench=true will cause clients to hang if they attempt to submit a benchmark. Also adds REST tests to validate fix Closes #5754	2014-04-21 15:08:50 -07:00
Shay Banon	2f8fc98012	[TEST] make fetch time in millis test more resilient beef up the fetch work, and increase teh number of iterations (since we count in nanos, but reports in rounded millis)	2014-04-22 00:00:08 +02:00
Shay Banon	aa86a51070	Use loopback when localhost is not resolved we use the "local host" address in sevearl places in our networking layer, if local host is not resolved for some reason, still continue and operate but using the loopback interface	2014-04-21 20:55:03 +02:00
Simon Willnauer	f26e9e784f	Searcher might not be closed if store hande can't be obtained Today we first get a reference to the IndexSearcher in #acquireSearcher and then futher down we try to run Store#incRef() which might throw an exception if the store is already closed. There is a small window that allows this to happen during InternalEngine#close() when we try to acquire the searcher at the same time and the engine is the last resource that holds a reference to the store. This commit only affects unreleased code since the Store's ref counting has not yet been released.	2014-04-21 20:45:38 +02:00
Boaz Leskes	baea1827d1	[Tests] SimpleRecoveryLocalGatewayTests.testSingleNodeNoFlush could fail if shards were not started The test starts a single node, indexes into, restarts the node and checks that no data was lost. It only indexed into 2 shards and didn't wait for green meaning that the node could be restarted with non-started primary. In that case the node will not re-assign the primary as it was not started. This commit makes sure that we either wait for primaries to start or index into all shards which has the same net effect. Also extending some logging in InternalIndexShard.	2014-04-21 11:44:16 +02:00
Boaz Leskes	2580099cf2	[Test] Let SuggestStatsTests.testSimpleStats do more work The test verifies that stats are measure by checking timeInMillis>0. On fast machines the suggestions are done in < 1 millis time. The tests now index documents (to power suggestions) and does multiple suggestions per iterations to slow things down.	2014-04-19 17:46:52 +02:00
Boaz Leskes	12bbe28649	Fail replica shards locally upon failures When a replication operation (index/delete/update) fails to be executed properly, we fail the replica and allow master to allocate a new copy of it. At the moment, the node hosting the primary shard is responsible of notifying the master of a failed replica. However, if the replica shard is initializing (`POST_RECOVERY` state), we have a racing condition between the failed shard message and moving the shard into the `STARTED` state. If the latter happen first, master will fail to resolve the fail shard message. This commit builds on #5800 and fails the engine of the replica shard if a replication operation fails. This protects us against the above as the shard will reject the `STARTED` command from master. It also makes us more resilient to other racing conditions in this area. Closes #5847	2014-04-18 18:56:08 +02:00
Simon Willnauer	b6515e2979	[TEST] Make InternalEngineMergeTests more stable	2014-04-18 18:20:44 +02:00
javanna	442dda2ac8	[TEST] _id is not indexed by default, sort on score,_uid in MultiMatchQueryTests	2014-04-18 15:09:00 +02:00
Martijn van Groningen	a808fe9d46	Moved the updateMappingOnMaster logic into a single place. Closes #5798	2014-04-18 19:27:13 +07:00
javanna	d6a676724a	[TEST] added sort by "_id" when score is the same to MultiMatchQueryTests#testEquivalence A merge (and refresh) might rarely happen in the background between the two queries whose output is compared. It might then happen that two docs with same scores get returned by the two queries in a different order due to different lucene document id (which has changed in the meantime). To fix this we need to order by id when the score is the same, so that we can safely compare the output of the two queries (multimatch and dismax).	2014-04-18 12:15:44 +02:00
Martijn van Groningen	a73286bcc4	[TEST] Use startNodesAsync in unicast discovery tests.	2014-04-17 11:51:11 +07:00
Simon Willnauer	0948260ada	[TEST] make testTimeoutSendExceptionWithDelayedResponse more reliable on slow systems	2014-04-16 22:59:31 +02:00
Simon Willnauer	1755ae7470	Added version constants for 1.1.2 and 1.0.4	2014-04-16 17:21:19 +02:00
Boaz Leskes	0887e68d4b	[Test] InternalEngineTests: increased gc deletes interval & turn it off randomly	2014-04-16 15:59:56 +02:00
Simon Willnauer	26adb37f09	[TEST] Ignore bogus system properties. LuceneTestCase might reset some solr properties that cause our tests to fail if the run before in the same JVM We just ignore solr properties.	2014-04-16 15:19:17 +02:00
Simon Willnauer	3530c8be7e	[TEST] catch exceptions if TTL already expired when indexing TTLPercolatorTests indexes docs with small TTLs which can trigger AlreadyExpiredException exception. This is expected while rare and we should just catch them.	2014-04-16 15:10:28 +02:00
Simon Willnauer	be14968c44	Ensure close is called under lock in the case of an engine failure Until today we did close the engine without aqcuireing the write lock since most calls were still holding a read lock. This commit removes the code that holds on to the readlock when failing the engine which means we can simply call #close()	2014-04-16 14:50:40 +02:00
Boaz Leskes	099b9c6b06	add debug logs if failed shards can not be resolved.	2014-04-16 14:45:54 +02:00
Martijn van Groningen	840d1b4b8e	[TEST] Reduce the amount of docs being indexed.	2014-04-16 15:49:24 +07:00
Martijn van Groningen	98deb5537f	Better deal with invalid scroll ids. Closes #5738	2014-04-16 14:13:29 +07:00
Simon Willnauer	8df5d4c37e	[TEST] Fix PercolatorTests#testSimple2 This test requires a mapping since otherwise if there is no mapping added the percolator query might not be parsed as a query on a numeric field since the query might arrive on a node before the dynamic mapping reached that node. This commit also moves the `indexService.readAllowed()` call up before the number of percolation queries is check to make sure we fail if reads are not allowed - there might be a query in-flight which means we need to check another node rather than return an empty result.	2014-04-15 23:01:35 +02:00
Lee Hinman	65e72a5be5	[TEST] Wait for green, and refresh after indexing in percolator test	2014-04-15 11:19:41 -06:00
Simon Willnauer	c5c87c4a48	[TEST] Don't delete data dirs after test - only delete their content. Closes #5815	2014-04-15 17:03:31 +02:00
Simon Willnauer	320a206352	Switch back to ConcurrentMergeScheduler Load tests showed that SerialMS has problems to keep up with the merges under high load. We should switch back to CMS until we have a better story to balance merge threads / efforts across shards on a single node. Closes #5817	2014-04-15 16:42:23 +02:00
Adrien Grand	9920084ba2	[TEST] Wait for shards to be allocated before running testUpdateMappingDynamicallyWhilePercolating. If the percolate request is executed soon enough, all shards fail and the mapping is not actually updated.	2014-04-15 16:20:16 +02:00
Martijn van Groningen	202b1e2306	Update clusterstate if mapping service has local changes If the during percolating a new field was introduced in the local mapping service, then those changes should be updated in cluster state of the master as well. Closes #5776	2014-04-15 13:41:01 +02:00
Simon Willnauer	7c6d745523	Cleanup FileSystemUtils#mkdirs(File) This methods had some workarounds for bugs that seem to be fixed in Java 7 [1]. There seem to be other problems on shared file-systems which are not really supported by lucene anyway or rather not recommeded. Yet the current solution that interrupts a static thread reference is too dangrous given all the usage of NIO across elasticsearch. [1] http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4742723	2014-04-15 13:22:51 +02:00
Simon Willnauer	8dd5dd409e	Remove FileSystemUtils#maxOpenFiles This method basically forcefully creates as many files as possible to find out the process limit in a brute-force manner. The number of possible probles with this approach would exceed the number of lines left on this commit message. This commit uses a JMX based alternative to print the process limit.	2014-04-15 13:22:51 +02:00
Shay Banon	bc5bdbc5de	Remove jsr166y now that we on Java 7, cleanup jsr166e to classes we use	2014-04-15 13:17:28 +02:00
Simon Willnauer	8bede7024f	Use TransportBulkAction for internal request from IndicesTTLService This prevents executing bulks internal autocreate indices logic and ensures that this internal request never creates an index automaticall. This fixes a bug where the TTL purger thread ran after the actual index it was purging was already closed / deleted and that re-created that index. Closes #5766	2014-04-15 12:40:25 +02:00
Igor Motov	3d23a71fa7	Fix snapshot status with empty repository The snapshot status command with empty repository should return current status of currently running snapshots in all repositories. Fixes #5790	2014-04-14 19:02:41 -04:00
Igor Motov	2ed8c632be	Separate persistent and global metadata serialization settings	2014-04-14 16:25:33 -04:00
Simon Willnauer	0564c883be	Remove unused FileSystemUtils#copyFile	2014-04-14 21:48:27 +02:00
Simon Willnauer	a215dd3ae8	Prevent fsync from creating 0-byte files This is related to LUCENE-5570 where fsync creates a 0-byte file if the file does not exists. This commit adds the patched lucene version using Java 7 APIs as well as a note to replace this method with the upcomeing IOUtils#fsync in Lucene 4.8 This commit cleans up FsImmutableBlobContainer#writeBlob to make use of Java7 Auto-Closing features and ensures that the directory the blob was written to is fsynced as well if possible.	2014-04-14 21:48:23 +02:00
Adrien Grand	e458d4fd93	Improved SearchContext.addReleasable. For resources that have their life time effectively defined by the search context they are attached to, it is convenient to use the search context to schedule the release of such resources. This commit changes aggregations to use this mechanism and also introduces a `Lifetime` object that can be used to define how long the object should live: - COLLECTION: if the object only needs to live during collection time and is what SearchContext.addReleasable would have chosen before this change (used for p/c queries), - SEARCH_PHASE for resources that only need to live during the current search phase (DFS, QUERY or FETCH), - SEARCH_CONTEXT for resources that need to live until the context is destroyed. Aggregators are currently registed with SEARCH_CONTEXT. The reason is that when using the DFS_QUERY_THEN_FETCH search type, they are allocated during the DFS phase but only used during the QUERY phase. However we should fix it in order to only allocate them during the QUERY phase and use SEARCH_PHASE as a life time. Close #5703	2014-04-14 17:42:41 +02:00
Adrien Grand	e589301806	Make Releasable extend AutoCloseable. Java7's AutoCloseable allows to manage resources more nicely using try-with-resources statements. Since the semantics of our Releasable interface are very close to a Closeable, let's switch to it. Close #5689	2014-04-14 17:21:42 +02:00
Adrien Grand	e688f445ad	[TEST] Use indexRandom in ShardSizeTests.	2014-04-14 12:31:34 +02:00
Simon Willnauer	1ce56ff969	Revert "Don't lookup version for auto generated id and create" This reverts commit `dc73498454`.	2014-04-14 12:15:02 +02:00
Shay Banon	dc73498454	Don't lookup version for auto generated id and create When a create document is executed, and its an auto generated id (based on UUID), we know that the document will not exists in the index, so there is no need to try and lookup the version from the index. For many cases, like logging, where ids are auto generated, this can improve the indexing performance, specifically for lightweight documents where analysis is not a big part of the execution.	2014-04-14 10:06:53 +02:00
Simon Willnauer	ad143e16cf	[TEST] Fix ClusterStatsTests#testValuesSmokeScreen to wait for yellow to get reliable FS stats.	2014-04-12 23:02:31 +02:00
Simon Willnauer	ec3c635696	[TEST] use a real upperbound for the check on the time spend during suggestions	2014-04-12 21:54:46 +02:00
Shay Banon	e9c0dd9ae4	[Test] should be abstract	2014-04-12 16:14:58 +02:00
Simon Willnauer	efb749936b	[TEST] Improve performance of MockBigArray MockPageRecycler	2014-04-11 23:02:59 +02:00
Simon Willnauer	5d611a9098	Ensure pending merges are updated on segment flushes Due to the default of `async_merge` to `true` we never run the merge policy on a segment flush which prevented the pending merges from being updated and that caused actual pending merges not to contribute to the merge decision. This commit also removes the `index.async.merge` setting is actually misleading since we take care of merges not being excecuted on the indexing threads on a different level (the merge scheduler) since 1.1. This commit also adds an additional check when to run a refresh since soely relying on the dirty flag might leave merges un-refreshed which can cause search slowdowns and higher memory consumption. Closes #5779	2014-04-11 23:02:59 +02:00
Boaz Leskes	e0fbd5df52	PR #5706 introduced a bug in the sparse array-backed field data When we load sparse single valued data, we automatically assign a missing value to represent a document who has none. We try to find a value that will increase the number of bits needed to represent the data. If that missing value happen to be 0, we do no properly intialize the value array. This commit solved this problem but also cleans up the code even more to make spotting such issues easier in the future.	2014-04-11 21:34:36 +02:00
Boaz Leskes	63d1fa45ab	Added awaitFix for SimpleNestedTests.testSortNestedWithNestedFilter While investigating failures	2014-04-11 18:12:35 +02:00
Boaz Leskes	f549472fea	Fixed- PackedArrayIndexFieldData.chooseStorageFormat compared to Long.MAX_VALUE instead of Long.MIN_VALUE Also made the LongFieldDataTests.SINGLE_VALUED_SPARSE_RANDOM & LongFieldDataTests.MULTI_VALUED_SPARSE_RANDOM more sparse	2014-04-11 16:40:47 +02:00
Boaz Leskes	1d1ca3befc	Added a AppendingDeltaPackedLongBuffer-based storage format to single value field data The AppendingDeltaPackedLongBuffer uses delta compression in paged fashion. For data which is roughly monotonic this results in reduced memory signature. By default we use the storage format expected to use the least memory. You can force a choice using a new field data setting `memory_storage_hint` which can be set to `ORDINALS`, `PACKED` or `PAGED` Closes #5706	2014-04-11 15:50:34 +02:00
Chris Earle	e8ea9d7585	Strengthening pseudo random number generator and adding tests to verify its behavior. Closes #5454 and #5578	2014-04-11 14:01:40 +02:00
Martijn van Groningen	45a1b44759	Each search request should use a new InternalSearchResponse instance even in case when all shards return no hits. The InternalSearchResponse may get modified afterwards, so a new instance required at all times.	2014-04-11 17:44:21 +07:00
Simon Willnauer	862611b792	[TEST] Prevent TTLPurger from recreating deleted index Related to #5766	2014-04-11 09:03:28 +02:00

... 3 4 5 6 7 ...

4263 Commits