OpenSearch

Commit Graph

Author	SHA1	Message	Date
Alexander Reelsen	dfc2e6381b	CliTool: CheckFileCommand checks for file existence As a CliTool command could potentially also delete files, the CheckFileCommand needs to check if those files exist, before trying to get permissions/owners/groups from that path.	2015-02-12 11:36:47 +01:00
Alexander Reelsen	9cd14a5c29	CliTool: Add command to warn on permission/owner change When using the CLI tool infrastructure, a command can potentially write a new file. In case it overwrites an existing one, you may want to ensure that the permissions, the owner and the group are kept the same and do not accidentally change when overwriting those files. This PR introduces a command that allows you to execute this check per path. It also adds a new testing dependency, namely jimfs, which allows you to create in-memory filesystems with certain properties (like supporting or not posix permissions on this filesystem), so that you can test those features, without executing tests on a certain operating system.	2015-02-12 10:10:11 +01:00
Alexander Reelsen	30a9d97a71	FileSystemUtils: Only create backup copies if files differ The FileSystemUtils class has a helper method to create files with a .new suffix, in case the file, which should be created already exists. If you install plugins and those have configuration files, even without changes, you will end up with tons of .new files. This commit checks the file size and sha-256 sum, and only if those differ, a .new file is actually being created.	2015-02-12 10:08:14 +01:00
Igor Motov	9b75d3ef98	Test: wait for the cluster to recover in ClusterServiceTests before waiting for update state task results On CI machines node recovery sometimes takes up to 2 seconds. When it happens an update cluster state task gets stuck behind the recovery and tests fail with 1 second timeout. This commit makes sure that we wait for recovery to complete before starting the clock.	2015-02-11 19:11:00 -05:00
Ryan Ernst	f735baf306	Core: Remove ability to run optimize and upgrade async This has been very trappy. Rather than continue to allow buggy behavior of having upgrade/optimize requests sidestep the single shard per node limits optimize is supposed to be subject to, this removes the ability to run the upgrade/optimize async. closes #9638	2015-02-11 11:30:27 -08:00
Martijn van Groningen	26b9d14443	Added 1.3.9-SNAPSHOT and 1.4.4-SNAPSHOT versions	2015-02-11 18:02:01 +01:00
Ryan Ernst	54c1813920	Tests: Add forgotten files for static bwc tests	2015-02-11 07:07:40 -08:00
Ryan Ernst	7328aa1c15	Tests: Add static bwc tests for new releases 1.3.8 and 1.4.3	2015-02-11 07:06:17 -08:00
Martijn van Groningen	173cfc14d6	Marked 1.4.3 as released	2015-02-11 15:58:39 +01:00
Simon Willnauer	764fda6420	[TEST] make sandbox settings explicit in Tests	2015-02-11 13:21:43 +01:00
Simon Willnauer	0b0cd1c46c	[ENGINE] Fix deadlock problems when API flush and finish recovery happens concurrently Unfortunately the lock order is important in the current flush codehe. We have to acquire the readlock fist otherwise if we are flushing at the end of the recovery while holding the write lock we can deadlock if: * Thread 1: flushes via API and gets the flush lock but blocks on the readlock since Thread 2 has the writeLock * Thread 2: flushes at the end of the recovery holding the writeLock and blocks on the flushLock owned by Thread 2 This commit acquires the read lock first which would be done further down anyway for the time of the flush. As a sideeffect we can now safely flush on calling close() while holding the writeLock.	2015-02-11 13:17:44 +01:00
Simon Willnauer	2f0d158692	[CORE] Consolidate index / shard deletion in IndicesService Today the logic related to deleting an index is spread across several classes which makes changes to this rather delicate part of the code-base very difficult. This commit consolidates this logic into the IndicesService and moves the handling of ack-ing the delete to the master entirely into `IndicesClusterStateService`.	2015-02-11 09:05:20 +01:00
Simon Willnauer	d3762d6427	[TEST] Make tests pass while flying	2015-02-11 09:05:18 +01:00
Igor Motov	00b5c6431c	Test: testSortMinValueScript - use unmappedType to handle slow propagation of mapping	2015-02-10 19:59:51 -05:00
javanna	9c847db8af	Percolate api: support encoded body as query string param consistently The percolate api doesn't parse the encoded body provided as `source` query string parameter, when percolating an existing document. Fixed and added REST test that would have caught this since we randomly use GET + encoded `source` param instead of GET + request body in our java runner (the perl runner does the same too). Closes #9628	2015-02-11 08:53:04 +11:00
Ryan Ernst	b3474f6b25	Mappings: Remove ability to set path for _id and _routing on 2.0+ indexes _id and _routing now no longer support the 'path' setting on indexes created with 2.0. Indexes created before 2.0 still support this setting for backcompat. closes #6730	2015-02-10 10:53:44 -08:00
Igor Motov	6544890e14	Internal: promptly cleanup updateTask timeout handler Improve cleanup of updateTask timeout handlers. The timeout handlers should be removed as soon as a corresponding update task is processed. Otherwise, timeout handlers might keep old updateTasks and all objects that they are pointing to in memory for the duration of timeout (15 minutes by default). Fixes #9621	2015-02-10 13:00:40 -05:00
Nicholas Knize	5b96595854	[GEO] Updating javadoc for XShapeCollection XShapeCollection has an incorrect description left over from when the relate method was overridden. This one line commit corrects the description.	2015-02-10 07:23:00 -06:00
Simon Willnauer	401e6c6b06	[ENGINE] Factor out settings updates from Engine The engine is already pretty complex, it's still confulated with code that doesn't necessarily belong there. Updateing the settings from the settings service can be done on the level above. This commit cleans up the settings code in the engine and moves it to the IndexShard.	2015-02-10 12:59:12 +01:00
Nicholas Knize	c9893ba0c2	[GEO] Correct bounding box logic for GeometryCollection type "The OpenGIS Abstract Specification: An Object Model for Interoperable Geoprocessing" published by the OGC defines "The boundary of a geometric object is a set of geometric objects of the next lower dimension." The bounding box of a GeometryCollection is therefore the set of bounding rectangles derived from the geometric objects of the next lower dimension. This commit updates the computeBoundingBox and relate methods for the ShapeCollection base class to correctly determine the prefixTree detail level used in Lucene's FilterCellIterator. closes #9360	2015-02-09 17:40:05 -06:00
Simon Willnauer	de7461efd0	[ENGINE] Close Engine immediately if a tragic event strikes. Until lately we couldn't close the engine in a tragic event due to some the lock order and all it's complications. Now that the engine is much more simplified in terms of having a single IndexWriter etc. we don't necessarily need the write-lock on close anymore and can easily just close and continue.	2015-02-09 23:21:53 +01:00
Lee Hinman	622d2c8e42	[CORE] Refactor InternalEngine into AbstractEngine and classes InternalEngine contains a number of inner classes that it uses, however, this makes the class overly large and hard to extend. In order to be able to easily add other Engines (such as the ShadowEngine), these helping methods have been extracted into an AbstractEngine class. The classes that were previously in `InternalEngine` have been moved to separate classes, which will allow for better unit testing as well. None of the functionality of InternalEngine has been changed, this is only refactoring. Note that this is a change I originally made on my shadow-replica branch, however it is easier to review piecemeal so I extracted it into a separate PR.	2015-02-09 13:28:55 -07:00
Igor Motov	dcc15a6460	Test: add wait for nodes to restorePersistentSettingsTest Sometimes by the time update settings is called the second node is not in the cluster yet. As a result change of minimum master node settings to 2 is ignored making this test to fail.	2015-02-09 12:51:48 -05:00
Christoph Büscher	d2f852a274	Aggregations: Add 'offset' option to date_histogram, replacing 'pre_offset' and 'post_offset' Add offset option to 'date_histogram' replacing and simplifying the previous 'pre_offset' and 'post_offset' options. This change is part of a larger clean up task for `date_histogram` from issue #9062.	2015-02-09 14:03:28 +01:00
Simon Willnauer	93df178469	Remove unneeded bwc code	2015-02-09 11:45:56 +01:00
Alexander Reelsen	98a2482825	Testing: Add test rule to repeat tests on binding exceptions Due to the possibility of ports being already used when choosing a random port, it makes sense to simply repeat a unit test upon a bind exception. This commit adds a junit rule, which does exactly this and does not require you to change the test code and add loops. Closes #9010	2015-02-09 11:18:00 +01:00
Simon Willnauer	b3b1a11a64	Mapping update task back references already closed index shard In the ShardRecoveryHandler we issue cluster update tasks to update the mapping. The annonymous inner class backreferences the ShardRecoveryHandler which holds a potentially large IndexShard object (which references buffers & caches etc) If the queue of update tasks piles up and recoveries get cancled and/or shards are closed the ShardRecoveryHandler can't be GCed. This commit moves the update task into a static inner class to allos the GC to do its job.	2015-02-09 09:21:55 +01:00
Boaz Leskes	1167beed48	Test: testRelocationWithBusyClusterUpdateThread - listener should wait for replicas to be created	2015-02-07 10:54:21 +01:00
Boaz Leskes	e684d7fde4	Logging: improve logging messages added in #9562 & #9562 Closes #9603	2015-02-06 22:26:16 +01:00
Robert Muir	66b5ed86f7	fix typo	2015-02-06 09:07:08 -05:00
Robert Muir	9c9b5c27d3	Upgrade to Lucene r1657571. Closes #9587 Squashed commit of the following: commit 23ac91dca4b949638ca1d3842fd6db2e00ee1d36 Author: Adrien Grand <jpountz@gmail.com> Date: Thu Feb 5 18:42:28 2015 +0100 Do not compute scores if aggregations do not need it (like top_hits) or use a script (which might compute scores). commit 51262fe2681c067337ca41ab88096ef80a2e8ebb Author: Adrien Grand <jpountz@gmail.com> Date: Thu Feb 5 15:58:38 2015 +0100 Fix more compile errors. commit a074895d55b8b3c898d23f7f5334e564d5271a56 Author: Robert Muir <rmuir@apache.org> Date: Thu Feb 5 09:31:22 2015 -0500 fix a few more obvious ones commit 399c41186cb3c9be70107f6c25b51fc4844f8fde Author: Robert Muir <rmuir@apache.org> Date: Thu Feb 5 09:28:32 2015 -0500 fix some collectors and queries commit 5f46c2f846c5020d5749233b71cbe66ae534ba51 Author: Robert Muir <rmuir@apache.org> Date: Thu Feb 5 09:24:24 2015 -0500 upgrade to lucene r1657571	2015-02-06 08:53:20 -05:00
Boaz Leskes	487ef80c35	Test: testRelocationWithBusyClusterUpdateThread - CountDownLatch.countDown should be await	2015-02-06 14:31:19 +01:00
Boaz Leskes	45ecb49a09	Test: testRelocationWithBusyClusterUpdateThread - use cluster state listener instead of assertBusy	2015-02-06 14:27:27 +01:00
Boaz Leskes	1b7920f202	Test: add document indexing back to testCancellationCleansTempFiles It was lost during a merge conflict in 796aa5c3fe2424390a8edee604cd292b8afdf514	2015-02-06 12:54:30 +01:00
Boaz Leskes	23022227d4	Recovery: add a timeout to local mapping change check After phase1 of recovery is completed, we check that all pending mapping changes have been sent to the master and processed by the other nodes. This is needed in order to make sure that the target node has the latest mapping (we just copied over the corresponding lucene files). To make sure we do not miss updates, we do so under a local cluster state update task. At the moment we don't have a timeout when waiting on the task to be completed. If the local node update thread is very busy, this may stall the recovery for too long. This commit adds a timeout (equal to `indices.recovery.internal_action_timeout`) and upgrade the task urgency to `IMMEDIATE`. If we fail to perform the check, we fail the recovery. Closes #9575	2015-02-06 10:06:47 +01:00
Ryan Ernst	c6968883a7	Mappings: Remove support for new indexes using path setting in object/nested fields or index_name in any field Backcompat is still here for indexes created before 2.0. closes #6677	2015-02-05 12:44:43 -08:00
Boaz Leskes	9362ba200d	Gateway: add logging around gateway shard allocation This commit adds more logs around the gateway shard allocation. Any errors while reaching out to nodes to list the local shards are logged in `WARN`. Shard info loading time is logged under DEBUG. Also, we log a `WARN` message if an exception forces a full checksum check during reading the store metadata Closes #9562	2015-02-05 18:06:33 +01:00
Boaz Leskes	f7fe6b7461	Test: add awaitFix to testFullRollingRestart	2015-02-05 13:00:03 +01:00
Boaz Leskes	1b8c0056d3	Test: StaticIndexBackwardCompatibilityTest.unloadIndex should call assertAllFilesClosed That method checks that files were release properly, but also clears a static map holding references to mock directories. Since we iterate on many indexes this created memory pressure.	2015-02-05 12:12:25 +01:00
Boaz Leskes	97ac2f5144	Test: add awaitFix to SearchWithRandomExceptionsTests disabling this until further discussion. Recent failures probably relate to #9211 & #8720 (+ friends)	2015-02-05 11:41:02 +01:00
Masaru Hasegawa	b4f7d26723	Fielddata: Change threshold value of fielddata.filter.frequency.max/min Make it consider 1.0 as 100% instead of aboslute count 1. Closes: #9327	2015-02-05 13:27:42 +09:00
Reuben Sutton	2436552840	Raise an exception on an array of values being sent as the factor for a field_value_factor query closes #7408	2015-02-04 14:17:09 -07:00
Simon Willnauer	4732ef3484	[ENGINE] Remove FlushType and make resources final in InternalEngine This commit removes the FlushType entirely and replaces it in the most places with a simple `Engine#flush()` call. Flushing without committing the translog is now entirely private to the engine and is only called in one place.	2015-02-04 18:42:58 +01:00
Simon Willnauer	0c5599e1d1	[ENGINE] Remove full flush / FlushType.NEW_WRITER The `full` option and `FlushType.NEW_WRITER` only exists to allow realtime changes to two settings (`index.codec` and `index.concurrency`). Those settings are very expert and don't really need to be updateable in realtime.	2015-02-04 17:38:05 +01:00
Boaz Leskes	7beaaaef62	Discovery: publishing timeout to log at WARN and indicate pending nodes When the master publishes a new cluster state it waits (by default) for up to 30s for all nodes to respond. If not it continues to process other pending tasks. At the moment, this timeout is logged under DEBUG but it typically represent a serious issue with one or more of the nodes. We should log it in WARN and give the nodes that failed to respond in a timefly fashion Closes #9551	2015-02-04 16:39:01 +01:00
Adrien Grand	8b76cd76f9	Internal: Avoid unnecessary utf8 conversion when creating ScriptDocValues for a string field. This regression was introduced in #6908: the conversion from RandomAccessOrds to SortedBinaryDocValues goes through Strings while both impls actually work on BytesRef, so the SortedBinaryDocValues instance could directly return the BytesRefs returned by the RandomAccessOrds. Close #9306	2015-02-04 09:53:34 +01:00
javanna	74c7b5a197	Internal: add AliasesRequest interface to mark requests that manage aliases We currently have the IndicesRequest interface to mark indices related requests and be able to retrieve the indices they relate to in a generic way. This commit introduces a similar abstraction for requests that manage aliases, to be able to retrieve/replace the aliases they relate to. Also, IndicesAliasesRequest becomes a CompositeIndicesRequest, as it allows to perform multiple operations (e.g. add/remote multiple aliases). Each single operation (AliasActions) implements now the newly introduced AliasesRequest. AliasesRequest is also implemented by GetAliasesRequest, which allows to retrieve aliases information. Closes #9460	2015-02-04 07:59:33 +01:00
Boaz Leskes	896e8657ea	Discovery: check index uuid when merging incoming cluster state into local In big deployment ClusterState can be large. To make sure we keep reusing objects that were promoted to the Old Gen, ZenDiscovery has an optimization where it tries to reuse existing IndexMetaData object (containing among other things the mappings) from the current cluster state if they didn't change. The comparison currently uses the index name and the metadata version. This is however not enough and we should also check the index uuid. In extreme cases, where cluster state processing is slow and the index in question is deleted and recreated and these operations are batch processed together, we can use the wrong meta data if the version is also identical. This can happen if people create the index with all meta data predefined and no settings were changed. Closes #9489 Closes #9541	2015-02-03 21:36:05 +01:00
Adrien Grand	6cdde31e64	Search: Reuse Lucene's MultiCollector. We could reuse Lucene's MultiCollector instead of implementing our own. Close #9549	2015-02-03 18:12:15 +01:00
Adrien Grand	13b64cc362	Aggs: Make the nested aggregation call sub aggregators with doc IDs in order. Close #9547	2015-02-03 16:51:36 +01:00
javanna	ebb7ecb00e	[TEST] RestClient to use a non static pooling connection manager When closing an instance of RestClient, the connection manager gets shutdown, which makes it not usable anymore. If that is static, like it is now, no RestClient will work anymore from that moment on. Each instance of RestClient should have its own instance of connection manager	2015-02-03 16:46:54 +01:00
Adrien Grand	8540a863aa	Search: Avoid calling DocIdSets.toSafeBits. This method is heavy as it builds a bitset out of a DocIdSet in order to be able to provide random-access. Now that Lucene has removed out-of-order scoring true random-access is very rarely needed and we could instead return an Bits instance that wraps the iterator. Ideally, we would use the DISI API directly but I have to admit that the Bits API is more friendly. Close #9546	2015-02-03 16:16:19 +01:00
javanna	e5b174ff77	[TEST] Move SimpleNettyTransportTests to expected exception Replaced try catch with expected exception, since no additional check was done on the exception thrown.	2015-02-03 15:51:51 +01:00
javanna	338766fd4d	[TEST] Remove needless ClusterScope annotation from NettyTransportMultiPortTests NettyTransportMultiPortTests is not an integration test, it doesn't rely on the test cluster thus the ClusterScope annotation doesn't have any effect.	2015-02-03 15:51:44 +01:00
javanna	0e67dda15d	[TEST] Make sure that match assertion throws error if run against an object We had a REST test that relied on matching a json response against a regex. It worked but the match wasn't done against the actual json object, but its java map representation converted into a string by calling `toString`. Since all other clients test runners don't work in this case, as they try to match a json object against a regex, we should do the same and prevent it from working.	2015-02-03 10:18:18 +01:00
javanna	dfe67da013	[TEST] support stashed values within property names in our REST tests Closes #9533	2015-02-03 10:17:50 +01:00
Boaz Leskes	4342237acf	Test: reduce load in RecoveryWhileUnderLoadTests	2015-02-03 09:32:42 +01:00
Robert Muir	027730006b	core: add 'checksum' option for index.shard.check_on_startup The current "checkindex" on startup is very very expensive. This is like running one of the old school hard drive diagnostic checkers and usually not a good idea. But we can do a CRC32 verification of files. We don't even need to open an indexreader to do this, its much more lightweight. This option (as well as the existing true/false) are randomized in tests to find problems. Also fix bug where use of the current option would always leak an indexwriter lock. Closes #9183	2015-02-03 00:10:08 -05:00
Ryan Ernst	6079d88d43	Mappings: Remove type prefix support from field names in queries This is the first part of #8872.	2015-02-02 13:10:56 -08:00
Lee Hinman	0f405e9710	Merge branch 'pr/8795'	2015-02-02 11:49:45 -07:00
Michael McCandless	e29cf903c8	Core: upgrade to Lucene snapshot r1656366 * IndexWriter deadlock and DV update concurrency fix * BytesRef reuse bug with SortedSetDVTermsEnum * Int overflow skip data corruption bug * Compound file API cleanups * IndexWriter doesn't accept per-doc Analyzer anymore Closes #9524	2015-02-02 13:37:45 -05:00
Christoph Büscher	44193e7ba5	Aggregations: Add 'offset' option to histogram aggregation Histogram aggregation supports an 'offset' option to move bucket boundaries. In a histogram with buckets of size X these can be moved from 0, X, 2X, 3X,... by an offset value of Y to Y, X+Y, 2X+Y, 3X+Y... by using the 'offset' option. The previous 'pre_offset' and 'post_offset' options are removed in favour of the simplified 'offset' option. Closes #9417 Closes #9505	2015-02-02 18:23:01 +01:00
Alexander Reelsen	a55476bf70	Tests: Ensure no use of potentially resolving internal ips	2015-02-02 09:45:42 +01:00
Boaz Leskes	79c8621a47	Test: add trace logging to testNodeFailuresAreProcessedOnce	2015-02-02 09:32:53 +01:00
Alexander Reelsen	59f8c0951a	Netty Transport: Add profiles to transport infos Until now, there was no possibility to expose infos about configured transport profiles. This commit adds the ability to expose those information in the TransportInfo class. The channel was well as the netty pipeline handler now also contain the profile they were configured for, as this information cannot be extracted elsewhere. In addition, each profile now can set its own publish host and port, which might be needed in case of portforwarding or using docker. Closes #9134	2015-02-02 08:17:55 +01:00
Martijn van Groningen	3ce05b6919	inner hits: Fix bug that resolves parent docs properly as inner hit when inner hit is defined on has_parent query.	2015-02-01 22:29:21 +01:00
Martijn van Groningen	d038f372d4	cleanup: Move catching of IOException higher op the stack to reduce the number of try-catch clauses.	2015-02-01 22:27:00 +01:00
Lee Hinman	25f944009c	Remove unneeded null checks from IndicesClusterStateService	2015-02-01 12:13:57 -07:00
Simon Willnauer	42bb5deca2	Revert "[ENGINE] Fail engine if Lucene commit fails" This reverts commit `dda7242848`.	2015-01-31 23:48:34 +01:00
Simon Willnauer	dda7242848	[ENGINE] Fail engine if Lucene commit fails This is similar to refresh, if we fail to commit the data we have to fail the engine since in-ram data is likely discarded. Yet, it's still in translog and might be recoverable when the node is restarted but we have to treat the engine as failed.	2015-01-31 16:45:38 +01:00
Lee Hinman	9557625ae7	Disallow method pointer expressions in Groovy scripting	2015-01-30 15:55:19 -07:00
Lee Hinman	9fe84062a1	Add `beforeIndexAddedToCluster` callback This callback is executed only once, on the master node during an index's creation. An exception thrown during this listener will cancel the index creation. This also adds checks in `IndicesClusterStateService` for the indexService being null as well as if the `indicesService.createIndex` throws an exception on data nodes after an index has already been created.	2015-01-30 15:25:58 -07:00
Adrien Grand	b2010f788d	[TESTS] IndicesQueryCacheTests: Ensure that shards are searchable before starting to query them.	2015-01-30 23:22:27 +01:00
Boaz Leskes	eabc3cde98	Recovery: update access time of ongoing recoveries #8720 introduced a timeout mechanism for ongoing recoveries, based on a last access time variable. In the many iterations on that PR the update of the access time was lost. This adds it back, including a test that should have been there in the first place. Closes #9506	2015-01-30 21:06:28 +01:00
Adrien Grand	00d54fabb2	Search: Remove query-cache serialization optimization. The query-cache has an optimization to not deserialize the bytes at the shard level. However this is a bit fragile since it assumes that serialized streams can be concatenanted (which is not the case with shared strings) and also does not update the QueryResult object that is held by the SearchContext. So you need to make sure to use the right one. With this change, the query cache just deserializes bytes into the QueryResult object from the context. Close #9500	2015-01-30 20:02:18 +01:00
Simon Willnauer	fb377d48bd	Remove dead code	2015-01-30 13:52:26 +01:00
Simon Willnauer	380fcd1d02	Reset MergePolicProvider settings only if the value actually changed Due to some unreleased refactorings we lost the persitence of a perviously set values in MergePolicyProvider. This commit adds this back and adds a simple unittest. Closes #8890	2015-01-30 13:24:08 +01:00
Ryan Ernst	1ebc95ee28	Tests: Add type-unrestricted version of field mapper getter to SearchContext. This fixes an NPE when using TestSearchContext in SignificanceHeuristicTests.	2015-01-29 13:42:07 -08:00
Michael McCandless	ecc8b702d3	also remove force option from logger.trace	2015-01-29 16:18:21 -05:00
Ryan Ernst	4e0e5e7328	Aggs: Remove limitation on field access within aggs to the types provided in the search Currently, doing a field lookup within a terms agg will restrict the fields available to those within the types passed into the search request. However, when doing sub aggs within a children agg, the fields available should not be restricted to those of the search. This change makes the field lookup use the index level mapper service.	2015-01-29 10:49:38 -08:00
Simon Willnauer	c0fa60eb26	Remove HandlesStreamInput/Output The optimization we do in the HandlesStreamInput / Output adds a lot of complexity with a rather unknown benefit. It tries to compress commonly used strings and write ids instead. This should rather be done on a lower level if at all necessary for the small message we send over the network.	2015-01-29 17:43:32 +01:00
Simon Willnauer	1d77c3af82	Fix compilation	2015-01-29 17:41:53 +01:00
Simon Willnauer	03f1fcc85e	[ENGINE] Remove dirty flag and force boolean for refresh Today we have a dirty flag indicating that a refresh must be executed. We also allow users to bypass this by setting a force=true boolean on the refresh request / command. All these flags are unneeded since the SearcherManager has all the information to do the right thing if it's dirty or not.	2015-01-29 17:30:00 +01:00
Simon Willnauer	b275e917b7	[CACHE] Use a smaller expected size when serializing query results BytesStreamOutput allows to pass the expected size but by default uses BigArrays.PAGE_SIZE_IN_BYTES which is 16k. A common cached result ie. a date histogram with 3 buckets is ~100byte so 16k might be very wasteful since we don't shrink to the actual size once we are done serializing. By passing 512 as the expected size we will resize the byte array in the stream slowly until we hit the page size and don't waste too much memory for small query results.	2015-01-29 17:27:08 +01:00
Britta Weber	0a07ce8916	core: disable auto gen id optimization This pr removes the optimization for auto generated ids. Previously, when ids were auto generated by elasticsearch then there was no check to see if a document with same id already existed and instead the new document was only appended. However, due to lucene improvements this optimization does not add much value. In addition, under rare circumstances it might cause duplicate documents: When an indexing request is retried (due to connect lost, node closed etc), then a flag 'canHaveDuplicates' is set to true for the indexing request that is send a second time. This was to make sure that even when an indexing request for a document with autogenerated id comes in we do not have to update unless this flag is set and instead only append. However, it might happen that for a retry or for the replication the indexing request that has the canHaveDuplicates set to true (the retried request) arrives at the destination before the original request that does have it set false. In this case both request add a document and we have a duplicated a document. This commit adds a workaround: remove the optimization for auto generated ids and always update the document. The asumtion is that this will not slow down indexing more than 10 percent, see: http://benchmarks.elasticsearch.org/ closes #8788 closes #9468	2015-01-29 16:26:04 +01:00
Simon Willnauer	15a766084d	[CACHE] Use correct number of bytes in query cache accounting today we use the length of the BytesReference which is misleading since the reference is paged such that the length != ramBytesUsed. This can lead to a way higher memory consuption than expected if query results are tiny since each query result requires at least 16kb. Yet, we should rethink this strategy for query results that are very small ie. less than 20% of the ramBytesUsed but this commit first tries to make the acocunting correct.	2015-01-29 10:59:36 +01:00
Simon Willnauer	4917121de2	Remove Unused code and remove unnecessary abstraction HashedBytesArray is not used anymore and Releable makes only sense on Paged implementation such that the marker interface is unneeded.	2015-01-29 09:51:14 +01:00
Lee Hinman	86e52c30a1	Make `script.groovy.sandbox.method_blacklist_patch` truly append-only Additionally, this setting can be specified in elasticsearch.yml if desired, to pre-populate the list of methods to be added to the default blacklist. When making a change to this setting dynamically, the entire blacklist is logged as well.	2015-01-28 17:09:27 -07:00
Ryan Ernst	afcedb94ed	Mappings: Remove `index_analyzer` setting to simplify analyzer logic The `analyzer` setting is now the base setting, and `search_analyzer` is simply an override of the search time analyzer. When setting `search_analyzer`, `analyzer` must be set. closes #9371	2015-01-28 13:43:15 -08:00
Lee Hinman	cc461a837f	Avoid NullPointerException if optional Groovy jar is removed	2015-01-28 13:49:50 -07:00
Lee Hinman	c610524392	Make groovy sandbox method blacklist dynamically additive Using the `script.groovy.sandbox.method_blacklist_patch` setting, the blacklist can be dynamically added to by specifying a comma-separated list of methods (for example, "toString,size" would add .toString and .size to the blacklist). When the `script.groovy.sandbox.method_blacklist_patch` setting is changed, the script cache is cleared to force new scripts to be recompiled. Additionally the on-disk cache is cleared so that scripts in the `config/scripts` directory are re-compiled as well. This also fixes an issue where script engines were injected more than once, which can cause multiple instances of the script engine per node.	2015-01-28 12:26:09 -07:00
Zachary Tong	a4eb1d5505	Aggregations: Add standard deviation bounds to extended_stats Extended_stats now displays the upper and lower bounds on standard deviations (e.g. avg +/- std). Default is to show 2 std above/below, but can be changed using the `sigma` parameter. Accepts non-negative doubles Closes #9356	2015-01-28 11:47:20 -05:00
gmarz	3e4fc2659d	Nodes Stats: Fix open file descriptors count on Windows Closes #1563	2015-01-28 10:30:02 -05:00
Nicholas Knize	9622f78fe6	Revert "[GEO] Update GeoPolygonFilter to handle ambiguous polygons" This reverts commit `06667c6aa8` which introduces an undesireable dependency on JTS.	2015-01-28 08:03:26 -06:00
Colin Goodheart-Smithe	29c24d75e7	Aggregations: Unify histogram implementations This change makes InternalHistogram the only InternalAggregation used by the Histogram Aggregator. There is still a separate Bucket implementation and Factory implementation. All buckets are created through the factory passed into the InternalHistogram meaning and the correct factory implementation is serialised as part of the aggregation to make sure the correct bucket types are always generate. This is needed by the Transformers (namely the derivative transformer) to allow it to generate buckets of the right type without having to know what the underlying bucket implementation is.	2015-01-28 10:45:28 +00:00
Boaz Leskes	1695f76f68	Test: testOldIndexes should disable merging It verifies some segments need to be upgraded, but if they are merged away, there are upgraded implicitly	2015-01-28 11:34:58 +01:00
Boaz Leskes	22a576d5ba	Recovery: flush immediately after a remote recovery finishes (unless there are ongoing ones) To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0. I also added a simple recovery benchmark. Closes #9439	2015-01-28 09:14:23 +01:00
Igor Motov	13ef7d73b9	Snapshot/Restore: better handling of index deletion during snapshot If an index is deleted during initial state of the snapshot operation, the entire snapshot can fail with NPE. This commit improves handling of this situation and allows snapshot to continue if partial snapshots are allowed. Closes #9024	2015-01-27 21:06:29 -05:00
Boaz Leskes	3512860956	Test: always use replicas in testClusterInfoServiceInformationClearOnError It assume the local node always has a shard	2015-01-28 00:23:03 +01:00
Nicholas Knize	06667c6aa8	[GEO] Update GeoPolygonFilter to handle ambiguous polygons PR #8672 addresses ambiguous polygons - those that either cross the dateline or span the map - by complying with the OGC standard right-hand rule. Since ```GeoPolygonFilter``` is self contained logic, the fix in #8672 did not address the issue for the ```GeoPolygonFilter```. This was identified in issue #5968 This fixes the ambiguous polygon issue in ```GeoPolygonFilter``` by moving the dateline crossing code from ```ShapeBuilder``` to ```GeoUtils``` and reusing the logic inside the ```pointInPolygon``` method. Unit tests are added to ensure support for coordinates specified in either standard lat/lon or great-circle coordinate systems. closes #5968 closes #9304	2015-01-27 15:45:05 -06:00

1 2 3 4 5 ...

6048 Commits