OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-19 19:35:02 +00:00

Author	SHA1	Message	Date
Simon Willnauer	162ad1251c	Fsync documents in an async fashion (#20145 ) today we fsync in a blocking fashion where all threads block while another syncs. Yet, we can improve this and make use of the async infrastrucutre added for `wait_for_refresh` and make fsyncing single threaded while all other threads can continue indexing. The syncing thread then notifies a listener once the requests location is synced. This also allows to send docs to replicas before its actually fsynced allowing for cocurrent replica processing. This patch has a significant impact on performance on slower discs. An initial single node benchmark shows that on very fast SSDs there is no noticable impact but on slow spinning disk this patch shows a ~32% performance improvement. ``` NVME SSD: 336ec0ac9a12b967163a4a21f75beb41c8582cde (master): Total docs/sec: 47200.9 Total docs/sec: 46440.4 23543a97e3e7f72a31e26b50e00931919784426c (async wait for translog): Total docs/sec: 47461.6 Total docs/sec: 46188.3 ------------------------------------------------------------------- Spinning disk: 336ec0ac9a12b967163a4a21f75beb41c8582cde (master): Total docs/sec: 22733.0 Total docs/sec: 24129.8 23543a97e3e7f72a31e26b50e00931919784426c (async wait for translog): Total docs/sec: 32724.1 Total docs/sec: 32845.4 -------------------------------------------------------------------- ```	2016-08-27 21:42:38 +02:00
Igor Motov	3d6270b5cd	Don't rebuild pipeline on every cluster state update Currently, after at least one pipeline is registered it is getting rebuilt on every single cluster state update, even when this update is not related to ingest metadata. This change adds a check that the ingest metadata changed before trying to rebuild all pipelines.	2016-08-27 10:11:51 -04:00
Yannick Welsch	1b75cb63a2	Add recovery source to ShardRouting (#19516 ) Adds an explicit recoverySource field to ShardRouting that characterizes the type of recovery to perform: - fresh empty shard copy - existing local shard copy - recover from peer (primary) - recover from snapshot - recover from other local shards on same node (shrink index action)	2016-08-27 16:11:10 +02:00
qwerty4030	9172653211	Fix NPE during search with source filtering if the source is disabled. (#20093 ) * Fix NPE during search with source filtering if the source is disabled. Instead of throwing an NPE, a search response with source filtering will not contain the source if it is disabled in the mapping. Closes #7758 * Created unit tests for FetchSourceSubPhase. Tests similar to SourceFetchingIT. Removed SourceFetchingIT#testSourceDisabled (now covered via unit test FetchSourceSubPhaseTests#testSourceDisabled). * Updated FetchSouceSubPhase unit tests per comments. Renamed main unit test method. Use assertEquals and assertNull instead of assertThat (less code).	2016-08-27 07:24:45 -04:00
Ali Beyad	230f0b514f	Fixes test to use admin client to check the cluster state instead of a random node's cluster service.	2016-08-27 01:29:29 -04:00
Ali Beyad	5fac32e699	Removed an unecessary TODO for snapshot file restoration and instead added comments explaining what happens during the restore process.	2016-08-26 17:13:14 -04:00
Nik Everett	3fe42beb64	Cleanup some docs Mark one `// NOTCONSOLE`, mark another `[source,painless]`, and another `// TESTRESPONSE` and fix a bug in it.	2016-08-26 16:01:07 -04:00
Lee Hinman	abdd1b6f86	Merge remote-tracking branch 'dakrone/prop-script-settings'	2016-08-26 13:53:48 -06:00
Nik Everett	52f23918c2	Use `painless` as language for painless snippets (#20185 ) The syntax highlighter does a decent job when you do this. This lets us `grep` for painless snippets in the docs. Closes #20025	2016-08-26 15:39:44 -04:00
Lee Hinman	3fbfb3e7e7	Fix propagating the default value for script settings Fixes an issue where the value for the `script.engine.<lang>.inline` settings would be _set_ properly, but would not accurately be reflected in the `include_defaults` output. Adds a test to ensure the default raw setting is now correct. Resolves #20159	2016-08-26 13:03:32 -06:00
Nicolas Ruflin	4ab1093564	Add reindex example on how to reindex daily indices (#18654 ) This can be a common case with beats in case the template changes between two versions and the old data should be reindex with the new templates.	2016-08-26 13:08:52 -04:00
Josh Becker	3c24ea43fd	Docs: Remove extra word from phrase-suggester	2016-08-26 13:02:54 -04:00
Martijn van Groningen	48926b4d66	ingest: don't render template twice for append processor	2016-08-26 18:07:32 +02:00
Xiang Chen	22242ec881	Fix request cache key for search * Make sure indexBoost is serialized in a consistent order * remove hasIndexBoost by using indexBoost size * Make sure phrase suggester's collateParams is serialized in consistent order * Make StreamOutput writer to serialize maps in consistent order	2016-08-26 12:03:24 -04:00
Chris Earle	bd0b06440e	Add "Async" to the end of each Async RestClient method This makes it much harder to accidentally miss the Response.	2016-08-26 10:51:33 -04:00
Jason Tedor	287cb00474	Avoid prematurely triggering logger initialization The class Setting holds a static reference to a deprecation logger instance. When the class initializer for Setting runs, it starts triggering log4j initialization. There is a chain of initializations from InternalSettingsPreparer to Environment to Setting that triggers this initialization before log4j configuration has occurred. This commit modifies this initialization so that initialization is not done eagerly. Relates #20170	2016-08-26 05:07:05 -04:00
Martijn van Groningen	7c9af98a3c	docs: add sort workaround	2016-08-26 10:55:42 +02:00
Adrien Grand	3ed0da5a58	GET operations should not extract fields from `_source`. #20158 This makes GET operations more consistent with `_search` operations which expect `(stored_)fields` to work on stored fields and source filtering to work on the `_source` field. This is now possible thanks to the fact that GET operations do not read from the translog anymore (#20102) and also allows to get rid of `FieldMapper#isGenerated`. The `_termvectors` API (and thus more_like_this too) was relying on the fact that GET operations would extract fields from either stored fields or the source so the logic to do this that used to exist in `ShardGetService` has been moved to `TermVectorsService`. It would be nice that term vectors do not rely on this, but this does not seem to be a low hanging fruit.	2016-08-26 10:35:23 +02:00
Yannick Welsch	6fe9ae29ea	Mark shard as stale on non-replicated write, not on node shutdown (#20023 ) Non-stale shard copies are currently tracked using their allocation ids in the cluster state. When a node leaves the cluster, shard copies of that node are marked as stale by removing their allocation ids from the active set in the cluster. For full cluster restarts, this can have the unwanted effect that only the last node holding a copy of the shard will be seen as non-stale. The other shard copies are not really stale though as long as no writes have happened on this shard copy. Shard copies should thus only be marked as stale (by the master in the cluster state) if other active shards have received writes. This commit implements the above logic and also renames the persistent structure used to track non-stale shard copies from "active_allocations" to "in_sync_allocations" as we now also support tracking non-stale shard copies that have no active routing entries in the cluster state.	2016-08-26 10:09:57 +02:00
Adrien Grand	c5f8e1b64d	Do not parse numbers as both strings and numbers when not included in `_all`. #20167 We need to get the string representation of numbers in order to include in `_all`. However this has a cost and disabling `_all` is rather common so we should look into skipping it.	2016-08-26 10:00:36 +02:00
Daniel Mitterdorfer	4460998ff8	Remove obsolete NoopSearchRequestBuilder#setNoStoredFields()	2016-08-26 09:58:53 +02:00
Daniel Mitterdorfer	7b81c4ca59	Add client-benchmark-noop-api-plugin to stress clients even more in benchmarks (#20103 )	2016-08-26 09:05:47 +02:00
Jason Tedor	bc136a90d5	Add network types to cluster stats The network types in use on a cluster can be useful information to have, so this commit adds aggregate metrics for the network types in use in a cluster to the cluster stats. Relates #20144	2016-08-25 21:08:05 -04:00
Sarwar Bhuiyan	b0ceecc3eb	Refactored to use Settings object	2016-08-25 17:27:22 -04:00
Chris Earle	1cf694b63e	Use StringBuilder in favor of StringBuffer This removes all instances of StringBuffer that are removeable. Uncontended synchronization in Java is pretty cheap, but it's unnecessary.	2016-08-25 16:20:03 -04:00
Chris Earle	b41508a344	Make MapOfLists Generic This moves the Writer interface from StreamOutput into Writeable, as a peer of its inner Reader interface. This should hopefully help to avoid random functional interfaces being created for the same purpose. It also makes use of the moved class by updating writeMapOfLists and readMapOfLists.	2016-08-25 16:10:48 -04:00
Chris Earle	e171d0e0a8	Un-final Core REST Client classes This removes final from the RestClient, Response, and Sniffer classes so that outside code can mock them. Their constructors are already package private, so there's not much that can go wrong.	2016-08-25 16:02:04 -04:00
Tanguy Leroux	68b943dc53	Fix MoreLikeThisQueryBuilderTests.testUnknownObjectException() Objects hierarchy must be tracked when entering/leaving an object so that it better knows if the "newField" has been inserted into an arbitrary holding object. Can be reproduced with gradle :core:test -Dtests.seed=760F8BD0F7E46D45 -Dtests.class=org.elasticsearch.index.query.MoreLikeThisQueryBuilderTests -Dtests.method="testUnknownObjectException" -Dtests.security.manager=true -Dtests.locale=ko -Dtests.timezone=Etc/Zulu	2016-08-25 20:54:06 +02:00
Jason Tedor	5a48ad661d	Address race condition in HTTP pipeline tests The Netty 4 HTTP server pipeline tests contains two different test cases. The general idea behind these tests is to submit some requests to a Netty 4 HTTP server, one test with pipelining enabled and another test with pipelining disabled. These requests are submitted to two endpoints, one with a path like /{id} and another with a path like /slow with a query string parameter sleep. This parameter tells the request handler how long to sleep for before replying. The idea is that in the case of the pipelining enabled tests, the requests should come back exactly in the order submitted, even with some of the requests hitting the slow endpoint with random sleep durations; this is the guarantee that pipelining provides. And in the case of the pipelining disabled tests, requests were randombly submitted to /{id} and /slow with sleep parameters starting at 600ms and increasing by 100ms for each slow request constructed. We would expect the requests to come back with the all the responses to the /{id} requests first because these requests will execute instantaneously, and then the responses to the /slow requests. Further, it was expected that the slow requests would come back ordered by the length of the sleep, the thinking being that 100ms should be enough of a difference between each request that we would avoid any race conditions. Sadly, this is not the case, the threads do sometimes hit race conditions. This commit modifies the HTTP server pipelining tests to address this race condition. The modification is that the query string parameter on the /slow endpoint is removed in favor of just submitting requests to the path /slow/{id}, where id just used a marker to distinguish each request. The server chooses a random sleep of at least 500ms for each request on the slow path. The assertion here then is that the /{id} responses arrive first, then then /slow responses. We can not make an assertion on the order of the responses, but we can assert that we did see every expected response. Relates #19845	2016-08-25 14:34:11 -04:00
Jack Conradson	139d3f957f	Merge pull request #20146 from jdconrad/break Fix break bug in for/foreach loops.	2016-08-25 09:30:26 -07:00
Jack Conradson	0fdadf4737	Merge branch 'master' into break	2016-08-25 09:26:04 -07:00
Tanguy Leroux	fbcfddbb77	Fix AbstractQueryTestCase.testUnknownObjectException() When need to check the whole hierarchy of objects to know if the newly inserted "newField" object is part of an arbitrary holding object or not. Reproduced with `gradle :modules:percolator:test -Dtests.seed=736B0B67DA7A3632 -Dtests.class=org.elasticsearch.percolator.PercolateQueryBuilderTests -Dtests.method="testUnknownObjectException" -Dtests.security.manager=true -Dtests.locale=es-ES -Dtests.timezone=ART`	2016-08-25 16:24:22 +02:00
Colin Goodheart-Smithe	3f350f33f1	#20156 Fix agg profiling when using breadth_first collect mode Fix agg profiling when using breadth_first collect mode	2016-08-25 14:58:55 +01:00
Colin Goodheart-Smithe	f5fbb3eb8b	Fix agg profiling when using breadth_first collect mode Previous to this change the nesting of aggregation profiling results would be incorrect when the request contains a terms aggregation and the collect mode is (implicitly or explicitly) set to `breadth_first`. This was because the aggregation profiling has to make the assumption that the `preCollection()` method of children aggregations is always called in the `preCollection()` method of their parent aggregation. When the collect mode is `breadth_first` the `preCollection` of the children aggregations was delayed until the documents were replayed. This change moves the `preCollection()` of deferred aggregations to run during the `preCollection()` of the parent aggregation. This should have no adverse impact on the breadth_first mode as there is no allocation of memory in any of the aggregations. We also apply the same logic to the diversified sampler aggregation as we did to the terms aggregation to move the `preCollection()` of the child aggregations method to be called during the `preCollection()` of the parent aggregation. This commit also includes a fix so that the `ProfilingLeafBucketCollector` propagates the scorer to its delegate so the diversified sampler agg works when profiling is enabled.	2016-08-25 14:57:52 +01:00
Adrien Grand	b521638f52	Revert "Revert "Save one utf8 conversion in KeywordFieldMapper. #19867 "" This reverts commit d805266d94adcf3643b77194a7895de6200f2914.	2016-08-25 13:37:14 +02:00
Dominik Stadler	f0db4d9942	Add an example call of how to stop a snapshot or restore operation (#20153 )	2016-08-25 13:01:04 +02:00
Adrien Grand	f93ce94afe	The root object mapper should support updating `numeric_detection`, `date_detection` and `dynamic_date_formats`. #20119 If they are specified by a mapping update, these properties are currently ignored. This commit also fixes the handling of `dynamic_templates` so that it is possible to remove templates (and so that it works more similarly to all other mapping properties). Closes #20111	2016-08-25 12:39:38 +02:00
Michael McCandless	1fe3e36934	Merge pull request #20147 from mikemccand/lucene_620_upgrade Upgrade to Lucene 6.2.0	2016-08-25 06:03:34 -04:00
Tanguy Leroux	20719f9b2f	Improve AbstractQueryTestCase#unknownObjectExceptionTest() This method fails when a randomized string value contains a double-quote. This commit changes the method so that it is not based on string concatenation anymore. It now use XContentGenerator & XContentParser to mutate the valid queries. Related #19864	2016-08-25 10:57:30 +02:00
Mike McCandless	7a14cd4b1d	Pass baseSimilarity to super (PerFieldSimilarityWrapper)	2016-08-25 04:43:56 -04:00
Mike McCandless	5eb66e3378	Mark Scandinavian analysis components as multi term aware	2016-08-24 19:50:25 -04:00
Mike McCandless	7492300544	Remove now unused Store.renameFile, and obsolete commented out code	2016-08-24 18:20:30 -04:00
Jack Conradson	3deea3dbde	Made for/each break tests more robust in Painless.	2016-08-24 15:17:18 -07:00
Mike McCandless	0ccfe69789	Upgrade to Lucene 6.2.0	2016-08-24 17:26:28 -04:00
Jack Conradson	c60885b5d4	Fix break bug in for/foreach loops.	2016-08-24 14:25:54 -07:00
Igor Motov	b36fbc4452	Add support for parameters to the script ingest processor The script processor should support `params` to be consistent with all other script consumers.	2016-08-24 16:49:48 -04:00
Jim Ferenczi	9bedbbaa6a	Fixed doc links	2016-08-24 22:37:59 +02:00
Nicholas Knize	9eb63fb885	Refactor GeoPointFieldMapperLegacy and Legacy BBox query helpers This is a house cleaning commit that refactors GeoPointFieldMapperLegacy to LegacyGeoPointFieldMapper for consistency with Legacy Numerics and IP field mappers. IndexedGeoBoundingBoxQuery and InMemoryGeoBoundingBoxQuery are also deprecated and refactored as Legacy classes.	2016-08-24 14:40:25 -05:00
Jim Ferenczi	50b47aa930	Merge pull request #20026 from jimferenczi/disable_stored_fields Add the ability to disable the retrieval of the stored fields entirely	2016-08-24 21:30:47 +02:00
Ryan Ernst	acbece5b55	Merge pull request #20134 from rjernst/plugin_run_config Build: Allow plugin to set run configuration distro to zip	2016-08-24 08:49:13 -07:00

1 2 3 4 5 ...

24040 Commits