OpenSearch

Commit Graph

Author	SHA1	Message	Date
James Baiera	c357f81aa7	Add soft limit for max concurrent policy executions (#43117 ) Adds a global soft limit on the number of concurrently executing enrich policies. Since an enrich policy is run on the generic thread pool, this is meant to limit policy runs separately from the generic thread pool capacity.	2019-07-23 16:03:14 -04:00
James Baiera	fc20264b99	Add Enrich index background task to cleanup old indices (#43746 ) This PR adds a background maintenance task that is scheduled on the master node only. The deletion of an index is based on if it is not linked to a policy or if the enrich alias is not currently pointing at it. Synchronization has been added to make sure that no policy executions are running at the time of cleanup, and if any executions do occur, the marking process delays cleanup until next run.	2019-07-22 14:41:22 -04:00
James Baiera	7ad9beb087	Set auto expand replicas on enrich index after force merge is done. (#43600 )	2019-07-12 11:56:56 -04:00
Michael Basnight	b4b2ad3593	Ensure enrich policy is immutable (#43604 ) This commit ensures the policy cannot be overwritten. An error is thrown if the policy exists. All tests have been updated accordingly.	2019-07-11 13:23:12 -05:00
Michael Basnight	d2c3f4bae9	Validate read priv of enrich source indices (#43595 ) This commit adds permissions validation on the indices provided in the enrich policy. These indices should be validated at store time so as not to have cryptic error messages in the event the user does not have permissions to access said indices.	2019-07-10 13:09:10 -05:00
Martijn van Groningen	adc06ffd89	take builtin role into account in docs tests	2019-07-05 08:06:18 +02:00
Martijn van Groningen	9528c59fb3	added a basic test that enriching data works	2019-07-04 17:42:45 +02:00
Martijn van Groningen	1dd3d14f09	take into account `manage_enrich` builtin role	2019-07-04 16:51:48 +02:00
Martijn van Groningen	ac119b07e7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-04 15:50:11 +02:00
Benjamin Trent	36f7259737	[ML] Fix datafeed checks when a concrete remote index is present (#43923 ) A bug was introduced in 6.6.0 when we added support for rollup indices. Rollup caps does NOT support looking at remote indices, consequently, since we always look up rollup caps, the datafeed fails with an error if its config includes a concrete remote index. (When all remote indices in a datafeed config are wildcards the problem did not occur.) The rollups feature does not support remote indices, so if there is any remote index in a datafeed config (wildcarded or not), we can skip the rollup cap checks. This PR implements that change.	2019-07-04 13:31:45 +01:00
Dimitris Athanasiou	2a70df424d	[TEST][ML] Fix assertion after starting df-analytics job (#43957 ) (#43967 ) In MachineLearningIT.testStopDataFrameAnalytics we call start and then assert the state is `started`. However, if things go fast enough, the state could have already changed to `reindexing` or `analyzing`. The test has been failing occasionally due to the state being `reindexing`. We fix this by simply asserting the state is either of `started`, `reindexing` or `analyzing`. Closes #43924	2019-07-04 15:17:36 +03:00
Martijn van Groningen	7ba6e1752a	required changes after merge	2019-07-04 13:17:22 +02:00
Martijn van Groningen	653f1436a0	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-04 13:05:10 +02:00
Alan Woodward	4b99255fed	Add name() method to TokenizerFactory (#43909 ) This brings TokenizerFactory into line with CharFilterFactory and TokenFilterFactory, and removes the need to pass around tokenizer names when building custom analyzers. As this means that TokenizerFactory is no longer a functional interface, the commit also adds a factory method to TokenizerFactory to make construction simpler.	2019-07-04 11:28:55 +01:00
Alpar Torok	1b6109517a	Mute failing test Tracking in #43960	2019-07-04 12:13:02 +03:00
Jim Ferenczi	2cc0a56fe6	Fix wrong logic in `match_phrase` query with multi-word synonyms (#43941 ) Disjunction over two individual terms in a phrase query with multi-word synonyms wrongly applies a prefix query to each of these terms. This change fixes this bug by inversing the logic to use prefixes on `phrase_prefix` queries only. Closes #43308	2019-07-04 09:39:39 +02:00
Henning Andersen	cacc3f7ff8	Async IO Processor release before notify (#43682 ) This commit changes async IO processor to release the promiseSemaphore before notifying consumers. This ensures that a bad consumer that sometimes does blocking (or otherwise slow) operations does not halt the processor. This should slightly increase the concurrency for shard fsync, but primarily improves safety so that one bad piece of code has less effect on overall system performance.	2019-07-04 06:33:38 +02:00
Igor Motov	c593085104	Geo: Refactors libs/geo parser to provide serialization logic as well (#43717 ) Enables libs/geo parser to return a geometry format object that can perform both serialization and deserialization functions. This can be useful for ingest nodes that are trying to modify an existing geometry in the source. Relates to #43554	2019-07-03 19:31:44 -04:00
Benjamin Trent	7063a40411	[7.x] [ML][Data Frame] Adding bwc tests for pivot transform (#43506 ) (#43929 ) * [ML][Data Frame] Adding bwc tests for pivot transform (#43506) * [ML][Data Frame] Adding bwc tests for pivot transform * adding continuous transforms * adding continuous dataframes to bwc * adding continuous data frame tests * Adding rolling upgrade tests for continuous df * Fixing test * Adjusting indices used in BWC, and handling NPE for seq_no_stats * updating and muting specific bwc test * Adjusting bwc tests for backport	2019-07-03 16:39:38 -05:00
Przemyslaw Gomulka	553f783e73	Fix DieWithDignity test when waiting on jps backport(#43861 ) (#43871 ) the test often hangs on executing jps command we don't need to wait for this command to finish. closes #43413	2019-07-03 20:39:48 +02:00
Adrien Grand	680edbe3f1	Bump current version to 7.4. (#43927 )	2019-07-03 20:32:04 +02:00
Lisa Cawley	50e96f9f0e	[DOCS] Updates documentation version (#43937 )	2019-07-03 11:09:34 -07:00
Armin Braun	be20fb80e4	Recursive Delete on BlobContainer (#43281 ) (#43920 ) This is a prerequisite of #42189: * Add directory delete method to blob container specific to each implementation: * Some notes on the implementations: * AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting. * For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block. * HDFS and FS are trivial since we have directory delete methods available for them * Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)	2019-07-03 17:14:57 +02:00
Alan Woodward	49d69bf987	Actually close IndexAnalyzers contents (#43914 ) IndexAnalyzers has a close() method that should iterate through all its wrapped analyzers and close each one in turn. However, instead of delegating to the analyzers' close() methods, it instead wraps them in a Closeable interface, which just returns a list of the analyzers. In addition, whitespace normalizers are ignored entirely.	2019-07-03 16:06:58 +01:00
Alpar Torok	3250cc53f0	Mute failing test Tracked in #43924	2019-07-03 17:43:40 +03:00
Martijn van Groningen	397150fa1e	Add enrich coordinator proxy action (#43801 ) Introduced proxy api the handle the search request load that originates from enrich processor. The enrich processor can execute many search requests that execute asynchronously in parallel and that can easily overwhelm the search thread pool on nodes. In order to protect this the Coordinator queues the search requests and only executes a fixed number of search requests in parallel. Besides this; the Coordinator tries to include as much as possible search requests (up to a defined maximum) inside a multi search request in order to reduce the number of remote api calls to be made from the node that performs ingestion.	2019-07-03 15:50:40 +02:00
Zachary Tong	f8fd4321f8	Link rare_terms docs from index page (#43882 ) Docs for rare_terms were added in #35718, but neglected to link it from the bucket index page	2019-07-03 09:32:01 -04:00
David Turner	9cecc31cdc	Shortcut simple patterns ending in `` (#43904 ) When profiling a call to `AllocationService#reroute()` in a large cluster containing allocation filters of the form `node-name-` I observed a nontrivial amount of time spent in `Regex#simpleMatch` due to these allocation filters. Patterns ending in a wildcard are not uncommon, and this change treats them as a special case in `Regex#simpleMatch` in order to shave a bit of time off this calculation. It also uses `String#regionMatches()` to avoid an allocation in the case that the pattern's only wildcard is at the start. Microbenchmark results before this change: Result "org.elasticsearch.common.regex.RegexStartsWithBenchmark.performSimpleMatch": 1113.839 ±(99.9%) 6.338 ns/op [Average] (min, avg, max) = (1102.388, 1113.839, 1135.783), stdev = 9.486 CI (99.9%): [1107.502, 1120.177] (assumes normal distribution) Microbenchmark results with this change applied: Result "org.elasticsearch.common.regex.RegexStartsWithBenchmark.performSimpleMatch": 433.190 ±(99.9%) 0.644 ns/op [Average] (min, avg, max) = (431.518, 433.190, 435.456), stdev = 0.964 CI (99.9%): [432.546, 433.833] (assumes normal distribution) The microbenchmark in question was: @Fork(3) @Warmup(iterations = 10) @Measurement(iterations = 10) @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) @State(Scope.Benchmark) @SuppressWarnings("unused") //invoked by benchmarking framework public class RegexStartsWithBenchmark { private static final String testString = "abcdefghijklmnopqrstuvwxyz"; private static final String[] patterns; static { patterns = new String[testString.length() + 1]; for (int i = 0; i <= testString.length(); i++) { patterns[i] = testString.substring(0, i) + "*"; } } @Benchmark public void performSimpleMatch() { for (int i = 0; i < patterns.length; i++) { Regex.simpleMatch(patterns[i], testString); } } }	2019-07-03 14:15:27 +01:00
Armin Braun	3317169c4f	Fix GCS Blob Repository 3rd Party Tests (#43030 ) (#43913 ) * We have to strip the trailing slash from child names here like we do for AWS * closes #43029	2019-07-03 15:09:28 +02:00
James Rodewig	e2a9a787fc	[DOCS] Rewrite dis max query (#43586 )	2019-07-03 08:56:18 -04:00
paulward24	cff027499a	Ensure to access RecoveryState#fileDetails under lock Closes #43840	2019-07-03 07:39:58 -04:00
Armin Braun	7059224668	Optimize Snapshot Finalization (#42723 ) (#43908 ) * Optimize Snapshot Finalization * Delete index-N blobs and segement blobs in one single bulk delete instead of in separate ones to save RPC calls on implementations that have bulk deletes implemented * Don't fail snapshot because deleting old index-N failed, this results in needlessly logging finalization failures and makes analysis of failures harder going forward as well as incorrect index.latest blobs	2019-07-03 13:26:35 +02:00
Christoph Büscher	662f517f4e	Add _reload_search_analyzers endpoint to HLRC (#43733 ) This change adds the new endpoint that allows reloading of search analyzers to the high-level java rest client. Relates to #43313	2019-07-03 12:05:59 +02:00
Dimitris Athanasiou	96b0b27f18	[7.x][ML] Set df-analytics task state to failed when appropriate (#43880 ) (#43906 ) This introduces a `failed` state to which the data frame analytics persistent task is set to when something unexpected fails. It could be the process crashing, the results processor hitting some error, etc. The failure message is then captured and set on the task state. From there, it becomes available via the _stats API as `failure_reason`. The df-analytics stop API now has a `force` boolean parameter. This allows the user to call it for a failed task in order to reset it to `stopped` after we have ensured the failure has been communicated to the user. This commit also adds the analytics version in the persistent task params as this allows us to prevent tasks to run on unsuitable nodes in the future.	2019-07-03 12:41:56 +03:00
Jay Modi	1e0f67fb38	Deprecate transport profile security type setting (#43237 ) This commit deprecates the `transport.profiles.*.xpack.security.type` setting. This setting is used to configure a profile that would only allow client actions. With the upcoming removal of the transport client the setting should also be deprecated so that it may be removed in a future version.	2019-07-03 19:31:55 +10:00
Armin Braun	455b12a4fb	Add Ability to List Child Containers to BlobContainer (#42653 ) (#43903 ) * Add Ability to List Child Containers to BlobContainer (#42653) * Add Ability to List Child Containers to BlobContainer * This is a prerequisite of #42189	2019-07-03 11:30:49 +02:00
Alexander Reelsen	9077c4402f	Watcher: Allow to execute actions for each element in array (#41997 ) This adds the ability to execute an action for each element that occurs in an array, for example you could sent a dedicated slack action for each search hit returned from a search. There is also a limit for the number of actions executed, which is hardcoded to 100 right now, to prevent having watches run forever. The watch history logs each action result and the total number of actions the were executed. Relates #34546	2019-07-03 11:28:50 +02:00
Tim Vernum	2a8f30eb9a	Support builtin privileges in get privileges API (#43901 ) Adds a new "/_security/privilege/_builtin" endpoint so that builtin index and cluster privileges can be retrieved via the Rest API Backport of: #42134	2019-07-03 19:08:28 +10:00
Tim Vernum	deacc2038e	Always attach system user to internal actions (#43902 ) All valid licenses permit security, and the only license state where we don't support security is when there is a missing license. However, for safety we should attach the system (or xpack/security) user to internally originated actions even if the license is missing (or, more strictly, doesn't support security). This allows all nodes to communicate and send internal actions (shard state, handshake/pings, etc) even if a license is transitioning between a broken state and a valid state. Relates: #42215 Backport of: #43468	2019-07-03 19:07:16 +10:00
Henning Andersen	cd2972239c	AsyncIOProcessor preserve thread context (#43729 ) AsyncIOProcessor now preserves thread context, ensuring that deprecation warnings are not duplicated to other concurrent operations on the same shard.	2019-07-03 10:22:20 +02:00
Tim Vernum	31b19bd022	Use separate BitSet cache in Doc Level Security (#43899 ) Document level security was depending on the shared "BitsetFilterCache" which (by design) never expires its entries. However, when using DLS queries - particularly templated ones - the number (and memory usage) of generated bitsets can be significant. This change introduces a new cache specifically for BitSets used in DLS queries, that has memory usage constraints and access time expiry. The whole cache is automatically cleared if the role cache is cleared. Individual bitsets are cleared when the corresponding lucene index reader is closed. The cache defaults to 50MB, and entries expire if unused for 7 days. Backport of: #43669	2019-07-03 18:04:06 +10:00
Jim Ferenczi	05c0cff1b6	Fix index_prefix sub field name on nested text fields (#43862 ) This change fixes the name of the index_prefix sub field when the `index_prefix` option is set on a text field that is nested under an object or a multi-field. We don't use the full path of the parent field to set the index_prefix field name so the field is registered under the wrong name. This doesn't break queries since we always retrieve the prefix field through its parent field but this breaks other APIs like _field_caps which tries to find the parent of the `index_prefix` field in the mapping but fails. Closes #43741	2019-07-03 09:50:52 +02:00
Armin Braun	826f38cd70	Enable Parallel Deletes in Azure Repository (#42783 ) (#43886 ) * Parallel deletes via private thread pool	2019-07-03 09:28:39 +02:00
Tanguy Leroux	365dfe88ca	Refresh translog stats after translog trimming in NoOpEngine (#43825 ) This commit changes NoOpEngine so that it refreshes its translog stats once translog is trimmed. Relates #43156	2019-07-03 08:49:14 +02:00
Tim Vernum	461aa39daf	Switch WriteActionsTests.testBulk to use hamcrest (#43897 ) If an item in the bulk request fails, that could be for a variety of reasons - it may be that the underlying behaviour of security has changed, or it may just be a transient failure during testing. Simply asserting a `true`/`false` value produces failure messages that are difficult to diagnose and debug. Using hamcert (`assertThat`) will make it easier to understand the causes of failures in this test. Backport of: #43725	2019-07-03 16:29:28 +10:00
Tim Vernum	14884c871f	Document API-Key APIs require manage_api_key priv (#43869 ) Add the "Authorization" section to the API key API docs. These APIs require The new manage_api_key cluster privilege. Relates: #43865 Backport of: #43811	2019-07-03 13:51:44 +10:00
Jake Landis	6e9ccda2c5	ilm test - allow more time for policy completion (#43844 )	2019-07-02 22:05:18 -05:00
Jake Landis	0a79f4ca70	Extend timeout for TimeSeriesLifecycleActionsIT> testFullPolicy (#43891 )	2019-07-02 22:05:04 -05:00
Jake Landis	2dc056b0a0	Read the default pipeline for bulk upsert through an alias (#41963 ) (#42802 ) This commit allows bulk upserts to correctly read the default pipeline for the concrete index that belongs to an alias. Bulk upserts are modeled differently from normal index requests such that the index request is a request inside of the update request. The update request (outer) contains the index or alias name is not part of the (inner) index request. This commit adds a secondary check against the update request (outer) if the index request (inner) does not find an alias.	2019-07-02 20:44:33 -05:00
Deb Adair	a4e518b640	[DOCS] Revise GS intro and remove redundant conceptual content. Closes #43846 .	2019-07-02 18:28:13 -07:00

1 2 3 4 5 ...

46631 Commits All Branches Search

46631 Commits

All Branches