OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Roberts	d2461643cd	[ML] Move open job failure explanation out of root cause (#31925 ) When an ML job cannot be allocated to a node the exception contained an explanation of why the job couldn't be allocated to each node in the cluster. For large clusters this was not particularly easy to read and made the error displayed in the UI look very scary. This commit changes the structure of the error to an outer ElasticsearchException with a high level message and an inner IllegalStateException containing the detailed explanation. Because the definition of root cause is the innermost ElasticsearchException the detailed explanation will not be the root cause (which is what Kibana displays). Fixes #29950	2018-07-13 08:57:33 +01:00
Nik Everett	dcbb1154bf	HLRest: Move xPackInfo() to xPack().info() (#31905 ) Originally I put the X-Pack info object into the top level rest client object. I did that because we thought we'd like to squash `xpack` from the name of the X-Pack APIs now that it is part of the default distribution. We still kind of want to do that, but at least for now we feel like it is better to keep the high level rest client aligned with the other language clients like C# and Python. This shifts the X-Pack info API to align with its json spec file. Relates to #31870	2018-07-10 13:01:28 -04:00
Nik Everett	fb27f3e7f0	HLREST: Add x-pack-info API (#31870 ) This is the first x-pack API we're adding to the high level REST client so there is a lot to talk about here! = Open source The client for these APIs is open source. We're taking the previously Elastic licensed files used for the `Request` and `Response` objects and relicensing them under the Apache 2 license. The implementation of these features is staying under the Elastic license. This lines up with how the rest of the Elasticsearch language clients work. = Location of the new files We're moving all of the `Request` and `Response` objects that we're relicensing to the `x-pack/protocol` directory. We're adding a copy of the Apache 2 license to the root fo the `x-pack/protocol` directory to line up with the language in the root `LICENSE.txt` file. All files in this directory will have the Apache 2 license header as well. We don't want there to be any confusion. Even though the files are under the `x-pack` directory, they are Apache 2 licensed. We chose this particular directory layout because it keeps the X-Pack stuff together and easier to think about. = Location of the API in the REST client We've been following the layout of the rest-api-spec files for other APIs and we plan to do this for the X-Pack APIs with one exception: we're dropping the `xpack` from the name of most of the APIs. So `xpack.graph.explore` will become `graph().explore()` and `xpack.license.get` will become `license().get()`. `xpack.info` and `xpack.usage` are special here though because they don't belong to any proper category. For now I'm just calling `xpack.info` `xPackInfo()` and intend to call usage `xPackUsage` though I'm not convinced that this is the final name for them. But it does get us started. = Jars, jars everywhere! This change makes the `xpack:protocol` project a `compile` scoped dependency of the `x-pack:plugin:core` and `client:rest-high-level` projects. I intend to keep it a compile scoped dependency of `x-pack:plugin:core` but I intend to bundle the contents of the protocol jar into the `client:rest-high-level` jar in a follow up. This change has grown large enough at this point. In that followup I'll address javadoc issues as well. = Breaking-Java This breaks that transport client by a few classes around. We've traditionally been ok with doing this to the transport client.	2018-07-08 11:03:56 -04:00
Dimitris Athanasiou	49ba271bd8	[ML] Fix master node deadlock during ML daily maintenance (#31836 ) This is the implementation for master and 6.x of #31691. Native tests are changed to use multi-node clusters in #31757. Relates #31683	2018-07-07 09:43:28 +01:00
Christoph Büscher	bd1c513422	Reduce more raw types warnings (#31780 ) Similar to #31523.	2018-07-05 15:38:06 +02:00
David Roberts	92de94c237	[ML] Don't treat stale FAILED jobs as OPENING in job allocation (#31800 ) Job persistent tasks with stale allocation IDs used to always be considered as OPENING jobs in the ML job node allocation decision. However, FAILED jobs are not relocated to other nodes, which leads to them blocking up the nodes they failed on after node restarts. FAILED jobs should not restrict how many other jobs can open on a node, regardless of whether they are stale or not. Closes #31794	2018-07-05 13:26:17 +01:00
Dimitris Athanasiou	9c11bf1e12	[ML] Fix calendar and filter updates from non-master nodes (#31804 ) Job updates or changes to calendars or filters may result into updating the job process if it has been running. To preserve the order of updates, process updates are queued through the UpdateJobProcessNotifier which is only running on the master node. All actions performing such updates must run on the master node. However, the CRUD actions for calendars and filters are not master node actions. They have been submitting the updates to the UpdateJobProcessNotifier even though it might have not been running (given the action was run on a non-master node). When that happens, the update never reaches the process. This commit fixes this problem by ensuring the notifier runs on all nodes and by ensuring the process update action gets the resources again before updating the process (instead of having those resources passed in the request). This ensures that even if the order of the updates gets messed up, the latest update will read the latest state of those resource and the process will get back in sync. This leaves us with 2 types of updates: 1. updates to the job config should happen on the master node. This is because we cannot refetch the entire job and update it. We need to know the parts that have been changed. 2. updates to resources the job uses. Those can be handled on non-master nodes but they should be re-fetched by the update process action. Closes #31803	2018-07-05 13:14:12 +01:00
David Roberts	308e37f80e	[ML] Rate limit established model memory updates (#31768 ) There is at most one model size stats document per bucket, but during lookback a job can churn through many buckets very quickly. This can lead to many cluster state updates if established model memory needs to be updated for a given model size stats document. This change rate limits established model memory updates to one per job per 5 seconds. This is done by scheduling the updates 5 seconds in the future, but replacing the value to be written if another model size stats document is received during the waiting period. Updating the values in arrears like this means that the last value received will be the one associated with the job in the long term, whereas alternative approaches such as not updating the value if a new value was close to the old value would not.	2018-07-04 13:56:32 +01:00
Hendrik Muhs	e9f8442bee	[ML] Return statistics about forecasts as part of the jobsstats and usage API (#31647 ) This change adds stats about forecasts, to the jobstats api as well as xpack/_usage. The following information is collected: _xpack/ml/anomaly_detectors/{jobid\|_all}/_stats: - total number of forecasts - memory statistics (mean/min/max) - runtime statistics - record statistics - counts by status _xpack/usage - collected by job status as well as overall (_all): - total number of forecasts - number of jobs that have at least 1 forecast - memory, runtime, record statistics - counts by status Fixes #31395	2018-07-04 08:15:45 +02:00
Alpar Torok	0afec8f31c	Remove deprecation warnings to prepare for Gradle 5 (sourceSets.main.output.classesDirs) (#30389 ) * Remove deprecation warnings to prepare for Gradle 5 Gradle replaced `project.sourceSets.main.output.classesDir` of type `File` with `project.sourceSets.main.output.classesDirs` of type `FileCollection` (see [SourceSetOutput](https://github.com/gradle/gradle/blob/master/subprojects/plugins/src/main/java/org/gradle/api/tasks/SourceSetOutput.java)) Build output is now stored on a per language folder. There are a few places where we use that, here's these and how it's fixed: - Randomized Test execution - look in all test folders ( pass the multi dir configuration to the ant runner ) - DRY the task configuration by introducing `basedOn` for `RandomizedTestingTask` DSL - Extend the naming convention test to support passing in multiple directories - Fix the standalon test plugin, the dires were not passed trough, checked with a debuger and the statement had no affect due to a missing `=`. Closes #30354 * Only check Java tests, PR feedback - Name checker was ran for Groovy tests that don't adhere to the same convections causing the check to fail - implement PR feedback * Replace `add` with `addAll` This worked because the list is passed to `project.files` that does the right thing. * Revert "Only check Java tests, PR feedback" This reverts commit 9bd9389875d8b88aadb50df57a45cd0d2b073241. * Remove `basedOn` helper * Bring some changes back Previus revert accidentally reverted too much * Fix negation * add back public * revert name check changes * Revert "revert name check changes" This reverts commit a2800c0b363168339ea65e2a79ec8256e5883e6d. * Pass all dirs to name check Only run on Java for build-tools, this is safe because it's a self test. It needs more work before we could pass in the Groovy classes as well as these inherit from `GroovyTestCase` * remove self tests from name check The self complicates the task setup and disable real checks on build-tools. With this change there are no more self tests, and the build-tools tests adhere to the conventions. The self test will be replaced by gradle test kit, thus the addition of the Gradle plugin builder plugin. * First test to run a Gradle build * Add tests that replace the name check self test * Clean up integ test base class * Always run tests * Align with test naming conventions * Make integ. test case inherit from unit test case The check requires this * Remove `import static org.junit.Assert.*`	2018-06-28 15:14:34 +03:00
Christoph Büscher	86ab3a2d1a	Reduce number of raw types warnings (#31523 ) A first attempt to reduce the number of raw type warnings, most of the time by using the unbounded wildcard.	2018-06-25 15:59:03 +02:00
Ryan Ernst	7a150ec06d	Core: Combine doExecute methods in TransportAction (#31517 ) TransportAction currently contains 2 doExecute methods, one which takes a the task, and one that does not. The latter is what some subclasses implement, while the first one just calls the latter, dropping the given task. This commit combines these methods, in favor of just always assuming a task is present.	2018-06-22 15:03:01 -07:00
Dimitris Athanasiou	c6cbc99f9c	[ML] Add ML filter update API (#31437 ) This adds an api to allow updating a filter: POST _xpack/ml/filters/{filter_id}/_update The request body may have: - description: setting a new description - add_items: a list of the items to add - remove_items: a list of the items to remove This commit also changes the PUT filter api to error when the filter_id is already used. As now there is an api for updating filters, the put api should only be used to create new ones. Also, updating a filter results into a notification message auditing the change for every job that is using that filter.	2018-06-22 15:13:31 +01:00
Adrien Grand	8ae2049889	Avoid deprecation warning when running the ML datafeed extractor. (#31463 ) In #29639 we added a `format` option to doc-value fields and deprecated usage of doc-value fields without a format so that we could migrate doc-value fields to use the format that comes with the mappings by default. However I missed to fix the machine-learning datafeed extractor.	2018-06-22 13:46:48 +02:00
Ryan Ernst	4f9332ee16	Core: Remove ThreadPool from base TransportAction (#31492 ) Most transport actions don't need the node ThreadPool. This commit removes the ThreadPool as a super constructor parameter for TransportAction. The actions that do need the thread pool then have a member added to keep it from their own constructor.	2018-06-21 11:25:26 -07:00
Ryan Ernst	401800d958	Core: Remove index name resolver from base TransportAction (#31002 ) Most transport actions don't need to resolve index names. This commit removes the index name resolver as a super constructor parameter for TransportAction. The actions that do need the resolver then have a member added to keep the resolver from their own constructor.	2018-06-19 17:06:09 -07:00
Yannick Welsch	02a4ef38a7	Use system context for cluster state update tasks (#31241 ) This commit makes it so that cluster state update tasks always run under the system context, only restoring the original context when the listener that was provided with the task is called. A notable exception is the clusterStatePublished(...) callback which will still run under system context, because it's defined on the executor-level, and not the task level, and only called once for the combined batch of tasks and can therefore not be uniquely identified with a task / thread context. Relates #30603	2018-06-18 16:46:04 +02:00
Dimitris Athanasiou	c6a5a6d924	[ML] Put ML filter API response should contain the filter (#31362 )	2018-06-15 21:15:35 +01:00
Tanguy Leroux	992c7889ee	Uncouple persistent task state and status (#31031 ) This pull request removes the relationship between the state of persistent task (as stored in the cluster state) and the status of the task (as reported by the Task APIs and used in various places) that have been confusing for some time (#29608). In order to do that, a new PersistentTaskState interface is added. This interface represents the persisted state of a persistent task. The methods used to update the state of persistent tasks are renamed: updatePersistentStatus() becomes updatePersistentTaskState() and now takes a PersistentTaskState as a parameter. The Task.Status type as been changed to PersistentTaskState in all places were it make sense (in persistent task customs in cluster state and all other methods that deal with the state of an allocated persistent task).	2018-06-15 09:26:47 +02:00
Dimitris Athanasiou	9b293275af	[ML] Add description to ML filters (#31330 ) This adds a `description` to ML filters in order to allow users to describe their filters in a human readable form which is also editable (filter updates to be added shortly).	2018-06-14 16:52:32 +01:00
Tanguy Leroux	2d4c9ce08c	Remove remaining unused imports before merging #31270	2018-06-14 09:52:03 +02:00
David Kyle	88f44a9f66	[ML] Check licence when datafeeds use cross cluster search (#31247 ) This change prevents a datafeed using cross cluster search from starting if the remote cluster does not have x-pack installed and a sufficient license. The check is made only when starting a datafeed.	2018-06-13 15:42:18 +01:00
Dimitris Athanasiou	5c77ebe89d	[ML] Implement new rules design (#31110 ) Rules allow users to supply a detector with domain knowledge that can improve the quality of the results. The model detects statistically anomalous results but it has no knowledge of the meaning of the values being modelled. For example, a detector that performs a population analysis over IP addresses could benefit from a list of IP addresses that the user knows to be safe. Then anomalous results for those IP addresses will not be created and will not affect the quantiles either. Another example would be a detector looking for anomalies in the median value of CPU utilization. A user might want to inform the detector that any results where the actual value is less than 5 is not interesting. This commit introduces a `custom_rules` field to the `Detector`. A detector may have multiple rules which are combined with `or`. A rule has 3 fields: `actions`, `scope` and `conditions`. Actions is a list of what should happen when the rule applies. The current options include `skip_result` and `skip_model_update`. The default value for `actions` is the `skip_result` action. Scope is optional and allows for applying filters on any of the partition/over/by field. When not defined the rule applies to all series. The `filter_id` needs to be specified to match the id of the filter to be used. Optionally, the `filter_type` can be specified as either `include` (default) or `exclude`. When set to `include` the rule applies to entities that are in the filter. When set to `exclude` the rule only applies to entities not in the filter. There may be zero or more conditions. A condition requires `applies_to`, `operator` and `value` to be specified. The `applies_to` value can be either `actual`, `typical` or `diff_from_typical` and it specifies the numerical value to which the condition applies. The `operator` (`lt`, `lte`, `gt`, `gte`) and `value` complete the definition. Conditions are combined with `and` and allow to specify numerical conditions for when a rule applies. A rule must either have a scope or one or more conditions. Finally, a rule with scope and conditions applies when all of them apply.	2018-06-13 11:20:38 +01:00
Jason Tedor	0bfd18cc8b	Revert upgrade to Netty 4.1.25.Final (#31282 ) This reverts upgrading to Netty 4.1.25.Final until we have a cleaner solution to dealing with the object cleaner thread.	2018-06-12 19:26:18 -04:00
Jason Tedor	563141c6c9	Upgrade to Netty 4.1.25.Final (#31232 ) This commit upgrades us to Netty 4.1.25. This upgrade is more challenging than past upgrades, all because of a new object cleaner thread that they have added. This thread requires an additional security permission (set context class loader, needed to avoid leaks in certain scenarios). Additionally, there is not a clean way to shutdown this thread which means that the thread can fail thread leak control during tests. As such, we have to filter this thread from thread leak control.	2018-06-11 16:55:07 -04:00
Tanguy Leroux	bf58660482	Remove all unused imports and fix CRLF (#31207 ) The X-Pack opening and the recent other refactorings left a lot of unused imports in the codebase. This commit removes them all.	2018-06-11 15:12:12 +02:00
Christoph Büscher	3f87c79500	Change ObjectParser exception (#31030 ) ObjectParser should throw XContentParseExceptions, not IAE. A dedicated parsing exception can includes the place where the error occurred. Closes #30605	2018-06-04 20:20:37 +02:00
David Kyle	16d1f05045	[ML] Add secondary sort to ML events (#31063 )	2018-06-04 16:31:35 +01:00
Ryan Ernst	46e8d97813	Core: Remove RequestBuilder from Action (#30966 ) This commit removes the RequestBuilder generic type from Action. It was needed to be used by the newRequest method, which in turn was used by client.prepareExecute. Both of these methods are now removed, along with the existing users of prepareExecute constructing the appropriate builder directly.	2018-05-31 16:15:00 +02:00
Tanguy Leroux	a0af0e7f1e	Rename methods in PersistentTasksService (#30837 ) This commit renames methods in the PersistentTasksService, to make obvious that the methods send requests in order to change the state of persistent tasks. Relates to #29608.	2018-05-30 09:20:14 +02:00
Jason Tedor	bcfdccaf3f	Use dedicated ML APIs in tests (#30941 ) ML has dedicated APIs for datafeeds and jobs yet base test classes and some tests were relying on the cluster state for this state. This commit removes this usage in favor of using the dedicated endpoints.	2018-05-29 21:17:47 -04:00
Adrien Grand	a19df4ab3b	Add a `format` option to `docvalue_fields`. (#29639 ) This commit adds the ability to configure how a docvalue field should be formatted, so that it would be possible eg. to return a date field formatted as the number of milliseconds since Epoch. Closes #27740	2018-05-23 14:39:04 +02:00
Yannick Welsch	03607f646b	Revert "Mutes MachineLearningTests.testNoAttributes_givenSameAndMlEnabled" This reverts commit `ca999ad569`.	2018-05-23 11:49:52 +02:00
Yannick Welsch	8145a820c2	Only allow x-pack metadata if all nodes are ready (#30743 ) Enables a rolling restart from the OSS distribution to the x-pack based distribution by preventing x-pack code from installing custom metadata into the cluster state until all nodes are capable of deserializing this metadata.	2018-05-23 11:41:23 +02:00
Colin Goodheart-Smithe	ca999ad569	Mutes MachineLearningTests.testNoAttributes_givenSameAndMlEnabled This is awaiting fix on https://github.com/elastic/elasticsearch/issues/30804	2018-05-23 10:39:00 +01:00
Yannick Welsch	30b004f582	Use original settings on full-cluster restart (#30780 ) When doing a node restart using the test framework, the restarted node does not only use the settings provided to the original node, but also additional settings provided by plugin extensions, which does not correspond to the settings that a node would have on a true restart.	2018-05-23 09:02:01 +02:00
David Kyle	f76f95b813	[ML] Filter undefined job groups from update calendar actions (#30757 ) The UI creates job groups in calendars ad hoc to ease calendar creation these must be filtered from the jobs list before applying updates.	2018-05-22 09:25:14 +01:00
David Roberts	eaf672f612	[ML] Don't install empty ML metadata on startup (#30751 ) This change is to support rolling upgrade from a pre-6.3 default distribution (i.e. without X-Pack) to a 6.3+ default distribution (i.e. with X-Pack). The ML metadata is no longer eagerly added to the cluster state as soon as the master node has X-Pack available. Instead, it is added when the first ML job is created. As a result all methods that get the ML metadata need to be able to handle the situation where there is no ML metadata in the current cluster state. They do this by behaving as though an empty ML metadata was present. This logic is encapsulated by always asking for the current ML metadata using a static method on the MlMetadata class. Relates #30731	2018-05-21 14:29:45 +01:00
Hendrik Muhs	6c313a9871	This implementation lazily (on 1st forecast request) checks for available diskspace and creates a subfolder for storing data outside of Lucene indexes, but as part of the ES data paths. Details: - tmp storage is managed and does not allow allocation if disk space is below a threshold (5GB at the moment) - tmp storage is supposed to be managed by the native component but in case this fails cleanup is provided: - on job close - on process crash - after node crash, on restart - available space is re-checked for every forecast call (the native component has to check again before writing) Note: The 1st path that has enough space is chosen on job open (job close/reopen triggers a new search)	2018-05-18 14:04:09 +02:00
Hendrik Muhs	d893041634	[ML] add version information in case of crash of native ML process (#30674 ) This change adds version information in case a native ML process crashes, the version is important for choosing the right symbol files when analyzing the crash. Adding the version combines all necessary information on one line. relates elastic/ml-cpp#94	2018-05-18 07:46:52 +02:00
Dimitris Athanasiou	75665a2d3e	[ML] Clean left behind model state docs (#30659 ) It is possible for state documents to be left behind in the state index. This may be because of bugs or uncontrollable scenarios. In any case, those documents may take up quite some disk space when they add up. This commit adds a step in the expired data deletion that is part of the daily maintenance service. The new step searches for state documents that do not belong to any of the current jobs and deletes them. Closes #30551	2018-05-17 17:51:26 +03:00
Dimitris Athanasiou	01bdfcde6f	[ML] DeleteExpiredDataAction should use client with origin (#30646 ) This is an admin action that should be allowed to operate on ML indices with full permissions.	2018-05-16 23:35:23 +03:00
Colin Goodheart-Smithe	a75b8adce5	Refactors ClientHelper to combine header logic (#30620 ) * Refactors ClientHelper to combine header logic This change removes all the `ClientHelper` classes which were repeating logic between plugins and instead adds `ClientHelper.executeWithHeaders()` and `ClientHelper.executeWithHeadersAsync()` methods to centralise the logic for executing requests with stored security headers. Removes Watcher headers constant	2018-05-16 11:38:24 +01:00
David Roberts	50c34b2a9b	[ML] Reverse engineer Grok patterns from categorization results (#30125 ) This change adds a grok_pattern field to the GET categories API output in ML. It's calculated using the regex and examples in the categorization result, and applying a list of candidate Grok patterns to the bits in between the tokens that are considered to define the category. This can currently be considered a prototype, as the Grok patterns it produces are not optimal. However, enough people have said it would be useful for it to be worthwhile exposing it as experimental functionality for interested parties to try out.	2018-05-15 09:02:38 +01:00
David Kyle	9dd629648d	[ML] Improve state persistence log message	2018-05-12 09:20:08 +01:00
Dimitris Athanasiou	3b260dcfc1	[ML] Account for gaps in data counts after job is reopened (#30294 ) This commit fixes an issue with the data diagnostics were empty buckets are not reported even though they should. Once a job is reopened, the diagnostics do not get initialized from the current data counts (especially the latest record timestamp). The result is that if the data that is sent have a time gap compared to the previous ones, that gap is not accounted for in the empty bucket count. This commit fixes that by initializing the diagnostics with the current data counts. Closes #30080	2018-05-03 15:08:24 +01:00
Ryan Ernst	fb0aa562a5	Network: Remove http.enabled setting (#29601 ) This commit removes the http.enabled setting. While all real nodes (started with bin/elasticsearch) will always have an http binding, there are many tests that rely on the quickness of not actually needing to bind to 2 ports. For this case, the MockHttpTransport.TestPlugin provides a dummy http transport implementation which is used by default in ESIntegTestCase. closes #12792	2018-05-02 11:42:05 -07:00
Dimitris Athanasiou	057cdffed5	[ML] Refactor DataStreamDiagnostics to use array (#30129 ) This commit refactors the DataStreamDiagnostics class achieving the following advantages: - simpler code; by encapsulating the moving bucket histogram into its own class - better performance; by using an array to store the buckets instead of a map - explicit handling of gap buckets; in preparation of fixing #30080	2018-05-01 09:50:32 +01:00
David Roberts	225f7093a9	[ML] Include 3rd party C++ component notices (#30132 ) The overall NOTICE file for the ML X-Pack module should include the notices from the 3rd party C++ components as well as the 3rd party Java components.	2018-04-30 20:05:27 +01:00
David Kyle	cfc66a1fd5	[ML] Wait for updates to established memory usage Tests need to wait for changes to the job's established memory usage to propagate and an over enthusiastic optimisation meant jobs were updated from stale state causing recent change to be lost.	2018-04-24 13:46:58 -04:00
Ryan Ernst	2efd22454a	Migrate x-pack-elasticsearch source to elasticsearch	2018-04-20 15:29:54 -07:00

... 13 14 15 16 17

801 Commits