OpenSearch

Commit Graph

Author	SHA1	Message	Date
Przemysław Witek	df574e5168	[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188 ) (#45510 )	2019-08-14 08:26:03 +02:00
István Zoltán Szabó	276e9c6697	[DOCS] Adds supported time units ref to the ML and DF API params. (#45322 )	2019-08-08 14:26:19 +02:00
Lisa Cawley	7c9c9a9cc4	[DOCS] Reformats ML update APIs (#45253 )	2019-08-06 11:16:29 -07:00
István Zoltán Szabó	dae648eb32	[DOCS] Makes clearer the note under freq_rare. (#45193 )	2019-08-05 13:29:43 +02:00
James Rodewig	8dd74dfe0b	Rename "indices APIs" to "index APIs" (#44863 )	2019-08-02 14:10:09 -04:00
Lisa Cawley	09bd6c4692	[DOCS] Clarifies bucket span in overall buckets API (#45110 )	2019-08-02 08:42:39 -07:00
Lisa Cawley	e4b7ae211b	[DOCS] Updates terms in machine learning get APIs (#44986 )	2019-07-30 11:29:25 -07:00
István Zoltán Szabó	19426f9cdf	[DOCS] Adds allow no jobs param to the GET, GET stats and Close APIs (#44503 )	2019-07-30 14:27:27 +02:00
Lisa Cawley	a041d1eacf	[DOCS] Updates anomaly detection terminology (#44888 )	2019-07-26 11:10:49 -07:00
Lisa Cawley	cef375f883	[DOCS] Updates terms in machine learning datafeed APIs (#44883 )	2019-07-26 10:48:28 -07:00
István Zoltán Szabó	cd7ba9f302	[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806 ) This PR addresses the feedback in https://github.com/elastic/ml-team/issues/175#issuecomment-512215731. * Adds an example to `analyzed_fields` * Includes `source` and `dest` objects inline in the resource page * Lists `model_memory_limit` in the PUT API page * Amends the `analysis` section in the resource page * Removes Properties headings in subsections	2019-07-26 11:52:43 +02:00
Lisa Cawley	21971feae8	[DOCS] Updates terms in machine learning calendar APIs (#44866 )	2019-07-25 11:50:34 -07:00
Lisa Cawley	a79adca7e3	[DOCS] Updates terms in anomaly detection job APIs (#44839 )	2019-07-25 09:06:52 -07:00
István Zoltán Szabó	4a31c426e6	[DOCS] Adds allow no datafeeds query param to the GET, GET stats and STOP datafeed APIs (#44499 )	2019-07-25 17:08:01 +02:00
James Rodewig	d46545f729	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:18:23 -04:00
Lisa Cawley	8445c41004	[DOCS] Moves content to ML anomaly-detection folder (#44520 ) (#44530 )	2019-07-18 08:44:52 -07:00
Lisa Cawley	213af8411f	[DOCS] Fixes query default value (#44572 )	2019-07-18 08:18:58 -07:00
Lisa Cawley	53514b0477	[DOCS] Separates data frame analytics APIs (#44451 )	2019-07-16 13:33:23 -07:00
James Rodewig	ac07eef86c	[DOCS] Remove :edit_url: overrides. (#44445 ) These overrides do not work in Asciidoctor and are no longer needed.	2019-07-16 15:04:44 -04:00
Lisa Cawley	e7ea49e32f	[DOCS] Removes unnecessary resource definition pages (#44289 )	2019-07-15 10:03:53 -07:00
David Kyle	2382701547	Wait for pending tasks in docs tests cleanup (#44123 ) ML and Data Frame tests should wait for pending tasks	2019-07-15 12:04:27 +01:00
Lisa Cawley	8fdcf28fac	[DOCS] Reformats API parameter details (#44194 )	2019-07-12 08:28:49 -07:00
Lisa Cawley	4d8bf1c3e3	[DOCS] Removes links to ML tutorial (#44251 )	2019-07-12 08:28:36 -07:00
István Zoltán Szabó	2171b6b47f	[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972 ) This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool.	2019-07-11 18:12:48 +02:00
lcawl	4e6cbc2890	[DOCS] Fixes formatting in data frame analytics API	2019-07-10 18:01:47 -07:00
Przemysław Witek	44781e415e	[7.x] [ML] Add DatafeedTimingStats to datafeed GetDatafeedStatsAction.Response (#43045 ) (#44118 )	2019-07-10 11:51:44 +02:00
David Kyle	23d7e309da	Mute put job docs test Relates to #43271	2019-07-09 13:23:31 +01:00
lcawl	cd4021274a	[DOCS] Enables testing for create job ML API (#44022 )	2019-07-08 11:43:18 -07:00
Lisa Cawley	117f14e0ed	[DOCS] Updates 7.x version in data frame analytics API (#44026 )	2019-07-08 11:20:57 -07:00
Lisa Cawley	efddbcc1d1	[DOCS] Fixes earliest_record_timestamp data type (#44030 )	2019-07-08 10:16:07 -07:00
lcawl	a831d4707c	[DOCS] Temporarily disables data frame API testing	2019-07-05 10:56:09 -07:00
István Zoltán Szabó	7242267f5d	[DOCS] Adds data frame analytics APIs to the ML APIs (#43875 ) This PR adds the reference documentation pages of the data frame analytics APIs (PUT, START, STOP, GET, GET stats, DELETE, Evaluate) to the ML APIs pool.	2019-07-05 14:25:54 +02:00
Lisa Cawley	1b7bcdc3a0	[DOCS] Adds data frame API response codes for allow_no_match (#43666 )	2019-06-27 15:17:58 -07:00
Lisa Cawley	42cb59f7b4	[DOCS] Updates ML APIs to use new API template (#43711 )	2019-06-27 15:05:51 -07:00
lcawl	d46e2bb26a	[DOCS] Adds anchors and attributes to ML APIs	2019-06-27 09:44:56 -07:00
Matthew Adams	0bcadbf846	Clarify storage location of ML Snapshots (#43437 ) The existing language was misleading about the model snapshots and where they are located. Saying "to disk" sounds like files external to Elasticsearch IMO. It raises the obvious question, where on disk? which node? Is it in the Elasticsearch snapshot repo? The model snapshots are held in an internal index.	2019-06-24 09:14:12 +01:00
Przemysław Witek	b2613a123d	[7.x] Report exponential_avg_bucket_processing_time which gives more weight to recent buckets (#43189 ) (#43263 )	2019-06-17 08:58:26 +02:00
lcawl	8a341a3ea5	[DOCS] Fix link to ML node description	2019-06-13 13:56:06 -07:00
Ryan Ernst	c3ce3f6891	Add native code info to ML info api (#43172 ) The machine learning feature of xpack has native binaries with a different commit id than the rest of code. It is currently exposed in the xpack info api. This commit adds that commit information to the ML info api, so that it may be removed from the info api.	2019-06-13 11:38:58 -07:00
Benjamin Trent	79052050bf	[ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds (#42969 ) (#43069 ) * [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds * only supporting doc_values for geo_point fields * moving validation into GeoPointField ctor	2019-06-10 21:52:53 -05:00
David Roberts	b202a59f88	[ML] Add earliest and latest timestamps to field stats (#42890 ) This change adds the earliest and latest timestamps into the field stats for fields of type "date" in the output of the ML find_file_structure endpoint. This will enable the cards for date fields in the file data visualizer in the UI to be made to look more similar to the cards for date fields in the index data visualizer in the UI.	2019-06-06 08:58:35 +01:00
David Roberts	b61202b0a8	[ML] Add a limit on line merging in find_file_structure (#42501 ) When analysing a semi-structured text file the find_file_structure endpoint merges lines to form multi-line messages using the assumption that the first line in each message contains the timestamp. However, if the timestamp is misdetected then this can lead to excessive numbers of lines being merged to form massive messages. This commit adds a line_merge_size_limit setting (default 10000 characters) that halts the analysis if a message bigger than this is created. This prevents significant CPU time being spent subsequently trying to determine the internal structure of the huge bogus messages.	2019-06-03 13:45:51 +01:00
Benjamin Trent	d06618a70d	[ML] adding delayed_data_check_config to datafeed update docs (#42095 ) (#42626 ) * [ML] adding delayed_data_check_config to datafeed update docs * [DOCS] Edits delayed data configuration details	2019-05-28 11:36:30 -04:00
David Roberts	f472186b9f	[ML] Improve file structure finder timestamp format determination (#41948 ) This change contains a major refactoring of the timestamp format determination code used by the ML find file structure endpoint. Previously timestamp format determination was done separately for each piece of text supplied to the timestamp format finder. This had the drawback that it was not possible to distinguish dd/MM and MM/dd in the case where both numbers were 12 or less. In order to do this sensibly it is best to look across all the available timestamps and see if one of the numbers is greater than 12 in any of them. This necessitates making the timestamp format finder an instantiable class that can accumulate evidence over time. Another problem with the previous approach was that it was only possible to override the timestamp format to one of a limited set of timestamp formats. There was no way out if a file to be analysed had a timestamp that was sane yet not in the supported set. This is now changed to allow any timestamp format that can be parsed by a combination of these Java date/time formats: yy, yyyy, M, MM, MMM, MMMM, d, dd, EEE, EEEE, H, HH, h, mm, ss, a, XX, XXX, zzz Additionally S letter groups (fractional seconds) are supported providing they occur after ss and separated from the ss by a dot, comma or colon. Spacing and punctuation is also permitted with the exception of the question mark, newline and carriage return characters, together with literal text enclosed in single quotes. The full list of changes/improvements in this refactor is: - Make TimestampFormatFinder an instantiable class - Overrides must be specified in Java date/time format - Joda format is no longer accepted - Joda timestamp formats in outputs are now derived from the determined or overridden Java timestamp formats, not stored separately - Functionality for determining the "best" timestamp format in a set of lines has been moved from TextLogFileStructureFinder to TimestampFormatFinder, taking advantage of the fact that TimestampFormatFinder is now an instantiable class with state - The functionality to quickly rule out some possible Grok patterns when looking for timestamp formats has been changed from using simple regular expressions to the much faster approach of using the Shift-And method of sub-string search, but using an "alphabet" consisting of just 1 (representing any digit) and 0 (representing non-digits) - Timestamp format overrides are now much more flexible - Timestamp format overrides that do not correspond to a built-in Grok pattern are mapped to a %{CUSTOM_TIMESTAMP} Grok pattern whose definition is included within the date processor in the ingest pipeline - Grok patterns that correspond to multiple Java date/time patterns are now handled better - the Grok pattern is accepted as matching broadly, and the required set of Java date/time patterns is built up considering all observed samples - As a result of the more flexible acceptance of Grok patterns, when looking for the "best" timestamp in a set of lines timestamps are considered different if they are preceded by a different sequence of punctuation characters (to prevent timestamps far into some lines being considered similar to timestamps near the beginning of other lines) - Out-of-the-box Grok patterns that are considered now include %{DATE} and %{DATESTAMP}, which have indeterminate day/month ordering - The order of day/month in formats with indeterminate day/month order is determined by considering all observed samples (plus the server locale if the observed samples still do not suggest an ordering) Relates #38086 Closes #35137 Closes #35132	2019-05-24 09:10:08 +01:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
James Rodewig	005296dac6	[DOCS] Allow attribute substitution in titleabbrevs for Asciidoctor migration (#41574 ) * [DOCS] Replace attributes in titleabbrevs for Asciidoctor migration * [DOCS] Add [subs="attributes"] so attributes render in Asciidoctor * Revert "[DOCS] Replace attributes in titleabbrevs for Asciidoctor migration" This reverts commit 98f130257a7c71e9f6cddf5157af7886418338d8. * [DOCS] Fix merge conflict	2019-04-30 13:46:45 -04:00
David Roberts	cbe7d335ff	[DOCS] Use "source" instead of "inline" in ML docs (#40635 ) Specifying an inline script in an "inline" field was deprecated in 5.x. The new field name is "source". (Since 6.x still accepts "inline" I will only backport this docs change as far as 7.0.)	2019-03-29 17:30:28 +00:00
David Kyle	48788269b0	[ML] Correct small inconsistencies in ml APIs spec and docs (#39907 )	2019-03-11 14:02:50 +00:00
David Roberts	5f8f91c03b	[ML] Use scaling thread pool and xpack.ml.max_open_jobs cluster-wide dynamic (#39736 ) This change does the following: 1. Makes the per-node setting xpack.ml.max_open_jobs into a cluster-wide dynamic setting 2. Changes the job node selection to continue to use the per-node attributes storing the maximum number of open jobs if any node in the cluster is older than 7.1, and use the dynamic cluster-wide setting if all nodes are on 7.1 or later 3. Changes the docs to reflect this 4. Changes the thread pools for native process communication from fixed size to scaling, to support the dynamic nature of xpack.ml.max_open_jobs 5. Renames the autodetect thread pool to the job comms thread pool to make clear that it will be used for other types of ML jobs (data frame analytics in particular) Backport of #39320	2019-03-06 12:29:34 +00:00
Tal Levy	92756288b4	relax ML Info Docs expected response (#38993 ) the get-ml-info API documentation tested that the response show that ML's `upgrade_mode` was false. For reasons that may be true due to other tests running in parallel or not cleaning themselves up, this may not be guaranteed. Since the actual value here is not of importance, this commit relaxes the requirement that upgrade_mode be static.	2019-02-15 16:31:01 -08:00

1 2

93 Commits