OpenSearch/docs/reference/ml
David Roberts b61202b0a8 [ML] Add a limit on line merging in find_file_structure (#42501)
When analysing a semi-structured text file the
find_file_structure endpoint merges lines to form
multi-line messages using the assumption that the
first line in each message contains the timestamp.
However, if the timestamp is misdetected then this
can lead to excessive numbers of lines being merged
to form massive messages.

This commit adds a line_merge_size_limit setting
(default 10000 characters) that halts the analysis
if a message bigger than this is created.  This
prevents significant CPU time being spent subsequently
trying to determine the internal structure of the
huge bogus messages.
2019-06-03 13:45:51 +01:00
..
apis [ML] Add a limit on line merging in find_file_structure (#42501) 2019-06-03 13:45:51 +01:00
functions [DOCS] Cleans up xpackml attributes 2019-01-07 14:33:10 -08:00
images [DOCS] Remove unused screenshots 2019-01-10 11:10:25 -08:00
aggregations.asciidoc [7.x Backport] Force selection of calendar or fixed intervals (#41906) 2019-05-20 12:07:29 -04:00
categories.asciidoc [ML] Deprecate X-Pack centric ML endpoints (#36315) 2018-12-07 20:34:11 +00:00
configuring.asciidoc [DOCS] Cleans up xpackml attributes 2019-01-07 14:33:10 -08:00
customurl.asciidoc [ML] Deprecate X-Pack centric ML endpoints (#36315) 2018-12-07 20:34:11 +00:00
delayed-data-detection.asciidoc [DOCS] Delayed data annotations (#37939) 2019-01-28 13:04:38 -08:00
detector-custom-rules.asciidoc [ML] Deprecate X-Pack centric ML endpoints (#36315) 2018-12-07 20:34:11 +00:00
functions.asciidoc [DOCS] Cleans up xpackml attributes 2019-01-07 14:33:10 -08:00
populations.asciidoc [ML] Deprecate X-Pack centric ML endpoints (#36315) 2018-12-07 20:34:11 +00:00
stopping-ml.asciidoc [ML] Deprecate X-Pack centric ML endpoints (#36315) 2018-12-07 20:34:11 +00:00
transforms.asciidoc [DOCS] Use "source" instead of "inline" in ML docs (#40635) 2019-03-29 17:30:28 +00:00