OpenSearch/docs/en/ml/datafeeds.asciidoc

41 lines
1.9 KiB
Plaintext
Raw Normal View History

[[ml-dfeeds]]
=== {dfeeds-cap}
Machine learning jobs can analyze data that is stored in {es} or data that is
sent from some other source via an API. _{dfeeds-cap}_ retrieve data from {es}
for analysis, which is the simpler and more common scenario.
If you create jobs in {kib}, you must use {dfeeds}. When you create a job, you
select an index pattern and {kib} configures the {dfeed} for you under the
covers. If you use {ml} APIs instead, you can create a {dfeed} by using the
{ref}/ml-put-datafeed.html[create {dfeeds} API] after you create a job. You can
associate only one {dfeed} with each job.
For a description of all the {dfeed} properties, see
{ref}/ml-datafeed-resource.html[Datafeed Resources].
To start retrieving data from {es}, you must start the {dfeed}. When you start
it, you can optionally specify start and end times. If you do not specify an
end time, the {dfeed} runs continuously. You can start and stop {dfeeds} in
{kib} or use the {ref}/ml-start-datafeed.html[start {dfeeds}] and
{ref}/ml-stop-datafeed.html[stop {dfeeds}] APIs. A {dfeed} can be started and
stopped multiple times throughout its lifecycle.
[IMPORTANT]
--
When {security} is enabled, a {dfeed} stores the roles of the user who created
or updated the {dfeed} at that time. This means that if those roles are updated,
the {dfeed} subsequently runs with the new permissions that are associated with
the roles. However, if the users roles are adjusted after creating or updating
the {dfeed}, the {dfeed} continues to run with the permissions that were
associated with the original roles.
One way to update the roles that are stored within the {dfeed} without changing
any other settings is to submit an empty JSON document ({}) to the
{ref}/ml-update-datafeed.html[update {dfeed} API].
--
If the data that you want to analyze is not stored in {es}, you cannot use
{dfeeds}. You can however send batches of data directly to the job by using the
{ref}/ml-post-data.html[post data to jobs API].