120 lines
4.1 KiB
Plaintext
120 lines
4.1 KiB
Plaintext
|
||
[[ml-sum-functions]]
|
||
=== Sum Functions
|
||
|
||
The sum functions detect anomalies when the sum of a field in a bucket is anomalous.
|
||
|
||
If you want to monitor unusually high totals, use high-sided functions.
|
||
|
||
If want to look at drops in totals, use low-sided functions.
|
||
|
||
If your data is sparse, use `non_null_sum` functions. Buckets without values are
|
||
ignored; buckets with a zero value are analyzed.
|
||
|
||
The {xpackml} features include the following sum functions:
|
||
|
||
* xref:ml-sum[`sum`, `high_sum`, `low_sum`]
|
||
* xref:ml-nonnull-sum[`non_null_sum`, `high_non_null_sum`, `low_non_null_sum`]
|
||
|
||
////
|
||
TBD: Incorporate from prelert docs?:
|
||
Input data may contain pre-calculated fields giving the total count of some value e.g. transactions per minute.
|
||
Ensure you are familiar with our advice on Summarization of Input Data, as this is likely to provide
|
||
a more appropriate method to using the sum function.
|
||
////
|
||
|
||
[float]
|
||
[[ml-sum]]
|
||
==== Sum, High_sum, Low_sum
|
||
|
||
The `sum` function detects anomalies where the sum of a field in a bucket is
|
||
anomalous.
|
||
|
||
If you want to monitor unusually high sum values, use the `high_sum` function.
|
||
|
||
If you want to monitor unusually low sum values, use the `low_sum` function.
|
||
|
||
These functions support the following properties:
|
||
|
||
* `field_name` (required)
|
||
* `by_field_name` (optional)
|
||
* `over_field_name` (optional)
|
||
* `partition_field_name` (optional)
|
||
|
||
For more information about those properties, see
|
||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||
|
||
.Example 1: Analyzing total expenses with the sum function
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"function" : "sum",
|
||
"field_name" : "expenses",
|
||
"by_field_name" : "costcenter",
|
||
"over_field_name" : "employee"
|
||
}
|
||
--------------------------------------------------
|
||
|
||
If you use this `sum` function in a detector in your job, it
|
||
models total expenses per employees for each cost center. For each time bucket,
|
||
it detects when an employee’s expenses are unusual for a cost center compared
|
||
to other employees.
|
||
|
||
.Example 2: Analyzing total bytes with the high_sum function
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"function" : "high_sum",
|
||
"field_name" : "cs_bytes",
|
||
"over_field_name" : "cs_host"
|
||
}
|
||
--------------------------------------------------
|
||
|
||
If you use this `high_sum` function in a detector in your job, it
|
||
models total `cs_bytes`. It detects `cs_hosts` that transfer unusually high
|
||
volumes compared to other `cs_hosts`. This example looks for volumes of data
|
||
transferred from a client to a server on the internet that are unusual compared
|
||
to other clients. This scenario could be useful to detect data exfiltration or
|
||
to find users that are abusing internet privileges.
|
||
|
||
[float]
|
||
[[ml-nonnull-sum]]
|
||
==== Non_null_sum, High_non_null_sum, Low_non_null_sum
|
||
|
||
The `non_null_sum` function is useful if your data is sparse. Buckets without
|
||
values are ignored and buckets with a zero value are analyzed.
|
||
|
||
If you want to monitor unusually high totals, use the `high_non_null_sum`
|
||
function.
|
||
|
||
If you want to look at drops in totals, use the `low_non_null_sum` function.
|
||
|
||
These functions support the following properties:
|
||
|
||
* `field_name` (required)
|
||
* `by_field_name` (optional)
|
||
* `partition_field_name` (optional)
|
||
|
||
For more information about those properties, see
|
||
{ref}/ml-job-resource.html#ml-detectorconfig[Detector Configuration Objects].
|
||
|
||
NOTE: Population analysis (that is to say, use of the `over_field_name` property)
|
||
is not applicable for this function.
|
||
|
||
.Example 3: Analyzing employee approvals with the high_non_null_sum function
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"function" : "high_non_null_sum",
|
||
"fieldName" : "amount_approved",
|
||
"byFieldName" : "employee"
|
||
}
|
||
--------------------------------------------------
|
||
|
||
If you use this `high_non_null_sum` function in a detector in your job, it
|
||
models the total `amount_approved` for each employee. It ignores any buckets
|
||
where the amount is null. It detects employees who approve unusually high
|
||
amounts compared to their past behavior.
|
||
//For this credit control system analysis, using non_null_sum will ignore
|
||
//periods where the employees are not active on the system.
|