The `abs` function will return the absolute value of the provided single parameter. The `abs` function will fail to execute if the value is non-numeric. If a null value is found then null will be returned as the result.
=== abs Parameters
* `Field Name | Raw Number | Number Evaluator`
=== abs Syntax
The expressions below show the various ways in which you can use the `abs` evaluator. Only one parameter is accepted. Returns a numeric value.
[source,text]
----
abs(1) // 1, not really a good use case for it
abs(-1) // 1, not really a good use case for it
abs(add(fieldA,fieldB)) // absolute value of fieldA + fieldB
The `add` function will take 2 or more numeric values and add them together. The `add` function will fail to execute if any of the values are non-numeric. If a null value is found then null will be returned as the result.
=== add Parameters
* `Field Name | Raw Number | Number Evaluator`
* `Field Name | Raw Number | Number Evaluator`
* `......`
* `Field Name | Raw Number | Number Evaluator`
=== add Syntax
The expressions below show the various ways in which you can use the `add` evaluator. The number and order of these parameters do not matter and is not limited except that at least two parameters are required. Returns a numeric value.
[source,text]
----
add(1,2,3,4) // 1 + 2 + 3 + 4 == 10
add(1,fieldA) // 1 + value of fieldA
add(fieldA,1.4) // value of fieldA + 1.4
add(fieldA,fieldB,fieldC) // value of fieldA + value of fieldB + value of fieldC
add(fieldA,div(fieldA,fieldB)) // value of fieldA + (value of fieldA / value of fieldB)
add(fieldA,if(gt(fieldA,fieldB),fieldA,fieldB)) // if fieldA > fieldB then fieldA + fieldA, else fieldA + fieldB
* Analyze the raw text: `analyze("hello world", analyzerField)`
* Analyze a text field within a `select` expression. This will annotate the tuples with output of the analyzer: `select(expr, analyze(textField, analyzerField) as outField)`
* Analyze a text field with a `cartesianProduct` expression. This will stream each token emitted by the analyzer in its own tuple: `cartesianProduct(expr, analyze(textField, analyzer) as outField)`
The `and` function will return the logical AND of at least 2 boolean parameters. The function will fail to execute if any parameters are non-boolean or null. Returns a boolean value.
The expressions below show the various ways in which you can use the `and` evaluator. At least two parameters are required, but there is no limit to how many you can use.
* `length`: The length of the copied array. The returned array will be right padded with zeros if the length parameter exceeds the size of the original array.
The `cumulativeProbability` function returns the cumulative probability of a random variable within a
probability distribution. The cumulative probability is the total probability of
all random variables less then or equal to a random variable.
=== cumulativeProbability Parameters
* `probability distribution`
* `number` : Value to compute the probability for.
=== cumulativeProbability Returns
double : the cumulative probability
=== cumulativeProbability Syntax
[source,text]
cumulativeProbability(normalDistribution(500, 25), 502) // Returns the cumulative probability of the random sample 502 in a normal distribution with a mean of 500 and standard deviation of 25.
The `div` function will take two numeric values and divide them. The function will fail to execute if any of the values are non-numeric or null, or the 2nd value is 0. Returns a numeric value.
The expressions below show the various ways in which you can use the `div` evaluator. The first value will be divided by the second and as such the second cannot be 0.
The `dotProduct` function returns the dotproduct (https://en.wikipedia.org/wiki/Dot_product) of a numeric array.
=== dotProduct Parameters
* `numeric array`
=== dotProduct Syntax
[source,text]
dotProduct(numericArray)
=== dotProduct Returns
number
== earthMoversDistance
The `earthMoversDistance` function calculates the Earth Movers distance (https://en.wikipedia.org/wiki/Earth_mover%27s_distance) of two numeric arrays.
The `ebeAdd` function performs an element-by-element addition of two numeric arrays.
=== ebeAdd Parameters
* `numeric array`
* `numeric array`
=== ebeAdd Syntax
[source,text]
ebeAdd(numericArray, numericArray)
=== ebeAdd Returns
numeric array
== ebeDivide
The `ebeDivide` function performs an element-by-element division of two numeric arrays.
=== ebeDivide Parameters
* `numeric array`
* `numeric array`
=== ebeDivide Syntax
[source,text]
ebeDivide(numericArray, numericArray)
=== ebeDivide Returns
numeric array
== ebeMultiple
The `ebeMultiply` function performs an element-by-element multiplication of two numeric arrays.
=== ebeMultiply Parameters
* `numeric array`
* `numeric array`
=== ebeMultiply Syntax
[source,text]
ebeMultiply(numericArray, numericArray)
=== ebeMultiply Returns
numeric array
== ebeSubtract
The `ebeSubtract` function performs an element-by-element subtraction of two numeric arrays.
=== ebeSubtract Parameters
* `numeric array`
* `numeric array`
=== ebeSubtract Syntax
[source,text]
ebeSubtract(numericArray, numericArray)
=== ebeSubtract Returns
numeric array
== empiricalDistribution
The `empiricalDistribution` function returns a continuous probability distribution function based
on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is designed to
work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
This function is designed to work with continuous data. To build a distribution from
a discrete data set use the `enumeratedDistribution`.
=== empiricalDistribution Parameters
* `numeric array` : empirical observations
=== empiricalDistribution Returns
probability distribution function
=== empiricalDistribution Syntax
empiricalDistribution(numericArray)
== enumeratedDistribution
The `enumeratedDistribution` function returns a discrete probability distribution function based
on an actual data set or a pre-defined set of data and probabilities.
This function is part of the probability distribution framework and is designed to
work with the `sample`, `probability` and `cumulativeProbability` functions.
The enumeratedDistribution can be called in two different scenarios:
1) Single array of discrete values. This works like an empirical distribution for
discrete data.
2) An array of singleton discrete values and an array of double values representing
the probabilities of the discrete values.
This function is designed to work with discrete data. To build a distribution from
a continuous data set use the `empiricalDistribution`.
=== enumeratedDistribution Parameters
* `integer array` : discrete observations or singleton discrete values.
* `double array` : (Optional) values representing the probabilities of the singleton discrete values.
=== enumeratedDistribution Returns
probability distribution function
=== enumeratedDistribution Syntax
[source,text]
enumeratedDistribution(integerArray) // This creates an enumerated distribution from the observations in the numeric array.
enumeratedDistribution(array(1,2,3,4), array(.25,.25,.25,.25)) // This creates an enumerated distribution with four discrete values (1,2,3,4) each with a probability of .25.
The `eor` function will return the logical exclusive or of at least two boolean parameters. The function will fail to execute if any parameters are non-boolean or null. Returns a boolean value.
The expressions below show the various ways in which you can use the `eor` evaluator. At least two parameters are required, but there is no limit to how many you can use.
The `eq` function will return whether all the parameters are equal, as per Java's standard `equals(...)` function. The function accepts parameters of any type, but will fail to execute if all the parameters are not of the same type. That is, all are Boolean, all are String, all are Numeric. If any any parameters are null and there is at least one parameter that is not null then false will be returned. Returns a boolean value.
=== eq Parameters
* `Field Name | Raw Value | Evaluator`
* `Field Name | Raw Value | Evaluator`
* `......`
* `Field Name | Raw Value | Evaluator`
=== eq Syntax
The expressions below show the various ways in which you can use the `eq` evaluator.
The `expMovingAverage` function computes an exponential moving average (https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average) for a numeric array.
=== expMovingAge Parameters
* `numeric array` : The array to compute the exponential moving average from.
* `integer`: window size
=== expMovingAvg Returns
numeric array : (The first element of the returned array will start from the windowSize-1 index of the original array)
=== expMovingAvg Syntax
[source,text]
----
expMovingAvg(numericArray, 5) //Computes an exponential moving average with a window size of 5.
----
== factorial
The `factorial` function returns the factorial (https://en.wikipedia.org/wiki/Factorial) of its parameter.
=== factorial Parameters
* `integer` : The value to compute the factorial for. The largest supported value of this parameter is 170.
The `gt` function will return whether the first parameter is greater than the second parameter. The function accepts numeric or string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value.
=== gt Parameters
* `Field Name | Raw Value | Evaluator`
* `Field Name | Raw Value | Evaluator`
=== gt Syntax
The expressions below show the various ways in which you can use the `gt` evaluator.
The `gteq` function will return whether the first parameter is greater than or equal to the second parameter. The function accepts numeric and string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value.
=== gteq Parameters
* `Field Name | Raw Value | Evaluator`
* `Field Name | Raw Value | Evaluator`
=== gteq Syntax
The expressions below show the various ways in which you can use the `gteq` evaluator.
The `if` function works like a standard conditional if/then statement. If the first parameter is true, then the second parameter will be returned, else the third parameter will be returned. The function accepts a boolean as the first parameter and anything as the second and third parameters. An error will occur if the first parameter is not a boolean or is null.
=== if Parameters
* `Field Name | Raw Value | Boolean Evaluator`
* `Field Name | Raw Value | Evaluator`
* `Field Name | Raw Value | Evaluator`
=== if Syntax
The expressions below show the various ways in which you can use the `if` evaluator.
[source,text]
----
if(fieldA,fieldB,fieldC) // if fieldA is true then fieldB else fieldC
if(gt(fieldA,5), fieldA, 5) // if fieldA > 5 then fieldA else 5
if(eq(fieldB,null), null, div(fieldA,fieldB)) // if fieldB is null then null else fieldA / fieldB
The `kendallsCorr` function returns the Kendall's Tau-b Rank Correlation (https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient) of two numeric arrays.
The `log` function will return the natural log of the provided single parameter. The `log` function will fail to execute if the value is non-numeric. If a null value is found, then null will be returned as the result.
=== log Parameters
* `Field Name | Raw Number | Number Evaluator`
=== log Syntax
The expressions below show the various ways in which you can use the `log` evaluator. Only one parameter is accepted. Returns a numeric value.
The `lt` function will return whether the first parameter is less than the second parameter. The function accepts numeric or string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value.
=== lt Parameters
* `Field Name | Raw Value | Evaluator`
* `Field Name | Raw Value | Evaluator`
=== lt Syntax
The expressions below show the various ways in which you can use the `lt` evaluator.
The `lteq` function will return whether the first parameter is less than or equal to the second parameter. The function accepts numeric and string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value.
mod(100,fieldA) // returns the remainder of 100 divided by the value of fieldA.
mod(fieldA,1.4) // returns the remainder of fieldA divided by 1.4.
if(gt(fieldA,fieldB),mod(fieldA,fieldB),mod(fieldB,fieldA)) // if fieldA > fieldB then return the remainder of fieldA/fieldB, else return the remainder of fieldB/fieldA.
The `mult` function will take two or more numeric values and multiply them together. The `mult` function will fail to execute if any of the values are non-numeric. If a null value is found then null will be returned as the result.
The expressions below show the various ways in which you can use the `mult` evaluator. The number and order of these parameters do not matter and is not limited except that at least two parameters are required. Returns a numeric value.
The `not` function will return the logical NOT of a single boolean parameter. The function will fail to execute if the parameter is non-boolean or null. Returns a boolean value.
The `or` function will return the logical OR of at least 2 boolean parameters. The function will fail to execute if any parameters are non-boolean or null. Returns a boolean value.
The expressions below show the various ways in which you can use the `or` evaluator. At least two parameters are required, but there is no limit to how many you can use.
The `polyFit` function performs polynomial curve fitting (https://en.wikipedia.org/wiki/Curve_fitting#Fitting_lines_and_polynomial_functions_to_data_points).
=== polyFit Parameters
* `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values.
* `numeric array` : y values
* `integer` : (Optional) polynomial degree. Defaults to 3.
=== polyFit Returns
numeric array : curve that was fit to the data points.
=== polyFit Syntax
[source,text]
polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using a the default 3 degree polynomial.
polyFit(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial.
== polyfitDerivative
The `polyfitDerivative` function returns the derivative of the curve created by the polynomial curve fitter.
=== polyfitDerivative Parameters
* `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values.
* `numeric array` : y values
* `integer` : (Optional) polynomial degree. Defaults to 3.
=== polyfitDerivative Returns
numeric array : The curve for the derivative created by the polynomial curve fitter.
=== polyfitDerivative Syntax
[source,text]
polyfitDerivative(yValues) // This creates the xValues automatically and returns the polyfit derivative
polyfitDerivative(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
The `raw` function will return whatever raw value is the parameter. This is useful for cases where you want to use a string as part of another evaluator.
The expressions below show the various ways in which you can use the `raw` evaluator. Whatever is inside will be returned as-is. Internal evaluators are considered strings and are not evaluated.
The `spearmansCorr` function returns the Spearmans Rank Correlation (https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient) of two numeric arrays.
The `sub` function will take 2 or more numeric values and subtract them, from left to right. The sub function will fail to execute if any of the values are non-numeric. If a null value is found then null will be returned as the result.
The expressions below show the various ways in which you can use the `sub` evaluator. The number of these parameters does not matter and is not limited except that at least two parameters are required. Returns a numeric value.
The `sumDifference` function calculates the sum of the differences following an element-by-element subtraction between two numeric arrays.
=== sumDifference Parameters
* `numeric array`
* `numeric array`
=== sumDifference Returns
numeric
=== sumDifference Syntax
[source,text]
----
sumDifference(numericArray, numericArray)
----
== uniformDistribution
The `uniformDistribution` function returns a continuous uniform probability distribution (https://en.wikipedia.org/wiki/Uniform_distribution_(continuous))
based on its parameters. See the `uniformIntegerDistribution` to work with discrete uniform distributions. This function is part of the
probability distribution framework and is designed to work with the `sample` and `cumulativeProbability` functions.
=== uniforDistribution Parameters
* `double` : start
* `double` : end
=== uniformDistribution Returns
probability distribution function
=== uniformDistribution Syntax
[source,text]
uniformDistribution(0.0, 100.0)
== uniformIntegerDistribution
The `uniformIntegerDistribution` function returns a discrete uniform probability distribution (https://en.wikipedia.org/wiki/Discrete_uniform_distribution)
based on its parameters. See the `uniformDistribution` to work with continuous uniform distributions. This function is part of the
probability distribution framework and is designed to work with the `sample`, `probability` and `cumulativeProbability` functions.
=== uniformIntegerDistribution Parameters
* `integer` : start
* `integer` : end
=== uniformIntegerDistribution Returns
probability distribution function
=== uniformIntegerDistribution Syntax
[source,text]
uniformDistribution(1, 6)
== weibullDistribution
The `weibullDistribution` function returns a Weibull probability distribution (https://en.wikipedia.org/wiki/Weibull_distribution)
based on its parameters. This function is part of the
probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
=== weibullDistribution Parameters
* `double` : shape
* `double` : scale
=== weibullDistribution Returns
probability distribution function
=== weibullDistribution Syntax
[source,text]
weibullDistribution(.5, 10)
== zipFDistribution
The `zipFDistribution` function returns a ZipF distribution (https://en.wikipedia.org/wiki/Zeta_distribution)
based on its parameters. This function is part of the
probability distribution framework and is designed to work with the `sample`,
`probability` and `cumulativeProbability` functions.