= Stream Evaluator Reference :page-shortname: stream-evaluator-reference :page-permalink: stream-evaluator-reference.html :page-tocclass: right :page-toclevels: 1 // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information // regarding copyright ownership. The ASF licenses this file // to you under the Apache License, Version 2.0 (the // "License"); you may not use this file except in compliance // with the License. You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, // software distributed under the License is distributed on an // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY // KIND, either express or implied. See the License for the // specific language governing permissions and limitations // under the License. Stream evaluators are different then stream sources or stream decorators. Both stream sources and stream decorators return streams of tuples. Stream evaluators are more like a traditional function that evaluates its parameters and returns an result. That result can be a single value, array, map or other structure. Stream evaluators can be nested so that the output of an evaluator becomes the input for another evaluator. Stream evaluators can be called in different contexts. For example a stream evaluator can be called on its own or it can be called within the context of a streaming expression. == abs The `abs` function will return the absolute value of the provided single parameter. The `abs` function will fail to execute if the value is non-numeric. If a null value is found then null will be returned as the result. === abs Parameters * `Field Name | Raw Number | Number Evaluator` === abs Syntax The expressions below show the various ways in which you can use the `abs` evaluator. Only one parameter is accepted. Returns a numeric value. [source,text] ---- abs(1) // 1, not really a good use case for it abs(-1) // 1, not really a good use case for it abs(add(fieldA,fieldB)) // absolute value of fieldA + fieldB abs(fieldA) // absolute value of fieldA ---- == acos The `acos` function returns the trigonometric arccosine of a number. === acos Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the arccosine of. === acos Syntax [source,text] ---- acos(100.4) // returns the arccosine of 100.4 acos(fieldA) // returns the arccosine for fieldA. if(gt(fieldA,fieldB),sin(fieldA),sin(fieldB)) // if fieldA > fieldB then return the arccosine of fieldA, else return the arccosine of fieldB ---- == add The `add` function will take 2 or more numeric values and add them together. The `add` function will fail to execute if any of the values are non-numeric. If a null value is found then null will be returned as the result. === add Parameters * `Field Name | Raw Number | Number Evaluator` * `Field Name | Raw Number | Number Evaluator` * `......` * `Field Name | Raw Number | Number Evaluator` === add Syntax The expressions below show the various ways in which you can use the `add` evaluator. The number and order of these parameters do not matter and is not limited except that at least two parameters are required. Returns a numeric value. [source,text] ---- add(1,2,3,4) // 1 + 2 + 3 + 4 == 10 add(1,fieldA) // 1 + value of fieldA add(fieldA,1.4) // value of fieldA + 1.4 add(fieldA,fieldB,fieldC) // value of fieldA + value of fieldB + value of fieldC add(fieldA,div(fieldA,fieldB)) // value of fieldA + (value of fieldA / value of fieldB) add(fieldA,if(gt(fieldA,fieldB),fieldA,fieldB)) // if fieldA > fieldB then fieldA + fieldA, else fieldA + fieldB ---- == analyze The `analyze` function analyzes text using a Lucene/Solr analyzer and returns a list of tokens emitted by the analyzer. The `analyze` function can be called on its own or within the `select` and `cartasianProduct` Streaming Expressions. === analyze Parameters * `Field Name` | `Raw Text`: Either the field in a tuple or the raw text to be analyzed. * `Analyzer Field Name`: The field name of the analyzer to use to analyze the text. === analyze Syntax The expressions below show the various ways in which you can use the `analyze` evaluator. * Analyze the raw text: `analyze("hello world", analyzerField)` * Analyze a text field within a `select` expression. This will annotate the tuples with output of the analyzer: `select(expr, analyze(textField, analyzerField) as outField)` * Analyze a text field with a `cartesianProduct` expression. This will stream each token emitted by the analyzer in its own tuple: `cartesianProduct(expr, analyze(textField, analyzer) as outField)` == and The `and` function will return the logical AND of at least 2 boolean parameters. The function will fail to execute if any parameters are non-boolean or null. Returns a boolean value. === and Parameters * `Field Name | Raw Boolean | Boolean Evaluator` * `Field Name | Raw Boolean | Boolean Evaluator` * `......` * `Field Name | Raw Boolean | Boolean Evaluator` === and Syntax The expressions below show the various ways in which you can use the `and` evaluator. At least two parameters are required, but there is no limit to how many you can use. [source,text] ---- and(true,fieldA) // true && fieldA and(fieldA,fieldB) // fieldA && fieldB and(or(fieldA,fieldB),fieldC) // (fieldA || fieldB) && fieldC and(fieldA,fieldB,fieldC,or(fieldD,fieldE),fieldF) ---- == anova The `anova` function calculates the https://en.wikipedia.org/wiki/Analysis_of_variance[analysis of variance] for two or more numeric arrays. === anova Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` ... (two or more) === anova Syntax [source,text] anova(numericArray1, numericArray2) // calculates ANOVA for two numeric arrays anova(numericArray1, numericArray2, numericArray2) // calculates ANOVA for three numeric arrays == array The `array` function returns an array of numerics or other objects including other arrays. === array Parameters //TODO 7.1 - fill in details of Parameters * `numeric` | `array` ... === array Syntax [source,text] array(1, 2, 3) // Array of numerics array(array(1,2,3), array(4,5,6)) // Array of arrays == asin The `asin` function returns the trigonometric arcsine of a number. === asin Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the arcsine of. === asin Syntax [source,text] ---- asin(100.4) // returns the sine of 100.4 asine(fieldA) // returns the sine for fieldA. if(gt(fieldA,fieldB),asin(fieldA),asin(fieldB)) // if fieldA > fieldB then return the asine of fieldA, else return the asine of fieldB ---- == atan The `atan` function returns the trigonometric arctangent of a number. === atan Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the arctangent of. === atan Syntax [source,text] ---- atan(100.4) // returns the arctangent of 100.4 atan(fieldA) // returns the arctangent for fieldA. if(gt(fieldA,fieldB),atan(fieldA),atan(fieldB)) // if fieldA > fieldB then return the arctanget of fieldA, else return the arctangent of fieldB ---- == betaDistribution The `betaDistribution` function returns a beta probability distribution (https://en.wikipedia.org/wiki/Beta_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions. === betaDistribution Parameters * `double` : shape1 * `double` : shape2 === betaDistribution Returns probability distribution function === betaDistribution Syntax [source,text] betaDistribution(1, 5) == binomialCoefficient The `binomialCoefficient` function returns the number of k-element subsets that can be selected from an n-element set (https://en.wikipedia.org/wiki/Binomial_coefficient). === binomialCoefficient Parameters * `integer` : [n] set * `integer` : [k] subset === binomialCoefficient Returns long value : The number of k-element subsets that can be selected from an n-element set. === binomialCoefficient Syntax [source,text] binomialCoefficient(8, 3) // Returns the number of 3 element subsets from an 8 element set. == binomialDistribution The `binomialDistribution` function returns a binomial probability distribution (https://en.wikipedia.org/wiki/Binomial_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `probability` and `cumulativeProbability` functions. === binomialDistribution Parameters * `integer` : number of trials * `double` : probability of success === binomialDistribution Returns probability distribution function === binomialDistribution Syntax [source,text] binomialDistribution(1000, .5) == canberraDistance The `canberraDistance` function calculates the Canberra distance (https://en.wikipedia.org/wiki/Canberra_distance) of two numeric arrays. === canberraDistance Parameters * `numeric array` * `numeric array` === canberraDistance Syntax [source,text] canberraDistance(numericArray1, numuericArray2)) === canberraDistance Returns numeric == cbrt The `cbrt` function returns the trigonometric cube root of a number. === cbrt Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the cube root of. === cbrt Syntax [source,text] ---- cbrt(100.4) // returns the square root of 100.4 cbrt(fieldA) // returns the square root for fieldA. if(gt(fieldA,fieldB),cbrt(fieldA),cbrt(fieldB)) // if fieldA > fieldB then return the cbrt of fieldA, else return the cbrt of fieldB ---- == ceil The `ceil` function rounds a decimal value to the next highest whole number. === ceil Parameters * `Field Name | Raw Number | Number Evaluator`: The decimal to round up. === ceil Syntax The expressions below show the various ways in which you can use the `ceil` evaluator. [source,text] ---- ceil(100.4) // returns 101. ceil(fieldA) // returns the next highest whole number for fieldA. if(gt(fieldA,fieldB),ceil(fieldA),ceil(fieldB)) // if fieldA > fieldB then return the ceil of fieldA, else return the ceil of fieldB. ---- == chebyshevDistance The `chebyshevDistance` function calculates the Chebyshev distance (https://en.wikipedia.org/wiki/Chebyshev_distance) of two numeric arrays. === chebyshevDistance Parameters * `numeric array` * `numeric array` === chebyshevDistance Syntax [source,text] chebyshevDistance(numericArray1, numuericArray2)) === chebyshevDistance Returns numeric == col The `col` function returns a numeric array from a list of Tuples. The `col` function is used to create numeric arrays from stream sources. === col Parameters //TODO 7.1 - fill in details of Parameters * `list of Tuples` * `field name`: The field to create the array from. === col Syntax [source,text] col(tupleList, fieldName) == constantDistribution The `constantDistribution` function returns a constant probability distribution based on its parameter. This function is part of the probability distribution framework and is designed to work with the `sample` and `cumulativeProbability` functions. When sampled the constant distribution always returns its constant value. === constantDistribution Parameters * `double` : constant value === constantDistribution Returns probability distribution function === constantDistribution Syntax [source,text] constantDistribution(constantValue) == conv The `conv` function returns the https://en.wikipedia.org/wiki/Convolution[convolution] of two numeric arrays. === conv Parameters * `numeric array` * `numeric array` === conv Syntax [source,text] conv(numericArray1, numericArray2) == copyOf The `copyOf` function creates a copy of a numeric array. === copyOf Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` * `length`: The length of the copied array. The returned array will be right padded with zeros if the length parameter exceeds the size of the original array. === copyOf Syntax [source,text] copyOf(numericArray, length) == copyOfRange The `copyOfRange` function creates a copy of a range of a numeric array. === copyOfRange Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` * `start index` * `end index` === copyOfRange Syntax [source,text] copyOfRange(numericArray, startIndex, endIndex) == corr The `corr` function returns the Pearson Product Moment Correlation of two numeric arrays. === corr Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` * `numeric array` === corr Returns double between -1 and 1 === corr Synax [source,text] corr(numericArray1, numericArray2) == cos The `cos` function returns the trigonometric cosine of a number. === cos Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the hyperbolic cosine of. === cos Syntax [source,text] ---- cos(100.4) // returns the arccosine of 100.4 cos(fieldA) // returns the arccosine for fieldA. if(gt(fieldA,fieldB),cos(fieldA),cos(fieldB)) // if fieldA > fieldB then return the arccosine of fieldA, else return the cosine of fieldB ---- == cosineSimilarity The `cosineSimilarity` function returns the cosine similarity (https://en.wikipedia.org/wiki/Cosine_similarity) of two numeric arrays. === cosineSimilarity Parameters * `numeric array` * `numeric array` === cosineSimilarity Syntax [source,text] ---- cosineSimilarity(numericArray, numericArray) ---- === cosineSimilarity Returns numeric == cov The `cov` function returns the covariance of two numeric arrays. === cov Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` * `numeric array` === cov Syntax [source,text] cov(numericArray, numericArray) == cumulativeProbability The `cumulativeProbability` function returns the cumulative probability of a random variable within a probability distribution. The cumulative probability is the total probability of all random variables less then or equal to a random variable. === cumulativeProbability Parameters * `probability distribution` * `number` : Value to compute the probability for. === cumulativeProbability Returns double : the cumulative probability === cumulativeProbability Syntax [source,text] cumulativeProbability(normalDistribution(500, 25), 502) // Returns the cumulative probability of the random sample 502 in a normal distribution with a mean of 500 and standard deviation of 25. == describe The `describe` function returns a tuple containing the descriptive statistics for an array. === describe Parameters * `numeric array` === describe Syntax [source,text] describe(numericArray) == distance The `distance` function calculates the Euclidian distance of two numeric arrays. === distance Parameters * `numeric array` * `numeric array` === distance Syntax [source,text] distance(numericArray1, numuericArray2)) == div The `div` function will take two numeric values and divide them. The function will fail to execute if any of the values are non-numeric or null, or the 2nd value is 0. Returns a numeric value. === div Parameters * `Field Name | Raw Number | Number Evaluator` * `Field Name | Raw Number | Number Evaluator` === div Syntax The expressions below show the various ways in which you can use the `div` evaluator. The first value will be divided by the second and as such the second cannot be 0. [source,text] ---- div(1,2) // 1 / 2 div(1,fieldA) // 1 / fieldA div(fieldA,1.4) // fieldA / 1.4 div(fieldA,add(fieldA,fieldB)) // fieldA / (fieldA + fieldB) ---- == dotProduct The `dotProduct` function returns the dotproduct (https://en.wikipedia.org/wiki/Dot_product) of a numeric array. === dotProduct Parameters * `numeric array` === dotProduct Syntax [source,text] dotProduct(numericArray) === dotProduct Returns number == earthMoversDistance The `earthMoversDistance` function calculates the Earth Movers distance (https://en.wikipedia.org/wiki/Earth_mover%27s_distance) of two numeric arrays. === earthMoversDistance Parameters * `numeric array` * `numeric array` === earthMoversDistance Syntax [source,text] earthMoversDistance(numericArray1, numuericArray2)) === earthMoversDistance Returns numeric == ebeAdd The `ebeAdd` function performs an element-by-element addition of two numeric arrays. === ebeAdd Parameters * `numeric array` * `numeric array` === ebeAdd Syntax [source,text] ebeAdd(numericArray, numericArray) === ebeAdd Returns numeric array == ebeDivide The `ebeDivide` function performs an element-by-element division of two numeric arrays. === ebeDivide Parameters * `numeric array` * `numeric array` === ebeDivide Syntax [source,text] ebeDivide(numericArray, numericArray) === ebeDivide Returns numeric array == ebeMultiple The `ebeMultiply` function performs an element-by-element multiplication of two numeric arrays. === ebeMultiply Parameters * `numeric array` * `numeric array` === ebeMultiply Syntax [source,text] ebeMultiply(numericArray, numericArray) === ebeMultiply Returns numeric array == ebeSubtract The `ebeSubtract` function performs an element-by-element subtraction of two numeric arrays. === ebeSubtract Parameters * `numeric array` * `numeric array` === ebeSubtract Syntax [source,text] ebeSubtract(numericArray, numericArray) === ebeSubtract Returns numeric array == empiricalDistribution The `empiricalDistribution` function returns a continuous probability distribution function based on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions. This function is designed to work with continuous data. To build a distribution from a discrete data set use the `enumeratedDistribution`. === empiricalDistribution Parameters * `numeric array` : empirical observations === empiricalDistribution Returns probability distribution function === empiricalDistribution Syntax empiricalDistribution(numericArray) == enumeratedDistribution The `enumeratedDistribution` function returns a discrete probability distribution function based on an actual data set or a pre-defined set of data and probabilities. This function is part of the probability distribution framework and is designed to work with the `sample`, `probability` and `cumulativeProbability` functions. The enumeratedDistribution can be called in two different scenarios: 1) Single array of discrete values. This works like an empirical distribution for discrete data. 2) An array of singleton discrete values and an array of double values representing the probabilities of the discrete values. This function is designed to work with discrete data. To build a distribution from a continuous data set use the `empiricalDistribution`. === enumeratedDistribution Parameters * `integer array` : discrete observations or singleton discrete values. * `double array` : (Optional) values representing the probabilities of the singleton discrete values. === enumeratedDistribution Returns probability distribution function === enumeratedDistribution Syntax [source,text] enumeratedDistribution(integerArray) // This creates an enumerated distribution from the observations in the numeric array. enumeratedDistribution(array(1,2,3,4), array(.25,.25,.25,.25)) // This creates an enumerated distribution with four discrete values (1,2,3,4) each with a probability of .25. == eor The `eor` function will return the logical exclusive or of at least two boolean parameters. The function will fail to execute if any parameters are non-boolean or null. Returns a boolean value. === eor Parameters * `Field Name | Raw Boolean | Boolean Evaluator` * `Field Name | Raw Boolean | Boolean Evaluator` * `......` * `Field Name | Raw Boolean | Boolean Evaluator` === eor Syntax The expressions below show the various ways in which you can use the `eor` evaluator. At least two parameters are required, but there is no limit to how many you can use. [source,text] ---- eor(true,fieldA) // true iff fieldA is false eor(fieldA,fieldB) // true iff either fieldA or fieldB is true but not both eor(eq(fieldA,fieldB),eq(fieldC,fieldD)) // true iff either fieldA == fieldB or fieldC == fieldD but not both ---- == eq The `eq` function will return whether all the parameters are equal, as per Java's standard `equals(...)` function. The function accepts parameters of any type, but will fail to execute if all the parameters are not of the same type. That is, all are Boolean, all are String, all are Numeric. If any any parameters are null and there is at least one parameter that is not null then false will be returned. Returns a boolean value. === eq Parameters * `Field Name | Raw Value | Evaluator` * `Field Name | Raw Value | Evaluator` * `......` * `Field Name | Raw Value | Evaluator` === eq Syntax The expressions below show the various ways in which you can use the `eq` evaluator. [source,text] ---- eq(1,2) // 1 == 2 eq(1,fieldA) // 1 == fieldA eq(fieldA,val(foo)) fieldA == "foo" eq(add(fieldA,fieldB),6) // fieldA + fieldB == 6 ---- == expMovingAge The `expMovingAverage` function computes an exponential moving average (https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average) for a numeric array. === expMovingAge Parameters * `numeric array` : The array to compute the exponential moving average from. * `integer`: window size === expMovingAvg Returns numeric array : (The first element of the returned array will start from the windowSize-1 index of the original array) === expMovingAvg Syntax [source,text] ---- expMovingAvg(numericArray, 5) //Computes an exponential moving average with a window size of 5. ---- == factorial The `factorial` function returns the factorial (https://en.wikipedia.org/wiki/Factorial) of its parameter. === factorial Parameters * `integer` : The value to compute the factorial for. The largest supported value of this parameter is 170. === factorial Returns double === factorial Syntax [source,text] ---- factorial(100) //Computes the factorial of 100 ---- == finddelay The `finddelay` function performs a cross-correlation between two numeric arrays and returns the delay. === finddelay Parameters * `numeric array` * `numeric array` === finddelay Syntax [source,text] finddelay(numericArray1, numericArray2) == floor The `floor` function rounds a decimal value to the next lowest whole number. === floor Parameters * `Field Name | Raw Number | Number Evaluator`: The decimal to round down. === floor Syntax The expressions below show the various ways in which you can use the `floor` evaluator. [source,text] ---- floor(100.4) // returns 100. ceil(fieldA) // returns the next lowestt whole number for fieldA. if(gt(fieldA,fieldB),floor(fieldA),floor(fieldB)) // if fieldA > fieldB then return the floor of fieldA, else return the floor of fieldB. ---- == freqTable The `freqTable` function returns a frequency distribution (https://en.wikipedia.org/wiki/Frequency_distribution) from an array of discrete values. This function is designed to work with discrete values. To work with continuous data use the `hist` function. === freqTable Parameters * `integer array` : The values to build the frequency distribution from. === freqTable Returns A list of tuples containing the frequency information for each discrete value. === freqTable Syntax [source,text] ---- freqTable(integerArray) ---- == gammaDistribution The `gammaDistribution` function returns a gamma probability distribution (https://en.wikipedia.org/wiki/Gamma_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions. === gammaDistribution Parameters * `double` : shape * `double` : scale === gammaDistribution Returns probability distribution function === gammaDistribution Syntax [source,text] gammaDistribution(1, 10) == gt The `gt` function will return whether the first parameter is greater than the second parameter. The function accepts numeric or string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value. === gt Parameters * `Field Name | Raw Value | Evaluator` * `Field Name | Raw Value | Evaluator` === gt Syntax The expressions below show the various ways in which you can use the `gt` evaluator. [source,text] ---- gt(1,2) // 1 > 2 gt(1,fieldA) // 1 > fieldA gt(fieldA,val(foo)) // fieldA > "foo" gt(add(fieldA,fieldB),6) // fieldA + fieldB > 6 ---- == gteq The `gteq` function will return whether the first parameter is greater than or equal to the second parameter. The function accepts numeric and string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value. === gteq Parameters * `Field Name | Raw Value | Evaluator` * `Field Name | Raw Value | Evaluator` === gteq Syntax The expressions below show the various ways in which you can use the `gteq` evaluator. [source,text] ---- gteq(1,2) // 1 >= 2 gteq(1,fieldA) // 1 >= fieldA gteq(fieldA,val(foo)) fieldA >= "foo" gteq(add(fieldA,fieldB),6) // fieldA + fieldB >= 6 ---- == hist The `hist` function creates a histogram from a numeric array. The hist function is designed to work with continuous variables. === hist Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` * `bins`: The number of bins in the histogram. Each returned tuple contains summary statistics for the observations that were within the bin. === hist Sytnax [source,text] hist(numericArray, bins) == hsin The `hsin` function returns the trigonometric hyperbolic sine of a number. === hsin Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the hyperbolic sine of. === hsin Syntax [source,text] ---- hsin(100.4) // returns the hsine of 100.4 hsin(fieldA) // returns the hsine for fieldA. if(gt(fieldA,fieldB),sin(fieldA),sin(fieldB)) // if fieldA > fieldB then return the hsine of fieldA, else return the hsine of fieldB ---- == if The `if` function works like a standard conditional if/then statement. If the first parameter is true, then the second parameter will be returned, else the third parameter will be returned. The function accepts a boolean as the first parameter and anything as the second and third parameters. An error will occur if the first parameter is not a boolean or is null. === if Parameters * `Field Name | Raw Value | Boolean Evaluator` * `Field Name | Raw Value | Evaluator` * `Field Name | Raw Value | Evaluator` === if Syntax The expressions below show the various ways in which you can use the `if` evaluator. [source,text] ---- if(fieldA,fieldB,fieldC) // if fieldA is true then fieldB else fieldC if(gt(fieldA,5), fieldA, 5) // if fieldA > 5 then fieldA else 5 if(eq(fieldB,null), null, div(fieldA,fieldB)) // if fieldB is null then null else fieldA / fieldB ---- == kendallsCorr The `kendallsCorr` function returns the Kendall's Tau-b Rank Correlation (https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient) of two numeric arrays. === kendallsCorr Parameters * `numeric array` * `numeric array` === kendalsCorr Returns double between -1 and 1 === kendalsCorr Synax [source,text] kendalsCorr(numericArray1, numericArray2) == length The `length` function returns the length of a numeric array. === length Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` === length Syntax [source,text] length(numericArray) == log The `log` function will return the natural log of the provided single parameter. The `log` function will fail to execute if the value is non-numeric. If a null value is found, then null will be returned as the result. === log Parameters * `Field Name | Raw Number | Number Evaluator` === log Syntax The expressions below show the various ways in which you can use the `log` evaluator. Only one parameter is accepted. Returns a numeric value. [source,text] ---- log(100) log(add(fieldA,fieldB)) log(fieldA) ---- == logNormalDistribution The `logNormalDistribution` function returns a log normal probability distribution (https://en.wikipedia.org/wiki/Log-normal_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions. === logNormalDistribution Parameters * `double` : shape * `double` : scale === logNormalDistribution Returns probability distribution function === logNormalDistribution Syntax [source,text] logNormalDistribution(.3, .0) == kolmogorovSmirnov The `kolmogorovSmirnov` function performs a Kolmogorov Smirnov test (https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test), between a reference continuous probability distribution and a sample set. The supported distribution functions are: (empiricalDistribution, normalDistribution, logNormalDistribution, weibullDistribution, gammaDistribution, betaDistribution) === kolmogorovSmirnov Parameters * `continuous probability distribution` : Reference distribution * `numeric array` : sample set === kolmogorovSmirnov Returns result tuple : A tuple containing the p-value and d-statistic for test result. === kolmogorovSmirnov Syntax [source,text] kolmogorovSmirnov(normalDistribution(10, 2), sampleSet) == lt The `lt` function will return whether the first parameter is less than the second parameter. The function accepts numeric or string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value. === lt Parameters * `Field Name | Raw Value | Evaluator` * `Field Name | Raw Value | Evaluator` === lt Syntax The expressions below show the various ways in which you can use the `lt` evaluator. [source,text] ---- lt(1,2) // 1 < 2 lt(1,fieldA) // 1 < fieldA lt(fieldA,val(foo)) fieldA < "foo" lt(add(fieldA,fieldB),6) // fieldA + fieldB < 6 ---- == lteq The `lteq` function will return whether the first parameter is less than or equal to the second parameter. The function accepts numeric and string parameters, but will fail to execute if all the parameters are not of the same type. That is, all are String or all are Numeric. If any any parameters are null then an error will be raised. Returns a boolean value. === lteq Parameters * `Field Name | Raw Value | Evaluator` * `Field Name | Raw Value | Evaluator` === lteq Syntax The expressions below show the various ways in which you can use the `lteq` evaluator. [source,text] ---- lteq(1,2) // 1 <= 2 lteq(1,fieldA) // 1 <= fieldA lteq(fieldA,val(foo)) fieldA <= "foo" lteq(add(fieldA,fieldB),6) // fieldA + fieldB <= 6 ---- == manhattanDistance The `manhattanDistance` function calculates the Manhattan distance (https://en.wiktionary.org/wiki/Manhattan_distance) of two numeric arrays. === manhattanDistance Parameters * `numeric array` * `numeric array` === manhattanDistance Syntax [source,text] manhattanDistance(numericArray1, numuericArray2)) === manhattanDistance Returns numeric == meanDifference The `meanDifference` function calculates the mean of the differences following the element-by-element subtraction between two numeric arrays. === meanDifference Parameters * `numeric array` * `numeric array` === meanDifference Returns numeric === meanDifference Syntax [source,text] ---- meanDifference(numericArray, numericArray) ---- == mod The `mod` function returns the remainder (modulo) of the first parameter divided by the second parameter. === mod Parameters * `Field Name | Raw Number | Number Evaluator`: Parameter 1 * `Field Name | Raw Number | Number Evaluator`: Parameter 2 === mod Syntax The expressions below show the various ways in which you can use the `mod` evaluator. [source,text] ---- mod(100,3) // returns the remainder of 100 / 3 . mod(100,fieldA) // returns the remainder of 100 divided by the value of fieldA. mod(fieldA,1.4) // returns the remainder of fieldA divided by 1.4. if(gt(fieldA,fieldB),mod(fieldA,fieldB),mod(fieldB,fieldA)) // if fieldA > fieldB then return the remainder of fieldA/fieldB, else return the remainder of fieldB/fieldA. ---- == monteCarlo The `monteCarlo` function performs a Monte Carlo simulation (https://en.wikipedia.org/wiki/Monte_Carlo_method) based on its parameters. The monteCarlo function runs another function a set number of times and returns the results. The function being run typically has one or more variables that are drawn from probability distributions on each run. The `sample` function is used in the function to draw the samples. The simulation's result array can then be treated as an empirical distribution to understand the probabilities of the simulation results. === monteCarlo Parameters * `numeric function` : The function being run by the simulation, which must return a numeric value. * `integer` : The number of times to run the function. === monteCarlo Returns numeric array: The results of simulation runs. === monteCarlo Syntax [source,text] let(a=uniformIntegerDistribution(1, 6), b=uniformIntegerDistribution(1, 6), c=monteCarlo(add(sample(a), sample(b)), 1000)) In the expression above the monteCarlo function is running the function `add(sample(a), sample(b))` 1000 times and returning the result. Each time the function is run samples are drawn from the probability distributions stored in variables `a` and `b`. == movingAvg The `movingAvg` function calculates a https://en.wikipedia.org/wiki/Moving_average[moving average] over an array of numbers. === movingAvg Parameters * `numeric array` * `window size` === movingAvg Returns numeric array (The first element of the returned array will start from the windowSize-1 index of the original array) === movingAvg Syntax [source,text] movingAverage(numericArray, 30) == movingMedian The `movingMedian` function calculates a moving median over an array of numbers. === movingMedian Parameters * `numeric array` * `window size` === movingMedian Syntax [source,text] movingMedian(numericArray, 30) === movingMedian Returns numeric array (The first element of the returned array will start from the windowSize-1 index of the original array) == mult The `mult` function will take two or more numeric values and multiply them together. The `mult` function will fail to execute if any of the values are non-numeric. If a null value is found then null will be returned as the result. === mult Parameters * `Field Name | Raw Number | Number Evaluator` * `Field Name | Raw Number | Number Evaluator` * `......` * `Field Name | Raw Number | Number Evaluator` === mult Syntax The expressions below show the various ways in which you can use the `mult` evaluator. The number and order of these parameters do not matter and is not limited except that at least two parameters are required. Returns a numeric value. [source,text] ---- mult(1,2,3,4) // 1 * 2 * 3 * 4 mult(1,fieldA) // 1 * value of fieldA mult(fieldA,1.4) // value of fieldA * 1.4 mult(fieldA,fieldB,fieldC) // value of fieldA * value of fieldB * value of fieldC mult(fieldA,div(fieldA,fieldB)) // value of fieldA * (value of fieldA / value of fieldB) mult(fieldA,if(gt(fieldA,fieldB),fieldA,fieldB)) // if fieldA > fieldB then fieldA * fieldA, else fieldA * fieldB ---- == normalDistribution The `normalDistribution` function returns a normal probability distribution (https://en.wikipedia.org/wiki/Normal_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions. === normalDistribution Parameters * `double` : mean * `double` : standard deviation === normalDistribution Returns probability distribution function === normalDistribution Syntax [source,text] normalDistribution(mean, stddev) == normalize The `normalize` function normalizes a numeric array so that values within the array have a mean of 0 and standard deviation of 1. === normalize Parameters * `numeric array` === normalize Syntax [source,text] normalize(numericArray) == not The `not` function will return the logical NOT of a single boolean parameter. The function will fail to execute if the parameter is non-boolean or null. Returns a boolean value. === not Parameters * `Field Name | Raw Boolean | Boolean Evaluator` === not Syntax The expressions below show the various ways in which you can use the `not` evaluator. Only one parameter is allowed. [source,text] ---- not(true) // false not(fieldA) // true if fieldA is false else false not(eq(fieldA,fieldB)) // true if fieldA != fieldB ---- == or The `or` function will return the logical OR of at least 2 boolean parameters. The function will fail to execute if any parameters are non-boolean or null. Returns a boolean value. === or Parameters * `Field Name | Raw Boolean | Boolean Evaluator` * `Field Name | Raw Boolean | Boolean Evaluator` * `......` * `Field Name | Raw Boolean | Boolean Evaluator` === or Syntax The expressions below show the various ways in which you can use the `or` evaluator. At least two parameters are required, but there is no limit to how many you can use. [source,text] ---- or(true,fieldA) // true || fieldA or(fieldA,fieldB) // fieldA || fieldB or(and(fieldA,fieldB),fieldC) // (fieldA && fieldB) || fieldC or(fieldA,fieldB,fieldC,and(fieldD,fieldE),fieldF) ---- == poissonDistribution The `poissonDistribution` function returns a poisson probability distribution (https://en.wikipedia.org/wiki/Poisson_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `probability` and `cumulativeProbability` functions. === poissonDistribution Parameters * `double` : mean === poissonDistribution Returns probability distribution function === poissonDistribution Syntax [source,text] poissonDistribution(mean) == polyFit The `polyFit` function performs polynomial curve fitting (https://en.wikipedia.org/wiki/Curve_fitting#Fitting_lines_and_polynomial_functions_to_data_points). === polyFit Parameters * `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values. * `numeric array` : y values * `integer` : (Optional) polynomial degree. Defaults to 3. === polyFit Returns numeric array : curve that was fit to the data points. === polyFit Syntax [source,text] polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using a the default 3 degree polynomial. polyFit(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial. == polyfitDerivative The `polyfitDerivative` function returns the derivative of the curve created by the polynomial curve fitter. === polyfitDerivative Parameters * `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values. * `numeric array` : y values * `integer` : (Optional) polynomial degree. Defaults to 3. === polyfitDerivative Returns numeric array : The curve for the derivative created by the polynomial curve fitter. === polyfitDerivative Syntax [source,text] polyfitDerivative(yValues) // This creates the xValues automatically and returns the polyfit derivative polyfitDerivative(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial and returns the polyfit derivative. == pow The `pow` function returns the value of its first parameter raised to the power of its second parameter. === pow Parameters * `Field Name | Raw Number | Number Evaluator`: Parameter 1 * `Field Name | Raw Number | Number Evaluator`: Parameter 2 === pow Syntax The expressions below show the various ways in which you can use the `pow` evaluator. [source,text] ---- pow(2,3) // returns 2 raised to the 3rd power. pow(4,fieldA) // returns 4 raised by the value of fieldA. pow(fieldA,1.4) // returns the value of fieldA raised by 1.4. if(gt(fieldA,fieldB),pow(fieldA,fieldB),pow(fieldB,fieldA)) // if fieldA > fieldB then raise fieldA by fieldB, else raise fieldB by fieldA. ---- == predict The `predict` function predicts the value of an dependent variable based on the output of the regress function. === predict Parameters //TODO 7.1 - fill in details of Parameters * `regress output` * `numeric predictor` === predict Syntax [source,text] predict(regressOutput, predictor) == primes The `primes` function returns an array of prime numbers starting from a specified number. === primes Parameters * `integer`: The number of primes to return in the list * `integer`: The starting point for returning the primes === primes Syntax [source,text] ---- primes(100, 2000) // returns 100 primes starting from 2000 ---- === primes Returns numeric array == probability The `probability` function returns the probability of encountering a random variable within a discrete probability distribution. === probability Parameters * `discrete probability distribution` : poissonDistribution | binomialDistribution | uniformDistribution | enumeratedDistribution * `integer` : Value to compute the probability for. === probability Returns double : the probability === probability Syntax [source,text] probability(poissonDistribution(10), 7) // Returns the probability of encountering a random sample if 7 in a poisson distribution with a mean of 10. == rank The `rank` performs a rank transformation on a numeric array. === rank Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` === rank Syntax [source,text] rank(numericArray) == raw The `raw` function will return whatever raw value is the parameter. This is useful for cases where you want to use a string as part of another evaluator. === raw Parameters * `Raw Value` === raw Syntax The expressions below show the various ways in which you can use the `raw` evaluator. Whatever is inside will be returned as-is. Internal evaluators are considered strings and are not evaluated. [source,text] ---- raw(foo) // "foo" raw(count(*)) // "count(*)" raw(45) // 45 raw(true) // "true" (note: this returns the string "true" and not the boolean true) eq(raw(fieldA), fieldA) // true if the value of fieldA equals the string "fieldA" ---- == regress The `regress` function performs a simple regression on two numeric arrays. The result of this expression is also used by the `predict` and `residuals` functions. === regress Parameters //TODO 7.1 - fill in details of Parameters * `numeric array` * `numeric array` === regress Syntax [source,text] regress(numericArray1, numericArray2) == residuals The `residuals` function takes three parameters: a simple regression model, an array of predictor values and an array of actual values. The residuals function applies the simple regression model to the array of predictor values and computes a predictions array. The actual values array is then subtracted from the predictions array to compute the residuals array. === residuals Parameters * `regress output` * `numeric array`: The array of predictor values * `numeric array`: The array of actual values === residuals Syntax [source,text] residuals(regressOutput, numericArray, numericArray) === residuals Returns numeric array of residuals == rev The `rev` function reverses the order of a numeric array. === rev Parameters * `numeric array` === rev Syntax [source,text] rev(numericArray) == round The `round` function returns the closest whole number to the argument === round Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the square root of. === round Syntax [source,text] ---- round(100.4) round(fieldA) if(gt(fieldA,fieldB),sqrt(fieldA),sqrt(fieldB)) // if fieldA > fieldB then return the round of fieldA, else return the round of fieldB ---- == sample The `sample` function can be used to draw random samples from a probability distribution. === sample Parameters * `probability distribution`: The distribution to sample. * `integer`: (Optional) Sample size. Defaults to 1. === sample Returns Either a single numeric random sample, or a numeric array depending on the sample size parameter. === sample function Syntax [source,text] sample(normalDistribution(50, 5)) // Return a single random sample from a normalDistribution with mean of 50 and standard deviation of 5. sample(poissonDistribution(5), 1000) // Return 1000 random samples from poissonDistribution with a mean of 5. == scale The `scale` function multiplies all the elements of an array by a number. === scale Parameters //TODO 7.1 - fill in details of Parameters * `number` * `numeric array` === scale Syntax [source,text] scale(number, numericArray) == sequence The `sequence` function returns an array of numbers based on its parameters. === sequence Parameters //TODO 7.1 - fill in details of Parameters * `length` * `start` * `stride` === sequence Syntax [source,text] sequence(100, 0, 1) // Returns a sequence of length 100, starting from 0 with a stride of 1. == sin The `sin` function returns the trigonometric sine of a number. === sin Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the sine of. === sin Syntax [source,text] ---- sin(100.4) // returns the sine of 100.4 sine(fieldA) // returns the sine for fieldA. if(gt(fieldA,fieldB),sin(fieldA),sin(fieldB)) // if fieldA > fieldB then return the sine of fieldA, else return the sine of fieldB ---- == spearmansCorr The `spearmansCorr` function returns the Spearmans Rank Correlation (https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient) of two numeric arrays. === spearmansCorr Parameters * `numeric array` * `numeric array` === spearmansCorr Returns double between -1 and 1 === spearmansCorr Synax [source,text] spearmansCorr(numericArray1, numericArray2) == sqrt The `sqrt` function returns the trigonometric square root of a number. === sqrt Parameters * `Field Name | Raw Number | Number Evaluator`: The value to return the square root of. === sqrt Syntax [source,text] ---- sqrt(100.4) // returns the square root of 100.4 sqrt(fieldA) // returns the square root for fieldA. if(gt(fieldA,fieldB),sqrt(fieldA),sqrt(fieldB)) // if fieldA > fieldB then return the sqrt of fieldA, else return the sqrt of fieldB ---- == sub The `sub` function will take 2 or more numeric values and subtract them, from left to right. The sub function will fail to execute if any of the values are non-numeric. If a null value is found then null will be returned as the result. === sub Parameters * `Field Name | Raw Number | Number Evaluator` * `Field Name | Raw Number | Number Evaluator` * `......` * `Field Name | Raw Number | Number Evaluator` === sub Syntax The expressions below show the various ways in which you can use the `sub` evaluator. The number of these parameters does not matter and is not limited except that at least two parameters are required. Returns a numeric value. [source,text] ---- sub(1,2,3,4) // 1 - 2 - 3 - 4 sub(1,fieldA) // 1 - value of fieldA sub(fieldA,1.4) // value of fieldA - 1.4 sub(fieldA,fieldB,fieldC) // value of fieldA - value of fieldB - value of fieldC sub(fieldA,div(fieldA,fieldB)) // value of fieldA - (value of fieldA / value of fieldB) if(gt(fieldA,fieldB),sub(fieldA,fieldB),sub(fieldB,fieldA)) // if fieldA > fieldB then fieldA - fieldB, else fieldB - field ---- == sumDifference The `sumDifference` function calculates the sum of the differences following an element-by-element subtraction between two numeric arrays. === sumDifference Parameters * `numeric array` * `numeric array` === sumDifference Returns numeric === sumDifference Syntax [source,text] ---- sumDifference(numericArray, numericArray) ---- == uniformDistribution The `uniformDistribution` function returns a continuous uniform probability distribution (https://en.wikipedia.org/wiki/Uniform_distribution_(continuous)) based on its parameters. See the `uniformIntegerDistribution` to work with discrete uniform distributions. This function is part of the probability distribution framework and is designed to work with the `sample` and `cumulativeProbability` functions. === uniforDistribution Parameters * `double` : start * `double` : end === uniformDistribution Returns probability distribution function === uniformDistribution Syntax [source,text] uniformDistribution(0.0, 100.0) == uniformIntegerDistribution The `uniformIntegerDistribution` function returns a discrete uniform probability distribution (https://en.wikipedia.org/wiki/Discrete_uniform_distribution) based on its parameters. See the `uniformDistribution` to work with continuous uniform distributions. This function is part of the probability distribution framework and is designed to work with the `sample`, `probability` and `cumulativeProbability` functions. === uniformIntegerDistribution Parameters * `integer` : start * `integer` : end === uniformIntegerDistribution Returns probability distribution function === uniformIntegerDistribution Syntax [source,text] uniformDistribution(1, 6) == weibullDistribution The `weibullDistribution` function returns a Weibull probability distribution (https://en.wikipedia.org/wiki/Weibull_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions. === weibullDistribution Parameters * `double` : shape * `double` : scale === weibullDistribution Returns probability distribution function === weibullDistribution Syntax [source,text] weibullDistribution(.5, 10) == zipFDistribution The `zipFDistribution` function returns a ZipF distribution (https://en.wikipedia.org/wiki/Zeta_distribution) based on its parameters. This function is part of the probability distribution framework and is designed to work with the `sample`, `probability` and `cumulativeProbability` functions. === zipFDistribution Parameters * `integer` : size * `double` : exponent === zipFDistribution Returns probability distribution function === zipFDistribution Syntax [source,text] zipFDistribution(5000, 1.0)