mirror of https://github.com/apache/lucene.git
Solr Ref Guide: update 7.1 statistical function docs
This commit is contained in:
parent
2da777cdb8
commit
7a5733d107
|
@ -455,10 +455,81 @@ Returns the following response:
|
||||||
|
|
||||||
== Setting Variables with let
|
== Setting Variables with let
|
||||||
|
|
||||||
The `let` function sets variables and runs a streaming expression that references the variables. The `let` function can be used to
|
The `let` function sets variables and returns the last variable. The output of any statistical function can be set to a variable.
|
||||||
write small statistical programs.
|
|
||||||
|
|
||||||
A variable can be set to the output of any streaming expression. Here is a very simple example:
|
Below is a simple example setting three variables `a`, `b` and `correlation`.
|
||||||
|
|
||||||
|
[source,text]
|
||||||
|
----
|
||||||
|
let(a=array(1,2,3),
|
||||||
|
b=array(10, 20, 30),
|
||||||
|
correlation=corr(a, b))
|
||||||
|
----
|
||||||
|
|
||||||
|
Here is the output:
|
||||||
|
|
||||||
|
[source,json]
|
||||||
|
----
|
||||||
|
{
|
||||||
|
"result-set": {
|
||||||
|
"docs": [
|
||||||
|
{
|
||||||
|
"correlation": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"EOF": true,
|
||||||
|
"RESPONSE_TIME": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
----
|
||||||
|
|
||||||
|
All variables can be output by setting the `echo` variable to `true`.
|
||||||
|
|
||||||
|
[source,text]
|
||||||
|
----
|
||||||
|
let(echo=true,
|
||||||
|
a=array(1,2,3),
|
||||||
|
b=array(10, 20, 30),
|
||||||
|
correlation=corr(a, b))
|
||||||
|
----
|
||||||
|
|
||||||
|
Here is the output:
|
||||||
|
|
||||||
|
[source,json]
|
||||||
|
----
|
||||||
|
{
|
||||||
|
"result-set": {
|
||||||
|
"docs": [
|
||||||
|
{
|
||||||
|
"a": [
|
||||||
|
1,
|
||||||
|
2,
|
||||||
|
3
|
||||||
|
],
|
||||||
|
"b": [
|
||||||
|
10,
|
||||||
|
20,
|
||||||
|
30
|
||||||
|
],
|
||||||
|
"correlation": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"EOF": true,
|
||||||
|
"RESPONSE_TIME": 0
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
----
|
||||||
|
|
||||||
|
Streaming expressions can also be used inside of a `let` expression in the following ways:
|
||||||
|
|
||||||
|
* A variable can be set to the output of any streaming expression.
|
||||||
|
* A streaming expression can be executed after all variables have been set. The variables can then be referenced by the streaming expression that is executed. The `let` expression will stream the tuples that are emitted by the final streaming expression.
|
||||||
|
|
||||||
|
Here is a very simple example:
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
----
|
----
|
||||||
|
|
|
@ -660,8 +660,8 @@ numeric array
|
||||||
== empiricalDistribution
|
== empiricalDistribution
|
||||||
|
|
||||||
The `empiricalDistribution` function returns a continuous probability distribution function based
|
The `empiricalDistribution` function returns a continuous probability distribution function based
|
||||||
on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is designed to
|
on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is
|
||||||
work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
|
designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
|
||||||
|
|
||||||
This function is designed to work with continuous data. To build a distribution from
|
This function is designed to work with continuous data. To build a distribution from
|
||||||
a discrete data set use the `enumeratedDistribution`.
|
a discrete data set use the `enumeratedDistribution`.
|
||||||
|
@ -1053,7 +1053,7 @@ The supported distribution functions are:
|
||||||
|
|
||||||
=== kolmogorovSmirnov Returns
|
=== kolmogorovSmirnov Returns
|
||||||
|
|
||||||
result tuple : A tuple containing the p-value and d-statistic for test result.
|
result tuple : A tuple containing the p-value and d-statistic for the test result.
|
||||||
|
|
||||||
=== kolmogorovSmirnov Syntax
|
=== kolmogorovSmirnov Syntax
|
||||||
|
|
||||||
|
@ -1163,7 +1163,7 @@ if(gt(fieldA,fieldB),mod(fieldA,fieldB),mod(fieldB,fieldA)) // if fieldA > field
|
||||||
== monteCarlo
|
== monteCarlo
|
||||||
|
|
||||||
The `monteCarlo` function performs a Monte Carlo simulation (https://en.wikipedia.org/wiki/Monte_Carlo_method)
|
The `monteCarlo` function performs a Monte Carlo simulation (https://en.wikipedia.org/wiki/Monte_Carlo_method)
|
||||||
based on its parameters. The monteCarlo function runs another function a set number of times and returns the results.
|
based on its parameters. The monteCarlo function runs another function a specified number of times and returns the results.
|
||||||
The function being run typically has one or more variables that are drawn from probability
|
The function being run typically has one or more variables that are drawn from probability
|
||||||
distributions on each run. The `sample` function is used in the function to draw the samples.
|
distributions on each run. The `sample` function is used in the function to draw the samples.
|
||||||
|
|
||||||
|
@ -1330,7 +1330,7 @@ or(fieldA,fieldB,fieldC,and(fieldD,fieldE),fieldF)
|
||||||
== poissonDistribution
|
== poissonDistribution
|
||||||
|
|
||||||
The `poissonDistribution` function returns a poisson probability distribution (https://en.wikipedia.org/wiki/Poisson_distribution)
|
The `poissonDistribution` function returns a poisson probability distribution (https://en.wikipedia.org/wiki/Poisson_distribution)
|
||||||
based on its parameters. This function is part of the probability distribution framework and is designed to
|
based on its parameter. This function is part of the probability distribution framework and is designed to
|
||||||
work with the `sample`, `probability` and `cumulativeProbability` functions.
|
work with the `sample`, `probability` and `cumulativeProbability` functions.
|
||||||
|
|
||||||
=== poissonDistribution Parameters
|
=== poissonDistribution Parameters
|
||||||
|
@ -1352,7 +1352,7 @@ The `polyFit` function performs polynomial curve fitting (https://en.wikipedia.o
|
||||||
|
|
||||||
=== polyFit Parameters
|
=== polyFit Parameters
|
||||||
|
|
||||||
* `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values.
|
* `numeric array` : (Optional) x values. If omitted a sequence will be created for the x values.
|
||||||
* `numeric array` : y values
|
* `numeric array` : y values
|
||||||
* `integer` : (Optional) polynomial degree. Defaults to 3.
|
* `integer` : (Optional) polynomial degree. Defaults to 3.
|
||||||
|
|
||||||
|
@ -1363,7 +1363,8 @@ numeric array : curve that was fit to the data points.
|
||||||
=== polyFit Syntax
|
=== polyFit Syntax
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using a the default 3 degree polynomial.
|
polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using the default 3 degree polynomial.
|
||||||
|
polyFit(yValues, 5) // This creates the xValues automatically and fits a curve through the data points using a 5 degree polynomial.
|
||||||
polyFit(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial.
|
polyFit(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial.
|
||||||
|
|
||||||
== polyfitDerivative
|
== polyfitDerivative
|
||||||
|
@ -1372,7 +1373,7 @@ The `polyfitDerivative` function returns the derivative of the curve created by
|
||||||
|
|
||||||
=== polyfitDerivative Parameters
|
=== polyfitDerivative Parameters
|
||||||
|
|
||||||
* `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values.
|
* `numeric array` : (Optional) x values. If omitted a sequence will be created for the x values.
|
||||||
* `numeric array` : y values
|
* `numeric array` : y values
|
||||||
* `integer` : (Optional) polynomial degree. Defaults to 3.
|
* `integer` : (Optional) polynomial degree. Defaults to 3.
|
||||||
|
|
||||||
|
@ -1384,6 +1385,7 @@ numeric array : The curve for the derivative created by the polynomial curve fit
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
polyfitDerivative(yValues) // This creates the xValues automatically and returns the polyfit derivative
|
polyfitDerivative(yValues) // This creates the xValues automatically and returns the polyfit derivative
|
||||||
|
polyfitDerivative(yValues, 5) // This creates the xValues automatically and fits a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
|
||||||
polyfitDerivative(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
|
polyfitDerivative(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
|
||||||
|
|
||||||
== pow
|
== pow
|
||||||
|
@ -1443,13 +1445,12 @@ numeric array
|
||||||
|
|
||||||
== probability
|
== probability
|
||||||
|
|
||||||
The `probability` function returns the probability of encountering a random variable within a discrete
|
The `probability` function returns the probability of a random variable within a discrete probability distribution.
|
||||||
probability distribution.
|
|
||||||
|
|
||||||
=== probability Parameters
|
=== probability Parameters
|
||||||
|
|
||||||
* `discrete probability distribution` : poissonDistribution | binomialDistribution | uniformDistribution | enumeratedDistribution
|
* `discrete probability distribution` : poissonDistribution | binomialDistribution | uniformDistribution | enumeratedDistribution
|
||||||
* `integer` : Value to compute the probability for.
|
* `integer` : Value of the random variable to compute the probability for.
|
||||||
|
|
||||||
=== probability Returns
|
=== probability Returns
|
||||||
|
|
||||||
|
@ -1458,7 +1459,7 @@ double : the probability
|
||||||
=== probability Syntax
|
=== probability Syntax
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
probability(poissonDistribution(10), 7) // Returns the probability of encountering a random sample if 7 in a poisson distribution with a mean of 10.
|
probability(poissonDistribution(10), 7) // Returns the probability of a random sample of 7 in a poisson distribution with a mean of 10.
|
||||||
|
|
||||||
== rank
|
== rank
|
||||||
|
|
||||||
|
@ -1497,7 +1498,7 @@ eq(raw(fieldA), fieldA) // true if the value of fieldA equals the string "fieldA
|
||||||
|
|
||||||
== regress
|
== regress
|
||||||
|
|
||||||
The `regress` function performs a simple regression on two numeric arrays.
|
The `regress` function performs a simple regression of two numeric arrays.
|
||||||
|
|
||||||
The result of this expression is also used by the `predict` and `residuals` functions.
|
The result of this expression is also used by the `predict` and `residuals` functions.
|
||||||
|
|
||||||
|
@ -1516,8 +1517,8 @@ regress(numericArray1, numericArray2)
|
||||||
|
|
||||||
The `residuals` function takes three parameters: a simple regression model, an array of predictor values
|
The `residuals` function takes three parameters: a simple regression model, an array of predictor values
|
||||||
and an array of actual values. The residuals function applies the simple regression model to the
|
and an array of actual values. The residuals function applies the simple regression model to the
|
||||||
array of predictor values and computes a predictions array. The actual values array is then
|
array of predictor values and computes a predictions array. The predicted values array is then
|
||||||
subtracted from the predictions array to compute the residuals array.
|
subtracted from the actual value array to compute the residuals array.
|
||||||
|
|
||||||
=== residuals Parameters
|
=== residuals Parameters
|
||||||
|
|
||||||
|
@ -1580,8 +1581,8 @@ Either a single numeric random sample, or a numeric array depending on the sampl
|
||||||
=== sample Syntax
|
=== sample Syntax
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
sample(normalDistribution(50, 5)) // Return a single random sample from a normalDistribution with mean of 50 and standard deviation of 5.
|
sample(poissonDistribution(5)) // Returns a single random sample from a poissonDistribution with mean of 5.
|
||||||
sample(poissonDistribution(5), 1000) // Return 1000 random samples from poissonDistribution with a mean of 5.
|
sample(poissonDistribution(5), 1000) // Returns 1000 random samples from poissonDistribution with a mean of 5.
|
||||||
|
|
||||||
== scale
|
== scale
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue