mirror of https://github.com/apache/lucene.git
Solr Ref Guide: update 7.1 statistical function docs
This commit is contained in:
parent
2da777cdb8
commit
7a5733d107
|
@ -455,10 +455,81 @@ Returns the following response:
|
|||
|
||||
== Setting Variables with let
|
||||
|
||||
The `let` function sets variables and runs a streaming expression that references the variables. The `let` function can be used to
|
||||
write small statistical programs.
|
||||
The `let` function sets variables and returns the last variable. The output of any statistical function can be set to a variable.
|
||||
|
||||
A variable can be set to the output of any streaming expression. Here is a very simple example:
|
||||
Below is a simple example setting three variables `a`, `b` and `correlation`.
|
||||
|
||||
[source,text]
|
||||
----
|
||||
let(a=array(1,2,3),
|
||||
b=array(10, 20, 30),
|
||||
correlation=corr(a, b))
|
||||
----
|
||||
|
||||
Here is the output:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
{
|
||||
"result-set": {
|
||||
"docs": [
|
||||
{
|
||||
"correlation": 1
|
||||
},
|
||||
{
|
||||
"EOF": true,
|
||||
"RESPONSE_TIME": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
All variables can be output by setting the `echo` variable to `true`.
|
||||
|
||||
[source,text]
|
||||
----
|
||||
let(echo=true,
|
||||
a=array(1,2,3),
|
||||
b=array(10, 20, 30),
|
||||
correlation=corr(a, b))
|
||||
----
|
||||
|
||||
Here is the output:
|
||||
|
||||
[source,json]
|
||||
----
|
||||
{
|
||||
"result-set": {
|
||||
"docs": [
|
||||
{
|
||||
"a": [
|
||||
1,
|
||||
2,
|
||||
3
|
||||
],
|
||||
"b": [
|
||||
10,
|
||||
20,
|
||||
30
|
||||
],
|
||||
"correlation": 1
|
||||
},
|
||||
{
|
||||
"EOF": true,
|
||||
"RESPONSE_TIME": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
Streaming expressions can also be used inside of a `let` expression in the following ways:
|
||||
|
||||
* A variable can be set to the output of any streaming expression.
|
||||
* A streaming expression can be executed after all variables have been set. The variables can then be referenced by the streaming expression that is executed. The `let` expression will stream the tuples that are emitted by the final streaming expression.
|
||||
|
||||
Here is a very simple example:
|
||||
|
||||
[source,text]
|
||||
----
|
||||
|
|
|
@ -660,8 +660,8 @@ numeric array
|
|||
== empiricalDistribution
|
||||
|
||||
The `empiricalDistribution` function returns a continuous probability distribution function based
|
||||
on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is designed to
|
||||
work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
|
||||
on an actual data set (https://en.wikipedia.org/wiki/Empirical_distribution_function). This function is part of the probability distribution framework and is
|
||||
designed to work with the `sample`, `kolmogorovSmirnov` and `cumulativeProbability` functions.
|
||||
|
||||
This function is designed to work with continuous data. To build a distribution from
|
||||
a discrete data set use the `enumeratedDistribution`.
|
||||
|
@ -1053,7 +1053,7 @@ The supported distribution functions are:
|
|||
|
||||
=== kolmogorovSmirnov Returns
|
||||
|
||||
result tuple : A tuple containing the p-value and d-statistic for test result.
|
||||
result tuple : A tuple containing the p-value and d-statistic for the test result.
|
||||
|
||||
=== kolmogorovSmirnov Syntax
|
||||
|
||||
|
@ -1163,7 +1163,7 @@ if(gt(fieldA,fieldB),mod(fieldA,fieldB),mod(fieldB,fieldA)) // if fieldA > field
|
|||
== monteCarlo
|
||||
|
||||
The `monteCarlo` function performs a Monte Carlo simulation (https://en.wikipedia.org/wiki/Monte_Carlo_method)
|
||||
based on its parameters. The monteCarlo function runs another function a set number of times and returns the results.
|
||||
based on its parameters. The monteCarlo function runs another function a specified number of times and returns the results.
|
||||
The function being run typically has one or more variables that are drawn from probability
|
||||
distributions on each run. The `sample` function is used in the function to draw the samples.
|
||||
|
||||
|
@ -1330,7 +1330,7 @@ or(fieldA,fieldB,fieldC,and(fieldD,fieldE),fieldF)
|
|||
== poissonDistribution
|
||||
|
||||
The `poissonDistribution` function returns a poisson probability distribution (https://en.wikipedia.org/wiki/Poisson_distribution)
|
||||
based on its parameters. This function is part of the probability distribution framework and is designed to
|
||||
based on its parameter. This function is part of the probability distribution framework and is designed to
|
||||
work with the `sample`, `probability` and `cumulativeProbability` functions.
|
||||
|
||||
=== poissonDistribution Parameters
|
||||
|
@ -1352,7 +1352,7 @@ The `polyFit` function performs polynomial curve fitting (https://en.wikipedia.o
|
|||
|
||||
=== polyFit Parameters
|
||||
|
||||
* `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values.
|
||||
* `numeric array` : (Optional) x values. If omitted a sequence will be created for the x values.
|
||||
* `numeric array` : y values
|
||||
* `integer` : (Optional) polynomial degree. Defaults to 3.
|
||||
|
||||
|
@ -1363,7 +1363,8 @@ numeric array : curve that was fit to the data points.
|
|||
=== polyFit Syntax
|
||||
|
||||
[source,text]
|
||||
polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using a the default 3 degree polynomial.
|
||||
polyFit(yValues) // This creates the xValues automatically and fits a curve through the data points using the default 3 degree polynomial.
|
||||
polyFit(yValues, 5) // This creates the xValues automatically and fits a curve through the data points using a 5 degree polynomial.
|
||||
polyFit(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial.
|
||||
|
||||
== polyfitDerivative
|
||||
|
@ -1372,7 +1373,7 @@ The `polyfitDerivative` function returns the derivative of the curve created by
|
|||
|
||||
=== polyfitDerivative Parameters
|
||||
|
||||
* `numeric array` : (Optional) x values. If omitted an sequence will be created for the x values.
|
||||
* `numeric array` : (Optional) x values. If omitted a sequence will be created for the x values.
|
||||
* `numeric array` : y values
|
||||
* `integer` : (Optional) polynomial degree. Defaults to 3.
|
||||
|
||||
|
@ -1384,6 +1385,7 @@ numeric array : The curve for the derivative created by the polynomial curve fit
|
|||
|
||||
[source,text]
|
||||
polyfitDerivative(yValues) // This creates the xValues automatically and returns the polyfit derivative
|
||||
polyfitDerivative(yValues, 5) // This creates the xValues automatically and fits a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
|
||||
polyfitDerivative(xValues, yValues, 5) // This will fit a curve through the data points using a 5 degree polynomial and returns the polyfit derivative.
|
||||
|
||||
== pow
|
||||
|
@ -1443,13 +1445,12 @@ numeric array
|
|||
|
||||
== probability
|
||||
|
||||
The `probability` function returns the probability of encountering a random variable within a discrete
|
||||
probability distribution.
|
||||
The `probability` function returns the probability of a random variable within a discrete probability distribution.
|
||||
|
||||
=== probability Parameters
|
||||
|
||||
* `discrete probability distribution` : poissonDistribution | binomialDistribution | uniformDistribution | enumeratedDistribution
|
||||
* `integer` : Value to compute the probability for.
|
||||
* `integer` : Value of the random variable to compute the probability for.
|
||||
|
||||
=== probability Returns
|
||||
|
||||
|
@ -1458,7 +1459,7 @@ double : the probability
|
|||
=== probability Syntax
|
||||
|
||||
[source,text]
|
||||
probability(poissonDistribution(10), 7) // Returns the probability of encountering a random sample if 7 in a poisson distribution with a mean of 10.
|
||||
probability(poissonDistribution(10), 7) // Returns the probability of a random sample of 7 in a poisson distribution with a mean of 10.
|
||||
|
||||
== rank
|
||||
|
||||
|
@ -1497,7 +1498,7 @@ eq(raw(fieldA), fieldA) // true if the value of fieldA equals the string "fieldA
|
|||
|
||||
== regress
|
||||
|
||||
The `regress` function performs a simple regression on two numeric arrays.
|
||||
The `regress` function performs a simple regression of two numeric arrays.
|
||||
|
||||
The result of this expression is also used by the `predict` and `residuals` functions.
|
||||
|
||||
|
@ -1516,8 +1517,8 @@ regress(numericArray1, numericArray2)
|
|||
|
||||
The `residuals` function takes three parameters: a simple regression model, an array of predictor values
|
||||
and an array of actual values. The residuals function applies the simple regression model to the
|
||||
array of predictor values and computes a predictions array. The actual values array is then
|
||||
subtracted from the predictions array to compute the residuals array.
|
||||
array of predictor values and computes a predictions array. The predicted values array is then
|
||||
subtracted from the actual value array to compute the residuals array.
|
||||
|
||||
=== residuals Parameters
|
||||
|
||||
|
@ -1580,8 +1581,8 @@ Either a single numeric random sample, or a numeric array depending on the sampl
|
|||
=== sample Syntax
|
||||
|
||||
[source,text]
|
||||
sample(normalDistribution(50, 5)) // Return a single random sample from a normalDistribution with mean of 50 and standard deviation of 5.
|
||||
sample(poissonDistribution(5), 1000) // Return 1000 random samples from poissonDistribution with a mean of 5.
|
||||
sample(poissonDistribution(5)) // Returns a single random sample from a poissonDistribution with mean of 5.
|
||||
sample(poissonDistribution(5), 1000) // Returns 1000 random samples from poissonDistribution with a mean of 5.
|
||||
|
||||
== scale
|
||||
|
||||
|
|
Loading…
Reference in New Issue