From 5887cc0faafdc23f912e49ee4d42278de765d92a Mon Sep 17 00:00:00 2001
From: Phil Steitz
The statistics package provides frameworks and implementations for
basic Descriptive statistics, frequency distributions, bivariate regression,
and t-, chi-square and ANOVA test statistics.
- Descriptive statistics
The stat package includes a framework and default implementations for
the following Descriptive statistics:
@@ -217,7 +219,7 @@ DescriptiveStatistics stats = DescriptiveStatistics.newInstance(SynchronizedDesc
org.apache.commons.math.stat.descriptive.Frequency
@@ -281,7 +283,7 @@ System.out.println(f.getCumPct("z")); // displays 1
org.apache.commons.math.stat.regression.SimpleRegression
@@ -398,7 +400,7 @@ System.out.println(regression.getSlopeStdErr());
org.apache.commons.math.stat.regression.MultipleLinearRegression
@@ -492,7 +494,121 @@ regression.addData(y, x, omega); // we do need covariance
+ The
+ org.apache.commons.math.stat.correlation package computes covariances
+ and correlations for pairs of arrays or columns of a matrix.
+
+ Covariance computes covariances and
+
+ PearsonsCorrelation provides Pearson's Product-Moment correlation coefficients.
+
+ Implementation Notes
+
- Frequency distributions
- Simple Regression
- Statistical Tests
+ Descriptive statistics
+ Frequency distributions
+ Simple Regression
+ Multiple Regression
+ Covariance and correlation
+ Statistical Tests
+
+
+ cov(X, Y) = sum [(xi - E(X))(yi - E(Y))] / (n - 1)
+ where E(X)
is the mean of X
and E(Y)
+ is the mean of the Y
values. Non-bias-corrected estimates use
+ n
in place of n - 1.
Whether or not covariances are
+ bias-corrected is determined by the optional constructor parameter,
+ "biasCorrected," which defaults to true.
+
+ cor(X, Y) = sum[(xi - E(X))(yi - E(Y))] / [(n - 1)s(X)s(Y)]
+ where E(X)
and E(Y)
are means of X
and Y
+ and s(X)
, s(Y)
are standard deviations.
+
+ Examples: +
x
and y
, use:
+
+ For non-bias-corrected covariances, use
+
+ data
+ can be computed using
+
+ The i-jth entry of the returned matrix is the unbiased covariance of the ith and jth
+ columns of data.
As above, to get non-bias-corrected covariances,
+ use
+
+ x
and y
, use:
+
+ data
+ can be computed using
+
+ The i-jth entry of the returned matrix is the Pearson's product-moment correlation between the
+ ith and jth columns of data.
+ data
using
+
+ where data
is either a rectangular array or a RealMatrix.
+ Then the matrix of standard errors is
+
+ The formula used to compute the standard error is SEr = ((1 - r2) / (n - 2))1/2
r
is the estimated correlation coefficient and
+ n
is the number of observations in the source dataset.RealMatrix
returned by
+
+ getCorrelationPValues().getEntry(i,j)
is the probability
+ that a random variable distributed as tn-2
takes
+ a value with absolute value greater than or equal to |r|((n - 2) / (1 - r2))1/2
, where r
+ is the estimated correlation coefficient.
+