The Commons Math User Guide - Statistics Phil Steitz

This is yet to be written. Any contributions will be greatfully accepted!

This is yet to be written. Any contributions will be gratefully accepted!

This is yet to be written. Any contributions will be gratefully accepted!

This is yet to be written. Any contributions will be gratefully accepted!

This is yet to be written. Any contributions will be gratefully accepted!

The distribution framework provides the means to compute probability density function (PDF) probabilities and cumulative distribution function (CDF) probabilities for common probability distributions. Along with the direct computation of PDF and CDF probabilities, the framework also allows for the computation of inverse PDF and inverse CDF values.

In order to use the distribution framework, first a distribution object must be created. It is encouraged that all distribution object creation occurs via the org.apache.commons.math.stat.distribution.DistributionFactory class. DistributionFactory is a simple factory used to create all of the distribution objects supported by Commons-Math. The typical usage of DistributionFactory to create a distribution object would be:

DistributionFactory factory = DistributionFactory.newInstance(); BinomialDistribution binomial = factory.createBinomialDistribution(10, .75);

The distributions that can be instantiated via the DistributionFactory are detailed below:
DistributionFactory MethodParameters
BinomialcreateBinomialDistribution
Number of trials
Probability of success
Chi-SquaredcreateChiSquaredDistribution
Degrees of freedom
ExponentialcreateExponentialDistribution
Mean
FcreateFDistribution
Numerator degrees of freedom
Denominator degrees of freedom
GammacreateGammaDistribution
Alpha
Beta
HypergeometriccreateHypogeometricDistribution
Population size
Number of successes in population
Sample size
tcreateTDistribution
Degrees of freedom

Using a distribution object, PDF and CDF probabilities are easily computed using the cummulativeProbability methods. For a distribution X, and a domain value, x, cummulativeProbability computes P(X <= x) (i.e. the lower tail probability of X).

DistributionFactory factory = DistributionFactory.newInstance(); TDistribution t = factory.createBinomialDistribution(29); double lowerTail = t.cummulativeProbability(-2.656); // P(T <= -2.656) double upperTail = 1.0 - t.cummulativeProbability(2.75); // P(T >= 2.75)

The inverse PDF and CDF values are just as easily computed using the inverseCummulativeProbabilitymethods. For a distribution X, and a probability, p, inverseCummulativeProbability computes the domain value x, such that:

  • P(X <= x) = p, for continuous distributions
  • P(X <= x) <= p, for discrete distributions
Notice the different cases for continuous and discrete distributions. This is the result of PDFs not being invertible functions. As such, for discrete distributions, an exact domain value can not be returned. Only the "best" domain value. For Commons-Math, the "best" domain value is determined by the largest domain value whose cummulative probability is less-than or equal to the given probability.