113 lines
6.7 KiB
XML
113 lines
6.7 KiB
XML
<?xml version="1.0"?>
|
|
|
|
<!--
|
|
Copyright 2003-2004 The Apache Software Foundation
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
you may not use this file except in compliance with the License.
|
|
You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
-->
|
|
|
|
<?xml-stylesheet type="text/xsl" href="./xdoc.xsl"?>
|
|
<!-- $Revision: 1.3 $ $Date: 2004/11/09 12:41:37 $ -->
|
|
<document url="stat.html">
|
|
<properties>
|
|
<title>The Commons Math User Guide - Statistics</title>
|
|
</properties>
|
|
<body>
|
|
<section name="8 Probability Distributions">
|
|
<subsection name="8.1 Overview" href="overview">
|
|
<p>
|
|
The distributions package provide a framework for some commonly used
|
|
probability distributions.
|
|
</p>
|
|
</subsection>
|
|
<subsection name="8.2 Distribution Framework" href="distributions">
|
|
<p>
|
|
The distribution framework provides the means to compute probability density
|
|
function (PDF) probabilities and cumulative distribution function (CDF)
|
|
probabilities for common probability distributions. Along with the direct
|
|
computation of PDF and CDF probabilities, the framework also allows for the
|
|
computation of inverse PDF and inverse CDF values.
|
|
</p>
|
|
<p>
|
|
In order to use the distribution framework, first a distribution object must
|
|
be created. It is encouraged that all distribution object creation occurs via
|
|
the <code>org.apache.commons.math.distribution.DistributionFactory</code>
|
|
class. <code>DistributionFactory</code> is a simple factory used to create all
|
|
of the distribution objects supported by Commons-Math. The typical usage of
|
|
<code>DistributionFactory</code> to create a distribution object would be:
|
|
</p>
|
|
<source>DistributionFactory factory = DistributionFactory.newInstance();
|
|
BinomialDistribution binomial = factory.createBinomialDistribution(10, .75);</source>
|
|
<p>
|
|
The distributions that can be instantiated via the <code>DistributionFactory</code>
|
|
are detailed below:
|
|
<table>
|
|
<tr><th>Distribution</th><th>Factory Method</th><th>Parameters</th></tr>
|
|
<tr><td>Binomial</td><td>createBinomialDistribution</td><td><div>Number of trials</div><div>Probability of success</div></td></tr>
|
|
<tr><td>Chi-Squared</td><td>createChiSquaredDistribution</td><td><div>Degrees of freedom</div></td></tr>
|
|
<tr><td>Exponential</td><td>createExponentialDistribution</td><td><div>Mean</div></td></tr>
|
|
<tr><td>F</td><td>createFDistribution</td><td><div>Numerator degrees of freedom</div><div>Denominator degrees of freedom</div></td></tr>
|
|
<tr><td>Gamma</td><td>createGammaDistribution</td><td><div>Alpha</div><div>Beta</div></td></tr>
|
|
<tr><td>Hypergeometric</td><td>createHypogeometricDistribution</td><td><div>Population size</div><div>Number of successes in population</div><div>Sample size</div></td></tr>
|
|
<tr><td>Normal (Gaussian)</td><td>createNormalDistribution</td><td><div>Mean</div><div>Standard Deviation</div></td></tr>
|
|
<tr><td>Poisson</td><td>createPoissonDistribution</td><td><div>Mean</div></td></tr>
|
|
<tr><td>t</td><td>createTDistribution</td><td><div>Degrees of freedom</div></td></tr>
|
|
</table>
|
|
</p>
|
|
<p>
|
|
Using a distribution object, PDF and CDF probabilities are easily computed
|
|
using the <code>cumulativeProbability</code> methods. For a distribution <code>X</code>,
|
|
and a domain value, <code>x</code>, <code>cumulativeProbability</code> computes
|
|
<code>P(X <= x)</code> (i.e. the lower tail probability of <code>X</code>).
|
|
</p>
|
|
<source>DistributionFactory factory = DistributionFactory.newInstance();
|
|
TDistribution t = factory.createBinomialDistribution(29);
|
|
double lowerTail = t.cumulativeProbability(-2.656); // P(T <= -2.656)
|
|
double upperTail = 1.0 - t.cumulativeProbability(2.75); // P(T >= 2.75)</source>
|
|
<p>
|
|
The inverse PDF and CDF values are just as easily computed using the
|
|
<code>inverseCumulativeProbability</code>methods. For a distribution <code>X</code>,
|
|
and a probability, <code>p</code>, <code>inverseCumulativeProbability</code>
|
|
computes the domain value <code>x</code>, such that:
|
|
<ul>
|
|
<li><code>P(X <= x) = p</code>, for continuous distributions</li>
|
|
<li><code>P(X <= x) <= p</code>, for discrete distributions</li>
|
|
</ul>
|
|
Notice the different cases for continuous and discrete distributions. This is the result
|
|
of PDFs not being invertible functions. As such, for discrete distributions, an exact
|
|
domain value can not be returned. Only the "best" domain value. For Commons-Math, the "best"
|
|
domain value is determined by the largest domain value whose cumulative probability is
|
|
less-than or equal to the given probability.
|
|
</p>
|
|
</subsection>
|
|
<subsection name="8.3 User Defined Distributions" href="userdefined">
|
|
<p>
|
|
Since there are numerous distributions and Commons-Math only directly supports a handful,
|
|
it may be necessary to extend the distribution framework to satisfy individual needs. It
|
|
is recommended that the <code>Distribution</code>, <code>ContinuousDistribution</code>,
|
|
<code>DiscreteDistribution</code>, and <code>IntegerDistribution</code> interfaces serve as
|
|
base types for any extension. These serve as the basis for all the distributions directly
|
|
supported by Commons-Math and using those interfaces for implementation purposes will
|
|
insure any extension is compatible with the remainder of Commons-Math. To aid in
|
|
implementing a distribution extension, the <code>AbstractDistribution</code>,
|
|
<code>AbstractContinuousDistribution</code>, and <code>AbstractIntegerDistribution</code>
|
|
provide implementation building blocks and offer a lot of default distribution
|
|
functionality. By extending these abstract classes directly, a good portion of the
|
|
repetitive distribution implementation is already developed and should save time and effort
|
|
in developing user defined distributions.
|
|
</p>
|
|
</subsection>
|
|
</section>
|
|
</body>
|
|
</document>
|