Recovering

git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@141012 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Mark R. Diggory 2003-11-14 21:50:39 +00:00
parent a58c503cf5
commit dd899e75fc
14 changed files with 1151 additions and 0 deletions

7
src/conf/MANIFEST.MF Normal file
View File

@ -0,0 +1,7 @@
Extension-Name: org.apache.commons.math
Specification-Title: Jakarta Commons Math
Specification-Vendor: Apache Software Foundation
Specification-Version: 0.1
Implementation-Title: org.apache.commons.math
Implementation-Vendor: Apache Software Foundation
Implementation-Version: 0.1

181
xdocs/developers.xml Normal file
View File

@ -0,0 +1,181 @@
<?xml version="1.0"?>
<document>
<properties>
<title>Developers Guide</title>
<author email="rdonkin@apache.org">Robert Burrell Donkin</author>
</properties>
<body>
<section name="Aims">
<p>
Creating and maintaining a mathematical and statistical library that is
accurate requires a greater degree of communication than might be the
case for other components. It is important that developers follow
guidelines laid down by the community to ensure that the code they create
can be successfully maintained by others.
</p>
</section>
<section name='Guidelines'>
<p>
Developers are asked to comply with the following development guidelines.
Code that does not comply with the guidelines including the word <i>must</i>
will not be committed. Our aim will be to fix all of the exceptions to the
"<i>should</i>" guidelines prior to a release.
</p>
<subsection name='Coding Style'>
<p>
Commons-math follows <a href="http://java.sun.com/docs/codeconv/">Code
Conventions for the Java Programming Language</a>. As part of the maven
build process, style checking is performed using the checkStyle plugin,
using the properties specified in <code>checkStyle.properties</code>.
Committed code <i>should</i> generate no checkStyle errors.
</p>
</subsection>
<subsection name='Documentation'>
<ul>
<li>
Committed code <i>must</i> include full javadoc.</li>
<li>
All component contracts <i>must</i> be fully specified in the javadoc class,
interface or method comments, including specification of acceptable ranges
of values, exceptions or special return values.</li>
<li>
References to definitions for all mathematical
terms used in component documentation <i>must</i> be provided, preferably
as HTML links.</li>
<li>
Implementations <i>should</i> use standard algorithms and
references to algorithm descriptions <i>should</i> be provided,
preferably as HTML links.</li>
</ul>
</subsection>
<subsection name='Unit Tests'>
<ul>
<li>
Committed code <i>must</i> include unit tests.</li>
<li>
Unit tests <i>should</i> provide full path coverage. </li>
<li>
Unit tests <i>should</i> verify all boundary conditions specified in
interface contracts, including verification that exceptions are thrown or
special values (e.g. Double.NaN, Double.Infinity) are returned as
expected. </li>
</ul>
</subsection>
<subsection name='Licensing and copyright'>
<ul>
<li>
All new source file submissions <i>must</i> include the Apache Software
License in a comment that begins the file </li>
<li>
All contributions must comply with the terms of the
<a href="http://www.apache.org/foundation/ASF_Contributor_License_1_form.pdf">
Apache Contributor License Agreement (CLA)</a></li>
<li>
Patches <i>must</i> be accompanied by a clear reference to a "source"
- if code has been "ported" from another language, clearly state the
source of the original implementation. If the "expression" of a given
algorithm is derivative, please note the original source (textbook,
paper, etc.).</li>
<li>
References to source materials covered by restrictive proprietary
licenses should be avoided.</li>
</ul>
</subsection>
</section>
<section name='Recommended Readings'>
<p>
Here is a list of relevant materials. Much of the discussion surrounding
the development of this component will refer to the various sources
listed below, and frequently the Javadoc for a particular class or
interface will link to a definition contained in these documents.
</p>
<subsection name='Recommended Readings'>
<dl>
<dt>Concerning floating point arithmetic.</dt>
<dd>
<a href="http://www.validlab.com/goldberg/paper.ps">
http://www.validlab.com/goldberg/paper.ps
</a><br/>
<a href="http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps">
http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps
</a><br/>
<a href="http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf">
http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf
</a><br/>
</dd>
<dt>Numerical analysis</dt>
<dd>
<a href="http://www.nr.com/">
Numerical Recipes (NR)
</a><br/>
<a href="http://www.mathcom.com/corpdir/techinfo.mdir/scifaq/index.html">
Scientific Computing FAQ @ Mathcom
</a><br/>
<a href="http://www.ma.man.ac.uk/~higham/asna/asna2.pdf">
Bibliography of accuracy and stability of numerical algorithms
</a><br/>
<a href="http://tonic.physics.sunysb.edu/docs/num_meth.html">
SUNY Stony Brook numerical methods page
</a><br/>
<a href="http://epubs.siam.org/sam-bin/dbq/toclist/SINUM">
SIAM Journal of Numerical Analysis Online
</a><br/>
</dd>
<dt>Probability and statistics</dt>
<dd>
<a href="http://lib.stat.cmu.edu/">
Statlib at CMU
</a><br/>
<a href="http://www.itl.nist.gov/div898/handbook/">
NIST Engineering Statistics Handbook
</a><br/>
<a href="http://www.psychstat.smsu.edu/sbk00.htm">
Online Introductory Statistics (David W. Stockburger)
</a><br/>
<a href="http://www.ubmail.ubalt.edu/~harsham/statistics/REFSTAT.HTM">
Probablility and Statistics Resources
</a><br/>
<a href="http://www.jstatsoft.org/">
Online Journal of Statistical Software
</a><br/>
</dd>
</dl>
</subsection>
<subsection name='Javadoc comment resources'>
<dl>
<dt>References for mathematical definitions.</dt>
<dd>
<a href="http://rd11.web.cern.ch/RD11/rkb/titleA.html">
http://rd11.web.cern.ch/RD11/rkb/titleA.html
</a><br/>
<a href="http://mathworld.wolfram.com/">
http://mathworld.wolfram.com/
</a><br/>
<a href="http://www.itl.nist.gov/div898/handbook">
http://www.itl.nist.gov/div898/handbook
</a><br/>
<a href="http://doi.acm.org/10.1145/359146.359152">
Chan, T. F. and J. G. Lewis 1979, <i>Communications of the ACM</i>,
vol. 22 no. 9, pp. 526-531.
</a><br/>
<a href="http://www.itl.nist.gov/div898/handbook">
http://www.wikipedia.org/wiki/
</a><br/>
</dd>
</dl>
</subsection>
<subsection name='XML'>
<dl>
<dt>XML related resources.</dt>
<dd>
<a href="http://www.openmath.org">
http://www.openmath.org
</a><br/>
</dd>
</dl>
</subsection>
</section>
</body>
</document>

90
xdocs/index.xml Normal file
View File

@ -0,0 +1,90 @@
<?xml version="1.0"?>
<document>
<properties>
<title>Commons-Math: The Jakarta Mathematics Library</title>
<author email="rdonkin@apache.org">Robert Burrell Donkin</author>
<author email="tobrien@apache.org">Tim O'Brien</author>
</properties>
<body>
<section name="Commons-Math: The Jakarta Mathematics Library" href="summary">
<p>
The Java programming language and the math extensions in
Commons Lang provide implementations for only the most basic
mathematical algorithms. Routine development tasks such as
computing basic statistics or solving a system of linear equations
require components not available in Java or Commons Lang.
</p>
<p>
Most basic mathematical or statistical algorithms are available in
open source implementations, but to assemble a simple set of
capabilities one has to use multiple libraries, many of which have
more restrictive licensing terms than the ASF. In addition, many
of the best open source implementations (e.g. the R statistical
package) are either not available in Java or require large support
libraries and/or external dependencies to work.
</p>
<p>
Commons Math is a library of lightweight, self-contained
mathematics and statistics components addressing the most common
problems not available in the Java programming language or Commons
Lang.
</p>
<p>
Guiding principles:
<ol>
<li>
Real-world application use cases will determine development
priority.
</li>
<li>
This package will emphasize small, easily integrated components
rather than large libraries with complex dependencies and
configurations.
</li>
<li>
All algorithms will be fully documented and follow generally
accepted best practices.
</li>
<li>
In situations where multiple standard algorithms exist, a
Strategy pattern will be used to support multiple
implementations.
</li>
<li>
Limited dependencies. No external dependencies beyond Commons
components and the core Java 2 platform.
</li>
</ol>
</p>
<subsection name='An Apology To British Users And Developers'>
<p>
Yes - I know that it should be commons-maths. But think of all the
bandwidth saved by losing that 's' ;)
</p>
</subsection>
</section>
<section name="Download Math">
<subsection name="Releases">
<p>
There haven't been any yet! The more people who contribute, the
quicker this will happen.
</p>
</subsection>
<subsection name="Nightly Builds">
<p>
Nightly builds are built once a day from the current CVS HEAD.
This is (nearly) the lastest code and so should be treated with
caution!
</p>
<p>
You can get the nightly builds from <a
href="http://jakarta.apache.org/builds/jakarta-commons/nightly/commons-math/">here</a>
</p>
</subsection>
</section>
</body>
</document>

25
xdocs/navigation.xml Normal file
View File

@ -0,0 +1,25 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- $Revision: 1.5 $ $Date: 2003/11/14 21:49:41 $ -->
<project name="Math">
<title>Math</title>
<organizationLogo href="/images/jakarta-logo-blue.gif">Jakarta</organizationLogo>
<body>
<menu name="Math">
<item name="Overview" href="/index.html"/>
<item name="Proposal" href="/proposal.html"/>
<item name="Developers Guide" href="/developers.html"/>
<item name="Tasks: Done And To Do" href="/tasks.html"/>
</menu>
<menu name="User Guide">
<item name="Contents" href="/userguide/index.html"/>
<item name="Overview" href="/userguide/overview.html"/>
<item name="Statistics" href="/userguide/stat.html"/>
<item name="Data generation" href="/userguide/random.html"/>
<item name="Linear Algebra" href="/userguide/linear.html"/>
<item name="Special Functions" href="/userguide/special.html"/>
<item name="Utilities" href="/userguide/utilities.html"/>
</menu>
</body>
</project>

114
xdocs/proposal.xml Normal file
View File

@ -0,0 +1,114 @@
<?xml version="1.0"?>
<document>
<properties>
<title>Proposal for math Package</title>
<author email="martin@mvdb.net">Robert Burrell Donkin</author>
</properties>
<body>
<section name='Proposal for math Package'>
<subsection name='(0) Rationale'>
<p>The Java programming language and the math extensions in commons-lang provide implementations
for only the most basic mathematical algorithms. Routine development tasks such as computing
basic statistics or solving a system of linear equations require components not available in java
or commons-lang.</p>
<p>Most basic mathematical or statistical algorithms are available in open source implementations,
but to assemble a simple set of capabilities one has to use multiple libraries, many of which have
more restrictive licensing terms than the ASF. In addition, many of the best open source
implementations (e.g. the R statistical package) are either not available in Java or require large
support libraries and/or external dependencies to work.</p>
<p>A commons-math community will provide a productive environment for aggregation, testing and
support of efficient Java implementations of commonly used mathematical and statistical algorithms.</p>
</subsection>
<subsection name='(1) Scope of the Package'>
<p>The Math project shall create and maintain a library of lightweight, self-contained mathematics
and statistics components addressing the most common practical problems not immediately available in
the Java programming language or commons-lang. The guiding principles for commons-math will be:
<ol>
<li>Real-world application use cases determine priority</li>
<li>Emphasis on small, easily integrated components rather than large libraries with complex
dependencies</li>
<li>All algorithms are fully documented and follow generally accepted best practices</li>
<li>In situations where multiple standard algorithms exist, use the Strategy pattern to support
multiple implementations</li>
<li>Limited dependencies. No external dependencies beyond Commons components and the JDK</li>
</ol>
</p>
</subsection>
<subsection name='(1.5) Interaction With Other Packages'>
<p><em>math</em> relies only on standard JDK 1.2 (or later) APIs for
production deployment. It utilizes the JUnit unit testing framework for
developing and executing unit tests, but this is of interest only to
developers of the component.</p>
<p>No external configuration files are utilized.</p>
</subsection>
<subsection name='(2) Initial Source of the Package'>
<p>The initial codebase will consist of implementations of basic statistical algorithms such
as the following:
<ul>
<li>Simple univariate statistics (mean, standard deviation, n, confidence intervals)</li>
<li>Frequency distributions</li>
<li>t-test, chi-square test</li>
<li>Random numbers from Gaussian, Exponential, Poisson distributions</li>
<li>Random sampling/resampling</li>
<li>Bivariate regression, corellation</li>
</ul>
and mathematical algorithms such as the following:
<ul>
<li>Basic Complex Number representation with algebraic operations</li>
<li>Newton's method for finding roots</li>
<li>Binomial coefficients</li>
<li>Exponential growth and decay (set up for financial applications)</li>
<li>Polynomial Interpolation (curve fitting)</li>
<li>Basic Matrix representation with algebraic operations</li>
</ul>
</p>
<p>The proposed package name for the new component is
<code>org.apache.commons.math</code>.</p>
</subsection>
<subsection name='(3) Required Jakarta-Commons Resources'>
<ul>
<li>CVS Repository - New directory <code>math</code> in the
<code>jakarta-commons</code> CVS repository.</li>
<li>Mailing List - Discussions will take place on the general
<em>commons-dev@jakarta.apache.org</em> mailing list. To help
list subscribers identify messages of interest, it is suggested that
the message subject of messages about this component be prefixed with
[math].</li>
<li>Bugzilla - New component "math" under the "Commons" product
category, with appropriate version identifiers as needed.</li>
<li>Jyve FAQ - New category "commons-math" (when available).</li>
</ul>
</subsection>
<subsection name='(4) Initial Committers'>
<p>The initial committers on the math component shall be:
<ul>
<li><a href="mailto:rdonkin@apache.org">Robert Burrell Donkin</a></li>
<li><a href="mailto:tobrien@apache.org">Tim O'Brien</a></li>
</ul>
</p>
</subsection>
</section>
</body>
</document>

93
xdocs/tasks.xml Normal file
View File

@ -0,0 +1,93 @@
<?xml version="1.0"?>
<!-- $Revision: 1.14 $ $Date: 2003/11/14 21:49:41 $ -->
<document>
<properties>
<title>Tasks: Done And To Do</title>
</properties>
<body>
<section name="Aim">
<p>This page aims to be a handy reference not only of the work done but also of work pending for the next planned release. Users who want new features should submit patches to this page in the unclassified section of this document. Developers who want to lend a hand can grab tasks from this page. Everyone can see the progress which is being made.</p>
</section>
<section name="TODO list">
<p>The following is a list of items still <code>TODO</code> for Math. Contributions are welcome!</p>
<subsection name="Documentation and Code Conformance Tasks">
<p>Many of these will always be a required. Please focus on applying format standards and provide as many test cases as possible for your code.</p>
<dl>
<dt>Develop user's guide following the package structure.</dt>
<dd>Provide any comments on this task here.</dd>
<dt>Performance and accuracy testing.</dt>
<dd>If anyone is interested in helping out here, what we could really use is a wider selection of test cases for the core numerical functions and validation against either other packages (e.g. R for the statistical stuff), verified datasets, or experiments comparing implementions using floats to doubles.</dd>
<dt>Test Coverage.</dt>
<dd>Clover tests show gaps in test path coverage. Get all tests to 100% coverage. Also improve test data and boundary conditions coverage.</dd>
<dt>Code review.</dt>
<dd>
<p>Code review is a continuous rpocess that all Contributors and Developers should practice while working on the code base.</p>
<ul>
<li>Javadoc generation is still throwing warnings. Bring the Javadoc into compliance (i.e. reach zero warnings).</li>
<li>Verify that the code matches the documentation and identify obvious inefficiencies or numerical problems. All feedback/suggestions for improvement/patches are welcome.</li>
<li>CheckStyle with modified properties still shows many errors. Try to clean these up.</li>
</ul>
</dd>
</dl>
</subsection>
<subsection name="Algorithm Development Tasks">
<p>These current tasks are planned and in need completion for the inital release.</p>
<dl>
<dt>Add confidence intervals to Univariate implementations.</dt>
<dd>Provide any comments on this task here.</dd>
<dt>Distributions.</dt>
<dd>Extend distribution framework to support discrete distributions and implement binomial and hypergeometric distributions.</dd>
<dt>Analysis.</dt>
<dd>
<ul>
<li>Rework unit tests for root finding and spline interpolation.</li>
<li>CheckStyle with modified properties still shows many errors. Try to clean these up.</li>
</ul>
</dd>
<dt>Distributions.</dt>
<dd>Finalize the contents of MathUtils and StatUtils. Suggest any additions -- ideally with patches -- to these utility classes.</dd>
<dt>Complex Number Library.</dt>
<dd>
An initial <a href="http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24241">submission</a>
of a complex number library has been donated by Brent Worden. It has been added
and is open to be reviewed/tested by others for feedback.
The implementation is based on the following source:
<a href="http://myweb.lmu.edu/dmsmith/ZMLIB.pdf">http://myweb.lmu.edu/dmsmith/ZMLIB.pdf</a>.
<ul>
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listId=15&amp;msgNo=28132">
Thread Subject: [math] Complex dilemmas
</a></li>
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=commons-dev@jakarta.apache.org&msgNo=36293">
Thread Subject: [math] Complex implementation
</a></li>
</ul>
</dd>
</dl>
</subsection>
</section>
<!--section name="Future Goals">
<subsection name="Delayed Tasks slated for the next release of the Math library">
</subsection>
</section-->
<section name="Completed">
<subsection name="Since Conception">
<ul>
<li>Framework and implementation strategie(s) for finding roots or real-valued functions of one (real) variable. Implemented algorithms: Brent-Dekker, secant, simple bisection.</li>
<li>Cubic spline interpolation.</li>
<li>Bivariate Regression, correlation. </li>
<li>Sampling from Collections</li>
<li>Add higher order moments to Univariate implementations.</li>
<li>Binomial coefficients -- incorporate an "exact" implementation that is limited to what can be stored in a long. Also provided double-value implementation of binomial coefficients and their logs.</li>
<li>Add percentiles to stored Univariate implementations</li>
<li>Improve numerical accuracy of Univariate and BivariateRegression statistical computations</li>
<li>t-test statistic needs to be added and we should probably add the capability of actually performing t- and chi-square tests at fixed significance levels (.1, .05, .01, .001).</li>
<li>numerical approximation of the t- and chi-square distributions to enable user-supplied significance levels.</li>
<li>The RealMatrixImpl class is missing some key method implementations. The critical thing is solution of linear systems. We need to implement a numerically sound solution algorithm. This will enable inverse() and also support general linear regression.</li>
<li>Added double[] |-> double methods in StatUtils to take start indexes and length as parameters and delegate the current "full array" versions to these.</li>
</ul>
</subsection>
</section>
</body>
</document>

View File

@ -0,0 +1,85 @@
<?xml version="1.0"?>
<!-- $Revision: 1.4 $ $Date: 2003/11/14 21:50:39 $ -->
<document url="analysis.html">
<properties>
<title>The Commons Math User Guide - Numerical Analysis</title>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="4 Numerical Analysis">
<subsection name="4.1 Overview" href="overview">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
</subsection>
<subsection name="4.2 Root-finding" href="rootfinding">
<p>
<code>org.apache.commons.math.analysis.UnivariateRealSolver</code> provides the means to
find roots of univariate, real valued, functions. Commons-Math supports various
implementations of <code>UnivariateRealSolver</code> to solve functions with differing
characteristics.
</p>
<p>
In order to use the root-finding features, first a solver object must be created. It is
encouraged that all solver object creation occurs via the <code>org.apache.commons.math.analysis.UnivariateRealSolverFactory</code>
class. <code>UnivariateRealSolverFactory</code> is a simple factory used to create all
of the solver objects supported by Commons-Math. The typical usage of <code>UnivariateRealSolverFactory</code>
to create a solver object would be:</p>
<source>UnivariateRealFunction function = // some user defined function object
UnivariateRealSolverFactory factory = UnivariateRealSolverFactory.newInstance();
UnivariateRealSolver solver = factory.newDefaultSolver(function);</source>
<p>
The solvers that can be instantiated via the <code>UnivariateRealSolverFactory</code> are detailed below:
<table>
<tr><th>Solver</th><th>Factory Method</th><th>Notes on Use</th></tr>
<tr><td>Bisection</td><td>newBisectionSolver</td><td><div>Root must be bracketted.</div><div>Linear, guaranteed convergence</div></td></tr>
<tr><td>Brent</td><td>newBrentSolver</td><td><div>Root must be bracketted.</div><div>Super-linear, guaranteed convergence</div></td></tr>
<tr><td>Secant</td><td>newSecantSolver</td><td><div>Root must be bracketted.</div><div>Super-linear, non-guaranteed convergence</div></td></tr>
</table>
</p>
<p>
Using a solver object, roots of functions are easily found using the <code>solve</code>
methods. For a function <code>f</code>, and two domain values, <code>min</code> and
<code>max</code>, <code>solve</code> computes the value <code>c</code> such that:
<ul>
<li><code>f(c) = 0.0</code></li>
<li><code>min &lt;= c &lt;= max</code></li>
</ul>
</p>
<source>UnivariateRealFunction function = // some user defined function object
UnivariateRealSolverFactory factory = UnivariateRealSolverFactory.newInstance();
UnivariateRealSolver solver = factory.newBisectionSolver(function);
double c = solver.solve(1.0, 5.0);</source>
<p>
Along with the <code>solve</code> methods, the <code>UnivariateRealSolver</code>
interface provides many properties to control the convergence of a solver. For the most
part, these properties should not have to change from their default values to produce
quality results. In the circumstances where changing these property values is needed, it
is easily done through getter and setter methods on <code>UnivariateRealSolver</code>:
<table>
<tr><th>Property</th><th>Methods</th><th>Purpose</th></tr>
<tr><td>Absolute accuracy</td><td>
<div>getAbsoluteAccuracy</div>
<div>resetAbsoluteAccuracy</div>
<div>setAbsoluteAccuracy</div></td><td>This is yet to be written. Any contributions will be greatfully accepted!</td></tr>
<tr><td>Function value accuracy</td><td>
<div>getFunctionValueAccuracy</div>
<div>resetFunctionValueAccuracy</div>
<div>setFunctionValueAccuracy</div></td><td>This is yet to be written. Any contributions will be greatfully accepted!</td></tr>
<tr><td>Maximum iteration count</td><td>
<div>getMaximumIterationCount</div>
<div>resetMaximumIterationCount</div>
<div>setMaximumIterationCount</div></td><td>This is yet to be written. Any contributions will be greatfully accepted!</td></tr>
<tr><td>Relative accuracy</td><td>
<div>getRelativeAccuracy</div>
<div>resetRelativeAccuracy</div>
<div>setRelativeAccuracy</div></td><td>This is yet to be written. Any contributions will be greatfully accepted!</td></tr>
</table>
</p>
</subsection>
<subsection name="4.3 Interpolation" href="interpolation">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
</subsection>
</section>
</body>
</document>

70
xdocs/userguide/index.xml Normal file
View File

@ -0,0 +1,70 @@
<?xml version="1.0"?>
<document url="index.html">
<properties>
<author email="phil@steitz.com">Phil Steitz</author>
<title>The Commons Math User Guide - Table of Contents</title>
</properties>
<body>
<section name="Table of Contents" href="toc">
<ul>
<li><a href="overview.html">0. Overview</a>
<ul>
<li><a href="overview.html#about">0.1 About the User Guide</a></li>
<li><a href="overview.html#summary">0.2 What's in commons-math</a></li>
<li><a href="overview.html#organization">0.3 How commons-math is organized</a></li>
<li><a href="overview.html#contracts">0.4 How interface contracts are specified in commons-math javadoc</a></li>
<li><a href="overview.html#dependencies">0.5 Dependencies</a></li>
</ul></li>
<li><a href="stat.html">1. Statistics</a>
<ul>
<li><a href="stat.html#overview">1.1 Overview</a></li>
<li><a href="stat.html#univariate">1.2 Univariate statistics</a></li>
<li><a href="stat.html#frequency">1.3 Frequency distributions</a></li>
<li><a href="stat.html#regression">1.4 Bivariate regression</a></li>
<li><a href="stat.html#tests">1.5 Statistical tests</a></li>
<li><a href="stat.html#distributions">1.6 Distribution framework</a></li>
</ul></li>
<li><a href="random.html">2. Data Generation</a>
<ul>
<li><a href="random.html#overview">2.1 Overview</a></li>
<li><a href="random.html#deviates">2.2 Random numbers</a></li>
<li><a href="random.html#strings">2.3 Random Strings</a></li>
<li><a href="random.html#combinatorics">2.4 Random permutations, combinations, sampling</a></li>
<li><a href="random.html#empirical">2.5 Generating data "like" an input file</a></li>
</ul></li>
<li><a href="linear.html">3. Linear Algebra</a>
<ul>
<li><a href="linear.html#overview">3.1 Overview</a></li>
<li><a href="linear.html#real_matrices">3.2 Real matrices</a></li>
<li><a href="linear.html#solve">3.3 Solving linear systems</a></li>
</ul></li>
<li><a href="analysis.html">4. Numerical Analysis</a>
<ul>
<li><a href="analysis.html#overview">4.1 Overview</a></li>
<li><a href="analysis.html#rootfinding">4.2 Root-finding</a></li>
<li><a href="analysis.html#interpolation">4.3 Interpolation</a></li>
</ul></li>
<li><a href="special.html">5. Special Functions</a>
<ul>
<li><a href="special.html#overview">5.1 Overview</a></li>
<li><a href="special.html#gamma">5.2 Gamma functions</a></li>
<li><a href="special.html#beta">5.3 Beta funtions</a></li>
</ul></li>
<li><a href="utilities.html">6. Utilities</a>
<ul>
<li><a href="utilities#overview">6.1 Overview</a></li>
<li><a href="utilities.html#arrays">6.2 Double array utilities</a></li>
<li><a href="utilities.html#continued_fractions">6.3 Continued Fractions</a></li>
<li><a href="utilities.html#math_utils">6.4 binomial coefficients, factorials and other common math functions</a></li>
<li><a href="utilities.html#stat_utils">6.5 statistical computation utiliities</a></li>
</ul></li>
</ul>
</section>
</body>
</document>

View File

@ -0,0 +1,34 @@
<?xml version="1.0"?>
<document url="linear.html">
<properties>
<title>The Commons Math User Guide - Linear Algebra</title>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="3 Linear Algebra">
<subsection name="3.1 Overview" href="overview">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
<subsection name="3.2 Real matrices" href="real_matrices">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
<subsection name="3.3 Solving linear systems" href="solve">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
</section>
</body>
</document>

View File

@ -0,0 +1,103 @@
<?xml version="1.0"?>
<document>
<properties>
<title>User Guide - Overview</title>
<author email="rdonkin@apache.org">Robert Burrell Donkin</author>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="Overview">
<subsection name="0.1 About The User Guide" href="about">
<p>
This guide is intended to help programmers quickly find what they need to develop
solutions using Commons Math. It also provides a supplement to the javadoc API documentation,
providing a little more explanation of the mathematical objects and functions included
in the package.
</p>
</subsection>
<subsection name="0.2 What's in commons-math" href="summary">
<p>
Commons Math is made up of a small set of math/stat utilities addressing
programming problems like the ones in the list below. This list is not exhaustive,
it's just meant to give a feel for the kinds of things that Commons Math provides.
<ul>
<li>Computing means, variances and other summary statistics for a list of numbers</li>
<li>Fitting a line to a set of data points using linear regression</li>
<li>Solving equations involving real-valued functions (i.e. root-finding)</li>
<li>Performing statistical significance tests</li>
<li>Solving systems of linear equations</li>
<li>Generating random numbers with more restrictions (e.g distribution, range) than what
is possible using the JDK</li>
<li>Generating random samples and/or datasets that are "like" the data in an input file</li>
<li>Finding a smooth curve that passes through a collection of points (interpolation)</li>
<li>Miscellaneous mathematical functions such as factorials and binomial
coefficients</li>
</ul></p>
<p>
Commons Math is a new project and we are actively seeking ideas for additional components that
fit into the <a href="../index.html#summary">Commons Math vision</a> of a set of lightweight,
self-contained math/stat components useful for solving common programming problems.
Suggestions for new components or enhancements to existing functionality are always welcome!
All feedback/suggestions for improvement should be sent to the
<a href="http://jakarta.apache.org/site/mail.html">commons-dev mailing list</a> with
[math] at the beginning of the subject line.
</p>
</subsection>
<subsection name="0.3 How commons-math is organized" href="organization">
<p>
Commons Math is divided into 6 subpackages, based on functionality provided.
<ol><li><a href="stat.html">org.apache.commons.math.stat</a> - statistics, statistical tests, probability distributions</li>
<li><a href="analysis.html">org.apache.commons.math.analysis</a> - rootfinding and interpolation</li>
<li><a href="random.html">org.apache.commons.math.random</a> - random numbers, strings and data generation</li>
<li><a href="special.html">org.apache.commons.math.special</a> - special functions (Gamma, Beta) </li>
<li><a href="linear.html">org.apache.commons.math.linear</a> - matrices, solving linear systems </li>
<li><a href="utilities.html">org.apache.commons.matn.utitlities</a> - common math/stat functions extending java.lang.Math </li>
</ol>
Package javadocs are <a href="../apidocs/index.html">here</a>
</p>
</subsection>
<subsection name="0.4 How interface contracts are specified in commons-math javadoc" href="contracts">
<p>
You should always read the javadoc class and method comments carefully when using
Commons Math components in your programs. The javadoc provides references to the algorithms
that are used, usage notes about limitations, performance, etc. as well as interface contracts.
Interface contracts are specified in terms of preconditions (what has to be true in order
for the method to return valid results), special values returned (e.g. Double.NaN)
or exceptions that may be thrown if the preconditions are not met, and definitions for returned
values/objects or state changes.</p>
<p>
When the actual parameters provide to a method or the internal state of an object
make a computation meaningless, an IllegalArgumentException or IllegalStateException may
be thrown. Exact conditions under which runtime exceptions (and any other exceptions) are
thrown are specified in the javadoc method comments. In some cases, to be consistent with
the <a href="http://grouper.ieee.org/groups/754/">IEEE 754 standard</a> for floating point
arithmetic and with java.lang.Math, Commons Math methods return Double.NaN values.
Conditions under which Double.NaN or other special values are returned are fully specified
in the javadoc method comments.
</p>
</subsection>
<subsection name="0.5 Dependencies" href="dependencies">
<p>
Commons Math requires JDK 1.2+ and has no dependencies other than the following Jakarta Commons
components:
<ul>
<li>commons-beanutils 1.5 </li>
<li>commons-collections 2.1 </li>
<li>commons-logging 1.0.3 </li>
</ul>
</p>
</subsection>
</section>
</body>
</document>

172
xdocs/userguide/random.xml Normal file
View File

@ -0,0 +1,172 @@
<?xml version="1.0"?>
<document url="random.html">
<properties>
<title>The Commons Math User Guide - Data Generation</title>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="2 Data Generation">
<subsection name="2.1 Overview" href="overview">
<p>
The Commons Math random package includes utilities for
<ul>
<li>generating random numbers</li>
<li>generating random strings</li>
<li>generating cryptographically secure sequences of random numbers or strings</li>
<li>generating random samples and permuations</li>
<li>analyzing distributions of values in an input file and generating values "like"
the values in the file</li>
<li>generating data for grouped frequency distributions or histograms</li>
</ul></p>
</subsection>
<subsection name="2.2 Random numbers" href="deviates">
<p>
The <a href="../apidocs/org/apache/commons/math/random/RandomData.html">
org.apache.commons.math.RandomData</a> interface defines methods for generating
random sequences of numbers. The API contracts of these methods use the following concepts:
<dl>
<dt>Random sequence of numbers from a probability distribution</dt>
<dd>There is no such thing as a single "random number." What can be generated
are <i>sequences</i> of numbers that appear to be random. When using the
built-in JDK function <code>Math.random(),</code> sequences of values generated
follow the <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm">
Uniform Distribution</a>, which means that the values are evenly spread over the interval
between 0 and 1, with no sub-interval having a greater probability of containing generated
values than any other interval of the same length. The mathematical concept of a <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda36.htm">
probability distribution</a> basically amounts to asserting that different ranges in the set
of possible values for of a random variable have different probabilities of containing the value.
Commons Math supports generating random sequences from the following probability distributions. The
javadoc for the <code>nextXxx</code> methods in <code>RandomDataImpl</code> describes the algorithms used
to generate random deviates from each of these distributions.
<ul>
<li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm">uniform distribution</a></li>
<li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm">exponential distribution</a></li>
<li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda366j.htm">poisson distribution</a></li>
<li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm">Gaussian distribution</a></li>
</ul>
</dd>
<dt>Cryptographically secure random sequences</dt>
<dd>It is possible for a sequence of numbers to appear random, but nonetheless to be
predictable based on the algorithm used to generate the sequence. If in addition to
randomness, strong unpredictability is required, it is best to use a
<a href="http://www.wikipedia.org/wiki/Cryptographically_secure_pseudo-random_number_generator">
secure random number generator</a> to generate values (or strings). The nextSecureXxx methods
in the <code>RandomDataImpl</code> implementation of the <code>RandomData</code> interface use the
JDK <code>SecureRandom</code> pseudo-random number generator (PRNG)
to generate cryptographically secure sequences. The <code>setSecureAlgorithm</code> method
allows you to change the underlying PRNG. These methods are <strong>much slower</strong> than
the corresponding "non-secure" versions, so they should only be used when cryptographic security
is required.</dd>
<dt>Seeding pseudo-random number generators</dt>
<dd>By default, the implementation provided in <code>RandomDataImpl</code> uses the JDK-provided
PRNG. Like other PRNGs, the JDK generator generates sequences of random numbers based on an initial
"seed value". For the non-secure methods, starting with the same seed always produces the same
sequence of values. Secure sequences started with the same seeds will diverge. When a new
<code>RandomDataImpl</code> is created, the underlying random number generators are
<strong>not</strong> intialized. The first call to a data generation method, or to a
<code>reSeed()</code> method initializes the appropriate generator. If you do not explicitly
seed the generator, it is by default seeded with the current time in milliseconds. Therefore,
to generate sequences of random data values, you should always instantiate <strong>one</strong>
<code>RandomDataImpl</code> and use it repeatedly instead of creating new instances for
subsequent values in the sequence. For example, the following will generate a random sequence
of 50 long integers between 1 and 1,000,000, using the current time in milliseconds as the seed
for the JDK PRNG:
<pre><code>
RandomDataImpl randomData = new RandomDataImpl();
for (int i = 0; i &lt; 1000; i++) {
value = randomData.nextLong(1, 1000000);
}
</code></pre>
The following will not in general produce a good random sequence, since the PRNG is reseeded
each time through the loop with the current time in milliseconds:
<pre><code>
for (int i = 0; i &lt; 1000; i++) {
RandomDataImpl randomData = new RandomDataImpl();
value = randomData.nextLong(1, 1000000);
}
</code></pre>
The following will produce the same random sequence each time it is executed:
<pre><code>
RandomDataImpl randomData = new RandomDataImpl();
randomData.reSeed(1000);
for (int i = 0; i = 1000; i++) {
value = randomData.nextLong(1, 1000000);
}
</code></pre>
The following will produce a different random sequence each time it is executed.
<pre><code>
RandomDataImpl randomData = new RandomDataImpl();
randomData.reSeedSecure(1000);
for (int i = 0; i &lt; 1000; i++) {
value = randomData.nextSecureLong(1, 1000000);
}
</code></pre>
</dd></dl>
</p>
</subsection>
<subsection name="2.3 Random Strings" href="strings">
<p>
The methods <code>nextHexString</code> and <code>nextSecureHexString</code>
can be used to generate random strings of hexadecimal characters. Both of these
methods produce sequences of strings with good dispersion properties.
The difference between the two methods is that the second is cryptographically secure.
Specifically, the implementation of <code>nextHexString(n)</code> in <code>RandomDataImpl</code>
uses the following simple algorithm to generate a string of <code>n</code> hex digits:
<ol>
<li>n/2+1 binary bytes are generated using the underlying Random</li>
<li>Each binary byte is translated into 2 hex digits</li></ol>
The <code>RandomDataImpl</code> implementation of the "secure" version,
<code>nextSecureHexString</code> generates hex characters in 40-byte "chunks"
using a 3-step process:
<ol>
<li>20 random bytes are generated using the underlying <code>SecureRandom</code>.</li>
<li>SHA-1 hash is applied to yield a 20-byte binary digest.</li>
<li>Each byte of the binary digest is converted to 2 hex digits</li></ol>
Similarly to the secure random number generation methods, <code>nextSecureHexString</code>
is <strong>much slower</strong> than the non-secure version. It should be used only for
applications such as generating unique session or transaction ids where predictability of
subsequent ids based on observation of previous values is a security concern. If all
that is needed is an even distribution of hex characters in the generated strings, the
non-secure method should be used.
</p>
</subsection>
<subsection name="2.4 Random permutations, combinations, sampling" href="combinatorics">
<p>
To select a random sample of objects in a collection, you can use the
<code>nextSample</code> method in the <code>RandomData</code> interface. Specifically,
if <code>c</code> is a collection containing at least <code>k</code> objects, and
<code>ranomData</code> is a <code>RandomDataImpl</code> instance
<code>randomData.nextSample(c, k)</code>
will return an <code>object[]</code> array of length <code>k</code> consisting of
elements randomly selected from the collection. If <code>c</code> contains
duplicate references, there may be duplicate references in the returned array;
otherwise returned elements will be unique -- i.e., the sampling is without
replacement among the object references in the collection. </p>
<p>
If <code>randomData</code> is a <code>RandomDataImpl</code> instance, and
<code>n</code> and <code>k</code> are integers with <code> k &lt;= n</code>,
then <code>randomData.nextPermutation(n, k)</code> returns an <code>int[]</code>
array of length <code>k</code> whose whose entries are selected randomly,
without repetition, from the integers <code>0</code> through <code>n-1</code> (inclusive), i.e.,
<code>randomData.nextPermutation(n, k)</code> returns a random permutation of
<code>n</code> taken <code>k</code> at a time.
</p>
</subsection>
<subsection name='2.5 Generating data "like" an input file' href="empirical">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
</section>
</body>
</document>

View File

@ -0,0 +1,40 @@
<?xml version="1.0"?>
<!-- $Revision: 1.4 $ $Date: 2003/11/14 21:50:39 $ -->
<document url="special.html">
<properties>
<title>The Commons Math User Guide - Special Functions</title>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="5 Special Functions">
<subsection name="5.1 Overview" href="overview">
<p>
The special functions portion of Commons-Math contains several useful functions not
provided by <code>java.lang.Math</code>. These functions mostly serve as building blocks
for other portions of Commons-Math but, as others may find them useful as stand-alone
methods, these special functions were included as part of the Commons-Math public API.
</p>
</subsection>
<subsection name="5.2 Gamma functions" href="gamma">
<p>
<code>org.apache.commons.math.special.Gamma</code> contains several useful functions involving the Gamma Function.
<table>
<tr><th>Function</th><th>Method</th><th>Reference</th></tr>
<tr><td>Log Gamma</td><td>logGamma</td><td>See <a href="http://mathworld.wolfram.com/GammaFunction.html">Gamma Function</a> from MathWorld</td></tr>
<tr><td>Regularized Gamma</td><td>regularizedGammaP</td><td>See <a href="http://mathworld.wolfram.com/RegularizedGammaFunction.html">Regularized Gamma Function</a> from MathWorld</td></tr>
</table>
</p>
</subsection>
<subsection name="5.3 Beta funtions" href="beta">
<p>
<code>org.apache.commons.math.special.Beta</code> contains several useful functions involving the Beta Function.
<table>
<tr><th>Function</th><th>Method</th><th>Reference</th></tr>
<tr><td>Log Beta</td><td>logBeta</td><td>See <a href="http://mathworld.wolfram.com/BetaFunction.html">Beta Function</a> from MathWorld</td></tr>
<tr><td>Regularized Beta</td><td>regularizedBeta</td><td>See <a href="http://mathworld.wolfram.com/RegularizedBetaFunction.html">Regularized Beta Function</a> from MathWorld</td></tr>
</table>
</p>
</subsection>
</section>
</body>
</document>

91
xdocs/userguide/stat.xml Normal file
View File

@ -0,0 +1,91 @@
<?xml version="1.0"?>
<!-- $Revision: 1.4 $ $Date: 2003/11/14 21:50:39 $ -->
<document url="stat.html">
<properties>
<title>The Commons Math User Guide - Statistics</title>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="1 Statistics">
<subsection name="1.1 Overview" href="overview">
<p>This is yet to be written. Any contributions will be greatfully
accepted!</p>
</subsection>
<subsection name="1.2 Univariate statistics" href="univariate">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
</subsection>
<subsection name="1.3 Frequency distributions" href="frequency">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
</subsection>
<subsection name="1.4 Bivariate regression" href="regression">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
</subsection>
<subsection name="1.5 Statistical tests" href="tests">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
</subsection>
<subsection name="1.6 Distribution framework" href="distributions">
<p>
The distribution framework provides the means to compute probability density
function (PDF) probabilities and cumulative distribution function (CDF)
probabilities for common probability distributions. Along with the direct
computation of PDF and CDF probabilities, the framework also allows for the
computation of inverse PDF and inverse CDF values.
</p>
<p>
In order to use the distribution framework, first a distribution object must
be created. It is encouraged that all distribution object creation occurs via
the <code>org.apache.commons.math.stat.distribution.DistributionFactory</code>
class. <code>DistributionFactory</code> is a simple factory used to create all
of the distribution objects supported by Commons-Math. The typical usage of
<code>DistributionFactory</code> to create a distribution object would be:
</p>
<source>DistributionFactory factory = DistributionFactory.newInstance();
BinomialDistribution binomial = factory.createBinomialDistribution(10, .75);</source>
<p>
The distributions that can be instantiated via the <code>DistributionFactory</code>
are detailed below:
<table>
<tr><th>Distribution</th><th>Factory Method</th><th>Parameters</th></tr>
<tr><td>Binomial</td><td>createBinomialDistribution</td><td><div>Number of trials</div><div>Probability of success</div></td></tr>
<tr><td>Chi-Squared</td><td>createChiSquaredDistribution</td><td><div>Degrees of freedom</div></td></tr>
<tr><td>Exponential</td><td>createExponentialDistribution</td><td><div>Mean</div></td></tr>
<tr><td>F</td><td>createFDistribution</td><td><div>Numerator degrees of freedom</div><div>Denominator degrees of freedom</div></td></tr>
<tr><td>Gamma</td><td>createGammaDistribution</td><td><div>Alpha</div><div>Beta</div></td></tr>
<tr><td>Hypergeometric</td><td>createHypogeometricDistribution</td><td><div>Population size</div><div>Number of successes in population</div><div>Sample size</div></td></tr>
<tr><td>t</td><td>createTDistribution</td><td><div>Degrees of freedom</div></td></tr>
</table>
</p>
<p>
Using a distribution object, PDF and CDF probabilities are easily computed
using the <code>cummulativeProbability</code> methods. For a distribution <code>X</code>,
and a domain value, <code>x</code>, <code>cummulativeProbability</code> computes
<code>P(X &lt;= x)</code> (i.e. the lower tail probability of <code>X</code>).
</p>
<source>DistributionFactory factory = DistributionFactory.newInstance();
TDistribution t = factory.createBinomialDistribution(29);
double lowerTail = t.cummulativeProbability(-2.656); // P(T &lt;= -2.656)
double upperTail = 1.0 - t.cummulativeProbability(2.75); // P(T &gt;= 2.75)</source>
<p>
The inverse PDF and CDF values are just as easily computed using the
<code>inverseCummulativeProbability</code>methods. For a distribution <code>X</code>,
and a probability, <code>p</code>, <code>inverseCummulativeProbability</code>
computes the domain value <code>x</code>, such that:
<ul>
<li><code>P(X &lt;= x) = p</code>, for continuous distributions</li>
<li><code>P(X &lt;= x) &lt;= p</code>, for discrete distributions</li>
</ul>
Notice the different cases for continuous and discrete distributions. This is the result
of PDFs not being invertible functions. As such, for discrete distributions, an exact
domain value can not be returned. Only the "best" domain value. For Commons-Math, the "best"
domain value is determined by the largest domain value whose cummulative probability is
less-than or equal to the given probability.
</p>
</subsection>
</section>
</body>
</document>

View File

@ -0,0 +1,46 @@
<?xml version="1.0"?>
<document url="utilities.html">
<properties>
<title>The Commons Math User Guide - Utilites</title>
<author email="phil@steitz.com">Phil Steitz</author>
</properties>
<body>
<section name="6 Utilities">
<subsection name="6.1 Overview" href="overview">
<p>
This is yet to be written. Any contributions will be greatfully accepted!
</p>
</subsection>
<subsection name="6.2 Double array utilities" href="arrays">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
<subsection name="6.3 Continued Fractions" href="continued_fractions">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
<subsection name="6.4 binomial coefficients, factorials and other common math functions" href="math_utils">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
<subsection name="6.5 statistical computation utiliities" href="stat_utils">
<p>
This is yet to be written. Any contributions will be gratefully accepted!
</p>
</subsection>
</section>
</body>
</document>