Commit Graph

274 Commits

Author SHA1 Message Date
Tim O'Brien 43c787eb35 Adds the one sample, t-test statistic to TestStatistic and implementations.
Also add unit tests. - BW

PR: Issue #20231
Obtained from: Bugzilla
Submitted by: Brent Worden
Reviewed by: Tim O'Brien


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140859 13f79535-47bb-0310-9956-ffa450edef68
2003-05-26 17:29:36 +00:00
Tim O'Brien 57b9151881 An implementation of ordinary least squares regression with one independent
variable. The implementation uses running sums and does not require the data
to be stored in memory.  Since I could not conceive of any significantly
different implementation strategies that did not amount to just improving
efficiency or numerical accuracy of what I am submitting, I did not abstract
the interface.

The test cases validate the computations against NIST reference data and
verified computations. The slope, intercept, their standard errors and
r-square estimates are accurate to within 10E-12 against the reference data
set.  MSE and other ANOVA stats are good at least to within 10E-8. -- Phil S.

PR: Issue #20224
Obtained from: Bugzilla
Submitted by: Phil Steitz
Reviewed by: Tim O'Brien


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140858 13f79535-47bb-0310-9956-ffa450edef68
2003-05-26 02:11:50 +00:00
Tim O'Brien b84e61ffcf From Phil Steitz patch submission for Issue #20175
The attached patch includes the following improvements to Univariate and
UnivariateImpl:

* Improved efficiency of min, max and product maintenance when windowSize is
  limited by incorporating suggestion posted to commons-dev by Brend Worden
  (added author credit).  Thanks, Brent!

* Added javadoc specifying NaN contracts for all statistics, definitions for
  geometric and arithmetic means.

* Made some slight modifications to UnivariateImpl to make it consistent with
  NaN contracts

* All interface documentation moved to Univariate. The interface specification
  includes the NaN semantics and a first attempt at clealy defining exactly
  what "rolling" means and how this effects what statistics are defined when.

* Added test cases to verify that min, max, product are correctly maintained
  when "rolling" and to verify that NaN contracts are satisfied.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140857 13f79535-47bb-0310-9956-ffa450edef68
2003-05-23 17:33:18 +00:00
Tim O'Brien 8216dd89b5 RandomDataTest.testNextGaussian() was using Univarite.getN(). Added
an explicit cast from int to double.  Test now succeeds.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140854 13f79535-47bb-0310-9956-ffa450edef68
2003-05-22 15:31:38 +00:00
Tim O'Brien ca304541d8 The EmpiricalDistributionImpl that submitted yesterday assumed the "old"
Univariate interface, in which getN() returned a double.  The attached patch
inserts the necessary casts to avoid the rounding/truncation errors that were
causing the EmpiricalDistribution and ValueServer unit tests to fail.

The patch also adds a RandomData member variable so that getNext() does not
instantiate a new RandomData instance for each activation

PR: Bugzilla #20149
Obtained from: Issue Patch
Submitted by: Phil Steitz
Reviewed by: Tim O'Brien


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140853 13f79535-47bb-0310-9956-ffa450edef68
2003-05-22 15:19:32 +00:00
Tim O'Brien 8ed6e84dc6 Added Product and Mean to Univariate and all implementations, this patch
contains contributions from Mark Diggory.

* This patch introduces Product and GeometricMean into the Univariate
implementation.

* Discarding the contribution of a discarded element in a rolling
UnivariateImpl requires that the product be calculated explicitly each
time a value is discarded.  This is necessary because not all values will
have non-zero values.

* Errors in rolling logic for ListUimpl, and UnivariateImpl were corrected,
and more test cases were added to the JUnit tests for the Univariate
implementations.  More rigorous test cases are needed for the entire
suite of Univariate implementations


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140851 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 17:59:20 +00:00
Tim O'Brien 5de3587b86 Commited patch for issue 20112 from Phil Steitz.
EmpiricalDistribution -- represents an empirical probability distribution and
supports generation of data values that are "like" values in an input file
without making any assumptions about the functional form of the probability
distribution that the data come from.   This is useful in simulation
applications where historical data about component performance are
available but do not follow standard distributions (or any application that
requires random data generation from an empirical distribution). Also
generates data for grouped frequency histograms based on the input file.

ValueServer -- a wrapper for RandomData and EmpiricalDistribution that
generates values in each of the following modes:
  * DIGEST_MODE -- uses an empirical distribution
  * REPLAY_MODE -- replays data from an input file
  * UNIFORM_MODE -- generates uniformly distributed random values
  * EXPONENTIAL_MODE -- generates exponentially distributed random
                        values
  * GAUSSIAN_MODE -- generates Gaussian distributed random values
  * CONSTANT_MODE -- returns the same value every time.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140848 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 14:21:15 +00:00
Tim O'Brien a99cbac0bb Updated class javadoc for ContractableDoubleArray
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140847 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 06:46:26 +00:00
Tim O'Brien 246d50aa44 Added more substantial class javadoc to ExpandableDoubleArray
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140846 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 06:24:30 +00:00
Tim O'Brien 5d9efdbcd4 Added javadoc to FixedDA and altered exceptions in DoubleArray
* One should be able to use a DoubleArray in a similar way to a
regular double[], to this effect methods for accessing element
values will no longer throw NoSuchElementExceptions when an
index is outside of the element set.  These method all throw
ArrayIndexOutOfBoundException if a bad index is supplied.

* Filled out javadoc in FixedDoubleArray.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140845 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 05:48:25 +00:00
Tim O'Brien 437e275a50 Updated source code to reflect Jakarta source code guidelines specifically
tab characters.  Changes were driven by Checkstyle report on
http://jakarta.apache.org/commons/sandbox/math


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140843 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 18:15:29 +00:00
Tim O'Brien 60f4205f1f Phil Steitz wrote:
This commit contains the suite of random data generation utilities that
I originally
proposed as extensions to lang.math.  There is some functional overlap
with lang.math, but the contract and intention of this implementation is
different in several significant ways.

* the lang implementation maintains "immutability" of the underlying
   random number generator (emulating Math). The RandomData
   implementation allows users to reseed the random number generator(s)
   (this is in effect possible in the recent extensions to lang.math by
   passing in a user-supplied random as an actual parameter to the
   next() methods) Users can also reset the PRNG algorithm and provider
   used by the "secure" methods.

* RandomData includes "secure" methods (delegating to SecureRandom)

* RandomData will generate random deviates from exponential and poisson,
   as well as Gaussian and Uniform distributions.  These are useful in
   simulation applications.

* Overlapping somewhat with lang.StringUtils, RandomData will generate
   random hex strings.  There is a nextSecureHexString method that will
   (I claim :-) generate cryptographically secure string identifiers. I
   would appreciate feedback on this algorithm, which I have seen used
   elsewhere (similar to what tomcat does to generate session ids); but
   not documented as a standard.

PR: Bugzilla 20013
Obtained from: Phil S.
Submitted by: Phil S.
Reviewed by: Tim O.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140838 13f79535-47bb-0310-9956-ffa450edef68
2003-05-18 00:58:52 +00:00
Tim O'Brien d6f7028269 Added a FixedDoubleArray. FixedDoubleArray supports a rolling mechanism
that reuses an array of fixed length.  This classes was added to an efficient
rolling mechanism.

FixedDoubleArray was influenced by discussions on the commons-dev list and
patches submitted by Mark Diggory.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140836 13f79535-47bb-0310-9956-ffa450edef68
2003-05-17 23:24:21 +00:00
Tim O'Brien 71dfdabde1 The Univariate interface now contains getWindow and setWindow in
addition to a constant which signifies an "infinite" window.  Windowing
has been added to all three Univariate implementations:

* UnivariateImpl - If the window is not infinite, we keep track of
0..n elements and discount the contribution of the discarded element when
our "window" is moved.  If the window is infinite no extra storage is used
beyond an empty ContractableDoubleArray.

- In the following two implementations, the window size can be changed at anytime.

* ListStoreUnivariateImpl - If the window is not infinite, this
implementation only takes into account the last n elements of the List.

* StoreUnivariateImpl - Uses an internal ContractableDoubleArray, window size
can be changed at any time.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140835 13f79535-47bb-0310-9956-ffa450edef68
2003-05-16 05:23:29 +00:00
Tim O'Brien 0700b0f482 * DoubleArray is now an interface which is implemented by
ExpandableDoubleArray.  The interface provides a public interface
which does not hint at any of the storage parameters of
Expandable or Contractable.

* DoubleArrayTest now operates on the DoubleArray interface, casting
to Expandable when we need to call the package scopes getInternalLength
method.

* While we should not provide access to the internal storage array, it
should be possible to obtain a double[] of elements stored in this
DoubleArray - double[] getElements() was added to the DoubleArray interface,
it will return the element array.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140834 13f79535-47bb-0310-9956-ffa450edef68
2003-05-16 04:23:06 +00:00
Tim O'Brien 7651a6b14d * ContractableDoubleArray extends ExpandableDoubleArray - I sense the
need for a DoubleArray interface.

* ExpandableDoubleArray and the extension ContractableDoubleArray should
aim towards presenting a public interface that does not expose any
details of the internal.  To this end, one is no longer able to get the
internal storage array via public double[] getValues(), and the startIndex
(which was relative to the internal storage array) is no longer available.

* [Expandable|Contractable]DoubleArray now allow one to discard
elements from the front of the array.  Before this commit, one could
accomplish the same goal by changing the starting index of the element
array within the internal storage array.  This solution allowed one to
discard elements from the front of the array (as well as) reclaiming
elements by decreases the startIndex.

There were two problems with this approach (especially in
ContractableDoubleArray).   The ContractableDoubleArray can be
"compacted" at anytime thereby reseting the startIndex to zero and the
size of the internal store array to number of elements plus one.  Second,
"reclaiming" elements from the internal storage array by finagling
internal "pointers" to the start and end index seems to violate the
principles of encapsulation.  If you "discard" an element from the
front of the array, consider it unavailable.

It should be noted that calling setNumElements allows one to move the end
index of the internal element array at will.  Assume one has a 100 element
array, and one calls setNumElements(10), thereby decreasing the ending index
of the element array by 90.  The 90 "dumped" elements are not currently
reinitializied to the default double primitive value.  This is an open
question.

* Tests for ExpandableDoubleArray and ContractableDoubleArray were
refactored.  both test classes now extend a DoubleArrayAbstractTest
JUnit class which contained shared unit tests for both "implementations".
An approach like this should be adopted to test the Univariate implementations.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140833 13f79535-47bb-0310-9956-ffa450edef68
2003-05-16 03:55:34 +00:00
Tim O'Brien 065a88e241 Patches from Phil S. applied:
* A TestStatistic interface with corresponding implementation and testcase

PR: Bugzilla 19971
Obtained from: Patches attached to issue
Submitted by: Phil S.
Reviewed by: Tim O/


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140832 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 21:58:23 +00:00
Tim O'Brien 5f2a1cbbc5 Made a nubmer of change to the ExpandableDoubleArray.
* This class now supports
the ability to move the starting index of the internal element array.  This
allows one to move the beginning of the element array, and form a sort of
"window", this will come into play when we want to provide moving
averages, or "rolling".

* Added an addElementRolling(double v) - this will increment the startIndex
and add the element to the end of the internal element array

* brought the Clover test cases up to 100% for this class

Added a class ContractableDoubleArray:

* This is an extension of ExpandableDoubleArray - it addes a configuration
parameter contractionCriteria.  Essential if the contractionCriteria is
2.0f we commit to never having the internal storage array provide more
than 2.0 times the storage capacity needed.  Once the internal
storage array exceed this measurement, the internal storage array is
pruned to the size of the internal element array.

Also, my IDE scolded me for some ununsed imports in ListUnivariateImpl, they
have been removed.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140831 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 15:38:48 +00:00
Tim O'Brien 5ae92f12c7 Another change to the stored Univariates. The calculations are now abstracted
into an AbstractStoreUnivariate class which take responsibility for
all statistical calculations.  AbstractStoreUnivariate is implemented by
two classes:

* StoreUnivariateImpl - This class uses a ExpandableDoubleArray for
internal storage.  This class is a more efficient class in terms
of storage and cycles for users who are interested in gathering statistics
not available in the UnivariateImpl implementation.

* ListUnivariateImpl - This class is for a situation where a user might
wish to maintain a List of numeric objects outside of a StoreUnivariate
instance.  We still need to add serious error checking in the absence of
1.5's generics, but this implementation will work with any list that
contains Number objects - (BigDecimal, BigInteger, Byte, Double, Float,
Integer, Long, Short).  This implementation ultimately transforms all
numeric objects into double primitives via Number.doubleValue().

Becuase AbstractStoreUnivariate does not hold on to any state, a user
can add values through the Univariate.addValue() function OR one can
directly manipulate the contents of the List directly.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140830 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 06:33:19 +00:00
Tim O'Brien 97568dc06f The following changes were made to the Univariate implementation. The public
interface of Univariate was extracted in an interface of the same name.
Univariate, an interface, is now implemented by UnivariateImpl which contains
all code originally present in the original Univariate implementation.

* StoredUnivariate is an interface which extends Univariate and adds
measures not available in the superinterface such as mode, kurtosis, and skew

* StoredUnivariateImpl provides an implementation which uses the
ExpandableDoubleArray for internal storage.  Calculations are performed
on demand *each* time a particular measure is required no state is
maintained by this implementation.

* Univariate provided methods addValue(int), addValue(float), addValue(long).
There functions were removed as no cast is required - all of these
assignments are widening conversions - no cast required

* Removed the name property from Univariate - property not relevant to
univariate statistics


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140828 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 05:39:01 +00:00
Tim O'Brien f039b677b8 Added an expandable double array, this class simply contains a double[] and takes care of automatic expansion of an internal array when necessary. Class added with accompanying unit test
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140827 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 05:19:57 +00:00
Tim O'Brien ccf6befd5f 1. Make RealMatrixImpl implement Serializable
2. Make all currently unimplemented methods throw UnsupportedOperationException
3. Add solve() method to RealMatrix interface, representing vector
   solution to AX = B, where B is the parameter and A is *this.

Phil

Obtained from: Phil S.
Submitted by: Phil S.
Reviewed by: Tim O.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140826 13f79535-47bb-0310-9956-ffa450edef68
2003-05-13 19:08:14 +00:00
Robert Burrell Donkin e4694325bc Starting source code - basic matrix operations and univarient stats plus test code. Submitted by Phil Steitz.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140823 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 19:04:38 +00:00
Robert Burrell Donkin 925847780e added new commons math component
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140822 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 15:07:54 +00:00