Commit Graph

4938 Commits

Author SHA1 Message Date
Tim O'Brien 57b9151881 An implementation of ordinary least squares regression with one independent
variable. The implementation uses running sums and does not require the data
to be stored in memory.  Since I could not conceive of any significantly
different implementation strategies that did not amount to just improving
efficiency or numerical accuracy of what I am submitting, I did not abstract
the interface.

The test cases validate the computations against NIST reference data and
verified computations. The slope, intercept, their standard errors and
r-square estimates are accurate to within 10E-12 against the reference data
set.  MSE and other ANOVA stats are good at least to within 10E-8. -- Phil S.

PR: Issue #20224
Obtained from: Bugzilla
Submitted by: Phil Steitz
Reviewed by: Tim O'Brien


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140858 13f79535-47bb-0310-9956-ffa450edef68
2003-05-26 02:11:50 +00:00
Tim O'Brien b84e61ffcf From Phil Steitz patch submission for Issue #20175
The attached patch includes the following improvements to Univariate and
UnivariateImpl:

* Improved efficiency of min, max and product maintenance when windowSize is
  limited by incorporating suggestion posted to commons-dev by Brend Worden
  (added author credit).  Thanks, Brent!

* Added javadoc specifying NaN contracts for all statistics, definitions for
  geometric and arithmetic means.

* Made some slight modifications to UnivariateImpl to make it consistent with
  NaN contracts

* All interface documentation moved to Univariate. The interface specification
  includes the NaN semantics and a first attempt at clealy defining exactly
  what "rolling" means and how this effects what statistics are defined when.

* Added test cases to verify that min, max, product are correctly maintained
  when "rolling" and to verify that NaN contracts are satisfied.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140857 13f79535-47bb-0310-9956-ffa450edef68
2003-05-23 17:33:18 +00:00
Tim O'Brien 5918a1fe1e Readded the linkcheck and tasklist reports
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140856 13f79535-47bb-0310-9956-ffa450edef68
2003-05-23 16:28:16 +00:00
Tim O'Brien e0161bb4cf Added mdiggory patch for developer resources
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140855 13f79535-47bb-0310-9956-ffa450edef68
2003-05-22 16:55:22 +00:00
Tim O'Brien 8216dd89b5 RandomDataTest.testNextGaussian() was using Univarite.getN(). Added
an explicit cast from int to double.  Test now succeeds.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140854 13f79535-47bb-0310-9956-ffa450edef68
2003-05-22 15:31:38 +00:00
Tim O'Brien ca304541d8 The EmpiricalDistributionImpl that submitted yesterday assumed the "old"
Univariate interface, in which getN() returned a double.  The attached patch
inserts the necessary casts to avoid the rounding/truncation errors that were
causing the EmpiricalDistribution and ValueServer unit tests to fail.

The patch also adds a RandomData member variable so that getNext() does not
instantiate a new RandomData instance for each activation

PR: Bugzilla #20149
Obtained from: Issue Patch
Submitted by: Phil Steitz
Reviewed by: Tim O'Brien


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140853 13f79535-47bb-0310-9956-ffa450edef68
2003-05-22 15:19:32 +00:00
Tim O'Brien 1376d960b5 Added task to bring javadoc into compliance with standard. Javadoc
generate currently throws 35 errors.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140852 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 18:24:46 +00:00
Tim O'Brien 8ed6e84dc6 Added Product and Mean to Univariate and all implementations, this patch
contains contributions from Mark Diggory.

* This patch introduces Product and GeometricMean into the Univariate
implementation.

* Discarding the contribution of a discarded element in a rolling
UnivariateImpl requires that the product be calculated explicitly each
time a value is discarded.  This is necessary because not all values will
have non-zero values.

* Errors in rolling logic for ListUimpl, and UnivariateImpl were corrected,
and more test cases were added to the JUnit tests for the Univariate
implementations.  More rigorous test cases are needed for the entire
suite of Univariate implementations


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140851 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 17:59:20 +00:00
Tim O'Brien a3e8ae46e0 xdoc tasks.xml was updated to reflect tasks pertaining to the next planned
release.  There tasks were lifted from Phil S. earlier message to
commons-dev@


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140850 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 15:11:54 +00:00
Tim O'Brien 60bf133582 project.properties changes. 1. The build will not fail on a failed
unit test, 2. The date/time and version are included in the xdoc
transformation, 3. JUnit is now forker, and 4. We use checkstyle.properties
to customize the behavior of Checkstyle.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140849 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 15:10:51 +00:00
Tim O'Brien 5de3587b86 Commited patch for issue 20112 from Phil Steitz.
EmpiricalDistribution -- represents an empirical probability distribution and
supports generation of data values that are "like" values in an input file
without making any assumptions about the functional form of the probability
distribution that the data come from.   This is useful in simulation
applications where historical data about component performance are
available but do not follow standard distributions (or any application that
requires random data generation from an empirical distribution). Also
generates data for grouped frequency histograms based on the input file.

ValueServer -- a wrapper for RandomData and EmpiricalDistribution that
generates values in each of the following modes:
  * DIGEST_MODE -- uses an empirical distribution
  * REPLAY_MODE -- replays data from an input file
  * UNIFORM_MODE -- generates uniformly distributed random values
  * EXPONENTIAL_MODE -- generates exponentially distributed random
                        values
  * GAUSSIAN_MODE -- generates Gaussian distributed random values
  * CONSTANT_MODE -- returns the same value every time.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140848 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 14:21:15 +00:00
Tim O'Brien a99cbac0bb Updated class javadoc for ContractableDoubleArray
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140847 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 06:46:26 +00:00
Tim O'Brien 246d50aa44 Added more substantial class javadoc to ExpandableDoubleArray
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140846 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 06:24:30 +00:00
Tim O'Brien 5d9efdbcd4 Added javadoc to FixedDA and altered exceptions in DoubleArray
* One should be able to use a DoubleArray in a similar way to a
regular double[], to this effect methods for accessing element
values will no longer throw NoSuchElementExceptions when an
index is outside of the element set.  These method all throw
ArrayIndexOutOfBoundException if a bad index is supplied.

* Filled out javadoc in FixedDoubleArray.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140845 13f79535-47bb-0310-9956-ffa450edef68
2003-05-21 05:48:25 +00:00
Robert Burrell Donkin 7d540cfa99 Corrected upload directory.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140844 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 19:54:58 +00:00
Tim O'Brien 437e275a50 Updated source code to reflect Jakarta source code guidelines specifically
tab characters.  Changes were driven by Checkstyle report on
http://jakarta.apache.org/commons/sandbox/math


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140843 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 18:15:29 +00:00
Tim O'Brien a008ed1316 Added Mark Diggory as a contributor
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140842 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 13:24:50 +00:00
Robert Burrell Donkin 118ae33102 Created outline user guide This will be built using maven. Lots more content needed.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140841 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 10:27:57 +00:00
Robert Burrell Donkin 526e1c4171 Created outline documentation. This will be built using maven. Lots more content needed.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140840 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 10:27:26 +00:00
Robert Burrell Donkin 2ebf818b42 Mavenization phase one - moving documentation build to maven, will complete by generating build.xml etc later.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140839 13f79535-47bb-0310-9956-ffa450edef68
2003-05-20 10:24:32 +00:00
Tim O'Brien 60f4205f1f Phil Steitz wrote:
This commit contains the suite of random data generation utilities that
I originally
proposed as extensions to lang.math.  There is some functional overlap
with lang.math, but the contract and intention of this implementation is
different in several significant ways.

* the lang implementation maintains "immutability" of the underlying
   random number generator (emulating Math). The RandomData
   implementation allows users to reseed the random number generator(s)
   (this is in effect possible in the recent extensions to lang.math by
   passing in a user-supplied random as an actual parameter to the
   next() methods) Users can also reset the PRNG algorithm and provider
   used by the "secure" methods.

* RandomData includes "secure" methods (delegating to SecureRandom)

* RandomData will generate random deviates from exponential and poisson,
   as well as Gaussian and Uniform distributions.  These are useful in
   simulation applications.

* Overlapping somewhat with lang.StringUtils, RandomData will generate
   random hex strings.  There is a nextSecureHexString method that will
   (I claim :-) generate cryptographically secure string identifiers. I
   would appreciate feedback on this algorithm, which I have seen used
   elsewhere (similar to what tomcat does to generate session ids); but
   not documented as a standard.

PR: Bugzilla 20013
Obtained from: Phil S.
Submitted by: Phil S.
Reviewed by: Tim O.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140838 13f79535-47bb-0310-9956-ffa450edef68
2003-05-18 00:58:52 +00:00
Tim O'Brien 5879d3be89 Altered the build script to use batch test instead of running
a suite.  The advantage of this is that when new tests are added to the
project a TestSuite class does not need to updated.

All classes *Test.java are included, and *AbstractTest.java are excluded.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140837 13f79535-47bb-0310-9956-ffa450edef68
2003-05-18 00:55:18 +00:00
Tim O'Brien d6f7028269 Added a FixedDoubleArray. FixedDoubleArray supports a rolling mechanism
that reuses an array of fixed length.  This classes was added to an efficient
rolling mechanism.

FixedDoubleArray was influenced by discussions on the commons-dev list and
patches submitted by Mark Diggory.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140836 13f79535-47bb-0310-9956-ffa450edef68
2003-05-17 23:24:21 +00:00
Tim O'Brien 71dfdabde1 The Univariate interface now contains getWindow and setWindow in
addition to a constant which signifies an "infinite" window.  Windowing
has been added to all three Univariate implementations:

* UnivariateImpl - If the window is not infinite, we keep track of
0..n elements and discount the contribution of the discarded element when
our "window" is moved.  If the window is infinite no extra storage is used
beyond an empty ContractableDoubleArray.

- In the following two implementations, the window size can be changed at anytime.

* ListStoreUnivariateImpl - If the window is not infinite, this
implementation only takes into account the last n elements of the List.

* StoreUnivariateImpl - Uses an internal ContractableDoubleArray, window size
can be changed at any time.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140835 13f79535-47bb-0310-9956-ffa450edef68
2003-05-16 05:23:29 +00:00
Tim O'Brien 0700b0f482 * DoubleArray is now an interface which is implemented by
ExpandableDoubleArray.  The interface provides a public interface
which does not hint at any of the storage parameters of
Expandable or Contractable.

* DoubleArrayTest now operates on the DoubleArray interface, casting
to Expandable when we need to call the package scopes getInternalLength
method.

* While we should not provide access to the internal storage array, it
should be possible to obtain a double[] of elements stored in this
DoubleArray - double[] getElements() was added to the DoubleArray interface,
it will return the element array.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140834 13f79535-47bb-0310-9956-ffa450edef68
2003-05-16 04:23:06 +00:00
Tim O'Brien 7651a6b14d * ContractableDoubleArray extends ExpandableDoubleArray - I sense the
need for a DoubleArray interface.

* ExpandableDoubleArray and the extension ContractableDoubleArray should
aim towards presenting a public interface that does not expose any
details of the internal.  To this end, one is no longer able to get the
internal storage array via public double[] getValues(), and the startIndex
(which was relative to the internal storage array) is no longer available.

* [Expandable|Contractable]DoubleArray now allow one to discard
elements from the front of the array.  Before this commit, one could
accomplish the same goal by changing the starting index of the element
array within the internal storage array.  This solution allowed one to
discard elements from the front of the array (as well as) reclaiming
elements by decreases the startIndex.

There were two problems with this approach (especially in
ContractableDoubleArray).   The ContractableDoubleArray can be
"compacted" at anytime thereby reseting the startIndex to zero and the
size of the internal store array to number of elements plus one.  Second,
"reclaiming" elements from the internal storage array by finagling
internal "pointers" to the start and end index seems to violate the
principles of encapsulation.  If you "discard" an element from the
front of the array, consider it unavailable.

It should be noted that calling setNumElements allows one to move the end
index of the internal element array at will.  Assume one has a 100 element
array, and one calls setNumElements(10), thereby decreasing the ending index
of the element array by 90.  The 90 "dumped" elements are not currently
reinitializied to the default double primitive value.  This is an open
question.

* Tests for ExpandableDoubleArray and ContractableDoubleArray were
refactored.  both test classes now extend a DoubleArrayAbstractTest
JUnit class which contained shared unit tests for both "implementations".
An approach like this should be adopted to test the Univariate implementations.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140833 13f79535-47bb-0310-9956-ffa450edef68
2003-05-16 03:55:34 +00:00
Tim O'Brien 065a88e241 Patches from Phil S. applied:
* A TestStatistic interface with corresponding implementation and testcase

PR: Bugzilla 19971
Obtained from: Patches attached to issue
Submitted by: Phil S.
Reviewed by: Tim O/


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140832 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 21:58:23 +00:00
Tim O'Brien 5f2a1cbbc5 Made a nubmer of change to the ExpandableDoubleArray.
* This class now supports
the ability to move the starting index of the internal element array.  This
allows one to move the beginning of the element array, and form a sort of
"window", this will come into play when we want to provide moving
averages, or "rolling".

* Added an addElementRolling(double v) - this will increment the startIndex
and add the element to the end of the internal element array

* brought the Clover test cases up to 100% for this class

Added a class ContractableDoubleArray:

* This is an extension of ExpandableDoubleArray - it addes a configuration
parameter contractionCriteria.  Essential if the contractionCriteria is
2.0f we commit to never having the internal storage array provide more
than 2.0 times the storage capacity needed.  Once the internal
storage array exceed this measurement, the internal storage array is
pruned to the size of the internal element array.

Also, my IDE scolded me for some ununsed imports in ListUnivariateImpl, they
have been removed.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140831 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 15:38:48 +00:00
Tim O'Brien 5ae92f12c7 Another change to the stored Univariates. The calculations are now abstracted
into an AbstractStoreUnivariate class which take responsibility for
all statistical calculations.  AbstractStoreUnivariate is implemented by
two classes:

* StoreUnivariateImpl - This class uses a ExpandableDoubleArray for
internal storage.  This class is a more efficient class in terms
of storage and cycles for users who are interested in gathering statistics
not available in the UnivariateImpl implementation.

* ListUnivariateImpl - This class is for a situation where a user might
wish to maintain a List of numeric objects outside of a StoreUnivariate
instance.  We still need to add serious error checking in the absence of
1.5's generics, but this implementation will work with any list that
contains Number objects - (BigDecimal, BigInteger, Byte, Double, Float,
Integer, Long, Short).  This implementation ultimately transforms all
numeric objects into double primitives via Number.doubleValue().

Becuase AbstractStoreUnivariate does not hold on to any state, a user
can add values through the Univariate.addValue() function OR one can
directly manipulate the contents of the List directly.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140830 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 06:33:19 +00:00
Tim O'Brien 52590a7d00 Submitted by: Phil S.
Reviewed by: Tim O.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140829 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 05:47:51 +00:00
Tim O'Brien 97568dc06f The following changes were made to the Univariate implementation. The public
interface of Univariate was extracted in an interface of the same name.
Univariate, an interface, is now implemented by UnivariateImpl which contains
all code originally present in the original Univariate implementation.

* StoredUnivariate is an interface which extends Univariate and adds
measures not available in the superinterface such as mode, kurtosis, and skew

* StoredUnivariateImpl provides an implementation which uses the
ExpandableDoubleArray for internal storage.  Calculations are performed
on demand *each* time a particular measure is required no state is
maintained by this implementation.

* Univariate provided methods addValue(int), addValue(float), addValue(long).
There functions were removed as no cast is required - all of these
assignments are widening conversions - no cast required

* Removed the name property from Univariate - property not relevant to
univariate statistics


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140828 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 05:39:01 +00:00
Tim O'Brien f039b677b8 Added an expandable double array, this class simply contains a double[] and takes care of automatic expansion of an internal array when necessary. Class added with accompanying unit test
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140827 13f79535-47bb-0310-9956-ffa450edef68
2003-05-15 05:19:57 +00:00
Tim O'Brien ccf6befd5f 1. Make RealMatrixImpl implement Serializable
2. Make all currently unimplemented methods throw UnsupportedOperationException
3. Add solve() method to RealMatrix interface, representing vector
   solution to AX = B, where B is the parameter and A is *this.

Phil

Obtained from: Phil S.
Submitted by: Phil S.
Reviewed by: Tim O.


git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140826 13f79535-47bb-0310-9956-ffa450edef68
2003-05-13 19:08:14 +00:00
Tim O'Brien 9b7cfb86b2 Added myself to STATUS and PROPOSAL
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140825 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 20:32:26 +00:00
Tim O'Brien 352f134f0e A maven project.xml was added for conv
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140824 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 19:28:54 +00:00
Robert Burrell Donkin e4694325bc Starting source code - basic matrix operations and univarient stats plus test code. Submitted by Phil Steitz.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140823 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 19:04:38 +00:00
Robert Burrell Donkin 925847780e added new commons math component
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140822 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 15:07:54 +00:00
No Author 4a8cbc2867 New repository initialized by cvs2svn.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@140821 13f79535-47bb-0310-9956-ffa450edef68
2003-05-12 15:07:54 +00:00