diff --git a/xdocs/userguide/index.xml b/xdocs/userguide/index.xml index 7e953c113..859fb6b69 100644 --- a/xdocs/userguide/index.xml +++ b/xdocs/userguide/index.xml @@ -17,7 +17,7 @@ --> - + The Commons Math User Guide - Table of Contents @@ -36,7 +36,7 @@
  • 0.5 Dependencies
  • -
  • 1. Statistics +
  • 1. Statistics and Distributions
    • 1.1 Overview
    • 1.2 Univariate statistics
    • diff --git a/xdocs/userguide/stat.xml b/xdocs/userguide/stat.xml index de995ff57..098db56f1 100644 --- a/xdocs/userguide/stat.xml +++ b/xdocs/userguide/stat.xml @@ -17,20 +17,76 @@ --> - + The Commons Math User Guide - Statistics -
      +
      -

      This is yet to be written. Any contributions will be greatfully - accepted!

      +

      + The statistics and distributions packages provide frameworks and implementations for + basic univariate statistics, frequency distributions, bivariate regression, t- and chi-square test + statistics and some commonly used probability distributions. +

      -

      This is yet to be written. Any contributions will be gratefully - accepted!

      +

      + The stat package includes a framework and default implementations for the following univariate + statistics: +

        +
      • arithmetic and geometric means
      • +
      • variance and standard deviation
      • +
      • sum, product, log sum, sum of squared values
      • +
      • minimum, maximum, median, and percentiles
      • +
      • skewness and kurtosis
      • +
      • first, second, third and fourth moments
      • +
      +

      +

      + With the exception of percentiles and the median, all of these statistics can be computed without + maintaining the full list of input data values in memory. The stat package provides interfaces and + implementations that do not require value storage as well as implementations that operate on arrays + of stored values. +

      +

      + The top level interface is + + org.apache.commons.math.stat.univariate.UnivariateStatistic. This interface, implemented by + all statistics, consists of evaluate() methods that take double[] arrays as arguments and return + the value of the statistic. This interface is extended by + + org.apache.commons.math.stat.univariate.StorelessUnivariateStatistic, which adds increment(), + getResult() and associated methods to support "storageless" implementations that + maintain counters, sums or other state information as values are added using the increment() + method. +

      +

      + Abstract implementations of the top level interfaces are provided in + + org.apache.commons.math.stat.univariate.AbstractUnivariateStatistic and + + org.apache.commons.math.stat.univariate.AbstractStorelessUnivariateStatistic respectively. +

      +

      + Each statistic is implemented as a separate class, in one of the subpackages (moment, rank, summary) and + each extends one of the abstract classes above (depending on whether or not value storage is required to + compute the statistic). + There are several ways to instantiate and use statistics. Statistics can be instantiated and used directly, but it is + generally more convenient to access them using the provided aggregates: + + + + +
      AggregateStatistics IncludedValues stored?
      + org.apache.commons.math.stat.DescriptiveStatisticsAllYes
      + org.apache.commons.math.stat.SummaryStatisticsmin, max, mean, geometric mean, n, sum, sum of squares, standard deviation, varianceNo
      + TODO: add code sample + There is also a utility class, + org.apache.commons.math.stat.StatUtils, that provides static methods for computing statistics + from double[] arrays. +

      This is yet to be written. Any contributions will be gratefully @@ -73,6 +129,7 @@ FcreateFDistribution

      Numerator degrees of freedom
      Denominator degrees of freedom
      GammacreateGammaDistribution
      Alpha
      Beta
      HypergeometriccreateHypogeometricDistribution
      Population size
      Number of successes in population
      Sample size
      + Normal (Gaussian)createNormalDistribution
      Mean
      Standard Deviation
      tcreateTDistribution
      Degrees of freedom