Filled in missing content in univariate statistics section.

git-svn-id: https://svn.apache.org/repos/asf/jakarta/commons/proper/math/trunk@141115 13f79535-47bb-0310-9956-ffa450edef68
2004-03-03 02:32:25 +00:00 · 2004-03-03 02:32:25 +00:00 · e6c5757f99
parent be15008b64
commit e6c5757f99
1 changed files with 97 additions and 16 deletions
--- a/xdocs/userguide/stat.xml
+++ b/xdocs/userguide/stat.xml
@ -17,7 +17,7 @@
  -->
  
 <?xml-stylesheet type="text/xsl" href="./xdoc.xsl"?>
-<!-- $Revision: 1.9 $ $Date: 2004/02/29 21:25:08 $ -->
+<!-- $Revision: 1.10 $ $Date: 2004/03/03 02:32:25 $ -->
 <document url="stat.html">
  <properties>
    <title>The Commons Math User Guide - Statistics</title>
@ -57,7 +57,7 @@
          all statistics, consists of <code>evaluate()</code> methods that take double[] arrays as arguments and return 
          the value of the statistic.   This interface is extended by 
          <a href="../apidocs/org/apache/commons/math/stat/univariate/StorelessUnivariateStatistic.html">
-          org.apache.commons.math.stat.univariate.StorelessUnivariateStatistic,</a> which adds <code>increment(),</code>
+          StorelessUnivariateStatistic,</a> which adds <code>increment(),</code>
          <code>getResult()</code> and associated methods to support "storageless" implementations that
          maintain counters, sums or other state information as values are added using the <code>increment()</code>
          method.  
@ -65,29 +65,110 @@
        <p>
          Abstract implementations of the top level interfaces are provided in 
          <a href="../apidocs/org/apache/commons/math/stat/univariate/AbstractUnivariateStatistic.html">
-          org.apache.commons.math.stat.univariate.AbstractUnivariateStatistic</a> and
+          AbstractUnivariateStatistic</a> and
          <a href="../apidocs/org/apache/commons/math/stat/univariate/AbstractStorelessUnivariateStatistic.html">
-          org.apache.commons.math.stat.univariate.AbstractStorelessUnivariateStatistic</a> respectively.
+          AbstractStorelessUnivariateStatistic</a> respectively.
        </p>
        <p>
          Each statistic is implemented as a separate class, in one of the subpackages (moment, rank, summary) and
          each extends one of the abstract classes above (depending on whether or not value storage is required to 
          compute the statistic).
          There are several ways to instantiate and use statistics.  Statistics can be instantiated and used directly,  but it is
-          generally more convenient to access them using the provided aggregates: 
-          <table>
-            <tr><th>Aggregate</th><th>Statistics Included</th><th>Values stored?</th></tr>
-            <tr><td><a href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
-            org.apache.commons.math.stat.DescriptiveStatistics</a></td><td>All</td><td>Yes</td></tr>
-            <tr><td><a href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
-            org.apache.commons.math.stat.SummaryStatistics</a></td><td>min, max, mean, geometric mean, n, sum, sum of squares, standard deviation, variance</td><td>No</td></tr>
-          </table>
-          TODO: add code sample
-          There is also a utility class, <a href="../apidocs/org/apache/commons/math/stat/StatUtils.html">
-           org.apache.commons.math.stat.StatUtils,</a> that provides static methods for computing statistics
-           from double[] arrays. 
+          generally more convenient (and efficient) to access them using the provided aggregates, <a href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
+            DescriptiveStatistics</a> and <a href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
+            SummaryStatistics.</a>  <code>DescriptiveStatistics</code> maintains the input data in memory and has the capability
+            of producing "rolling" statistics computed from a "window" consisting of the most recently added values.  <code>SummaryStatisics</code>
+            does not store the input data values in memory, so the statistics included in this aggregate are limited to those that can be
+            computed in one pass through the data without access to the full array of values.  
        </p>
+        <p>
+          <table>
+            <tr><th>Aggregate</th><th>Statistics Included</th><th>Values stored?</th><th>"Rolling" capability?</th></tr>
+            <tr><td><a href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
+            DescriptiveStatistics</a></td><td>min, max, mean, geometric mean, n, sum, sum of squares, standard deviation, variance, percentiles, skewness, kurtosis, median</td><td>Yes</td><td>Yes</td></tr>
+            <tr><td><a href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
+            SummaryStatistics</a></td><td>min, max, mean, geometric mean, n, sum, sum of squares, standard deviation, variance</td><td>No</td><td>No</td></tr>
+          </table>
+        </p>
+        <p>
+          There is also a utility class, <a href="../apidocs/org/apache/commons/math/stat/StatUtils.html">
+           StatUtils,</a> that provides static methods for computing statistics
+           directly from double[] arrays. 
+        </p>
+        <p>
+          Here are some examples showing how to compute univariate statistics.
+          <dl>
+          <dt>Compute summary statistics for a list of double values</dt>
+          <br></br>
+          <dd>Using the <code>DescriptiveStatistics</code> aggregate (values are stored in memory):
+        <source>
+// Get a DescriptiveStatistics instance using factory method
+DescriptiveStatistics stats = DescriptiveStatistics.newInstance(); 
+
+// Add the data from the array
+for( int i = 0; i &lt; inputArray.length; i++) {
+        stats.addValue(inputArray[i]);
+}
+
+// Compute some statistics 
+double mean = stats.getMean();
+double std = stats.getStandardDeviation();
+double median = stats.getMedian();
+  	  	</source>
+  	    </dd>
+  	    <dd>Using the <code>SummaryStatistics</code> aggregate (values are <strong>not</strong> stored in memory):
+       <source>
+// Get a SummaryStatistics instance using factory method
+SummaryStatistics stats = SummaryStatistics.newInstance(); 
+
+// Read data from an input stream, adding values and updating sums, counters, etc. necessary for stats
+while (line != null) {
+        line = in.readLine();
+        stats.addValue(Double.parseDouble(line.trim()));
+}
+in.close();
+
+// Compute the statistics 
+double mean = stats.getMean();
+double std = stats.getStandardDeviation();
+//double median = stats.getMedian(); &lt;-- NOT AVAILABLE in SummaryStatistics
+  	  	</source>
+  	    </dd>	
+  	     <dd>Using the <code>StatUtils</code> utility class:
+       <source>
+// Compute statistics directly from the array -- assume values is a double[] array
+double mean = StatUtils.mean(values);
+double std = StatUtils.variance(values);
+double median = StatUtils.percentile(50);
+// Compute the mean of the first three values in the array 
+mean = StatuUtils.mean(values, 0, 3); 
+  	  	</source>
+  	    </dd>  
+  	    <dt>Maintain a "rolling mean" of the most recent 100 values from an input stream</dt>
+  	    <br></br>
+  	    <dd>Use a <code>DescriptiveStatistics</code> instance with window size set to 100
+  	    <source>
+// Create a DescriptiveStats instance and set the window size to 100
+DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
+stats.setWindowSize(100);
+// Read data from an input stream, displaying the mean of the most recent 100 observations
+// after every 100 observations
+long nLines = 0;
+while (line != null) {
+        line = in.readLine();
+        stats.addValue(Double.parseDouble(line.trim()));
+        if (nLines == 100) {
+                nLines = 0;
+                System.out.println(stats.getMean());  // "rolling" mean of most recent 100 values
+       }
+}
+in.close();
+  	    </source>
+  	    </dd>  	    
+  	    </dl>
+  	   </p>
      </subsection>
+      
      <subsection name="1.3 Frequency distributions" href="frequency">
        <p>This is yet to be written. Any contributions will be gratefully
          accepted!</p>