From 77a6b785d20681a25b701bef174d5acb90c31724 Mon Sep 17 00:00:00 2001 From: Luc Maisonobe Date: Sun, 15 Mar 2009 21:34:47 +0000 Subject: [PATCH] updated documentation after the redesign of the optimization package git-svn-id: https://svn.apache.org/repos/asf/commons/proper/math/trunk@754764 13f79535-47bb-0310-9956-ffa450edef68 --- src/site/site.xml | 1 - src/site/xdoc/changes.xml | 3 + src/site/xdoc/userguide/estimation.xml | 364 ----------------------- src/site/xdoc/userguide/index.xml | 30 +- src/site/xdoc/userguide/ode.xml | 10 +- src/site/xdoc/userguide/optimization.xml | 139 +++++++-- 6 files changed, 141 insertions(+), 406 deletions(-) delete mode 100644 src/site/xdoc/userguide/estimation.xml diff --git a/src/site/site.xml b/src/site/site.xml index bea25d574..7d8ecbbcf 100644 --- a/src/site/site.xml +++ b/src/site/site.xml @@ -55,7 +55,6 @@ - diff --git a/src/site/xdoc/changes.xml b/src/site/xdoc/changes.xml index ba3f36f04..0ea729c53 100644 --- a/src/site/xdoc/changes.xml +++ b/src/site/xdoc/changes.xml @@ -39,6 +39,9 @@ The type attribute can be add,update,fix,remove. + + Redesigned the optimization framework for a simpler yet more powerful API. + Fixed an error in computing gcd and lcm for some extreme values at integer range boundaries. diff --git a/src/site/xdoc/userguide/estimation.xml b/src/site/xdoc/userguide/estimation.xml deleted file mode 100644 index 9dadf7368..000000000 --- a/src/site/xdoc/userguide/estimation.xml +++ /dev/null @@ -1,364 +0,0 @@ - - - - - - - - - - The Commons Math User Guide - Parametric Estimation - - - -
- -

- The estimation package provides classes to fit some non-linear - model to available observations depending on it. These - problems are commonly called estimation problems. -

-

- The estimation problems considered here are parametric - problems where a user-provided model depends on initially - unknown scalar parameters and several measurements made on - values that depend on the model are available. As examples, - one can consider the center and radius of a circle given - points approximately lying on a ring, or a satellite orbit - given range, range-rate and angular measurements from various - ground stations. -

-

- One important class of estimation problems is weighted least - squares problems. They basically consist in finding the values - for some parameters pk such that a cost function - J = sum(wiri2) is minimized. - The various ri terms represent the deviation - ri = mesi - modi - between the measurements and the parameterized models. The - wi factors are the measurements weights, they are often - chosen either all equal to 1.0 or proportional to the inverse of the - variance of the measurement type. The solver adjusts the values of - the estimated parameters pk which are not bound (i.e. the - free parameters). It does not touch the parameters which have been - put in a bound state by the user. -

-

- The aim of this package is similar to the aim of the - optimization package, but the algorithms are entirely - different as: -

    -
  • - they need the partial derivatives of the measurements with - respect to the free parameters -
  • -
  • - they are residuals based instead of generic cost functions - based -
  • -
-

-
- - -

- The problem modeling is the most important part for the - user. Understanding it is the key to proper use of the - package. One interface and two classes are provided for this - purpose: - EstimationProblem, - EstimatedParameter and - WeightedMeasurement. -

-

- Consider the following example problem: we want to determine the - linear trajectory of a sailing ship by performing angular and - distance measurements from an observing spot on the shore. The - problem model is represented by two equations: -

-

- x(t) = x0+(t-t0)vx0
- y(t) = y0+(t-t0)vy0 -

-

- These two equations depend on four parameters (x0, y0, - vx0 and vy0). We want to determine these four parameters. -

-

- Assuming the observing spot is located at the origin of the coordinates - system and that the angular measurements correspond to the angle between - the x axis and the line of sight, the theoretical values of the angular - measurements at ti and of the distance measurements at - tj are modeled as follows: -

-

- anglei,theo = atan2(y(ti), x(ti))
- distancej,theo = sqrt(x(tj)2+y(tj)2) -

-

- The real observations generate a set of measurements values anglei,meas - and distancej,meas. -

-

- The following class diagram shows one way to solve this problem using the - estimation package. The grey elements are already provided by the package - whereas the purple elements are developed by the user. -

- -

- The TrajectoryDeterminationProblem class holds the linear model - equations x(t) and y(t). It delegate storage of the four parameters x0, - y0, vx0 and vy0 and of the various measurements - anglei,meas and distancej,meas to its base class - SimpleEstimationProblem. Since the theoretical values of the measurements - anglei,theo and distancej,theo depend on the linear model, - the two classes AngularMeasurement and DistanceMeasurement - are implemented as internal classes, thus having access to the equations of the - linear model and to the parameters. -

-

- Here are the various parts of the TrajectoryDeterminationProblem.java - source file. This example, with an additional main method is - available here. -

-
First, the general setup of the class: declarations, fields, constructor, setters and getters: - -public class TrajectoryDeterminationProblem extends SimpleEstimationProblem { - public TrajectoryDeterminationProblem(double t0, - double x0Guess, double y0Guess, - double vx0Guess, double vy0Guess) { - this.t0 = t0; - x0 = new EstimatedParameter( "x0", x0Guess); - y0 = new EstimatedParameter( "y0", y0Guess); - vx0 = new EstimatedParameter("vx0", vx0Guess); - vy0 = new EstimatedParameter("vy0", vy0Guess); - - // inform the base class about the parameters - addParameter(x0); - addParameter(y0); - addParameter(vx0); - addParameter(vy0); - - } - - public double getX0() { - return x0.getEstimate(); - } - - public double getY0() { - return y0.getEstimate(); - } - - public double getVx0() { - return vx0.getEstimate(); - } - - public double getVy0() { - return vy0.getEstimate(); - } - - public void addAngularMeasurement(double wi, double ti, double ai) { - // let the base class handle the measurement - addMeasurement(new AngularMeasurement(wi, ti, ai)); - } - - public void addDistanceMeasurement(double wi, double ti, double di) { - // let the base class handle the measurement - addMeasurement(new DistanceMeasurement(wi, ti, di)); - } - - public double x(double t) { - return x0.getEstimate() + (t - t0) * vx0.getEstimate(); - } - - public double y(double t) { - return y0.getEstimate() + (t - t0) * vy0.getEstimate(); - } - - // measurements internal classes go here - - private double t0; - private EstimatedParameter x0; - private EstimatedParameter y0; - private EstimatedParameter vx0; - private EstimatedParameter vy0; - -} - -
-
The two specialized measurements class are simple internal classes that - implement the equation for their respective measurement type, using the - enclosing class to get the parameters references and the linear models x(t) - and y(t). The serialVersionUID static fields are present because - the WeightedMeasurement class implements the - Serializable interface. - - private class AngularMeasurement extends WeightedMeasurement { - - public AngularMeasurement(double weight, double t, double angle) { - super(weight, angle); - this.t = t; - } - - public double getTheoreticalValue() { - return Math.atan2(y(t), x(t)); - } - - public double getPartial(EstimatedParameter parameter) { - double xt = x(t); - double yt = y(t); - double r = Math.sqrt(xt * xt + yt * yt); - double u = yt / (r + xt); - double c = 2 * u / (1 + u * u); - if (parameter == x0) { - return -c; - } else if (parameter == vx0) { - return -c * t; - } else if (parameter == y0) { - return c * xt / yt; - } else { - return c * t * xt / yt; - } - } - - private final double t; - private static final long serialVersionUID = -5990040582592763282L; - - } - - - private class DistanceMeasurement extends WeightedMeasurement { - - public DistanceMeasurement(double weight, double t, double angle) { - super(weight, angle); - this.t = t; - } - - public double getTheoreticalValue() { - double xt = x(t); - double yt = y(t); - return Math.sqrt(xt * xt + yt * yt); - } - - public double getPartial(EstimatedParameter parameter) { - double xt = x(t); - double yt = y(t); - double r = Math.sqrt(xt * xt + yt * yt); - if (parameter == x0) { - return xt / r; - } else if (parameter == vx0) { - return xt * t / r; - } else if (parameter == y0) { - return yt / r; - } else { - return yt * t / r; - } - } - - private final double t; - private static final long serialVersionUID = 3257286197740459503L; - - } - -
-
- -

- Solving the problem is simply a matter of choosing an implementation - of the - Estimator interface and to pass the problem instance to its estimate - method. Two implementations are already provided by the library: - GaussNewtonEstimator and - LevenbergMarquardtEstimator. The first one implements a simple Gauss-Newton - algorithm, which is sufficient when the starting point (initial guess) is close - enough to the solution. The second one implements a more complex Levenberg-Marquardt - algorithm which is more robust when the initial guess is far from the solution. -

-

- The following sequence diagram explains roughly what occurs under the hood - in the estimate method. -

- -

- Basically, the estimator first retrieves the parameters and the measurements. - The estimation loop is based on the gradient of the sum of the squares of the - residuals, hence, the estimators get the various partial derivatives of all - measurements with respect to all parameters. A new state hopefully globally - reducing the residuals is estimated, and the parameters value are updated. - This estimation loops stops when either the convergence conditions are met - or the maximal number of iterations is exceeded. -

-
- -

- One important tuning parameter for weighted least-squares solving is the - weight attributed to each measurement. This weights has two purposes: -

-
    -
  • fixing unit problems when combining different types of measurements
  • -
  • adjusting the influence of good or bad measurements on the solution
  • -
-

- The weight is a multiplicative factor for the square of the residuals. - A common choice is to use the inverse of the variance of the measurements error - as the weighting factor for all measurements for one type. On our sailing ship - example, we may have a range measurements accuracy of about 1 meter and an angular - measurements accuracy of about 0.01 degree, or 1.7 10-4 radians. So we - would use w=1.0 for distance measurements weight and w=3 107 for - angular measurements weight. If we knew that the measurements quality is bad - at tracking start because of measurement system warm-up delay for example, then - we would reduce the weight for the first measurements and use for example - w=0.1 and w=3 106 respectively, depending on the type. -

-

- After a problem has been set up, it is possible to fine tune the - way it will be solved. For example, it may appear the measurements are not - sufficient to get some parameters with sufficient confidence due to observability - problems. It is possible to fix some parameters in order to prevent the solver - from changing them. This is realized by passing true to the - setBound method of the parameter. -

-

- It is also possible to ignore some measurements by passing true to the - setIgnored method of the measurement. A typical use is to -

    -
  1. - perform a first determination with all parameters, to check each measurement - residual after convergence (i.e. to compute the difference between the - measurement and its theoretical value as computed from the estimated parameters), -
  2. -
  3. - compute standard deviation for the measurements samples (one sample for each - measurements type) -
  4. -
  5. - ignore measurements whose residual are above some threshold (for example three - time the standard deviation on the residuals) assuming they correspond to - bad measurements, -
  6. -
  7. - perform another determination on the reduced measurements set. -
  8. -
-

-
-
- -
diff --git a/src/site/xdoc/userguide/index.xml b/src/site/xdoc/userguide/index.xml index dbdd36477..742111d4b 100644 --- a/src/site/xdoc/userguide/index.xml +++ b/src/site/xdoc/userguide/index.xml @@ -112,28 +112,20 @@
  • 11.2 Vectors
  • 11.3 Rotations
  • -
  • 12. Parametric Estimation +
  • 12. Optimization
  • -
  • 13. Optimization +
  • 13. Ordinary Differential Equations Integration
  • -
  • 14. Ordinary Differential Equations Integration -
  • diff --git a/src/site/xdoc/userguide/ode.xml b/src/site/xdoc/userguide/ode.xml index e9c42f113..c5c4fc5f0 100644 --- a/src/site/xdoc/userguide/ode.xml +++ b/src/site/xdoc/userguide/ode.xml @@ -26,8 +26,8 @@ -
    - +
    +

    The ode package provides classes to solve Ordinary Differential Equations problems.

    @@ -105,7 +105,7 @@ automatic guess is wrong.

    - +

    Discrete events detection is based on switching functions. The user provides a simple g(t, y) @@ -176,7 +176,7 @@ public int eventOccurred(double t, double[] y) { } - +

    First order ODE problems are defined by implementing the FirstOrderDifferentialEquations @@ -196,7 +196,7 @@ public int eventOccurred(double t, double[] y) { that implement it are allowed to handle them as they want.

    - +

    The tables below show the various integrators available for non-stiff problems.

    diff --git a/src/site/xdoc/userguide/optimization.xml b/src/site/xdoc/userguide/optimization.xml index e56cd200e..69a83ab58 100644 --- a/src/site/xdoc/userguide/optimization.xml +++ b/src/site/xdoc/userguide/optimization.xml @@ -26,28 +26,93 @@ -
    - +
    +

    - The optimization package provides algorithms to optimize (i.e. minimize) some - objective or cost function. The package is split in several sub-packages - dedicated to different kind of functions or algorithms. + The optimization package provides algorithms to optimize (i.e. either minimize + or maximize) some objective or cost function. The package is split in several + sub-packages dedicated to different kind of functions or algorithms.

      -
    • the univariate package handles univariate real functions,
    • +
    • the univariate package handles univariate scalar functions,
    • the linear package handles multivariate vector linear functions with linear constraints,
    • -
    • the direct package handles multivariate real functions using direct - search methods (i.e. not using derivatives),
    • -
    • the general package handles multivariate real or vector functions +
    • the direct package handles multivariate scalar functions + using direct search methods (i.e. not using derivatives),
    • +
    • the general package handles multivariate scalar or vector functions using derivatives.

    -
    -

    - A - org.apache.commons.math.optimization.univariate.UnivariateRealMinimizer - is used to find the minimal values of a univariate real-valued function f. + The top level optimization package provides common interfaces for the optimization + algorithms provided in sub-packages. The main interfaces defines objective functions + and optimizers. +

    +

    + Objective functions interfaces are intended to be implemented by + user code to represent the problem to minimize or maximize. When the goal is to + minimize, the objective function is often called a cost function. Objective + functions can be either scalar or vectorial and can be either differentiable or + not. There are four different interfaces, one for each case: +

    +

    + +

    + Optimizers are the algorithms that will either minimize or maximize, the objective function + by changing its input variables set until an optimal set is found. There are only three + interfaces defining the common behavior of optimizers, one for each type of objective + function except + VectorialObjectiveFunction: +

    +

    + +

    + Despite there are only three types of supported optimizers, it is possible to optimize a + transform a non-differentiable + VectorialObjectiveFunction by transforming into a + ScalarObjectiveFunction thanks to the + LeastSquaresConverter helper class. The transformed function can be optimized using any + implementation of the + ScalarOptimizer interface. +

    + +

    + There are also three special implementations which wrap classical optimizers in order to + add them a multi-start feature. This feature call the underlying optimizer several times + in sequence with different starting points and returns the best optimum found or all optima + if desired. This is a classical way to prevent being trapped into a local extremum when + looking for a global one. The multi-start wrappers are + MultiStartScalarOptimizer, + MultiStartScalarDifferentiableOptimizer and + MultiStartVectorialDifferentiableOptimizer. +

    +
    + +

    + A + UnivariateRealMinimizer is used to find the minimal values of a univariate scalar-valued function + f.

    Minimization algorithms usage is very similar to root-finding algorithms usage explained @@ -55,11 +120,14 @@ finding algorithms is replaced by minimize methods.

    - +

    + This package provides an implementation of George Dantzig's simplex algorithm + for solving linear optimization problems with linear equality and inequality + constraints.

    - +

    Direct search methods only use cost function values, they don't need derivatives and don't either try to compute approximation of @@ -97,8 +165,45 @@ multi-directional method.

    - +

    + The general package deals with non-linear vectorial optimization problems when + the partial derivatives of the objective function are available. +

    +

    + One important class of estimation problems is weighted least + squares problems. They basically consist in finding the values + for some parameters pk such that a cost function + J = sum(wi(mesi - modi)2) is + minimized. The various (targeti - modeli(pk)) + terms are called residuals. They represent the deviation between a set of + target values targeti and theoretical values computed from + models modeli depending on free parameters pk. + The wi factors are weights. One classical use case is when the + target values are experimental observations or measurements. +

    +

    + Solving a least-squares problem is finding the free parameters pk + of the theoretical models such that they are close to the target values, i.e. + when the residual are small. +

    +

    + Two optimizers are available in the general package, both devoted to least-squares + problems. The first one is based on the + Gauss-Newton method. The second one is the + Levenberg-Marquardt method. +

    +

    + In order to solve a vectorial optimization problem, the user must provide it as + an object implementing the + VectorialDifferentiableObjectiveFunction interface. The object will be provided to + the estimate method of the optimizer, along with the target and weight arrays, + thus allowing the optimizer to compute the residuals at will. The last parameter to the + estimate method is the point from which the optimizer will start its + search for the optimal point.