## Class RobustFit

- java.lang.Object
- edu.rit.numeric.RobustFit

public class RobustFitextends Object

Class RobustFit uses a robust estimation procedure to fit a series of (*x,y*) data points to a model. The data series is an instance of class XYSeries. The model is represented by a ParameterizedFunction that computes the*y*value, given an*x*value. The model also has*parameters*.Given a data series, a model function, and an initial guess for the parameter values, class RobustFit's

`fit()`method finds parameter values that minimize the following*metric:***Σ**_{i}*ρ*(*y*_{i}−*f*(*x*_{i}, parameters))where

*f*is the model function and*ρ*is one of these metric functions:- Normal:
*ρ*(*z*) =*z*^{2}/2 - Exponential:
*ρ*(*z*) = |*z*| - Cauchy (default):
*ρ*(*z*) = log (1 +*z*^{2}/2)

In other words, the

`fit()`method*fits*the model to the data by adjusting the parameters to minimize the metric.The metric function is the negative logarithm of the probability distribution of the errors in the

*y*values. The above metric functions correspond to normal, two-sided exponential, and Cauchy error distributions.The metric functions differ in how they treat

*outliers,*i.e., data points that deviate from the model. The normal metric function gives increasing weights to points with increasing deviations. However, because of the increasing weights, outlier points may skew the fit (hence, this is not really a "robust" metric function). The exponential metric function gives equal weights to all points, regardless of deviation. This reduces the influence of outliers on the fit, yielding a more robust fit. With the Cauchy metric function, the weights first increase, then decrease as the deviations increase. This reduces the influence of outliers even further.The

`fit()`method uses class MDMinimizationDownhillSimplex to find the parameter values that minimize the metric. The inputs to and outputs from the`fit()`method are stored in fields of an instance of class RobustFit.The

`fitWithDistribution()`method uses the*bootstrapping*technique to determine the distribution of the model parameters, which depends on the error distribution of the data points. Bootstrapping performs multiple iterations of the model fitting procedure. On each iteration, a trial data set the same size as the original data set is created by sampling the original data points with replacement, and model parameters for the trial data set are computed. The`fitWithDistribution()`method outputs a series of the parameter values found at each iteration; the confidence region for the parameters; and the goodness-of-fit*p*-value.- Normal:

### Field Summary

Fields Modifier and Type Field and Description `static Function`

**CAUCHY**The Cauchy metric function.`double[]`

**confidenceRegionLowerBound**The lower bound of the confidence region for the model parameters.`double[]`

**confidenceRegionUpperBound**The upper bound of the confidence region for the model parameters.`XYSeries`

**data**The data series.`static Function`

**EXPONENTIAL**The exponential metric function.`int`

**M**The number of parameters in the model,*M*.`Function`

**metric**The metric function.`double[]`

**metricSeries**The metric values for the model parameter distribution.`double`

**metricValue**The metric value.`ParameterizedFunction`

**model**The model function.`static Function`

**NORMAL**The normal metric function.`double[]`

**param**The model parameters.`double[][]`

**paramSeries**The model parameter distribution.`double`

**pValue**The goodness-of-fit*p*-value.

### Constructor Summary

Constructors Constructor and Description **RobustFit**(ParameterizedFunction model)Construct a new robust fitting object for the given model.

### Method Summary

Methods Modifier and Type Method and Description `void`

**fit**(XYSeries data)Fit the given data series to the model.`void`

**fitWithDistribution**(XYSeries data, int T, Random prng, double conf)Fit the given data series to the model and compute the distribution of the model parameters.

### Field Detail

#### model

public final ParameterizedFunction model

The model function. When`model.f()`is called, the`x`argument is*x*_{i}, the*x*value of a data point; the`p`argument contains the model parameters; and the return value is*f*(*x*_{i}, parameters).

#### M

public final int M

The number of parameters in the model,*M*.

#### metric

public Function metric

The metric function. By default, this is`CAUCHY`. It can instead be set to`NORMAL`,`EXPONENTIAL`, or some other metric function.

#### param

public final double[] param

The model parameters. On input to the`fit()`and`fitWithDistribution()`methods,`param`contains the initial guess for the model parameters. On output from the`fit()`and`fitWithDistribution()`methods,`param`contains the fitted parameter values.

#### data

public XYSeries data

The data series. It contains the (*x,y*) data points to be fitted to the model. It is specified as an argument of the`fit()`and`fitWithDistribution()`methods.

#### metricValue

public double metricValue

The metric value. An output of the`fit()`and`fitWithDistribution()`methods. It is set to the value of the metric for the model with the fitted parameters stored in`param`.

#### paramSeries

public double[][] paramSeries

The model parameter distribution. An output of the`fitWithDistribution()`method.`paramSeries`is a*T*-element array, where*T*is the number of trials. Each element of`paramSeries`is an*M*-element array giving the fitted parameter values for the corresponding trial.

#### metricSeries

public double[] metricSeries

The metric values for the model parameter distribution. An output of the`fitWithDistribution()`method.`metricSeries`is a*T*-element array, where*T*is the number of trials. Each element of`metricSeries`gives the value of the metric for the model with the parameters stored in the corresponding element of`paramSeries`.

#### confidenceRegionLowerBound

public double[] confidenceRegionLowerBound

The lower bound of the confidence region for the model parameters. An output of the`fitWithDistribution()`method. The confidence level is specified as an argument of the`fitWithDistribution()`method; for example, 0.90 specifies a 90% confidence level. The confidence region is an*M*-dimensional rectangular hyperprism centered on the fitted parameters stored in`param`, such that the given fraction of the model parameter distribution stored in`paramSeries`falls within the hyperprism.`confidenceRegionLowerBound`gives the lower bound of each dimension of the confidence region hyperprism.

#### confidenceRegionUpperBound

public double[] confidenceRegionUpperBound

The upper bound of the confidence region for the model parameters. An output of the`fitWithDistribution()`method.`confidenceRegionUpperBound`gives the upper bound of each dimension of the confidence region hyperprism.

#### pValue

public double pValue

The goodness-of-fit*p*-value. An output of the`fitWithDistribution()`method. This gives the probability that a metric value greater than or equal to`metricValue`would occur by chance, even if the model with parameters`params`is correct.

#### NORMAL

public static final Function NORMAL

The normal metric function.

#### EXPONENTIAL

public static final Function EXPONENTIAL

The exponential metric function.

#### CAUCHY

public static final Function CAUCHY

The Cauchy metric function.

### Constructor Detail

#### RobustFit

public RobustFit(ParameterizedFunction model)

Construct a new robust fitting object for the given model. The`model`field is set to the corresponding argument. The`M`field is set by calling the model function's`parameterLength()`method. The`param`field is allocated with*M*elements; initially, the elements are 0.- Parameters:
`model`

- Model function.- Throws:
`NullPointerException`

- (unchecked exception) Thrown if`model`is null.

### Method Detail

#### fit

public void fit(XYSeries data)

Fit the given data series to the model. The data series is stored in the`data`field. The model function was specified to the constructor, and is also stored in the`model`field. On input to the`fit()`method,`param`contains the initial guess for the model parameters. On output from the`fit()`method,`param`contains the fitted parameter values and`metricValue`contains the value of the metric for the fitted parameters.The

`fit()`method uses the downhill simplex technique to find the model parameters that minimize the metric. This involves initializing the*simplex*in an MDMinimizationDownhillSimplex object. The`initializeSimplex()`method is called to initialize the simplex.- Parameters:
`data`

- Data series.- Throws:
`TooManyIterationsException`

- (unchecked exception) Thrown if too many iterations occurred without finding parameters that minimize the metric function.

#### fitWithDistribution

public void fitWithDistribution(XYSeries data, int T, Random prng, double conf)

Fit the given data series to the model and compute the distribution of the model parameters. The data series is stored in the`data`field. The bootstrapping technique with*T*trials using the given pseudorandom number generator is used to compute the distribution. The given confidence level is used to compute the confidence region; for example, 0.90 specifies a 90% confidence level. The model function was specified to the constructor, and is also stored in the`model`field. On input to the`fitWithDistribution()`method,`param`contains the initial guess for the model parameters. On output from the`fit()`method,`param`contains the fitted parameter values,`metricValue`contains the value of the metric for the fitted parameters,`paramSeries`contains the series of fitted parameter values from all the trials,`metricSeries`contains the metric values from all the trials,`confidenceRegionLowerBound`and`confidenceRegionUpperBound`contain the lower and upper bounds of the confidence region hyperprism, and`pValue`contains the goodness-of-fit.The

`fitWithDistribution()`method uses the downhill simplex technique to find the model parameters that minimize the metric. This involves initializing the*simplex*in an MDMinimizationDownhillSimplex object. The`initializeSimplex()`method is called to initialize the simplex.- Parameters:
`data`

- Data series.`T`

- Number of trials.`prng`

- Pseudorandom number generator.`conf`

- Confidence level, in the range 0.0 .. 1.0.- Throws:
`IllegalArgumentException`

- (unchecked exception) Thrown if`conf`is out of bounds.`TooManyIterationsException`

- (unchecked exception) Thrown if too many iterations occurred without finding parameters that minimize the metric function.

**SCaVis 2.0 © jWork.ORG**