Fit

Fit[data,{f₁,…,f_n},{x,y,…}]

finds a fit a₁ f₁+…+a_n f_n to a list of data for functions f₁,…,f_n of variables {x,y,…}.

Fit[{m,v}]

finds a fit vector a that minimizes for a design matrix m.

Fit[…,"prop"]

specifies what fit property prop should be returned.

Details and Options

Fit is also known as linear regression or least squares fit. With regularization, it is also known as LASSO and ridge regression.
Fit is typically used for fitting combinations of functions to data, including polynomials and exponentials. It provides one of the simplest ways to get a model from data.

The best fit minimizes the sum of squares .
The data can have the following forms:

	{v₁,…,v_n}	equivalent to {{1,v₁},…,{n,v_n}}
	{{x₁,v₁},…,{x_n,v_n}}	univariate data with values v_i at coordinates x_i
	{{x₁,y₁,v₁},…}	bivariate data with values v_i and coordinates {x_i,y_i}
	{{x₁,y₁,…,v₁},…}	multivariate data with values v_i at coordinates {x_i,y_i,…}

The design matrix m has elements that come from evaluating the functions at the coordinates, . In matrix notation, the best fit minimizes the norm where and .
The functions f_i should depend only on the variables {x,y,…}.
The possible fit properties "prop" include:

"BasisFunctions"	funs	the basis functions
"BestFit"		the best fit linear combination of basis functions
"BestFitParameters"		the vector that gives the best fit
"Coordinates"	{{x₁,y₁,…},…}	the coordinates of vars in data
"Data"	data	the data
"DesignMatrix"	m
"FitResiduals"		the differences between the model and the fit at the coordinates
"Function"	Function[{x,y,…},a₁ f₁+…+a_n f_n]	best fit pure function
"PredictedResponse"		fitted values for the data coordinates
"Response"		the response vector from the input data
{"prop₁","prop₂",…}		several fit properties

Fit takes the following options:

NormFunction	Norm	measure of the deviations to minimize
FitRegularization	None	regularization for the fit parameters
WorkingPrecision	Automatic	the precision to use

With NormFunction->normf and FitRegularization->rfun, Fit finds the coefficient vector a that minimizes normf[{a.f(x₁,y₁,…)-v₁,…,a.f(x_k,y_k,…)-v_k}] + rfun[a].
The setting for NormFunction can be given in the following forms:

	normf	a function normf that is applied to the deviations
	{"Penalty", pf}	sum of the penalty function pf applied to each component of the deviations
	{"HuberPenalty",α}	sum of Huber penalty function for each component
	{"DeadzoneLinearPenalty",α}	sum of deadzone linear penalty function for each component

The setting for FitRegularization may be given in the following forms:

	None	no regularization
	rfun	regularize with rfun[a]
	{"Tikhonov",λ}	regularize with
	{"LASSO",λ}	regularize with
	{"Variation",λ}	regularize with $lambda\|\|TemplateBox[{Differences, paclet:ref/Differences}, RefLink, BaseStyle -> {2ColumnTableMod}][a]\|\|^2$
	{"TotalVariation",λ}	regularize with $lambda\|\|TemplateBox[{Differences, paclet:ref/Differences}, RefLink, BaseStyle -> {2ColumnTableMod}][a]\|\|_1$
	{"Curvature",λ}	regularize with $lambda\|\|TemplateBox[{Differences, paclet:ref/Differences}, RefLink, BaseStyle -> {2ColumnTableMod}][a,2]\|\|^2$
	{r₁,r₂,…}	regularize with the sum of terms from r₁,…

With WorkingPrecision->Automatic, exact numbers given as input to Fit are converted to approximate numbers with machine precision.

Examples

open allclose all

Basic Examples (2)

Here is some data:

Find the line that best fits the data:

Find the quadratic that best fits the data:

Show the data with the two curves:

Find the best fit parameters given a design matrix and response vector:

Scope (2)

Here is some data defined with exact values:

Fit the data to a linear combination of sine functions using machine arithmetic:

Fit the data using 24-digit precision arithmetic:

Show the data with the curve:

Here is some data in three dimensions:

Find the plane that best fits the data:

Show the plane with the data points:

Find the quadratic that best fits the data:

The quadratic actually interpolates the data:

Generalizations & Extensions (1)

Here is a list of values:

Fit to a quadratic. When coordinates are not given, the values are assumed to be paired up with 1, 2, …:

Fit to a quartic:

Show the data with the curve:

Options (6)

FitRegularization (2)

Tikhonov regularization controls the size of the best fit parameters:

Without the regularization, the coefficients are quite large:

This can also be referred to by "L2" or "RidgeRegression":

LASSO (least absolute shrinkage and selection operator) regularization selects the most significant basis functions and gives some control of the size of the best fit parameters:

Without the regularization, the coefficients are quite large:

This can also be referred to by "L1":

NormFunction (3)

Measure the "best" fit using the 1-norm:

Compare with the least squares (gold) and max norm (green) fits:

Manipulate the norm p interactively:

Use the Huber penalty function to balance the influence of outliers with least squares:

Manipulate the parameter interactively:

Compute a weighted least squares fit:

WorkingPrecision (1)

Compute the fit using 24-digit precision:

Use exact computations:

Applications (6)

Compute a high-degree polynomial fit to data using a Chebyshev basis:

Use regularization to stabilize the numerical solution to where is perturbed:

The solution found by LinearSolve has very large entries:

Regularized solutions stay much closer to the solution of the unperturbed problem:

Use variation regularization to find a smooth approximation for an input signal:

The output target is a step function:

The input is determined from the output via convolution with :

Without regularization, the predicated response matches the signal very closely, but the computed input has a lot of oscillation:

With regularization of the variation, a much smoother approximation can be found:

Regularization of the input size may also be included:

Smooth a corrupted signal using variation regularization:

Plot the tradeoff between the norm of the residuals and the norm of the variation:

Choose a value of near where the curve bends sharply:

Use total variation regularization to smooth a corrupted signal with jumps:

Regularize with the parameter :

A smaller value of gives less smoothing but the residual norm is smaller:

Use LASSO (L1) regularization to find a sparse fit (basis pursuit):

Here is a signal:

The goal is to approximate the signal with just of few of the thousands of Gabor basis functions:

With , a fit is found using only 41 of the basis elements:

The error is quite small:

Once the important elements of the basis have been found, error can be reduced by finding the least squares fit to these elements:

With a smaller value , a fit is found using more of the basis elements:

The error is even smaller:

Properties & Relations (5)

Fit gives the best fit function:

LinearModelFit allows for extraction of additional information about the fitting:

Extract the fitted function:

Extract additional results and diagnostics:

Here is some data:

This is the sum of squares error for a line a+b x:

Find the minimum symbolically:

These are the coefficients given by Fit:

The exact coefficients may be found by using WorkingPrecision->Infinity:

This is the sum of squares error for a quadratic a+b x+c x^2:

Find the minimum symbolically:

These are the coefficients given by Fit:

When a polynomial fit is done to a high enough degree, Fit returns the interpolating polynomial:

The result is consistent with that given by InterpolatingPolynomial:

Fit will use the time stamps of a TimeSeries as variables:

Rescale the time stamps and fit again:

Find fit for the values:

Fit acts pathwise on a multipath TemporalData:

Possible Issues (2)

Here is some data from a random perturbation of a Gaussian:

This is a function that gives the standard basis for polynomials:

Show the fits computed for successively higher-degree polynomials:

The problem is that the coefficients get very small for higher powers:

Giving the basis in terms of scaled and shifted values helps with this problem:

Reconstruct a sparse signal from compressed sensing data:

Compressed sensing consists of multiplying the signal with an matrix with so only data of length has to be transmitted. If is large enough, a random matrix with independently identically distributed normal entries will satisfy the restricted isometry property, and the original signal can be recovered with very high probability:

The samples to send are constructed by multiplication of the signal with the matrix :

Reconstruction can be done by minimizing for all possible solutions of :

Even though the minimization is solved by linear optimization, it is relatively slow because all the constraints are equality constraints. The solution can be found much faster using basis pursuit (L1 regularization):

This gives the basis elements. To find the best solution, solve the linear equations corresponding to those components:

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

Fit

Details and Options

Examples

Basic Examples (2)

Scope (2)

Generalizations & Extensions (1)