# Regression (curve fitting)

## Regression

Regression or fiting methods referred to a series of measurement data to determine the parameters of a given function or best to approach.

## Least Square Fit

The least squares method is to determine the parameters of a curve such that the root mean square deviation is minimized. In statistics, the method of regression analysis will be used.

## Fourier Series

The development of a periodic function into a series of trigonometric functions does the Fourier series. The development of a function in its Fourier series is known as harmonic analysis.

## Mean values ​​and standard deviation

The arithmetic means

$x ‾ = 1 n ∑ i = 1 n x i$

$y ‾ = 1 n ∑ i = 1 n y i$

Standard deviation from the mean

$σ = 1 n-1 ∑ i = 1 n x i - x ‾ 2$

For the standard deviation of the regression line of the average value of x of the relevant function value of the straight line is to be replaced.

## Fit line for power laws

If the measured values ​​is an exponential relationship is based can also be used for the best fit straight line linear model. Therefore it is necessary to take the logarithm, the measured values​​, because then gives a linear equation by substitution.

$y=b\cdot {a}^{x}$

Logarithm leads to a linear equation.

$\mathrm{ln}y=\mathrm{ln}b+x\mathrm{ln}a$

With the logarithm of measured values ​​y' and the substitutions a' = ln a and b'= ln b is present, the linear model.

$\mathrm{y\text{'}}=\mathrm{b\text{'}}+\mathrm{a\text{'}}x$

## Regression

The regression analysis is used for determining or estimating model parameters. The starting point are measured values ​​and a model that the measured values ​​should be based on. Since the measured values ​​generally subject to fault optimizes the regression analysis, the model parameters to optimal adaptation. The basic procedure is the method of least squares.

## Least square method

The method of least squares, a method of compensation calculation. With the method, an optimum compromise is calculated, in which the squares of the deviations are minimized by the model function.

With respect to the measured values ​​(x i , y i ) and the model function f is the quadratic deviation is minimized. To achieve this, the parameters a i of the model function are determined so that the following condition is satisfied.

$∑ i = 1 n f x i a → - y i 2 → min$

### Model function: Linear fit

For the determination of the regression line is a linear model function f is used for the least squares method.

$f\left(x,\stackrel{\to }{a}\right)={a}_{0}+{a}_{1}x$

The deviation of the regression line to the measured values ​​is then given as follows.

$\begin{array}{c}{a}_{0}+{a}_{1}{x}_{1}-{y}_{1}={r}_{1}\\ {a}_{0}+{a}_{2}{x}_{2}-{y}_{2}={r}_{2}\\ ⋮\\ {a}_{0}+{a}_{n}{x}_{n}-{y}_{n}={r}_{n}\end{array}$

The goal now is the sum of squares of the deviations from a straight line r to make as small as possible.

$\sum _{i=1}^{n}{r}_{i}^{2}$ $={\left({a}_{0}+{a}_{1}{x}_{1}-{y}_{1}\right)}^{2}+\dots$ $+{\left({a}_{0}+{a}_{n}{x}_{n}-{y}_{n}\right)}^{2}$ $\to \text{min}$

The extremum is determined by the partial derivatives with respect to a 0 and a be set 1 zero.

$\frac{\partial }{\partial {a}_{0}}{\left({a}_{0}+{a}_{1}{x}_{1}-{y}_{1}\right)}^{2}+\dots$ $+{\left({a}_{0}+{a}_{n}{x}_{n}-{y}_{n}\right)}^{2}=0$

$\frac{\partial }{\partial {a}_{1}}{\left({a}_{0}+{a}_{1}{x}_{1}-{y}_{1}\right)}^{2}+\dots$ $+{\left({a}_{0}+{a}_{n}{x}_{n}-{y}_{n}\right)}^{2}=0$

The resolution of the equation system provides the parameters a 0 and a 1 of the regression line.

$a0 =y‾-a1x‾$

$a1 = ∑ i = 1 n x i - x ‾ y i - y ‾ ∑ i = 1 n x i - x ‾ 2$

#### Calculator: fit line

Number of measurement points n=

### Model function: power functions

The approximation of a power function is performed by returning to the linear model function.

$y=a\cdot {x}^{b}$

Logarithm leads to a linear equation.

$\mathrm{ln}y=\mathrm{ln}a+b\mathrm{ln}x$

With the logarithmic measurement values ​​y 'and the substitutions a' = ln a and x '= ln x is the linear model before.

$\mathrm{y\text{'}}=\mathrm{a\text{'}}+b\mathrm{x\text{'}}$

#### Calculator: Power function

Number of measurement points n=

### Approximation (fit) der Gauß-Verteilung an Messwerte

Die Gauß-Verteilung auch Normalverteilung genannt ist folgendermaßen definiert:

$f\left(x\right)=\frac{1}{\sqrt{2\pi }\sigma }\phantom{\rule{0.3em}{0ex}}{e}^{-\frac{1}{2}\frac{{\left(x-\mu \right)}^{2}}{{\sigma }^{2}}}$

#### Calculator: Normal distribution

The adaptation (fit) of the Gaussian distribution to the measured values takes place by forming the weighted mean value of the measured values. The weighted mean value corresponds to the μ In the Gaussian distribution. The standard deviation of the measured values from the mean value is the σ in the normal distribution.

$\mu =\frac{\sum _{i=1}^{n}{x}_{i}{y}_{i}}{\sum _{i=1}^{n}{y}_{i}}$

$σ = ∑ i = 1 n x i - μ 2 y i ∑ i = 1 n y i$

Number of measuring points n=

### Model function: Periodically (Fourier series)

Measured values ​​can also be approximated by the periodic functions. The procedure for this is the development of a Fourier series. The elements of the Fourier series are sine and cosine functions. The development takes place in ascending order of frequencies.

The Fourier series is:

$snx= a 0 2 + ∑ k = 1 n a k cos k ω x + b k sin k ω x$

with the Fourier coefficients a k and b k and ω = 2π / T. This is the period T = b - a with the initial interval a and the end of interval b.

The Fourier coefficients a k and b k satisfy the least squares condition for the associated sine or cosine function. The coefficients are calculated as follows.

$ak= 2 l ∫ a b f x cos k ω x dx$

$bk= 2 l ∫ a b f x sin k ω x dx$

#### Calculator: Fourier approximation

Number of measurement points n=

x-min=

x-max=

#### Enter the measured values: x1, y1, x2, y2, ...

Number of Fourier terms k=

#### Fourier coefficients

The Fourier coefficients a k and b k are here computed numerically using the trapezoidal method for the numerical integration. The accuracy can be improved by increasing the number of measurements is increased in the interval.

### Polynomial approximation using the QR method

The linear approximation problem is solved by the QR decomposition. The calculator determines the coefficients of the n-th degree polynomial.

The starting point is the over-determined system of equations:

$Ax=b$

$\text{with}\phantom{\rule{1em}{0ex}}x\phantom{\rule{0.5em}{0ex}}\in \phantom{\rule{0.5em}{0ex}}{\mathbb{R}}^{n}\phantom{\rule{1em}{0ex}}\text{und}\phantom{\rule{1em}{0ex}}A\phantom{\rule{0.5em}{0ex}}\in \phantom{\rule{0.5em}{0ex}}{\mathbb{R}}^{nxm}$

The QR decomposition leads to the factorization of the matrix A:

$A=QR$

This applies to the compensation problem:

${||Ax-b||}_{2}^{2}={||QRx-b||}_{2}^{2}={||{R}^{*}x-{Q}^{\mathrm{T*}}b||}_{2}^{2}$

here R and Q are reduced to the relevant proportion. That is, R * is the upper triangular matrix of R and Q T * contains the corresponding rows of Q.

Replacing A by the Vandermond matrix with the corresponding measured values x i and b by the measured values y i yields the coefficients of the compensating polynomial as the solution of the equation system.

#### Calculator: Polynomial approximation

Number of measuring points n=

Degree of the polynomial =