Least squares fitting

Source: Internet
Author: User

In physical experiments, it is often necessary to observe two physical quantities with functional relationships. The curve fitting problem in experimental data processing is determined based on the observation data of two quantities of many groups to determine their function curves. This kind of problem usually has two kinds of situations: one is the function form of two observations x and Y is known, but some parameters are unknown, need to determine the best estimate of the unknown parameter, the other is the function form between x and Y is unknown, need to find out the empirical formula between them. In the latter case, it is often assumed that the relationship between X and Y is a pending polynomial, and the polynomial coefficients are unknown parameters to be determined, thus allowing for a similar approach to the previous case.

Principle of least squares method

Of the two observations, there is always a much higher precision than the other, for the sake of simplicity, the high-precision observation is regarded as error-free, and the observation is selected as x, and all errors are considered to be the error of y . Set the function relationship of x and y by the theoretical formula

y =f(x; c1,c2,......cm) (0-0-1)

Given, wherein c1,c2,......cm is a parameter that M is determined by experiment. For each set of observation data (xi,yi)i=1,2,......,n. Corresponds to a point on the xy plane. If there is no measurement error, these data points are accurately falling on the theoretical curve. The equation of equations is obtained by selecting the M-Group Measurement (0-0-1)

Yi =f(x; c1,c2,......cm) (0-0-2)

In the formula I=1,2,......,m. The solution of M-equations is the value of m parameters. Obviously n<m, the parameters are not determined.

In the case of N>m, the formula (0-0-2) becomes the contradictory equations, and the M parameter values can not be obtained directly by solving the equation, and can only be processed by the curve fitting method. If there is no system error in the set measurement, or if it has been corrected, the observed value of y is around the expected <f(x;c1,c2,......cm)> Swing, The distribution is normally distributed, the probability density of Yi is

,

The standard error of the distribution in the formula. For simplicity, the following is represented by C (c1,c2,......cm). Considering that each measurement is independent of each other, the likelihood function of the observed value (y1,y2,......cn)

.

The maximum number of likelihood function L to estimate parameter C should be

(0-0-3)

Minimum value: For the distribution of Y is not limited to normal distribution, the formula (0-0-3) is called the least squares criterion. In the case of normal distribution, the maximum likelihood is consistent with the least squares method. Because of the weight factor, the formula (0-0-3) shows that the least squares method is used to estimate the parameters, and the weighted sum of squares of the deviations of Yi is required to be minimized.

According to the requirements of the formula (0-0-3), it should be

Thus obtaining a equations

(0-0-4)

Solution Equations (0-0-4), that is, the estimated value of M parameters, so as to get the curve equation to fit.

However, a reasonable evaluation of the results of the fitting should be given. If Yi obeys normal distribution, it can introduce the amount of X2 Fit,

(0-0-5)

The minimum X2 value is obtained by substituting the parameter estimation into the upper formula and the comparison formula (0-0-3).

(0-0-6)

It can be proved that the X2 distribution of the degree of Freedom v=n-m , thus the fitting result can be x2 tested.

It is known from the X2 distribution that the expected value of a random variable is n-m. If the formula (0-0-6) calculates a close n-m (for example), the fitting result is considered acceptable, and if so, the fitting result is considered to be significantly contradictory to the observed value.

The least squares fitting of the straight line

The most basic and most commonly used in curve fitting is linear fitting. Set the function relationship between x and y by the linear equation

y                         =a0+a1x (0-0-7)

Give A. There are two pending parameters in the formula, theA0 represents the Intercept, and theA1 represents the slope. For the N set of data (Xi,yi) obtained by the equal precision measurement, the I=1,2......,N,XI value is considered accurate, and all errors are only linked to Yi. The observed data is fitted to a straight line using the least squares method.

1. Estimation of linear parameters

It is pointed out that the weighted square sum of the deviation of the observed value Yi is the smallest when estimating the parameter with least squares method. For linear fitting of equal precision observations, the formula (0-0-3) allows

(0-0-8)

The minimum is the best estimate of the parameter A (representing A0,A1), which requires the minimum squared sum of the deviation of the observed Yi .

According to the requirements of the formula (0-0-8), it should be

The formal equations are obtained after finishing

The optimal estimation of linear parameters A0 and A1 can be obtained by solving the normal equation group. That

(0-0-10)

(0-0-11)

2. Deviation of the fitting result

Because of the estimation of the linear parameters and calculated from the observed data points with errors, they inevitably exist deviations. At the same time, the observed data points are not all accurately fitted to the line above, the observed value of Yi and corresponding to the fitting line between this is also biased.

The standard deviation s of the measured value Yi is discussed first. Considering the formula (0-0-6), because the equal precision measured value yi all of the same, can be estimated by the standard deviation s of Yi , so the formula in the linear fitting of the equal precision measurement value should be expressed as

(0-0-12)

When the measured value is given a normal distribution, the X2 distribution of the degree of Freedom v=n-2 , its expectation

The standard deviation of Yi is thus available

(0-0-13)

This expression is not difficult to understand, it is consistent with the Bessel formula, but here the calculation of S is constrained by two parameters and an estimate, so the degree of freedom becomes N-2.

The S value represented by the formula (0-0-13) is also known as the standard deviation of the fitted line, which is an important sign to test whether the fitting result is valid. If the XY plane is made up of two straight lines parallel to the fitted line

As shown in 0-0-1, the distribution of all observed data points (Xi,yi), about 68.3% of the points fall within the range between the two lines.

Figure 0-0-1 The distribution of data points on both sides of a line fitting

The following is a discussion of the fitting parameter deviations, which are visible from the formula (0-0-10) and (0-0-11), the two parameter estimates of the straight line fitting and the function of Yi . Since XI is assumed to be accurate, all measurement errors are only Yi , so the standard deviation of the two estimation parameters can be obtained by using the uncertainty transfer formula, i.e.

Bashi (0-0-10) and (0-0-11) respectively into the two, it can be calculated

(0-0-14)

(0-0-15)

Third, correlation coefficient and its significance test

When weXi,yithe observed data points (in a straight line), we do not know how closely the linear relationship between x and y is. The correlation coefficient ρ(x,y) is used to determine this. Its definition has been given by the formula (0-0-12) and is now rewritten to another form, and R is used instead to indicate the correlation coefficient,

(0-0-16)

And the arithmetic mean values of x and y , respectively. The R value range is between 1 and +1, which is -1≤r≤1. When r>0, the slope of the straight line is positive, and the slope of the line is negative when the r<0 is called negative correlation. When |r|=1, all data points (Xi,yi) fall on the fit line. If r=0, then x is completely unrelated to Y. The closer the R value to the ±1, the closer the linear relationship between them is.

Http://www.cnblogs.com/wenrenhua08/archive/2008/10/08/3993629.html

Least squares fitting (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.