Fifth day of Learning Big data: Python implementation of least squares (ii)

Source: Internet
Author: User
Tags sin

1.numpy.random.normal


Numpy.random.normal
Numpy.random. Normal ( loc=0.0, scale=1.0, size=none )

Draw random samples from a normal (Gaussian) distribution.

The probability density function of the normal distribution, first derived by De Moivre and years later by both Gauss and Laplace independently [R250], is often called the bell curve because of it characteristic shape (see the example Belo W).

The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random dist Urbances, each with its own unique distribution [R250].

Parameters:

Loc : Float

Mean ("centre") of the distribution.

Scale: Float

Standard deviation (spread or "width") of the distribution.

size  : int or tuple of ints, optional

output shape. If the given shape is, e.g.,  (m,  n,   k) , then m  *   n  *  k  samples is drawn. Default is None, with which case a single value is returned.

See Also

Scipy.stats.distributions.norm
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Gaussian distribution is

The where is the mean and the standard deviation. The square of the deviation, is called the variance.

The function has a peak at the mean, and its ' spread ' increases with the standard deviation (the function reaches 0.607 Times its maximum at and [R250]). This implies, Numpy.random.normal are more likely to return samples lying close to the mean, rather than those far away .

References

[r249" wikipedia, " Normal distribution ",  http://en.wikipedia.org/wiki/normal_distribution
[r250" (1, 2,  3, 4)  p. R Peebles Jr, "Central Limit theorem" in "Probability, random Variables and random Signal Principles ", 4th ed., 2001, pp. Wuyi, Wuyi.

Examples

Draw samples from the distribution:

>>>
>>>mu, Sigma = 0, 0.1 # mean and standard deviation>>>s = NP.Random.Normal(mu, Sigma,  +)

Verify the mean and the variance:

>>>
>>>  abs   ( mu  - np   mean   ( s   <  0.01  true  
>>>
>>>  abs   ( sigma  - np   std   ( s   ddof  =  1   <  0.01  true  

Display the histogram of the samples, along with the probability density function:

>>>
>>>Import Matplotlib.pyplot  as PLT>>>Count, Bins, ignored = PLT.hist(s,  -, normed=True)>>>PLT.plot(Bins, 1/(Sigma * NP.sqrt(2 * NP.Pi)) *...                NP.Exp( - (Bins - mu)**2 / (2 * Sigma**2) ),...          linewidth=2, Color=' R ')>>>PLT.Show()

(Source code, PNG, PDF)

2.numpy.random.randn


Import NumPy as NP
NP.RANDOM.RANDN (2,3)


Array ([[0.59941534,  1.0991949,  1.36316028],       [ -0.01979197,  1.30783162, 0.69808199]])

This means that it is randomly extracted from the standard positive distribution.


3.scipy.optimize.leastsq

Least squares


Import NumPy as NP
From scipy.optimize import leastsq


#待拟合的函数, X is the variable, p is the parameter
def fun (x, p):
A, B = P
return a*x + b


#计算真实数据和拟合数据之间的误差, p is the parameter to be fitted, and X and Y are the corresponding real data respectively.
def residuals (p, x, y):
return Fun (x, p)-Y


#一组真实数据, in the case of a=2, b=1
X1 = Np.array ([1, 2, 3, 4, 5, 6], dtype=float)
y1 = Np.array ([3, 5, 7, 9, one, all], dtype=float)


#调用拟合函数, the first parameter is the difference function that needs to be fitted, the second is the fitting initial value, and the third is the other parameter that passes in the function
R = leastsq (residuals, [1, 1], args= (x1, y1))


#打印结果, R[0] stores the results of a fit, r[1], r[2] for other information
Print R[0]


After running, the fitting result is


[2.1.]


But in the actual use of the process, I fit the function is not so simple, one of the difficulties is to fit the function is a piecewise function, you need to determine the value of the argument, and then give a different function equation, for example, such a piecewise function: When x > 3 o'clock, y = ax + B, when x <= 3 , y = ax–b, write it in Python code:


def fun (x, p):
A, B = P
if (x > 3):
return a*x + b
Else
Return A*x-b


If we were to fit with the original difference function, we would get an error like this:


Valueerror:the truth value of an array with more than one element is ambiguous. Use A.any () or A.all ()


The reason is simple, we now the fun function can only calculate a single value, if the incoming or an array, will naturally error. So what do we do? I was also very depressed, so in the scipy maillist to seek help, foreign cattle and cattle are very enthusiastic, quickly pointed out the problem. In fact, I understand the difference function is wrong, the LEASTSQ function to pass in the difference function to return is actually an array, so we can modify the difference function:


def residuals (p, x, y):
temp = Np.array ([0,0,0,0,0,0],dtype=float)
For I in range (0, Len (x)):
Temp[i] = Fun (X[i], p)
Return temp-y


4.

Import NumPy as NP #惯例
Import scipy as SP #惯例
From scipy.optimize import leastsq #这里就是我们要使用的最小二乘的函数
Import Pylab as Pl

m = 9 #多项式的次数

def real_func (x):
Return Np.sin (2*np.pi*x) #sin (2 pi x)

Def fake_func (P, x):
f = np.poly1d (P) #多项式分布的函数
return f (x)

#残差函数
def residuals (p, y, x):
Return Y-fake_func (p, x)

#随机选了9个点, as X
x = Np.linspace (0, 1, 9)
A lot of points #画图的时候需要的 "continuous"
X_show = Np.linspace (0, 1, 1000)

y0 = Real_func (x)
#加入正态分布噪音后的y
y1 = [Np.random.normal (0, 0.1) + Y for y in y0]

#先随机产生一组多项式分布的参数
P0 = Np.random.randn (M)


PLSQ = LEASTSQ (residuals, P0, args= (y1, x))

Print (' Fitting Parameters: ', plsq[0]) #输出拟合参数

Pl.plot (X_show, Real_func (x_show), label= ' real ')
Pl.plot (X_show, Fake_func (plsq[0], x_show), label= ' fitted curve ')
Pl.plot (x, y1, ' Bo ', label= ' with Noise ')
Pl.legend ()
Pl.show ()




Fifth day of Learning Big data: Python implementation of least squares (ii)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.