Mathematics in machine learning feel useful, welcome to discuss Mutual learning ~follow Me
Original article, if you want to reprint please keep the source
This blog for the July online Shambo teacher Machine Learning Math Course study notes
Moment
- For the random variable x,x, the K-Order Origin moment is \[e (x^{k}) \]
- The K-Order center moment of X is \[e ([X-e (X)]^{k})]
- The expectation is actually the 1-order Origin moment of the random variable x, and the variance is actually the 2-order center moment of the random variable x.
- Coefficient of variation (coefficient of variation): The ratio of the standard deviation to the mean (expectation) is called the coefficient of variation, recorded as C.V
- Skewness skewness (Third Order)
- Kurtosis Kurtosis (four-stage)
Skewness and Kurtosis
Calculation of expectation and variance using Matplotlib simulation skewness and Kurtosis
ImportMatplotlib.pyplot asPltImportMathImportNumPy asNpdefCalc (data): N=Len(data)# 10,000 NumbersNiu=0.0 # NIU represents the average, which is expected.Niu2=0.0 # NIU2 represents the average of squaredNiu3=0.0 # NIU3 means three-time average forAinchData:niu+=A NIU2+=A**2Niu3+=A**3Niu/=N NIU2/=N NIU3/=N Sigma=MATH.SQRT (NIU2-Niu*Niureturn[NIU,SIGMA,NIU3]
- \[niu=\bar{x_{i}} is expected \]
- \[niu2=\frac{\sum_{i=1}^{n}x_{i}^{2}}{n}\]
- \[niu3=\frac{\sum_{i=1}^{n}x_{i}^{3}}{n}\]
- Sigma means that the standard deviation formula is \[\sigma=\sqrt{e (x^{2})-e (x) ^{2}}\] \[is represented in the Python language as Sigma = math.sqrt (Niu2-niu*niu) \]
- The return value is [expected, standard deviation,\ (E (x^{3}) \)]
- PS: We know that the expected E (x) formula is \[e (x) =\sum_{i=1}^{n}p (i) x (i)-----(1) \] Here we X an event p (i) indicates the probability of the occurrence of an event, and X (i) represents the weight of the event given to the event.
- We directly use \[e (x) =\bar{x_{i}}----(2) \] to indicate that the expectation should be clear
- (2) the formula \ (X_{i} is generated using a pseudo-random number in numpy, and its mean value is used to represent expectations \)
- At this point (1) The weight given to the event in the formula defaults to 1, i.e. the formula is \[e (x) =\bar{(x_{i}*1)}\]
Calculation of skewness and kurtosis
defCalc_stat (data): [Niu, Sigma, NIU3]=Calc (data) n=Len(data) Niu4=0.0 # Niu4 calculates the peak degree formula for the molecule forAinchData:a-=Niu Niu4+=A**4Niu4/=N Skew=(NIU3-3*Niu*Sigma**2-Niu**3)/(Sigma**3)# skewness Calculation formulaKurt=Niu4/(Sigma**4)# kurtosis Calculation formula: The square of the variance below is four times the standard deviation return[Niu, Sigma,skew,kurt]
Simulating images with matplotlib
if __name__ == "__main__": Data= List(Np.random.randn (10000))# 10,000 numbers to satisfy the Gaussian distributionData2= List(2*NP.RANDOM.RANDN (10000))# Multiply the 10,000 numbers that satisfy the good Gaussian distribution by twice times, and the variance becomes four times timesData3=[x forXinchDataifX>-0.5]# Take the value of >-0.5 in dataData4= List(Np.random.uniform (0,4,10000))# Uniform distribution of 0~4[Niu, Sigma, skew, Kurt]=Calc_stat (data) [Niu_2, Sigma2, Skew2, Kurt2]=Calc_stat (DATA2) [Niu_3, Sigma3, SKEW3, Kurt3]=Calc_stat (DATA3) [Niu_4, SIGMA4, Skew4, Kurt4]=Calc_stat (DATA4)Print(Niu, Sigma, skew, Kurt)Print(NIU2, Sigma2, Skew2, Kurt2)Print(Niu3, Sigma3, SKEW3, KURT3)Print(Niu4, SIGMA4, SKEW4, KURT4) info= R ' $\mu=%.2f, \ \sigma=%.2f, \ skew=%.2f, \ kurt=%.2f$ ' %(Niu,sigma, skew, Kurt)# CalloutInfo2= R ' $\mu=%.2f, \ \sigma=%.2f, \ skew=%.2f, \ kurt=%.2f$ ' %(Niu_2,sigma2, Skew2, Kurt2) Info3= R ' $\mu=%.2f, \ \sigma=%.2f, \ skew=%.2f, \ kurt=%.2f$ ' %(Niu_3,sigma3, SKEW3, Kurt3) Plt.text (1,0.38, Info,bbox=Dict(Facecolor=' Red ', Alpha=0.25)) Plt.text (1,0.35, Info2,bbox=Dict(Facecolor=' Green ', Alpha=0.25)) Plt.text (1,0.32, Info3,bbox=Dict(Facecolor=' Blue ', Alpha=0.25)) Plt.hist (data, -, normed=True, Facecolor=' R ', Alpha=0.9) Plt.hist (Data2, -, normed=True, Facecolor=' G ', Alpha=0.8) Plt.hist (DATA4, -, normed=True, Facecolor=' B ', Alpha=0.7) Plt.grid (True) Plt.show ()
- The graph represents the statistical distribution of random numbers generated by the NumPy random number generation function, using the histogram plotted by matplotlib.pyplot.hist. That is, the distribution statistics of the numbers appear, and are the result of normalization to the 0~1 interval.
- That is, the horizontal axis represents the number, and the vertical is the percentage of the number that corresponds to the horizontal axis in the 1000 random numbers. If you do not use the normalized horizontal axis for numbers (Normed=false), the vertical axis indicates the number of occurrences.
- If normalization is not used--the longitudinal axis indicates the number of occurrences
- About the Matplotlib.pyplot.hist function
= plt.hist(arr, bins=10, normed=0, facecolor=‘black‘, edgecolor=‘black‘,alpha=1,histtype=‘b‘)hist的参数非常多,但常用的就这六个,只有第一个是必须的,后面四个可选arr: 需要计算直方图的一维数组bins: 直方图的柱数,可选项,默认为10normed: 是否将得到的直方图向量归一化。默认为0facecolor: 直方图颜色edgecolor: 直方图边框颜色alpha: 透明度histtype: 直方图类型,‘bar’, ‘barstacked’, ‘step’, ‘stepfilled’返回值 :n: 直方图向量,是否归一化由参数normed设定bins: 返回各个bin的区间范围patches: 返回每个bin里面包含的数据,是一个list
About the Matplotlib.pyplot.hist function
Machine Learning Mathematics | Skewness and kurtosis and its implementation of Python