Fitting and testing of distributions

Source: Internet
Author: User

"fit for distribution"

The distribution function (also known as the "Empirical distribution function") of the sample is stacked with a distribution function of a theory, such as a normal distribution, for comparison.
For example:

score=Xlsread(' Examp02_14.xls ',' Sheet1 ',' g2:g52 ');% minus 0 of the total score, i.e. the lack of test resultsscore=score(score>0);% Sample Figure;% New graphics window% draws the experience distribution function graph and returns the graph handle h and struct variable stats,% struct variable stats has 5 fields, corresponding to minimum, maximum, average, median, and standard deviation, respectively[h,Stats] =Cdfplot(score);Set(h,' Color ',' K ',' LineWidth ',2);% set Line color is black, line width is 2%************************ plotting theory normal distribution function diagram ******************************x= +:0.5: -;% produces a new horizontal axis vector x% calculated mean is Stats.mean, and standard deviation is STATS.STD distribution function value at vector xy=NORMCDF(x,Stats.mean,Stats.STD); hold  on% draw the distribution function curve of the normal distribution and set the line to magenta dashed, with a line width of 2plot(x,y,': K ',' LineWidth ',2);% Add a callout box and set the position of the callout box in the upper-left corner of the graphics windowlegend(' experience distribution function ',' theoretical normal distribution ',' Location ',' Northwest ');

Results:


It is shown that the sample is approximately subjected to a normal distribution.

"Test of distribution"

(1) using the Kstest function to test whether a single sample obeys a specified distribution (two-sided inspection), or whether under or under a specified distribution function (one-sided inspection), note that the distribution here is fully deterministic and does not contain unknown parameters.
For example:

% read the data in g2:g52 in the 1th worksheet of the file Examp02_14.xls, that is, the total datascore=Xlsread(' Examp02_14.xls ',' Sheet1 ',' g2:g52 ');% minus 0 of the total score, i.e. the lack of test resultsscore=score(score>0);% generates a CDF matrix to specify distribution: a normal distribution with a mean value of 79 and a standard deviation of 10.1489CDF= [score,NORMCDF(score, -,10.1489)];% Call the Kstest function to verify that the total scores are subject to distribution specified by the CDF[h,P,Ksstat,CV] =kstest(score,CDF)

Attention:

Results:

The assumption is accepted by h=0,p=0.5486>0.05 that the average value is 79 and the standard deviation is 10.1489 normal.

(2) using the KTEST2 function to test whether two samples are subject to the same distribution (two-sided test), or whether the distribution function of one sample is above or below the distribution function of another sample (one-sided test), the Ktest2 function compares the empirical distribution function of two samples, The distribution here is also deterministic.
"Example 1":

% read the data in B2:b52 in the 1th worksheet of the file Examp02_14.xls, i.e. class dataBanji=Xlsread(' Examp02_14.xls ',' Sheet1 ',' B2:b52 ');% read the data in g2:g52 in the 1th worksheet of the file Examp02_14.xls, that is, the total datascore=Xlsread(' Examp02_14.xls ',' Sheet1 ',' g2:g52 ');% removal of missing test datascore=score(score>0);Banji=Banji(score>0);% of the total of 60101 and 60102 classes respectivelyScore1=score(Banji==60101);Score2=score(Banji==60102);% Call the Kstest2 function to verify that the total of two classes is subject to the same distribution[h,P,Ks2stat] =Kstest2(Score1,Score2)[H1,Stats1] =Cdfplot(Score1);% draws the Score1 experience distribution function graph and returns the graphics handle H1 and struct variables stats1Set(H1,' Color ',' K ',' LineWidth ',2); hold  on[H2,stats2] =Cdfplot(Score2);% draws the Score2 experience distribution function graph and returns the graphics handle H2 and struct variables stats2Set(H2,' Color ',' R ',' LineWidth ',2);

Results:


The hypothesis is accepted by h=0,p=0.7016>0.05 that the overall scores of the two classes are subject to the same distribution.
Example 2: Using KTEST2 to complete the example in (1)

score = Xlsread ( ' Examp02_14.xls ' , ,  ' g2:g52 ' ); % remove missing data  score = score (Score > 0 ); randn  ( ' seed ' , 0 ) % specifies that the initial seed of the random number generator is 0  % produces 10,000 obey mean value 79, A normal distribution random number with a standard deviation of 10.1489, constituting a column vector x  x = Normrnd (mean (score), STD (score), 10000 , Span class= "Hljs-number" >1 ); % call kstest2 function to verify the total data score and random number vector x obey the same distribution  [h,p]  = kstest2 (Score,x,0.05 )  

Result:

The assumption is accepted by h=0,p=0.5138>0.05 that the overall result is a normal distribution with an average value of 79 and a standard deviation of 10.1489.
(3) uses the Lillietest function to verify that the sample is subject to the specified distribution (normally distributed by default), noting that the parameters distributed here are estimated based on the sample.
"Example 1":

 score  = Span class= "Hljs-function_or_atom" >xlsread  ( ' Examp02_14.xls ' ,  ' Sheet1 ' ,  ' g2:g52 ' ); % remove missing data  score  = score  (score  > 0 ); % call lillietest function for lilliefors test to verify that the total data is subject to normal distribution  [h , p , kstat , critval ] = lillietest  (score )  

Results:

By h=0,p=0.1346>0.05, it is assumed that the total number is subject to the normal distribution, and the mean and variance of the distribution are replaced by the sample mean and variance.
"Example 2":

scorexlsread(‘examp02_14.xls‘,‘Sheet1‘,‘G2:G52‘);% 去除缺考数据scorescore(score0);% 调用lillietest函数进行Lilliefors检验,检验总成绩数据是否服从指数分布[hplillietest(score,0.05,‘exp‘)

Results:

The hypothesis that the total number is not subject to exponential distribution by h=1,p<0.05

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Fitting and testing of distributions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.