"Fitting of distribution"
The distribution function of the sample (also known as the "Experiential distribution function") is stacked with the distribution function of a theory (such as the normal distribution) to be compared.
For example:
Score = Xlsread (' Examp02_14.xls ', ' Sheet1 ', ' g2:g52 ');
% of the total score of 0, that is, missing test scores
score = score (Score > 0); % sample
figure; % New graphics window
% Draw experience distribution function graph, and return graphics handle h and struct variable stats,
% struct variable stats has 5 fields, corresponding to minimum, maximum, average, median and standard deviation
[h,stats] = Cdfplot (score);
Set (h, ' color ', ' k ', ' linewidth ', 2); % set Line color is black, line width is 2
%************************ draw theory normal distribution function graph ******************************
x = 40:0.5:100; % produces a new coordinate vector x to
calculate the mean value of Stats.mean, the standard deviation is the STATS.STD normal distribution at the vector x of the distributed function value
y = normcdf (X,STATS.MEAN,STATS.STD);
Hold
on% draws the distribution function curve of the normal distribution, and sets the line to magenta dashed line width of 2
plot (X,y, ': K ', ' linewidth ', 2);
% Add the callout box and set the callout box position in the upper left corner of the graphics window
legend (' experience distribution function ', ' theoretical normal distribution ', ' Location ', ' Northwest ');
Results:
It is shown that the sample obeys the normal distribution approximately. "Test of Distribution"
(1) using the Kstest function to verify whether a single sample is subject to a specified distribution (two-sided test) or whether it is under or under a specified distribution function (unilateral inspection), note that the distribution here is fully determined and does not contain unknown parameters.
For example:
% read the data in the g2:g52 in the 1th worksheet of the file Examp02_14.xls, that is, the total score data
score = xlsread (' Examp02_14.xls ', ' Sheet1 ', ' g2:g52 ');
% of the total score of 0, that is, missing test scores
score = score (Score > 0);
% generates the CDF matrix, which is used to specify the distribution: the average value is 79, the standard deviation is 10.1489 of the normal distribution
CDF = [score, NORMCDF (score, 10.1489)];
% calls the Kstest function to verify that the total score is subject to the distribution specified by the CDF
[H,P,KSSTAT,CV] = Kstest (SCORE,CDF)
Attention:
Results:
It is assumed that the total score is 79 and the standard deviation is 10.1489 normal distribution by the h=0,p=0.5486>0.05 knowledge acceptance hypothesis.
(2) Using KTEST2 function to test whether two samples obey the same distribution (bilateral test), or whether the distribution function of a sample is above or below the distribution function of another sample (unilateral test), the Ktest2 function contrasts the empirical distribution function of two samples, that is, the distribution here is also determined.
"Example 1":
% reads the data in the b2:b52 in the 1th worksheet of the file Examp02_14.xls, that is class data
Banji = Xlsread (' Examp02_14.xls ', ' Sheet1 ', ' b2:b52 ');
% read the data in the g2:g52 in the 1th worksheet of the file Examp02_14.xls, that is, the total score data
score = xlsread (' Examp02_14.xls ', ' Sheet1 ', ' g2:g52 ');
% removal of missing test data
score = score (Score > 0);
Banji = Banji (Score > 0);
%
Score1 = score (Banji = = 60101) were extracted from class 60101 and 60102 respectively;
Score2 = Score (Banji = = 60102);
% Call Kstest2 function to check whether the total score of two classes obeys the same distribution
[h,p,ks2stat] = Kstest2 (score1,score2)
[H1,stats1] = Cdfplot (score1); Draw Score1 's empirical distribution function graph and return the graphics handle H1 and struct variable stats1
set (H1, ' color ', ' k ', ' linewidth ', 2);
Hold
on [h2,stats2] = Cdfplot (score2);% draws Score2 's empirical distribution function graph, and returns the graphics handle H2 and struct variable stats2
Results:
The assumption was accepted by h=0,p=0.7016>0.05 that the total score of the two classes was subject to the same distribution.
"Example 2": Using Ktest2 to complete (1) examples
Score = Xlsread (' Examp02_14.xls ', ' Sheet1 ', ' g2:g52 ');
% removal of missing data
score = score (Score > 0); Randn (' seed ', 0) % specifies that the initial seed of the random number generator is 0
, producing 10,000 compliance mean 79, The standard deviation is 10.1489 of the normal distribution of random numbers, constituting a column vector x
x = Normrnd (mean (score), STD (score), 10000,1);
% Call Kstest2 function to check whether the total score data score and random number vector x obey the same distribution
[h,p] = Kstest2 (score,x,0.05)
Results:
It is assumed that the total score is 79 and the standard deviation is 10.1489 normal distribution by the h=0,p=0.5138>0.05 knowledge acceptance hypothesis.
(3) using the Lillietest function to verify that the sample is subject to the specified distribution (by default, the normal distribution), note that the distribution of the parameters here is based on the sample estimates.
"Example 1":
Score = Xlsread (' Examp02_14.xls ', ' Sheet1 ', ' g2:g52 ');
% removal of missing test data
score = score (Score > 0);
% call lillietest function for lilliefors test, check whether the total score data obeys normal distribution
[h,p,kstat,critval] = lillietest (score)
Results:
By h=0,p=0.1346>0.05, it is assumed that the total score obeys the normal distribution, and the mean and variance of the distribution are replaced by the sample mean and variance.
"Example 2":
Score = Xlsread (' Examp02_14.xls ', ' Sheet1 ', ' g2:g52 ');
% removal of missing test data
score = score (Score > 0);
% call lillietest function for lilliefors test, check whether the total score data obeys exponential distribution
[H, p] = lillietest (score,0.05, ' exp ')
Result:
by h=1,p<0.05 to reject the assumption that the total score is not subject to exponential distribution