1.kolmogorov-smirnov test of normal condition
Kolmogorov-smirnov is a test method for comparing the distribution of a frequency distribution F (x) with a theoretical distribution g (X) or two observations, and if the gap between the two is very small, it is inferred that the sample is derived from a particular distribution family or two observations of the same distribution
use function : Ks.test () in the stats package that is installed by default
Note : The ks.test has four parameters, the first parameter x is the observation vector, the second parameter y is the second observation vector or the cumulative distribution function or a true cumulative distribution function, such as pnorm (normal distribution function, usually do normal detection when the direct input pnorm), Valid only for continuous CDF. The third parameter indicates whether it is a one-sided or two-sided test, the exact parameter is null or a logical value, indicating if the exact p-value needs to be computed.
The result is that the statistic D value and P-value will appear in the result,
The lower the D value, the closer it is to 0, indicating that the sample data is closer to normal distribution
P value, if the p-value is less than the significant level α (0.05), the H0 is rejected
Note : In the case of a single sample k-s test or normal test, there are sometimes error "Kolmogorov-smirnov test should not have a link", this is because the k-s test is only valid for continuous CDF, and the probability of the same value in a continuous CDF is 0, So R will error. This also reminds us that, before doing the normal test, we should first describe the data and have a general understanding of the data as a whole, so that we can choose the correct test method.
2.shapiro-wilk Inspection
Shapiro-wilk test is a very common normal test method in the case of small sample.
use function : Shapiro.test () in the stats package that is installed by default
Note : Shapiro.test () has only one parameter x, and as long as it is a number vector, it can also have missing values, but the number of non-missing values must be between 3-5000, which is the rule of R.
The result is that the statistic W value and P-value will appear in the result:
The lower the W value, the closer it is to 0, indicating that the sample data is closer to normal distribution
P value, if the p-value is less than the significant level α (0.05), the H0 is rejected
3.Lilliefor Inspection
is a Kolmogorov-smirnov normal test, which can be used for normality test.
using the function : Lillie.test () in the Nortest package
Note : Lillie.test () has only one parameter x, as long as the number vector can also have missing values, but the number of non-missing values must be >4, this is the regulation of R
results explained : Statistics D values and p-value are present in the results:
The lower the D value, the closer it is to 0, indicating that the sample data is closer to normal distribution
P value, if the p-value is less than the significant level α (0.05), the H0 is rejected
Note : Using the Lillefor test in R is equivalent to the correction of the Kolmogorov-smirnov lilliefors of the normality test in the SPSS exploratory analysis, the results of which are the same.
4.anderson-darling test of normal condition
Use the Ad.test () in the Nortest package
using the function : Ad.test () in the Nortest package
Note : Ad.test () has only one parameter x, as long as the number vector can also have missing values, but the number of non-missing values must be >7, this is the regulation of R
results explained: statistics A and p-value are present in the results:
A lower value, closer to 0, indicates that the sample data is closer to the normal distribution
P value, if the p-value is less than the significant level α (0.05), the H0 is rejected
5.jarque-bera test of normal condition
Jarque Bera is based on the statistic of skewness coefficient and kurtosis coefficient
using the function:jarque.bera.test () in the Tseries package
Jb.norm.test () in the Nromtest package
Ajb.norm.test () in the Nromtest package
Note : Jarque.bera.test () has only one parameter x, can be a number vector or time series, does not allow the omission of values, but R also does not specify the minimum value of x, Jb.norm.test () In addition to X, more than a in Montechoro analog value, The default is 2000,ajb.norm.test () is the j-b detection correction, mainly to solve the j-b statistics convergence slow disadvantage.
results explained : Statistics x-squared or JB values, degrees of freedom DF, and p-value are present in the results
The smaller the x-squared value, the closer it is to 0, indicating that the sample data is closer to normal
P value, if the p-value is less than the significant level α (0.05), the H0 is rejected
R language and normality test