Spss_ statistical analysis of normality test

Source: Internet
Author: User

The importance of data distribution patterns

In the process of data analysis, the different distribution patterns of data will directly affect the choice of data analysis strategy. Therefore, it is very important to judge the distribution pattern of the data series. The common distribution pattern of data is normal distribution, random distribution (evenly distributed), Poisson distribution, exponential distribution, etc., but in data analysis, the most important distribution pattern is normal, and many data analysis techniques are fixed-distance variable or high measure variable for normal distribution.


Below we introduce the three kinds of normal test methods commonly used by SPSS.


Data distribution strategy of SPSS judgment

Histogram with normal curves

Using the SPSS menu Analysis Environment: "Analysis"-"descriptive statistics"-"frequency", in the drawing options to select a histogram with a normal curve.


The histogram of the normal curve is plotted by comparing the fitting degree between the histogram and the normal curve to determine whether the distribution of the data sequence is close to the normal distribution. The following two pictures are a class of Chinese and mathematical results, with a normal curve histogram. On the graph, the normal curve closest to the current data sequence is shown. It can be seen from the graph that Chinese scores are close to normal curves, while the distribution of mathematical results is far from normal curve. Based on the fitting degree of histogram and its close normal curve, it can be judged whether the data sequence conforms to normal distribution.



Q-q and P-p graphs

Using SPSS Menu Analysis Environment: "Analysis"-"descriptive Statistics"-"p-p or q-q diagram".


Using Q-q graph and p-p graph to judge whether the data sequence is near the normal distribution, the P-p diagram and the q-q diagram are the same, the difference is that the units of the transverse ordinate are different, the P is the cumulative ratio, q is the number of points, and the following is illustrated by the example of the p-p chart. From the two picture on the left, the Chinese p-p graph, the scatter point can match with the diagonal line, then the data sequence conforms to the normal distribution, but the mathematical scatter point deviates from the diagonal line, the data sequence does not conform to the normal distribution.


The two graph on the right becomes the inverse trend normal probability graph, taking the cumulative probability as the horizontal axis, with the deviation of the standard normal distribution as the ordinate. Therefore, the standard normal distribution is the middle horizontal line. Although both graphs have many scattered points on both sides of the horizontal line, but the Chinese ordinate units between the -0.06~0.06, and the unit of mathematics between the -0.3~0.3, relative to the cumulative probability of 1, the deviation of the language score is very small, can be considered to be basically in line with the normal distribution.



K-s Normal state test

Using the SPSS menu Analysis Environment: "Analysis"-"descriptive Statistics"-"1 sample K-s".


The single variable k-s test of SPSS is used to determine whether the data sequence is near normal distribution. Using histogram/q-q graph/p-p graph to judge the normality of data sequence, mainly through the subjective judgment of the analyst. and using k-s as the normal test is to judge whether the sequence satisfies the normal distribution by comparing the difference between the data sequence and the standard normal distribution. The following table is using SPSS as the result of k-s normal test, can be in the last line of the Chinese p value of 0.2, greater than 0.05, indicating that there is no significant difference between Chinese performance and normal distribution, and mathematics for 0.000, less than 0.05, can be thought that the mathematical results and normal distribution by significant difference.

As a single, this Kolmogorov-smirnov

Chinese

Mathematical

N

40

40

The a,b of the normal parameters

Average numbers

69.9825

78.0874

Standard deviation

5.15620

12.84712

At the extreme end of the difference

Absolutely

.103

.280

Is

.069

.219

Negative

-.103

-.280

Measuring data

.103

.280

It's almost obvious (both tails)

.200c,d

.000c

A. The distribution is permanent.

B. From data calculations.

C. lilliefors a correction.

D. This is the lower limit of true.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.