T test and variance analysis is mainly for continuous variables, rank and test mainly for ordered classification variables, and chi-square test mainly for unordered classification variables (also can be used for continuous variables, but need to do discretization), the use is also very broad, based on chi-squared statistics also derived a lot of statistical methods.
Chi-square statistic is a kind of test method based on Chi-square distribution, and it is a non-parameter test method to construct statistic according to the frequency value. In the analysis of the cross-table and nonparametric tests, the chi-square test can be called in SPSS.
There are two main types of application of chi-square test
Test of quasi-goodness
1. Verify that the actual number of observations and the number of theoretical times are consistent for each classification of a single unordered categorical variable
This type of problem is a single-variable test, first of all to clarify the theoretical number, the theoretical frequency is based on professional or experience known, the original hypothesis is the number of observations and the number of theoretical times consistent
"Example": Randomly selected 60 senior students, asked them whether the arts and sciences to branch, to answer the 39 of the votes against the 21 people, asked if there are significant differences in the views of the branch.
Analysis: If there is no difference in opinion, then the number of people in favour of opposition should be half, i.e. 30 times, so the number of theories is 30
"Example": The number of people suffering from depression each day in a week is shown in the following table, please check the days of the week if the number of melancholy is satisfied 1:1:2:2:1:1:1
In this case, the theoretical number of tests is not half, but a certain proportion 1:1:2:2:1:1:1
2. Test the probability of the occurrence of a categorical variable is equal this kind of problem also belongs to the single-variable test, for example, the probability of throwing a coin on the front and back is 1/2, the probability of each side of the dice is 1/6, the original hypothesis is the variable types of probability equal
"Example": a dice throw 120 times, record each number of points thrown, ask if there is a problem with the dice if the dice are normal, then each point throw the probability should be equal, the operation method and the same as before, also use the non-parametric inspection process, select the default of all categories equal
In fact, the first example above happens to be able to transform with this example, the opinion no difference is equal to the probability that the pro and the opposition appear, and the number of times each dice points appears 120*1/6=20 times
3. Verify that the distribution of a continuous variable is consistent with a certain theoretical distribution
Chi-square test is mainly used for categorical variables, but it can also be used to test the quasi-goodness of continuous variables, the basic idea of this kind of problem is: The total x value range is divided into K non-overlapping inter-cell A1 ... A2 ... Ak, the number of samples falling into the sub-I interval as the actual frequency, all the actual frequency and equal to the sample capacity, according to the theoretical distribution, you can calculate the total value of x into each cell ai probability pi, so the NPI is falling into the AI's theoretical frequency of the sample value. With the actual frequency and the theoretical frequency, chi-square statistics can be calculated and Chi-square test.
Second, the Independence test
The independence test analyzes whether the two variables are independent of each other, or whether the two variables are independent of each other after controlling a certain factor. The original hypothesis is that the two variables are independent of each other or the interaction between the two variables is not different.
For the two variables generally use the form of the table to record the observation data, divided into four table and R*c table, according to Chi-Square statistics and classification variables of the type, but also derived some correlation coefficient, which in the correlation analysis has been mentioned.
"Example": In order to understand the attitudes of men and women in public places, 100 men and 80 women were randomly investigated. 58 of the men were in favour of smoking, 42 were not, while 61 of the women were in favour and 19 were not. Analysis of the different attitudes of men and women in smoking in public places? Or whether smoking attitudes change with sex.
The independence between the two variables refers to a variable does not change with the change of another variable, an analysis of the problem is that men and women in public places on the issue of non-smoking attitude is different, which seems similar to the fitting, but involving two variables-gender and attitude, and therefore belong to the independent test.
On the surface, both the fitting test and the independent test are the same in the form of the table, or in the calculation of the Chi-square formula, so it is often referred to as the Chi-square test generally. But there is a difference between the two.
First, the two methods of sample extraction are different. If the sampling is carried out separately in each category, the proportion calculated according to the respective categories belongs to the goodness of fit test. If the sampling is not pre-classified, after sampling according to the research content, the selected units according to the two categories of variables to classify, forming a list, is the independence test.
Secondly, the contents of the two kinds of test hypothesis are different. The original hypothesis of goodness-of-fit testing is usually to assume that the overall proportions of the classes are equal to some expected probabilities, whereas the independence Test assumes that the two variables are independent of each other.
Finally, the calculation of the expected frequency is different. The goodness-of-fit test uses the expected probability in the original hypothesis, and the expected frequency is obtained by multiplying the observed frequency by the desired probability. The joint probability of two levels in the independence test is the product of two individual probabilities
SPSS data Analysis-chi-square test