The Nineth chapter analysis of the column-linked analysis

Source: Internet
Author: User

Analysis of the main classification data of the analysis 1 classified data and table 1 classified data

such as: Complete family/divorced family, first-class goods/second-class products, third-class products ... The construction of the 2-linked table

The list of tables is a frequency distribution table with two or more variables for cross classification. Distribution of 3-linked tables

The distribution of the linked table can be viewed from two convenient ways: one is the distribution of the observed value, the other is the distribution of the expectation.
(1) Observation value
Condition frequency, line edge frequency, column edge frequency, percent
(2) Distribution of expected value
The expected value of each variable based on the scale
For example, in favour/against the reform programme of four companies, if the total sample was 420 (100+120+90+110), there were 279 in favour of the reform programme, which accounted for 66.4% per cent of the total. If companies have the same view of the reform programme, for a company, the number of people in favour of the programme should be: 0.664*100=66, expectations and observations should be very similar.
For π1 =π2 =π3 =π4 =0.664 (πi is the percentage of the reform programme favoured by the first company), a χ2 test can be used.
In general, the expected value of any one unit if number:
F e =rtnxctnxn=rtxctn
Where: RT is the total of the row for the given cell, the CT is the total of the column for the given cell, and n is the total number of observations, that is, the sample size. 2Χ2 Inspection

If f O is used to denote the frequency of observations, and F E is used to denote the expected frequency, the χ2 statistic can be:
Χ2 =∑ (f o−f e) 2 F E
Steps: (1) H 0: No difference H 1: there are differences
(2) Calculation of statistical value and critical value
χ2 distribution of degrees of freedom (Rows-1) (Number of columns-1)
(3) Comparing statistical and critical values, making decisions on whether to reject the original hypothesis. Relative measurements in 3-row tables

The correlation between two classification variables was statistically tested by using χ2 distribution. If the variables are independent of each other, there is no connection between them and, conversely, they are considered to be connected. If there is a connection, the degree of correlation between them is much greater.
The correlation between classified data is called quality-related. 1φ correlation coefficient

φ correlation coefficient is the most commonly used correlation coefficient to describe the correlation degree of 2x2 list data, and the formula is:
Φ=χ2/n−−−−√
At this point, the range of φ coefficients is between 0∼1, and the greater the absolute value of φ, the greater the correlation degree of the variables. However, when the row number of rows R or the number of columns C is greater than 2 o'clock, the φ coefficient will increase with R or C, and the Φ value is not on line, which is the correlation of two variables measured by φ coefficient, and the correlation coefficient of the column can be used. 2-Column correlation coefficient

The correlation coefficient is also called the number of links, referred to as C coefficient, mainly used in the case of the 2x2 table, the formula is:
C=χ2χ2 +n−−−−−−√
Features: When independent, the coefficient is 0, it is not possible to be greater than 1, and its possible maximum value depends on the number of rows and columns of the linked table, and increases and increases with R and C.
Disadvantage: The number of column contacts calculated based on different rows and columns is not easy to compare, unless the number of rows and columns in the two two-column tables is the same. 3 V correlation coefficient

In view of the fact that φ coefficient is no upper limit and the C coefficient is less than 1, Kramer proposes the V coefficient, and the formula is:
V=χ2 nxmin[(r−1), (c−1)]−−−−−−−−−−−−−−−−−−−−−√
4 Numerical analysis of V's value between 0∼1

When describing the degree of correlation, we can compare the calculated correlation coefficient with the maximum value of this correlation coefficient to see the degree of correlation. Problems needing attention in 4-column analysis 1 condition percentile table direction

In general, the position of the variable in the column table is arbitrary. If there is a causal relationship between the variable x and y, so that x is the independent variable and y is the dependent variable, then the variable x is generally placed in the column position, and the conditional percentage is calculated in the direction of the argument. But there are exceptions. Expectation criterion of 2χ2 distribution

The independent test using χ 2   distribution requires that the sample size must be large enough, especially the expected frequency (theoretical frequency) of each unit should not be too small, otherwise the application of χ 2   test may draw wrong conclusions.
There are usually two criteria for the number of small cells:
(i) If there are only two units, the expected frequency of each cell must be 5 or more than 5
(b) If there are more than two units, if 20% of the unit expected frequency f e   less than 5, you can not apply χ 2   validation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.