Chi-square test or chi-square test

Source: Internet
Author: User

Chi-square test or chi-square test

The Chi-square test (chi-square test) or chi-square test is a widely used hypothesis test method. It can be divided into two types: group comparison (non-paired data) and individual comparison (matching, or comparison of the same object.

X2 test of Table 1 and Table 4

In 20.7, a hospital treated patients with ovarian cancer with chemical therapy and chemotherapy respectively. The results are shown in table 20-11. Are there any differences between the two methods?

Table 20-11 Comparison of the efficacy of two methods in the treatment of ovarian cancer

group valid invalid total efficiency (%)
chemotherapy group 19 24 43 44.2
chemotherapy + radiotherapy group 34 10 44 77.3
total 53 34 87 60.9

The four data separated by dotted lines in the table are the basic data of the entire table, and the rest of the data are calculated from this. The four data tables are specially called the four-grid table ), or two rows of 2 lists (2 × 2 contingency table) from the data calculated two ways of efficacy were 44.2% and 77.3%, respectively, the difference between the two may be caused by sampling error, it is also possible that the two treatments have different effective rates (overall rate. Here, the X2 test can be used to determine whether the difference is statistically significant. The basic formula for the test is:

In formula, a is the actual number, and the four data in the above four cells is the actual number. T is the theoretical number, inferred from the test hypothesis, that is, there is no difference in the efficacy of the two types of ovarian cancer treatment, the difference is only caused by the sampling error. Here we can use the total efficiency of the two treatments as the theoretical efficiency, that is, 53/87 = 60.9%. Based on this, we can calculate the theoretical number of the corresponding four cells in the four cells. The following table uses 20-11 data as an example.

Test procedure:

1. Establish test hypothesis:

H0: π 1 = π 2

H1: π 1 =π 2

α = 0.05

2. Calculate the theoretical number (TRC). The formula is as follows:

TRC = Nr. nc/n formula (20.13)

In the formula, TRC indicates the theoretical number of the lattice in Column C of row R. Nr indicates the Union count of the theoretical number, NC indicates the Union count of the column with the theoretical number, and N indicates the total number of samples.

1st rows and 1 column: 43 × 53/87 = 26.2

1st rows and 2 columns: 43 × 34/87 = 16.8

2nd rows and 1 column: 44 × 53/87 = 26.8

2nd rows and 2 columns: 4x34/87 = 17.2

Based on the calculated results, it can be combined with the actual number of the original four items into a table 20-12:

Table 20-12 comparison of the efficacy of two methods in the treatment of ovarian cancer

Group Valid Invalid Total
Chemotherapy group 19 (26.2) 24 (16.8) 43
Chemotherapy and radiotherapy group 34 (26.8) 10 (17.2) 44
Total 53 34 87

Because the sum of each row and column in the Table above is fixed, you only need to use the TRC formula to obtain a theoretical number (for example, t1.1 = 26.2 ), then, the other three theories can be calculated directly by subtracting them from the same row or column combination. The example is as follows:

T1.1 = 26.2

T1.2 = 43-26.2 = 16.8

T2.1 = 53-26.2 = 26.8

T2.2 = 44-26.2 = 17.2

3. Calculate the X2 value and use the formula 20.12 for substitution.

4. query the X2 value table to find the P value

The degree of freedom of this question should be known before the table is queried. The degree of freedom tested by x2 is V = (number of rows-1) (number of columns-1), then the Degree of Freedom v = (2-1) (2-1) = 1, query the X2 bounded value table (Appendix 20-1) and find x0000001 (1) = 6.63. In this example, X2 = 10.01 is X2> x0000001 (1), P <0.01, the difference was highly statistically significant. According to the α = 0.05 level, the rejection of H0, it can be considered that the use of chemotherapy and radiotherapy in the treatment of ovarian cancer is better than the use of chemotherapy alone.

Through instance calculation, the reader understands the basic formula of Chi-square as follows: the smaller the difference between the number of theories and the actual number, the smaller the value of X2. If the two are the same, then the X2 value must be zero, while X2 is always positive. Because each pair of theoretical and actual numbers is added to the X2 value, the more groups, that is, the larger the number of cells, the larger the X2 value, therefore, when considering the significance of the X2 value, we must also consider the number of cells. Therefore, when the degree of freedom is large, the boundary value of X2 increases accordingly.

2. Special formulas for four-grid tables

You can use the following special formula to evaluate the X2 value for four-grid table data.

In formula A, B, C, and D represent the four actual numbers in the four cells table. Currently, the 20-12 table is used as an example to mark the above symbol as follows (Table 20-13) and demonstrate the calculation.

Table 20-13 efficacy of two treatments for patients with ovarian tumors

Group Valid Invalid Total
Chemotherapy group 19 () 24 (B) 43 (A + B)
Chemotherapy and radiotherapy group 34 (c) 10 (d) 44 (C + D)
53 (A + C) 34 (B + D) 87 (N)

The calculation result is the same as the preceding basic formula. The difference is 0.01.

3. Correction of X2 values in a four-cell table

The x2 value table is calculated by mathematical statistics based on the definitions in the normal distribution. It is an approximation. When the degree of freedom is greater than 1 and the theoretical number is greater than 5, this approximation is very good. When the degree of freedom is 1, especially when 1 <t <5, when N> 40, apply the following correction formula:

If you use a special formula for a four-cell table, apply the following formula for correction:

In case 20.8, a physician treated simple indigestion in children with A and B. The results are shown in table 20-14. Are there any differences between the two methods?

Table 20-14 positive comparison of the two therapeutic effects

Therapy Recovery count Unhealed Total
Jia 26 (28.82) 7 (4.18) 33
B 36 (33.18) 2 (4.82) 38
Total 62 9 71

It can be seen from the 20-14 table that the values of t1.2 and t2.2 are both <5 and the total number of examples is greater than 40. Therefore, the correction formula (20.15) is recommended. The procedure is as follows:

1. Test hypothesis:

H0: π 1 = π 2

H1: π 1 =π 2

α = 0.05

2. Number of computing theories: (included in the square arc of a four-cell table)

3. Calculate the X2 value: use the formula (20.15) to perform the following operations:

Query the X2 bounded value table. x000005 (1) = 3.84. Therefore, X2 <x000005 (1), P> 0.05.

H0 was received at α = 0.05. There was no significant difference between the two curative effects.

If the correction formula is not used, but the original basic formula is used and the calculated result is X2 = 4.068, the conclusion is different.

If the observation data t <1 or n <40, the four-grid table data cannot be corrected using the above correction method, you can directly calculate the probability by referring to the precise detection method in the Medical Statistics textbooks for preventive medicine.

Iv. Row X list card side test (x2test for R × C table)

It is applicable to the significance test of the difference in rate or percentage between two groups. The test steps are the same as those described above. The simple calculation formula is as follows:

In formula, n is the total number of samples, a is the observed value, and NR and NC are the total number of rows and columns corresponding to each a value.

For example, in the northern winter of 20.9, the sunshine is short but the south shift. It is important to adapt the residential house design to obtain the maximum amount of sunshine, enhance the residents' constitution, and reduce the children's rickets. Hu's research on Rizhao health standards for residential buildings in Beijing in 1986 compared the relationship between the orientation of the living room and the disease by checking 214 infants and children in 712 buildings with 333 cases of mild rickets. The data is summarized as 20-15 in table for row x column test.

Table comparison between the 20-15 orientation and the prevalence of indoor infants and children's Rickets

Check Result Orientation Total
South West and Southwest China East and Southeast North, Northeast, Northwest
Diseased 180 14 120 65 379
Disease-free 200 16 84 33 333
Total 380 30 204 98 712
Prevalence Rate (%) 47.4 46.7 58.8 66.3 53.2

/P>

This table consists of 2 rows and 4 columns, which are named 2 × 4 tables. The formula (20.17) can be used for testing.

(1) inspection steps

1. Test assumptions

H0: four types of infants and children with the same prevalence of rickets.

H1.

α = 0.05

2. Calculate the X2 Value

3. Determine the P value and Analysis

This question v = (2-1) (4-3) = 3. Check the Appendix 20-1:

X000001 (3) = 11.34, this question X2 = 15.08, X2> x000001 (3), P <0.01, according to alpha = 0.05, refuse H0, you can think that the living room faces different residents, there is a difference in the prevalence of infants and children with vroup disease.

(2) Considerations for row X list X2 test

1. It is generally considered that there should be no theoretical number of more than 1/5 grids in the row X list less than 5, or a theoretical number less than 1. When the theoretical number is too small, the following methods can be taken to deal with it: ① increase the sample content to increase the theoretical number; ② Delete rows and columns with the above theoretical number too small; ③ merge the actual numbers in the adjacent columns of rows or columns with similar properties to increase the theoretical number of recalculations. Since the last two methods may cause loss of information and damage to the randomness of samples, different merging methods may affect the conclusion, so it is not suitable for conventional methods. In addition, the actual numbers of different types cannot be combined. For example, different blood types cannot be combined when studying blood types.

2. if the test result rejects the test hypothesis, it can only be considered that there is a difference between the overall rate or composition ratio, but it cannot be said that there is a difference between them, or there is a difference between the two.

5. x2test of paired red comparison of enumeration data)

In terms of metering data, the comparison of differential or paired data before and after the same object experiment is different from that of the average number of two samples. This is also true for counting data. For example, in table 20-16, there are 28 throat smear specimens, each of which are inoculated in the same conditions in the culture medium A and B, respectively, to observe the growth status of Escherichia coli, try to compare the effects of the two media sets.

Table 20-16 comparison of culture results of two species of Escherichia coli

Jia medium Culture Medium total
+ -
+ 11 (a) 9 (B) 20
- 1 (c) 7 (d) 8
total 12 16 28

The data in the table shows four results: (a) A + B +, (B) a + B-(C) a-B +, (d) a-B -; if we aim to compare the results of the two cultures, the results (a) and (d) are consistent, which is meaningless and negligible, we only consider (B) and (c) with different results to see whether the difference is meaningful. The following simple formula can be used for calculation:

Test procedure:

1. Test assumptions

H0: π 1 = π 2

H1: π 1 =π 2

α = 0.05

2. Calculate the X2 Value

3. determine the P value and analysis paired data v = 1. Check the Appendix 20-1 and find that x000005 (1) = 3.84, X2> x0.05 (1), P <0.05, which is equal to α = 0.05, if H0 is rejected, it can be considered that the growth efficiency of Escherichia coli in a medium is high.

If B + C> 40, you can use:

In addition, there are two or more methods to compare. For more information, see the chapter on statistical methods of Preventive Medicine.

Table 20-1 X2 bounded value

V P V P
0.05 0.01 0.001 0.05 0.01 0.001
1 3.84 6.63 10.83 16 26.30 32.00 39.25
2 5.99 9.21 13.81 17 27.59 33.14 40.79
3 7.81 11.34 16.27 18 28.87 34.18 42.31
4 9.49 13.28 18.47 19 30.14 36.19 43.82
5 11.07 15.09 20.52 20 31.41 37.57 45.32
6 12.59 16.81 22.46 21 32.67 38.93 46.80
7 14.07 18.48 24.32 22 33.92 40.29 48.27
8 15.51 20.09 26.12 23 35.17 41.64 49.73
9 16.92 21.67 27.88 24 36.42 42.98 51.18
10 18.31 23.21 29.59 25 37.65 44.31 52.62
11 19.68 24.72 31.26 26 38.89 45.64 54.05
12 21.03 26.22 32.91 27 40.11 46.96 55.48
13 22.36 27.69 34.53 28 41.34 48.28 56.89
14 23.68 29.14 36.12 29 42.56 49.59 58.30
15 25.00 30.58 37.70 30 43.77 50.89 59.70

From: http://www.med66.com/html/2005/8/hu79050353148500210950.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.