Utest and t-test

Source: Internet
Author: User

Utest and t-test

The Utest and t-test can be used to compare the average number of samples and the average number of samples. Theoretically, the sample is from a normal distribution population. However, when the number of samples N is large, or N is small, but the general standard deviation σ is known, the Utest can be applied. When n is small and the general standard deviation σ is unknown, t-test can be applied, but the sample must come from the normal distribution population. When comparing the average number of two samples, the two population variance must be equal.

I. Comparison between the average number of samples and the average number of samples

The purpose of comparison is to infer whether the average number of unknown populations represented by the sample is different from that of known populations. Generally, the theoretical value, standard value, or the stability value obtained through a large number of investigations are treated as μ 0. Whether the U test or t test is known Based on the n size and general standard deviation σ.

(1) The U test is used when σ is known or σ is unknown, but n is large enough [the sample standard deviation S is used as the estimated value of σ, substituted formula (19.6.

Based on the formula U, the relationship is determined based on the relationship shown in table 19-3.

Table 19-3 u values, P values, and statistical conclusions

α | T | value P value statistical conclusion
0.05 sides
one side
<1.96
<1.645
0.05 do not reject H0. The difference is not statistically significant
0.05 sides
one side
≥ 1.96
≥ 1.645
≤0, 0.05 reject H0 and accept H1. The difference is statistically significant
0.01 sides
one side
≥ 2.58
≥ 2.33
≤0, 0.01 reject H0 and accept H1, the difference is highly statistically significant

In 19.3, according to a large number of surveys, the average pulse rate of healthy adult men was 72 times/minute, and the standard deviation was 6.0 times/minute. A doctor randomly checked 25 healthy adult men in the mountainous area and obtained a pulse rate of 74.2 times/minute. Can he think that the pulse rate of adult men in mountainous areas is higher than that of ordinary men?

According to the question, the average number of results obtained from a large number of surveys is 72 times/points and the standard deviation is 6.0 times/points. The average number is μ0 and the general standard deviation is σ, the average number of samples X is 74.2 times/minute, and the number of samples N is 25.

H0: μ = μ 0

H1: μ> μ 0

Alpha = 0.05 (unilateral test)

According to the calculated statistics u = 1.833> 1.645, P <0.05, the H0 was rejected according to alpha = 0.05, and the pulse rate of healthy adult men in the mountainous area is higher than that in general.

(2) t-test is used for σ unknown and N hours.

Based on the calculated statistical value T, it is determined based on the relationship shown in table 19-4.

Table 19-4 | T | value, P value, and statistical conclusion

 

α | T | value P value Statistical conclusion
0.05 <T0.05 (V) <0.05 H0 is not rejected, and the difference is not statistically significant.
0.05 ≥T0.05 (V) ≤ 0.05 Reject H0 and accept H1. The difference is statistically significant.
0.01 ≥T0.01 (V) ≤ 0.01 Reject H0 and accept H1. The difference is statistically significant.

Example 19.4 if the population standard deviation σ in example 19.3 is unknown, but the sample standard deviation has been obtained, S = 6.5 times/minute, and the remaining data is 19.3 in the same case.

The difference from example 19.3 is that σ is unknown and t-test is available.

H0: μ = μ 0

H1: μ> μ 0

Alpha = 0.05 (unilateral test)

In this example, the degree of freedom v = 25-1 = 24, and the t0.05 (24) = 1 is obtained from the T-boundary value table (single side) (Table 19-1. 711. the calculated statistical value T = 1.692 <1.711, P> 0.05 is based on the α = 0.05 test level and H0 is not rejected. However, the pulse rate of adult men in the mountainous area cannot be considered higher than that of ordinary men.

Ii. Comparison of paired data

In medical research, paired designs are commonly used. There are four main scenarios in the pairing design: ① data before and after the same subject object is processed; ② data of two parts of the same subject object; ③ two methods for the same sample (instrument, etc) result of the test; ④ the paired subjects receive two types of processed data respectively. Scenario ① is used to infer whether the processing is effective; Scenario ②, ③, and ④ are used to infer whether the results of the two processing methods are different.

Formula (19.8)

In formula, 0 indicates the average number of the year before and after the processing. If there is no difference between the two methods, the average number of the difference is 0, D is an average number of data difference d (short for difference), and its formula is the same as formula (18.1). SD is the standard error of the mean number of difference numbers, SD is the standard deviation of the year of difference, and the formula is the same as (18.3). n is the Child number.

Because the calculated statistic is t, it is determined based on the relationship shown in table 19-4.

In 19.5, 9 patients with hypertension were treated with a certain drug, and the diastolic blood pressure before and after treatment was shown in table 19-5. Are there any changes in the diastolic blood pressure before and after medication?

Table 19-5 diastolic blood pressure (kPa) before and after treatment with a certain medicine for hypertensive patients)

Patient ID Before treatment Post-treatment Difference d D2
1 12.8 11.7 1.0 1.21
2 13.1 13.1 0.0 0.00
3 14.9 14.4 0.5 0.25
4 14.4 13.6 0.8 0.64
5 13.6 13.1 0.5 0.25
6 13.1 13.3 -0.2 0.04
7 13.3 12.8 0.5 0.25
8 14.1 13.6 0.5 0.25
9 13.3 12.3 1.0 1.00
Total 4.7 3.89

H0: no change in the diastolic blood pressure before and after treatment, that is, μ D = 0

H1: changes in the diastolic pressure before and after treatment, that is, μ D =0

α = 0.05

Degrees of Freedom v = n-1 = 8. In this example, t0.05 (8) = 2.306, t0.01 (8) = 3.355, t0.01 (8 ), P <0.01, according to alpha = 0.05 test level reject H0, received H1, can be considered before and after the treatment of diastolic pressure changes, that is, the drug has a blood pressure reduction effect.

Iii. Comparison of mean values of two completely randomly designed Samples

Also known as group comparison. The purpose is to infer whether the average numbers of the two samples are equal to those of μ1 and μ2. According to the sample content n, the samples are divided into U-test and t-test.

(1) The U test can be used for the two samples with a content of N1, N2, both of which are large enough, if both are greater than 50 or 100.

Formula (19.9)

The calculated statistical value is U, which is determined based on the relationship shown in table 19-3.

In Example 19.6, the number of red blood cells in some healthy adults was sampled from a certain place, of which 360 were male, 4.660 × 1012/L, and the standard deviation was 0.575 × 1012/L, the mean number is 4.178 × 1012/L, and the standard deviation is 0.291 × 1012/L. Are there any differences in the average number of male and female red blood cells in the area?

H0: μ = μ 0

H1: μ =μ 0

α = 0.05

Today, X1 = 4.660x1012/L, S1 = 0.575x1012/L, n1 = 360;

X2 = 4.1781012/L, S2 = 0.2911012/L, n2 = 255.

The calculated u = 13.63> 2.58, P <0.01, the H0 was rejected at Alpha = 0.05 test level, and H1 was accepted. The average number of red blood cells in both men and women is different, which is higher than that in women.

(2) t-test can be used for two samples with a relatively small content of N1 and N2, and the two population variance must be equal, that is, the variance is consistent (homoscedasticity ). T-test is required if the variance of the two samples to be tested is significantly different and the difference is statistically significant.

Formula (19.10)

Formula (19.11)

Formula (19.12)

In formula, the sx1-x2 is the standard error of the difference between the two samples, and s2c is the combined estimated variance (combined estimate variance ). The calculated statistical value is T, which is determined based on the relationship shown in table 19-4.

In case 19.7, a doctor measured the pelvis X-ray data of 50 normal women of Yao Nationality and Dong Nationality in Guangxi province. Diameter before and after pelvic entrance: The mean size of Yao Nationality is 12.002 (CM), standard deviation is 0.948 (cm), and the corresponding diameter of Dong Nationality is 11.456 (cm) and 1.215 (cm ). Asked if there is a difference in the anterior and posterior diameter of the pelvis entrance between two women?

H0: μ1 = μ2

H1: μ1 =μ2

α = 0.05

It is known that n1 = n2 = 50, X1 = 12.002 (CM), S1 = 0.948 (CM );

X2 = 11.456 (CM), S2 = 1.215 (cm ).

In this example, the degrees of freedom v = N1 + n2-2 = 98, query the T-boundary value table [No 98 degrees of freedom in the table, available inner plug-in method (Omitted) or estimated with V = 100]. t0.05 (100) = 1948, t0.01 (100) = 2.626, today t = 2.505> t0.05 (1000, P <0.05, reject H0 according to alpha = 0.05 test level, accept H1, the diameter of the pelvis entrance is different between Yao and Dong women in Guangxi. The diameter of the former is greater than that of the latter.

Iv. Comparison of geometric mean of two completely random samples

Some medical data is equivalent data or normal distribution data, and geometric mean is recommended to represent its average level. The purpose of comparing the geometric mean of the two samples is to infer whether they represent the same total geometric mean. In this case, the original data X should be first transformed to the logarithm, And the transformed data should be substituted into the formula (19.10), (19.11), and (19.12) to calculate the tvalue.

In 19.8, the serum of 20 people with hook-end screw diseases were randomly divided into two groups, respectively using standard or aquatic plants for coagulation test. The dilution times were measured as follows, q: Is there any difference in the average price of the two groups?

X1: Standard plant (11 persons) 100,200,400,400,400,400,800,160

X2: Aquatic beads (9 persons) 100,100,100,200,200,200,200,400,400

H0: μ1 = μ2

H1: μ1 =μ2

α = 0.05

Use the logarithm of the two groups of data as the new variables X1 and x2.

X1: 2.000, 2.301, 2.602, 2.602, 2.602, 2.602, 2.903, 3.204, 3.204, 3.204, 3.505

X2: 2.000, 2.000, 2.000, 2.301, 2.301, 2.301, 2.301, 2.602, 2.602

Use the transformed data to calculate X1, S12, X2, and s22, and then use the formula (19.10), (19.11), and (19.12) to calculate the tvalue.

X1 = 2.794, S12 = 0.2043; x2 = 2.268, s22 = 0.0554

Degrees of Freedom v = 11 + 9-2 = 18, t0.01 (18) = 2.878 for querying the T value table, today t = 3.150> 2.878, P <0.01, the H0 was rejected by alpha = 0.05, and H1 was accepted. The average potency of the two groups was different, and the standard plant height was higher than that of the aquatic plants.

From: http://www.med66.com/html/2005/8/hu6404551514850022505.html

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.