Utest and t-test

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Utest and t-test

The Utest and t-test can be used to compare the average number of samples and the average number of samples. Theoretically, the sample is from a normal distribution population. However, when the number of samples N is large, or N is small, but the general standard deviation σ is known, the Utest can be applied. When n is small and the general standard deviation σ is unknown, t-test can be applied, but the sample must come from the normal distribution population. When comparing the average number of two samples, the two population variance must be equal.

I. Comparison between the average number of samples and the average number of samples

The purpose of comparison is to infer whether the average number of unknown populations represented by the sample is different from that of known populations. Generally, the theoretical value, standard value, or the stability value obtained through a large number of investigations are treated as μ 0. Whether the U test or t test is known Based on the n size and general standard deviation σ.

(1) The U test is used when σ is known or σ is unknown, but n is large enough [the sample standard deviation S is used as the estimated value of σ, substituted formula (19.6.

Based on the formula U, the relationship is determined based on the relationship shown in table 19-3.

Table 19-3 u values, P values, and statistical conclusions

α	\| T \| value	P value	statistical conclusion
0.05 sides one side	<1.96 <1.645	0.05	do not reject H0. The difference is not statistically significant
0.05 sides one side	≥ 1.96 ≥ 1.645	≤0, 0.05	reject H0 and accept H1. The difference is statistically significant
0.01 sides one side	≥ 2.58 ≥ 2.33	≤0, 0.01	reject H0 and accept H1, the difference is highly statistically significant

In 19.3, according to a large number of surveys, the average pulse rate of healthy adult men was 72 times/minute, and the standard deviation was 6.0 times/minute. A doctor randomly checked 25 healthy adult men in the mountainous area and obtained a pulse rate of 74.2 times/minute. Can he think that the pulse rate of adult men in mountainous areas is higher than that of ordinary men?

According to the question, the average number of results obtained from a large number of surveys is 72 times/points and the standard deviation is 6.0 times/points. The average number is μ0 and the general standard deviation is σ, the average number of samples X is 74.2 times/minute, and the number of samples N is 25.

H0: μ = μ 0

H1: μ> μ 0

Alpha = 0.05 (unilateral test)

According to the calculated statistics u = 1.833> 1.645, P <0.05, the H0 was rejected according to alpha = 0.05, and the pulse rate of healthy adult men in the mountainous area is higher than that in general.

(2) t-test is used for σ unknown and N hours.

Based on the calculated statistical value T, it is determined based on the relationship shown in table 19-4.

Table 19-4 | T | value, P value, and statistical conclusion

α	\| T \| value	P value	Statistical conclusion
0.05	<T0.05 (V)	<0.05	H0 is not rejected, and the difference is not statistically significant.
0.05	≥T0.05 (V)	≤ 0.05	Reject H0 and accept H1. The difference is statistically significant.
0.01	≥T0.01 (V)	≤ 0.01	Reject H0 and accept H1. The difference is statistically significant.

Example 19.4 if the population standard deviation σ in example 19.3 is unknown, but the sample standard deviation has been obtained, S = 6.5 times/minute, and the remaining data is 19.3 in the same case.

The difference from example 19.3 is that σ is unknown and t-test is available.

H0: μ = μ 0

H1: μ> μ 0

Alpha = 0.05 (unilateral test)

In this example, the degree of freedom v = 25-1 = 24, and the t0.05 (24) = 1 is obtained from the T-boundary value table (single side) (Table 19-1. 711. the calculated statistical value T = 1.692 <1.711, P> 0.05 is based on the α = 0.05 test level and H0 is not rejected. However, the pulse rate of adult men in the mountainous area cannot be considered higher than that of ordinary men.

Ii. Comparison of paired data

In medical research, paired designs are commonly used. There are four main scenarios in the pairing design: ① data before and after the same subject object is processed; ② data of two parts of the same subject object; ③ two methods for the same sample (instrument, etc) result of the test; ④ the paired subjects receive two types of processed data respectively. Scenario ① is used to infer whether the processing is effective; Scenario ②, ③, and ④ are used to infer whether the results of the two processing methods are different.

Formula (19.8)

In formula, 0 indicates the average number of the year before and after the processing. If there is no difference between the two methods, the average number of the difference is 0, D is an average number of data difference d (short for difference), and its formula is the same as formula (18.1). SD is the standard error of the mean number of difference numbers, SD is the standard deviation of the year of difference, and the formula is the same as (18.3). n is the Child number.

Because the calculated statistic is t, it is determined based on the relationship shown in table 19-4.

In 19.5, 9 patients with hypertension were treated with a certain drug, and the diastolic blood pressure before and after treatment was shown in table 19-5. Are there any changes in the diastolic blood pressure before and after medication?

Table 19-5 diastolic blood pressure (kPa) before and after treatment with a certain medicine for hypertensive patients)

Patient ID	Before treatment	Post-treatment	Difference d	D2
1	12.8	11.7	1.0	1.21
2	13.1	13.1	0.0	0.00
3	14.9	14.4	0.5	0.25
4	14.4	13.6	0.8	0.64
5	13.6	13.1	0.5	0.25
6	13.1	13.3	-0.2	0.04
7	13.3	12.8	0.5	0.25
8	14.1	13.6	0.5	0.25
9	13.3	12.3	1.0	1.00
Total			4.7	3.89

H0: no change in the diastolic blood pressure before and after treatment, that is, μ D = 0

H1: changes in the diastolic pressure before and after treatment, that is, μ D =0

α = 0.05

Degrees of Freedom v = n-1 = 8. In this example, t0.05 (8) = 2.306, t0.01 (8) = 3.355, t0.01 (8 ), P <0.01, according to alpha = 0.05 test level reject H0, received H1, can be considered before and after the treatment of diastolic pressure changes, that is, the drug has a blood pressure reduction effect.

Iii. Comparison of mean values of two completely randomly designed Samples

Also known as group comparison. The purpose is to infer whether the average numbers of the two samples are equal to those of μ1 and μ2. According to the sample content n, the samples are divided into U-test and t-test.

(1) The U test can be used for the two samples with a content of N1, N2, both of which are large enough, if both are greater than 50 or 100.

Formula (19.9)

The calculated statistical value is U, which is determined based on the relationship shown in table 19-3.

In Example 19.6, the number of red blood cells in some healthy adults was sampled from a certain place, of which 360 were male, 4.660 × 1012/L, and the standard deviation was 0.575 × 1012/L, the mean number is 4.178 × 1012/L, and the standard deviation is 0.291 × 1012/L. Are there any differences in the average number of male and female red blood cells in the area?

H0: μ = μ 0

H1: μ =μ 0

α = 0.05

Today, X1 = 4.660x1012/L, S1 = 0.575x1012/L, n1 = 360;

X2 = 4.1781012/L, S2 = 0.2911012/L, n2 = 255.

The calculated u = 13.63> 2.58, P <0.01, the H0 was rejected at Alpha = 0.05 test level, and H1 was accepted. The average number of red blood cells in both men and women is different, which is higher than that in women.

(2) t-test can be used for two samples with a relatively small content of N1 and N2, and the two population variance must be equal, that is, the variance is consistent (homoscedasticity ). T-test is required if the variance of the two samples to be tested is significantly different and the difference is statistically significant.

Formula (19.10)

Formula (19.11)

Formula (19.12)

In formula, the sx1-x2 is the standard error of the difference between the two samples, and s2c is the combined estimated variance (combined estimate variance ). The calculated statistical value is T, which is determined based on the relationship shown in table 19-4.

In case 19.7, a doctor measured the pelvis X-ray data of 50 normal women of Yao Nationality and Dong Nationality in Guangxi province. Diameter before and after pelvic entrance: The mean size of Yao Nationality is 12.002 (CM), standard deviation is 0.948 (cm), and the corresponding diameter of Dong Nationality is 11.456 (cm) and 1.215 (cm ). Asked if there is a difference in the anterior and posterior diameter of the pelvis entrance between two women?

H0: μ1 = μ2

H1: μ1 =μ2

α = 0.05

It is known that n1 = n2 = 50, X1 = 12.002 (CM), S1 = 0.948 (CM );

X2 = 11.456 (CM), S2 = 1.215 (cm ).

In this example, the degrees of freedom v = N1 + n2-2 = 98, query the T-boundary value table [No 98 degrees of freedom in the table, available inner plug-in method (Omitted) or estimated with V = 100]. t0.05 (100) = 1948, t0.01 (100) = 2.626, today t = 2.505> t0.05 (1000, P <0.05, reject H0 according to alpha = 0.05 test level, accept H1, the diameter of the pelvis entrance is different between Yao and Dong women in Guangxi. The diameter of the former is greater than that of the latter.

Iv. Comparison of geometric mean of two completely random samples

Some medical data is equivalent data or normal distribution data, and geometric mean is recommended to represent its average level. The purpose of comparing the geometric mean of the two samples is to infer whether they represent the same total geometric mean. In this case, the original data X should be first transformed to the logarithm, And the transformed data should be substituted into the formula (19.10), (19.11), and (19.12) to calculate the tvalue.

In 19.8, the serum of 20 people with hook-end screw diseases were randomly divided into two groups, respectively using standard or aquatic plants for coagulation test. The dilution times were measured as follows, q: Is there any difference in the average price of the two groups?

X1: Standard plant (11 persons) 100,200,400,400,400,400,800,160

X2: Aquatic beads (9 persons) 100,100,100,200,200,200,200,400,400

H0: μ1 = μ2

H1: μ1 =μ2

α = 0.05

Use the logarithm of the two groups of data as the new variables X1 and x2.

X1: 2.000, 2.301, 2.602, 2.602, 2.602, 2.602, 2.903, 3.204, 3.204, 3.204, 3.505

X2: 2.000, 2.000, 2.000, 2.301, 2.301, 2.301, 2.301, 2.602, 2.602

Use the transformed data to calculate X1, S12, X2, and s22, and then use the formula (19.10), (19.11), and (19.12) to calculate the tvalue.

X1 = 2.794, S12 = 0.2043; x2 = 2.268, s22 = 0.0554

Degrees of Freedom v = 11 + 9-2 = 18, t0.01 (18) = 2.878 for querying the T value table, today t = 3.150> 2.878, P <0.01, the H0 was rejected by alpha = 0.05, and H1 was accepted. The average potency of the two groups was different, and the standard plant height was higher than that of the aquatic plants.

From: http://www.med66.com/html/2005/8/hu6404551514850022505.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Utest and t-test

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support