Eighth chapter: Hypothesis test
Content Summary:
1. In order to infer some unknown characteristics of the whole, if the whole distribution function is completely unknown or only know its form, but does not know its parameters, some assumptions about the general are put forward, and then the decision process of accepting or rejecting the proposed hypothesis according to the sample is called hypothesis test.
2. Procedures for dealing with the hypothesis test problem:
(1) According to the requirements of practical problems, the paper puts forward the original hypothesis H0 and the alternative hypothesis H1;
(2) Given the significant level and sample size;
(3) Determine the test statistics and the form of the rejection domain;
(4) by {When H0 is true to reject H0} to find the rejection domain;
(5) sampling, based on the sample observations to make decisions, is to accept H0 or reject H0.
3, two types of errors:
(1) When the hypothesis H0 actual is true, by the sample observation value makes rejects the H0 the erroneous conclusion, is called the first kind of error or "takes true as false" the error, uses the expression to make the first kind of mistake the probability, namely {};
(2) When the hypothesis H0 actual error, by the sample observation value made to accept the H0 error conclusion, called the second type of error or "false" error, with the expression of the probability of committing the second type of error, that is {};
When the sample capacity is fixed, to reduce the probability of committing the first type of error, the probability of committing the second type of error increases, and to reduce the probability of committing the second type of error, the probability of committing the first type of error will increase. If the probability of committing two types of errors is reduced at the same time, it is impossible to make the sample size fixed. In general, the probability of the first type of error is always controlled in the hypothesis test, so that it is smaller and the probability of committing the second type of error is not considered.
4, the normal general parameters of the hypothesis test method table (significant level is)
5, confidence interval and the relationship between the hypothesis test:
Consider the hypothesis test of the significance level: Assume that its accepted domain is:
That is:
So yes, there are
Therefore, it is a confidence interval for the confidence level of the parameter, that is, the upper and lower two bounds of the hypothesis test acceptance domain are the confidence limit and the confidence limit of the corresponding parameter confidence interval.
Basic requirements:
1. Understand the concept of "hypothesis test" and grasp the basic solution steps of hypothesis testing.
2. To understand the origin and the definition of the corresponding probability of two kinds of errors in the hypothesis test.
3. To master two normal parameters of a single paragraph, the hypothesis test is solved.
4. Understand the relationship between hypothesis test and confidence interval.
The difficulty of this chapter: the basic principle of hypothesis test, the origin of two kinds of errors in hypothesis test.
This chapter focuses on the solution steps of the hypothesis test.
Difficulty Analysis:
1. The basic idea of hypothesis test: Disprove method and basic statistical inference principle (small probability event is almost impossible in one experiment) here the basic practice of hypothesis testing is realized: there is a hypothesis to be tested, the hypothesis is correct, and a time is constructed under this "presumption". The probability of it occurring under the right conditions is very small: set, now to do an experiment, if the event occurs, that is a small probability event, but by the basic statistical inference principle, the event is not possible, and now it happened, this is the basic statistical inference principle of "contradiction", so that "the assumption is correct" is wrong, Thereby rejecting, conversely, if a small probability event does not occur, is generally accepted.
2. The root causes of two types of errors in hypothesis testing, can understand two aspects: on the one hand the basic statistical inference principle "small probability event in a test is almost impossible", not "small probability event in a test will never happen", therefore, if the H0 is correct, but a small probability event happened in a trial, The decision to reject H0, which was wrong, resulted in two types of errors in the hypothesis test; On the other hand, the hypothesis test is based on the sample to infer the whole, in essence, part of the overall inference, which in itself determines that it is impossible to never make mistakes.
Typical example Analysis:
Example 1: The following is the assembly time of a random selection of 20 parts (units: minutes)
9.8 10.4 10.6 9.6 9.7 9.9 10.9 11.1 9.6 10.2
10.3 9.6 9.9 11.2 10.6 9.8 10.5 10.1 10.5 9.7
When the assembly time is set, the general obeys the normal distribution, and the parameters are unknown
(1) Whether the average value of assembly time is 10.
(2) It can be considered that the mean value of assembly time is significantly greater than 10.
Analysis: The hypothesis test is divided into bilateral hypothesis test and unilateral hypothesis test, and the hypothesis test should be distinguished by the question.
Solution: (1) by the general knowledge of the problem, are unknown, the requirements of the test hypothesis at the level
Because of the unknown, using T test, take the test statistics as follows:
Because of the n=20,=10.2,s=0.51,
Absolute fields are:
The calculation means that the test statistics do not fall in the rejection domain, so the original assumption H0 at the level, that is, the assembly time mean value can be considered to be 10.
(2) from the problem to know the overall, are unknown, requirements at the level of the test hypothesis:
Because of unknown, adopt T test, take test statistic as
Because of the n=20,=10.2,s=0.51,
The rejection domain is: that is, after the calculation: t=1.75>1.729, the test statistics fall in the rejection domain, so at the level of the refusal to assume that the assembly time is significantly greater than 10.
Example 2: The resistance of the sample of two batches of electronic devices (unit:) is:
Group A (x)
0.140
0.138
0.143
0.142
0.144
0.137
Group B (Y)
0.135
0.140
0.142
0.136
0.138
0.140
Two batches of equipment resistance overall distribution is not known, and two samples of independent, ask under, can be considered two batches of electronic components of the same resistance.
Analysis: To carry out the hypothesis test, we should carefully examine the question, make clear the hypothesis that the problem needs to be examined, and the prerequisite that the test needs to be known, the hypothesis test of whether the two independent normal population mean equal or not is carried out, and this test requires the premise that the two general variance is equal So it is necessary to carry out the hypothesis test of whether two independent population variance is equal, if the hypothesis of equality of variance is established, the test of mean equality or not can be carried out.
Solution: From the problem set, a batch of electronic devices resistor, B batch of electronic components of the resistance, here are unknown.
(1) At the level, the test hypothesis
Use f test, check the statistic quantity
Existing
, the Deny domain is:
After calculation: f=1.108, because of 0.140<1.108<7.15, so the test statistics do not fall in the rejection domain, it is accepted at the level of the assumption that the two batches of electronic device resistance variance is equal.
(2) based on the premise of equal variance of the two populations, the hypothesis is tested by T-Test at the level. The test statistics are
Existing:
The Deny domain is:
Calculated by: | t|=1.3958<2.2281, does not fall within the deny domain. Therefore, we accept the hypothesis H0 at the level, and think that the average resistance of two batches of devices is equal.
Example 3: There are two machines to produce metal parts, respectively, in the two machines are not produced in each of the n=60,n=40 of the annual sample, measured parts weight (unit: kg) of the sample difference is, set two samples of each other independent, two overall compliance, distribution. are unknown, Q: At the level, can you think that the first machine produced parts of the weight variance is greater than the second machine production components of the variance of weight.
Solution: Test the hypothesis by the requirement of the title,
Using f test, the test statistics are
Existing
Deny Domain to
Calculated by:. The test statistics do not fall within the rejection area, so accept it below. It is considered that the variance of the weight of the components produced by the first machine is greater than that of the second machine.
Self-Test questions
One, the choice question
1. At a significant level, the assumption of normal general expectations is tested, if the original hypothesis is accepted, ask at the level, the following conclusion is correct ()
(A) acceptance of (B) refusal
(B) may or may not be accepted (D) do not accept or reject
2. Set Overall ~, known, unknown. () is a group of samples from. Ask the test hypothesis that the Deny domain is ()
(A) (B)
(C) (D)
3. In the hypothesis test, it is called () the first type of error as the original hypothesis to be examined.
(A) True, accept (B) not true, reject
(C) to be true, reject (D) not true, accept
Two blank questions:
4. Set is from the overall sample, unknown, and, then assume the T-test of the test statistics.
5. Set the overall known, as a sample from the overall x, the test assumes that the statistics are ___: When set up, obey ______ distribution.
Three answer questions
6. Chemical plant with automatic packer packaging fertilizer, a day measured 9 fertilizer weight (kg), as follows:
49.7 49.8 50.3 50.5 49.7 50.1 49.9 50.5 50.4
Given that the packing weight is subject to a normal distribution, it is considered that the average weight of each packet is 50 kg ().
7. A machine processing part, the stipulation part length is 100 centimeters, the standard difference must not exceed 2 centimeters,
Daily check and its operation, a month to extract 10 parts, measured average length of centimeters. Standard deviation cm. The processing part length obeys the normal distribution and asks whether the machine works normally (). (Hint: Set the length of the part to first check the hypothesis, after the test hypothesis).
8. There are two methods of smelting a metal. Now each with a sample, the product impurity content (unit grams) as follows:
A: 26.9 22.8 25.7 23.0 22.3 24.2 26.1 26.4 27.2 30.2 24.5 29.5 25.1
B: 22.6 22.5 20.6 23.5 24.3 21.9 20.6 23.2 23.4
Known product impurity content obeys normal distribution, ask
(1) Whether the impurity variance of the products contained is equal,
(2) Whether the impurity content of the products produced by untraceable letters a smelting method is not greater than the B method
Answer: 1. A 2.D 3.C
4.5.
6. You can
7. The day the machine works normally
8. (1) No significant difference
(2) The impurity content of products produced by untraceable letters a smelting method is greater than that of B method
From:http://lxy.cumtb.edu.cn/gailvtongjidaoxue/chap8.htm