http://blog.renren.com/share/223170925/14708690013
Common significance test
1.t Inspection
It is suitable for the comparison of small sample between two groups with the homogeneity of the measurement data, normal distribution and variance. Including matching data between the sample and the average, the two are compared between the two three kinds, the calculation formula of the three can not be confused.
2.t ' Inspection
The application conditions are much the same as the T test, but when the t′ test is used for the variance between the two groups, the calculation formula of the t′ test is actually the correction formula of T-Test when the variance is not homogeneous.
3.U Inspection
The application conditions and T test are basically consistent, but when the large sample with U test, and small sample with T test, t test can replace U test.
4. Variance analysis
For the normal distribution, homogeneity of the multi-group metering comparison. A common single-factor grouping of multi-sample mean comparison and two-factor grouping of multiple sample mean comparison, the variance analysis first is to compare the total differences between the groups, such as the total difference is significant, and then 22 comparison between groups, the comparison between groups with Q test or LST test.
5.X2 Inspection
Is the main significance test method of counting data. Used for comparisons of two or more percentages (rates). The following situations are common: four-meter data, paired data, more than 2 rows of data and group X2 test.
6.0 Reaction test
Used to count data. is a special form of X2 test when there is a probability of 0 or 100% in the experimental group or control group. belongs to the direct probability calculation method.
7. Symbol inspection, rank and test and Ridit inspection
All three are non-parametric statistical methods, common characteristics are simple, fast and practical. It can be used for the analysis of non-normal distribution data, unknown distribution data and semi-quantitative data. The main disadvantage is that it is easy to lose the information contained in the data. Therefore, the normal distribution or data can be converted to normal distribution, as far as possible without these methods.
8.Hotelling Inspection
For the measurement data, the normal distribution, the two groups of multiple indicators of the comprehensive difference between the significance of the test.
Discussion on the method of econometrics test
There are a variety of test methods in econometrics, and under different assumptions, the test statistics used are different, and here I discuss several more common methods.
Before we discuss the different tests, we need to know why we have to test them and what to test. If this problem is not known, then I think we are absurd or very stereotyped. The meaning of the test is to really cause and effect, the core of econometrics is to say what the causal relationship is. So if there is no causal connection between the two things, then the reason we are looking for is wrong. So the result is meaningless, or not very meaningful. The test is very important for us to confirm the results and is also a key factor in evaluating the value of our results. So we have to do statistical testing.
T test, the T test is mainly to test the individual OLS estimates or the significance of the parameter estimates, what is the significance of the? That is, given a tolerance, a limit to which we can make mistakes, which are divided into two categories: 1, which is wrong, but we think it's right. 2, it is right that we think is wrong. The test of statistics is mainly for the first kind of error. This tolerance in general econometrics is 5%, which means that the probability that we can tolerate our first class error is 5%. That's not accurate, but it's better understood. It is not difficult to understand that, if it is 0, the t-stastic is the same normal distribution as the standard, that is, the estimated value minus the hypothetical value divided by the estimate is worth the standard deviation, the general assumption is that the value is 0, which means that there is no causal relationship. This t-static obeys the T-distribution under the classical hypothesis. T distribution is generally similar to the normal distribution, especially when the sample size is large enough, the general experience that the sample number is greater than 120, it can be considered as a normal distribution.
F-statistc:f test is an important part of joint testing, the main purpose is to use for a series of reasons whether the results of such a proposition to make a test. The main source of the F-Statistic is ssr\sst\sse three quantities. But one drawback of this test is that it has to be valid under the classical assumptions.
LM test: The nature of this test is the same as the properties of the F-test, both are test joint significance, the difference is that the F-statistic is in accordance with the F-distribution, but the LM statistic is subject to chi-square distribution. The chi-square distribution is the squared sum of the variables of the normal distribution, and the F distribution is the quotient of the chi-squared distribution, and the molecules and distributions must be independent, which is why the F test is limited in scope. LM=N*SSR, or LM=N-SSR.
As for the other white test, the Brusch-pagan test (the test method of variance), and the sequence-related T-Test, the DW test is basically the same.
There are different places in the test of variance and sequence correlation, but the thought is basically the same.
Discussion on the test of variance:
1, Brusch-pagan test: The idea of this test is relatively simple, mainly to study the relationship between the residual and X, given such an equation: U=b0+b1*x1+......+bn*xn+u ' regression, wherein the F-Test and LM test. If the test passes then there is no variance, if not through then there is the variance.
2, White test: This test is also the test of the variance, but this test is not only for the X-side of the regression, but also considering the residual check and the square of X and xi*xj relationship between. Given the following equation: U=b0+b1*y+b2*y^2+u '. It is also tested with the combination of F and LM to test the significance. If through then there is no variance, otherwise there is.
Discussion of sequence-related test methods:
The question of time series needs to know one thing, that is, a self-regression process, which is what is generally said in textbooks: AR (1) process, in which the main reason is that the variables in the current period are mainly dependent on the variables of the past period and a random error term. The expression is as follows: Ut=p*u (t-1) +et. Here I would like to talk about a few conceptual issues, I (1) (First order integral), I (0) (0 order integral). The AR (1) is a 0-order integral process, and the first-order integral process is a random swimming process with random walk and drift. Random Swimming process: ut=u (t-1) +et. That is, under the process of AR (1), where p is equal to 1. Random swimming process of drift: Ut=a+u (t-1) +et. The difference between the random walk process and AR (1) is a weak dependence, in fact we can think that any process is weakly dependent in the time series problem, but the crux of the problem is that we don't know how weak it is. Or more intuitively, we want to know how big P is, If P is 0.9 or is a relatively close to 1, then perhaps we can think that this time series is highly persistent, the concept that the current period of the variable is not in a very early period of the variable, such as the first order of the whole process, in fact, et is an independent distribution of variables, and the conditional mathematical expectation equals 0, no variance. So actually the mathematical expectation of this sequence is not related to the number of periods. Then it means that from the No. 0 period, the mathematical expectation of U is the same as the mathematical expectation of u in a long time. But the variance is different, and the variance increases with time. We know that this different concept can be discussed under the condition of the first-order autoregressive, but we say that the process of a self-regression is a characteristic of the staggered sequence, and the characteristics of the other variables we do not talk about.
Before discussing the question of testing, it is necessary for me to explain what time series we should be aware of when OLS estimates. In fact, the main problem of solving sequence autocorrelation problem is a differential method. Because if it is a long-lasting sequence or is not a long-lasting sequence, then a certain difference can relieve this problem.
1, t test. If we know that this variable is a process of self-regression, if we know that the autoregressive process is AR (1). So we can do this, first of all we do OLS estimates, the resulting staggered sequence we think is a first-order autocorrelation. So in order to verify this, then we can do UT and u (t-1) regression, of course, here can contain a intercept item. Then we verify that the estimation of the parameters is not significant, just use the T-Test.
What is the difference between T-Test and F-Test?
1. Test with a single sample t test, paired T-Test and two-sample T-Test.
Single-Sample T-Test: The difference between this group of samples and the population is observed by comparing the unknown population mean and the known population mean by the mean number of the sample.
Paired T-Test: A pairing design method is used to observe the following situations,
1, two homogeneous subjects received two different treatments respectively.
2, the same subjects accept two different treatments;
3, the same subjects before and after processing.
The F-Test is also called the homogeneity test of variance. The F-test is used in the two-sample T-Test. In order to compare the two samples randomly, we should first determine whether the two population variances are the same, that is, the variance homogeneity. If the variance of the two populations is equal, the T-Test or variable transformation or rank and test can be used directly. To determine whether the two population variances are equal, the F test can be used.
Precondition and application of 2.t test and variance analysis the T-test for comparing mean can be divided into three categories,
The first category is for the single group design quantitative data;
The second category is quantitative data for pairing design;
The third category is to design quantitative data for groups.
The difference between the latter two design types is whether the two groups of subjects are paired in a way that is similar in character to one or several aspects. No matter what type of T test, it is reasonable to apply the application under certain preconditions.
If a single group of design, must give a standard value or the overall mean, at the same time, to provide a set of quantitative observations, the application of T test is the precondition is that the group of data must obey the normal distribution; if paired design, the difference of each pair of data must obey the normal distribution;
If the group design, the individual between each other independent, both groups of data are taken from the general distribution, and to meet the variance homogeneity.
These prerequisites are required because the T-Statistic must be calculated under such a premise, and T-Test is based on the T-distribution as the theoretical basis of the test method. It is worth noting that the variance analysis is the same as the precondition of group design T test, that is, normality and variance homogeneity.
T-Test is the most frequently used method in medical research, and the most common way to deal with quantitative data in medical papers is hypothesis test. T test has been so widely used, the reasons are as follows: The existing medical periodicals have made more statistical demands, the research conclusions need statistical support; The traditional medical statistics teaching has introduced T-Test as an introductory method of hypothesis testing, which makes it become the most familiar method for the general medical researchers. The T test method is simple and the result is easy to explain. Simplicity, familiarity and external requirements have led to the popularity of T-Test. However, because some people understand this method is not comprehensive, leading to a lot of problems in the application process, some even very serious errors, directly affect the reliability of the conclusion. Classifying these questions can be broadly summarized in the following two scenarios:
Without considering the application premise of T test, the comparison of the two groups is tested by T.
All kinds of experimental design types are considered as multiple single-factor two-level design, and the T-test is used to compare the mean value 22 times.
In both cases, the risk of concluding a false conclusion is increased to varying degrees. Moreover, in the number of experimental factors greater than or equal to 2 o'clock, it is impossible to study the interaction between the experimental factors of the size.
U-Test and T-Test difference and contact
U-Test, T-Test, F-Test, X2 test (Turn)