Variance analysis (ANOVA) (conversion)

Source: Internet
Author: User

From: http://blog.sciencenet.cn/blog-116082-218338.html

Analysis of variance (ANOVA), that is, variable analysis, is a method for the significance test of differences between the average of multiple samples. In a multi-processing test, a series of different observations can be obtained. There are many reasons for the differences in observed values. Some are caused by different treatments, that is, the processing effect. Some are caused by the interference of unexpected factors and measurement errors during the test, that is, the error effect. The basic idea of variance analysis is to divide the total variation of measurement data into processing effects and experimental errors based on different causes of variation, and make a quantitative estimate.To correctly understand whether the variation of the observed values is caused by the processing effect or the error effect, we can calculate the mean square of the source effect and the mean square of the Error Effect and compare it in a certain sense, to test the differences between processing. Assume that a test has K processes, and each processing has n observations, a total of NK observations are available. Represent the J observations processed by I, where I = 1, 2, 3,..., K; j = 1, 2, 3,..., n. Indicates the total mean of the I-th processing observed value, indicating the test error :, that is, the J observations processed by I are composed of the total mean of the Processing plus the inevitable experimental error. For the overall average (the average of all NK observations), there is. If the average at the respective processing level is deemed to have applied different processing effects on the basis of the average, then yes. In summary, that is, any observation data is composed of the total mean plus the processing effect and the test error. Similarly, the linear model estimated by the sample is:, the sample average, the effect of the I processing, and the test error. Based on different assumptions, the above model can be divided: Fixed Model(Fixed model): The Effect values of each processing are fixed, that is, the effect produced by each processing except for random errors is fixed. It is a constant and the sum is 0. At this time, the experimental treatment water is usually selected based on the objective, such as the germination of wheat seeds at different temperatures. Random Model(Random Model): The Effect values of each processing are not fixed, but the effects caused by random factors. It is a random variable obtained from the normal population where the expected mean is 0 and the variance is normal. For example, when investigating the growth status of a species in different habitat, the climate, soil conditions, and water conditions in different habitat are factors that cannot be considered as control, and should be processed using a random model. Hybrid Model(Mixed Model): In a multi-factor test, both fixed-Effect Factors and random-effect factors are involved, the test should be based on a hybrid model. Different models have different focuses and different variance expectations. The fixed model mainly focuses on the estimation and comparison of the effect value, while the random model focuses on the estimation and test of the effect variance. Therefore, before analysis and testing, we need to clarify the basic assumptions about the model. For single-factor variance analysis, there is no big difference between a fixed model and a random model. Steps for variance analysis:(The variance analysis must meet the conditions such as independent samples, variance homogeneity, and normal distribution. If the variance is not prepared (F test), data conversion can be performed first, such as logarithm conversion) according to the basic idea of variance analysis, the total variation of the measurement data should be split into processing effect and test error, and then the inter-processing variance and intra-processing variance (error variance) should be processed) perform a F test to determine whether the difference between the processing effect and the test error is significant. 1. Calculation of Inter-processing variance and intra-processing variance:(1) split the sum of squares:
For the average number of N observed data processed at Nth time and the average number of all NK observed data, there are: (test error) and (processing effect ), that is, the total variation of the observed data is the sum of the experimental error and the processing effect. Add the square on both sides of the equation: the n observations processed by each equation are accumulated. Because the value is set at the same processing level, the formula above is: to accumulate K observations, the formula below is: represent the sum of squares in the group. represent the sum of squares in the processing room. represent the sum of squares in the group. So :. (2) decomposition of degrees of freedom: Total Degrees of Freedom = degrees of freedom between processing + degrees of freedom within processing: Finally, according to the sum of squares and degrees of freedom of each variation, the Inter-processing variance and intra-processing variance are :,. 2. Significance Test of statistical hypothesis-F test:,,. Compare the calculated F value with a significant level (such as 0.05) to determine whether the difference between processing is significant. If there is a significant difference between processes, we need to further compare which processes are significantly different. 3. Multiple comparisions)Common methods include the least significant difference (LSD) and the least significant difference (LSR ). LSD method:Essentially because of the t-test method for comparing the two averages, at that time, for intra-processing error variance, n is the number of repetitions within the same processing. The minimum difference that reaches a significant difference at a certain level is defined as:, when, under a given significant level, the difference is significant, and vice versa. LSR method:Different mean numbers are compared using different notable difference standards, and different test scales are used based on the different processing data (also known as Rank Distance) K contained in the range of range. Common methods include the new complex range test (Duncan) and q test (SNK. New limit test(New multiple range test): Also known as the Duncan and SSR methods. At that time, under a significant level, is defined as the intra-processing error variance, and N is the number of repetitions within the same processing. Sort the averages to be compared in the order of ascending to smallest, then the difference between the adjacent two average positions is m = 2, and the difference is M = 3, and so on. According to the M Value and the degree of freedom, the complex range test SSR value can be searched and obtained. Compare the difference between the two averages to be compared with the corresponding value to determine whether the difference is significant. , The difference is significant, and vice versa. Q test method: SNK Method, Which is essentially the same as the LSR method. Replace it with the value query table in the LSR method. When the sorting rank exceeds 3, the scale relationship of the three tests is one of the multiple comparison result marking methods of the LSD method: marking the letter method. First, all the averages are arranged in ascending order. The largest letter is labeled a. The average is compared with the following average. If the difference is not significant, until the mean with a significant difference is labeled B, and then compared with the mean greater than the mean with this mean, the difference is not significant in the latter of, then, based on the maximum mean of labeled B, and compared with the average of the unlabeled letters below, if the difference is not significant, it is still labeled B until the difference is marked C, and so on, until all averages are marked with letters. Note: When the number of inner observations (number of repetitions) is not the same, the calculation formula is changed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.