[Statistics in small eyes] difference test and general linear model (1)

Last Update:2015-05-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

With the use of SPSS children's shoes are known, we commonly used variance analysis (ANOVA) in the general linear model (generic Linear models, called GLM) under the menu. And who is that GLM? Let's open the Magnum wiki and type the general Linear Model ... What I saw was a fitting Plot with no vainly disobey:

and the legendary multivariate (linear) regression formula: $Y _{i}=\beta_{0} + \beta_{i1}x_{i1} + \beta_{2}x_{i2} + ... + \beta_{p}x_{ip} + \epsilon_{i} $

Isn't that a regression problem? What is the relationship between the variance analysis of the test difference?

In fact, the parameter test based on the normal distribution hypothesis (t test, Anova,manova,ancova, etc.) can be characterized as regression problem.

Let's take the simplest of the two types of differences (the difference between Group A and Group B), we usually use the T-Test to investigate the difference between A and B, and use Error_bar to represent the difference between A and B (for example, left).

A and B are two levels under the argument x, and we can encode them with 0 (A) and 1 (B). Then we can get the corresponding function y=f (x) of the dependent variable Y and the argument x, assuming there is a linear correlation between the two, that is, the function model $y = \alpha X + \beta$ (e.g. right). The greater the difference between A and B, the greater the slope α of the fitted line, i.e. the difference test can be expressed in the form of regression.

So is the significance of the two equivalent? That is, the t fraction between A and B and the slope of the fitted line is the same as the t fraction of the curve α, the answer is yes.

We might as well assume that the variance homogeneity between Group A and group B varies between Group A (Na) and the number of samples in Group B (Nb). Then the T value between Group A and Group B is

$ $t =\frac{\overline{y_{b}}-\overline{y_{a}}}{s_{y_{a}y_{b}} \sqrt{\frac{1}{n_{a}} + \frac{1}{N_{b}}}} $$

Among them, there are $s_{y_{a}y_{b}}=\sqrt{\frac{(n_{a}-1) s_{y_{a}}^2 + (n_{b}-1) s_{y_{b}}^2}{n_{a}+n_{b}-2}}$, $S _{y_{a}}$, $S _{y_ {B}} $ is the standard deviation for Group A and Group B, respectively.

Make $mean=\overline{y_{b}}-\overline{y_{a}}$, $SE =s_{y_{a}y_{b}} \sqrt{\frac{1}{n_{a}} + \frac{1}{N_{b}}}$, which is $t=\ frac{mean}{se}$

The simplest part of the mean is processed first:

The linear model $y = \alpha X + \beta$ into MEAN, due to $x_{a_{i}}=0$, $X _{b_{i}}=1$, that is: $MEAN = (\hat{\alpha} \times 1 +\hat{\beta})-(\hat {\alpha} \times 0 +\hat{\beta}) ={\hat{\alpha}-0} $

Then look at the SE section, first the $s_{y_{a}y_{b}}$ section:

Bessel correction formula based on sample standard deviation, $S _{y_{a}}=\frac{\sum_{i=1}^{n_{a}}{(Y_{a_{i}}-\overline{y_{a}) ^{2}}}{n_{a}-1}$, $S _{y_{b}} =\frac{\sum_{i=1}^{n_{b}}{(Y_{b_{i}}-\overline{y_{b}) ^{2}}}{n_{b}-1}$, bring both into the $s_{y_{a}y_{b}}$:

$ $S _{y_{a}y_{b}}=\sqrt{\frac{\sum_{i=1}^{n_{a}}{(Y_{a_{i}}-\overline{y_{a}) ^{2} + \sum_{i=1}^{n_{b}}{(Y_{b_{i} }-\overline{y_{b}}) ^{2}}}{n_{a}+n_{b}-2}}$$

The mean values of Group A and a are the least squares estimators of the points within their groups, i.e. $\overline{y_{a}}=\hat{y_{a_{i}}},i\in a$;$\overline{y_{b}}=\hat{y_{b_{i}}},i\in B$:

$ $S _{y_{a}y_{b}}=\sqrt{\frac{\sum_{i=1}^{n_{a}}{(Y_{a_{i}}-\hat{y_{a_{i}}) ^{2} + \sum_{i=1}^{n_{b}}{(Y_{b_{i}} -\hat{y_{b_{i}}) ^{2}}}{n_{a}+n_{b}-2}}=\sqrt{\frac{\sum_{i=1}^{n}{(Y_{i}-\hat{y_{i}}) ^{2}}}{N-2}}$$

Finally, see $\sqrt{\frac{1}{n_{a}} + \frac{1}{n_{b}}}$:

$$\frac{1}{n_{a}} + \frac{1}{n_{b}}=\frac{1}{\frac{n_{a}n_{b}}{n_{a}+n_{b}}}$$

$$=\frac{1}{\frac{n_{b} (N-n_{b})}{n}}$$

$$=\FRAC{1}{N_{B}-\frac{n_{b}^2}{n}}$$

$$=\FRAC{1}{N_{B}-2 \frac{n_{b}^2}{n}+\frac{n_{b}^2}{n}}$$

Since $x_{a_{i}}=0$, $X _{b_{i}}=1$, there are $n_{b}=\sum^{n}x_{i}=\sum^{n}x_{i}^2$, and there are $\overline{x}=\frac{n_{b}}{n}$:

$ $N _{b}-2 \frac{n_{b}^2}{n}+\frac{n_{b}^2}{n}=\sum^{n}x_{i}^2-2 \sum^{n} (\frac{n_{b}}{n} X_{i}) +\sum^{n} (\frac{n_{b}}{n}) ^2$$

$$=\sum^{n} (X_{i}-\frac{n_{b}}{n}) ^2=\sum^{n} (x_{i}-\hat{x}) ^2$$

i.e. $\large \sqrt{\frac{1}{n_{a}} + \frac{1}{n_{b}}}=\frac{1}{\sqrt{\sum_{i=1}^{n} (x_{i}-\overline{x}) ^2}}$

All in all, $\large t=\frac{mean}{se}=\frac{mean}{s_{y_{a}y_{b}} \sqrt{\frac{1}{n_{a}} + \frac{1}{n_{b}}}}=\frac{\hat{\ Alpha} -0}{\sqrt{\frac{\frac{1}{n-2} \sum_{i=1}^{n}{(Y_{i}-\hat{y_{i}}) ^{2}}}{\sum_{i=1}^{n} (X_{i}-\overline{X} ^2}}}=\frac{\hat{\alpha} -0}{se_{\hat{\alpha}}}$, the final equation is the expression of the T-test of whether the least squares estimator of the slope of the linear model $\hat{\alpha}$ is greater than 0.

At this point, we have succeeded in proving that:

The T value of the difference between Group A and group B of homogeneity and unequal groups

The linear model $y = \alpha X + \beta$ ($X _{a}=0$, $X _{b}=1$) is equivalent to the T value of $\alpha$ significantly greater than 0.

For more general variances, paired-sample T-tests, and variance analysis of comparisons between groups (dummy coding technology), and listen to tell ^ ^

[Statistics in small eyes] difference test and general linear model (1)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Statistics in small eyes] difference test and general linear model (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Statistics in small eyes] difference test and general linear model (1)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support