Support the Seven Pillars of statistics!

Source: Internet
Author: User
Tags square root

Support the Seven Pillars of statistics!

JSM on the statistics of the old master Stephen Stigler did a keynote speech, said "the Seven Pillars of Statistics", the kind and earnest Rick Wicklin classmate took notes, he estimated still in Chinatown to eat I only to understand what the SS adults actually said. Looking back at the notes, I think SS adults are a bit boastful of statistics. The so-called pillar is that we will collapse without it. The Seven pillars are:

Summary : We get knowledge from data aggregation. The boy thinks that the summary is the classic use of statistics, but the summary (descriptive statistics) is only a statistical aspect, another is equally important and relatively more reliable aspects are forecast. I have always been a summary of the prediction, because statistics are born with an unreliable nature, a summary of the wrong can not be verified, predicting the wrong extent we still know how far wrong.

diminishing marginal effects: as the amount of data increases, the amount of information is not linearly increased, but it may not be much new after a certain degree. SS adults use the square root of n (sample size) to describe this decrement, I think too far-fetched, for example, the standard error of the sample mean has a square root of N, but this with the information has a wool relationship?

likelihood/probability : Probability theory is of course the backbone of statistics, and of course depends on how we define statistics, but the probability is that the basis of mathematical statistics will certainly not be people disagree. Some people say that statistics is "the Science of research uncertainty", I am now the most annoying is "science" two words, everyone put their work to upgrade to science, what is science? I think mathematical/mathematical statistics can be a subject, but not a science. To say what you do is science, first ask those who raise rabbits and E. coli, and then think about what they do is not science. It is not shameful to say that you are studying a subject, and that this is the age of "Data science" and the irony of Mengsengwang, "The School of Mathematical Sciences" (three repeating words in a name, directly called the "mathematics department" shameful?). ), are some people who do not have the confidence to come out of the noun. I admire the honest work of the natural science workers, not to say with paper and pen to push formula workers do is meaningless things or not bitter force, but said nothing in these terms, not to contend, a little better.

Horizontal comparison : For example, compare the differences between the two mean values. SS adults speak of other disciplines compared to the "gold standard", while we are comparing data internally, such as variance analysis Anova and T-Test. I do not quite understand this is what the pillar, and statistics is not there is no comparison with the "gold standard" situation ah.

regression and multivariate analysis : The return of height is a classic example, which is indeed an interesting discovery, but what is the reality of regression being used to do? I feel that the main function of the return is to be a deluge of papers to be used as cannon fodder (you see, my method is better than the return), or outside the professional when the million essential oil (you see, I ran a return, the coefficient is significant yes). The combination of methods and domain knowledge is a pillar, rather than a pillar. Without specific domain knowledge, running a significant regression of coefficients is just elephant.

Experimental Design: This of course is also important, and I think it is the only one of the seven pillars that can be called the pillar, because it can be separated from the domain knowledge and effective. No comparison there is no identification, we all know to compare, but how than is a key problem. For example, the popular Chinese dictation contest in previous days violated the basic principles of "random", "repetition", "control" and other experimental design, and in the absence of probability under the guidance of the competition, inevitably some unfair.

Models and residuals : This is a bit limited to regression routines, not all models involve residuals. If you do not check the distribution of residuals, will statistics collapse? I do not think so. Even if the residuals still have obvious characteristics, the model may not be completely inappropriate, which depends on what part of the model you want to get information about.

Excerpt from Yihui Xie's seven Pillars of statistics

Article Source: http://www.36dsj.com/archives/26528

Support the Seven Pillars of statistics!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.