P-value is one of the most commonly used statistical indicators in the paper, but its misuse and interpretation are very common. Therefore, it is necessary to explain the significance of P-value, usage and common errors. P-value refers to the probability that the difference between the two comparisons is caused by chance. The smaller the P-value, the more reason to think that there is a difference between things. For example, p<0.05, which means that the difference between the results shows less than 5% of opportunities, or that others repeat the same research under the same conditions, less than 5% of the likelihood of the opposite conclusion. "Not significant", p<=0.05 said "significant", p<=0.01 said "very significant". because "significant" is commonly used to denote p-value size, the most common misuse of P-value is to confuse statistically significant with significant differences in clinical or practical, namely, to confuse the meaning of "significant difference" with "significant difference". In fact, the former refers to the p<=0.05, that is, there is sufficient reason to think that the comparison of the two from the same total probability of less than 5%, so that there is a difference between the two, this conclusion the possibility of error <=5%. The latter means that the difference is really great. For example, the difference between 4 and 40 is so significant that it can be said to be "significantly different", while 4 and 4.2 are not very different, but if the calculated P-value is <=0.05, the difference is considered "significant" but not "significantly different". because "there is significant difference" and "difference is significant" easy to confuse, so now some periodicals advocate "difference has statistical meaning" to replace "difference is significant", with "no statistical significance of difference", "difference has high statistical significance" to replace "difference is not significant" and "difference is highly significant". This is the case, for example, "Chinese parenteral medicine". If p>5%, can we conclude that there is no difference between the two? No. p>5% can only indicate that there is insufficient evidence to show that there is a difference, but it cannot be said that there is no difference or small difference. There is also a transition interval between the two extremes, that is, there is insufficient evidence of differences or differences or small differences. To infer that there is no difference or small difference between the two, it is necessary to use the statistical inference method of equivalent test. Have to mention the P-valueIndeed, the P-value is one of the most commonly used statistical indicators, and almost all of the statistical software outputs have p-values. It is necessary to understand the origin, calculation and significance of P-values.
First, the origin of P-value
R. A. Fisher (1890-1962) as the founder of hypothesis testing theory, the concept of P-value is first put forward in hypothesis testing. He argues that hypothesis testing is a procedure in which the researchers can form a judgment on a general parameter according to the procedure. In other words, he believes that hypothesis testing is a form of data analysis, which is the subjective information that people join in the study. (At the time, this view was objected to by Neyman-pearson, who believed that hypothesis testing was a method in which decision-makers operated under uncertain conditions, using which could make clear choices in two possible ways, while at the same time controlling the probability of errors occurring. These two methods make a long and painful controversy. Although Fisher's view has also been opposed by modern statisticians, he has made a huge contribution to the development of modern hypothesis testing. Fisher's specific practice is to:
- Assume a value for a parameter.
- Select a test statistic (such as z-Statistic or Z-Statistic), and the distribution of the statistic should be fully known if the assumed parameter value is true.
- A random sample of 4 is taken from the overall study to calculate the value of the test statistic 5 to calculate the probability P-value or the significant level of observation, that is, when the hypothesis is true, the probability of the test statistic being greater than or equal to the actual observed value.
- If p<0.01, the description is a strong decision result, rejecting the assumed parameter values.
- If the 0.01<p value is <0.05, the weaker decision result is rejected, and the assumed parameter value is denied.
- If the P-value is >0.05, the result is more inclined to accept the assumed parameter value.
However, in that era, due to hardware problems, the calculation of P-value is not easy, people use a statistical test method, that is, we initially learned T-value and T-critical value comparison method. The statistical test method is to determine the significance level before the test & #x03B1; " > α α, which means that the deny domain was determined beforehand. However, if you select the same & #x03B1; " > α α, the reliability of all test conclusions is the same, it is not possible to give an accurate amount of the degree of inconsistency between the observed data and the original hypothesis. As long as the statistics fall in the reject domain, the assumption is that the results are the same, i.e. the results are significant. But in fact, the statistic falls in the rejection domain different place, actually the significance has the big difference.
Therefore, with the development of computer, p-value calculation is no longer a difficult problem, so that P-value becomes one of the most commonly used statistical indicators.
In order to understand the calculation of P-value, the statistical quantity of the test is represented by Z z , and the value ofThe test statistic calculated from the sample data is indicated by Z-C ZC.
Left side inspectionh0:μ≥< Span id= "mathjax-span-23" class= "Msubsup" >μ 0 h0:μ≥μ0 vs H1:μ<μ0h1:μ< μ0
The P-value is whenμ=μ0 μ=μ0, the test statistic is less than or equal to the probability of the test statistic value calculated from the actual observed sample data, i.e. P-value =P(zC≤z| μ=μ0) P (zc≤z|μ=μ0)
Right testh0:μ≤< Span id= "mathjax-span-70" class= "Msubsup" >μ 0 h0:μ≤μ0 vs H1:μ>μ0h1:μ> μ0
The P-value is whenμ=μ0 Μ=μ0, the probability that the test statistic is greater than or equal to the test statistic value calculated from the actual observed sample data, that is, p-value =P(z-C≥z| μ=μ0) P (zc≥z|μ=μ0)
Two-sided inspectionh0:μ=< Span id= "mathjax-span-117" class= "Msubsup" >μ0 h0:μ=μ0 vs H1:μ≠μ0h1:μ≠μ0
The P-value is whenμ=μ0 Μ=μ0, the probability that the test statistic is greater than or equal to the test statistic value calculated from the actual observed sample data, that is, p-value =2P(zc ≥| Z| | Μ=μ0) 2P (zc≥| z| | μ=μ0)
Significance of P-value
The P-value is the probability that the sample observations or more extreme results appear when the original hypothesis is true. If the P-value is small, it indicates that the probability of this happening is very small, and if so, we have reason to reject the original hypothesis, the smaller the P-value, the better the reason for rejecting the original hypothesis, according to the small probability principle.
In summary, the smaller the P-value, the more pronounced the result. But whether the test results are "significant", "moderate significant" or "highly significant" requires ourselves to address the magnitude of P-value and the actual problem.
P-Value, "significant difference" and "significant difference"