Serialization] fourth lecture on measurement accuracy, repeatability, reproducibility and standard deviation

Http://www.jlbjb.com/edu/show.asp? Id = 443

Metering lectures: General metering terminology lectures

Shi changyan, China Institute of Metrology

I. Measurement Accuracy refers to the "consistency between measurement results and the truth value of the oil volume" (article 1 of the General metering terms and definitions of JJF1001-1998, which is only a simple clause below ).

The degree of consistency in the above definition is qualitative rather than quantitative. Accuracy is a qualitative concept and can be understood from the following three aspects. First, the measured truth is actually the measured itself, and the so-called true value consistent with the given specific quantitative definition is only an idealized and difficult-to-operate concept. Therefore, it is impossible to accurately and quantitatively give the value of accuracy. Secondly, the traditional error theory holds that accuracy is the synthesis of system errors and random errors, but their synthesis methods have not been unified internationally. Finally, the accuracy mentioned in habits actually represents an inaccurate degree, but people do not want to use derogatory terms rather than comments. Therefore, when the accuracy is high, the accuracy value is smaller. When the accuracy is less than 1%, does it mean the error is less than 1% or the error is greater than 1%? Sometimes people cannot understand the necessity of introducing the concept of accuracy.

As a customary term in history, the seven international organizations stipulated in 1993 that the accuracy follows is only the consistency or closeness between the measurement results and the measured true values. It is only a qualitative concept, it is not appropriate to quantify it. For example, it can be said that "this research project is highly accurate to measurement", "measurement accuracy should meet the requirements for use, or certain technical specifications and standards. In other words, it can be said that the accuracy is high or low, the accuracy is 0.25, the accuracy is 3, or the accuracy meets the × standard, instead of 0.25%, 16 mg, ≤ 16 mg or ± 16 mg. That is to say, accuracy cannot be connected to numbers. If a number is required, uncertainty is available. For example, it can be said that "the expansion uncertainty of measurement results is 2.*μ*Instead of 2.*μ*".

The accuracy specified in some measurement instrument specifications or technical specifications is actually the maximum allowable error or Allowable Error Limit of the instrument and should not be confused with the definition of measurement accuracy terminology. The accuracy level of a measuring instrument is the level or level that meets certain metering requirements and puts the indication value error within the specified limit, generally, this level is given a number or symbol according to the agreed method.

Do not use the term "precision" to indicate "accuracy", because the former only reflects dispersion and cannot replace the latter. The traditional definition of precision is the degree of consistency between the Independent observations obtained under the specified conditions. Therefore, precision only means that the measurement results cannot be completely repeated or reproduced due to the random effect, while accuracy means that the measurement results are inconsistent with the true value due to the comprehensive effect of the random and system. In fact, precision is also a qualitative concept and cannot be used as a term for quantitative estimation. Because the precision under repeated measurement conditions can be quantitatively expressed by the repeatability of the measurement results (see 5.6); while the precision under recurrence measurement conditions, the reproducibility of the measurement result (see article 1) is used for quantitative representation. For example, we can say that the repeatability of measurement results is 2 mg or the repeatability standard deviation is 2 mg, rather than 2 mg ".

Since the word precision (often referred to as "precision" in our country) is widely used, over-utilized, and sometimes not even traditional definition, it has been avoided internationally, seven international organizations are no longer in use. When it is necessary to quantitatively represent or quantitatively estimate the impact of possible random errors or random effects in the measurement results, the repeatability standard deviation or reproducibility standard deviation can be used. The term correctness used in the past is actually the influence of system errors or system effects, which can be expressed or estimated quantitatively.

2. [Measurement Result] repeatability refers to the consistency between the measurement results of the same measurement under the same measurement condition (5.6 items ).

The "consistency" in the above definition is quantitative and can be expressed by the dispersion of the results obtained from multiple measurements of the same amount under repetitive conditions. The amount that represents the dispersion of the measurement results, the most common is the experimental standard deviation (see Article 5.8 ). The deviation calculated based on the Bessel formula under repetitive conditions is called the "repeatability standard deviation ".*S*R. Subscript R is called the "repeatability limit". It is the range in which the difference between the two measurement results under the repeatability condition is 95% probability, that is, the difference between the two measurement results falls within the r range or the probability of the difference ≤ r is 95%. Assume that the results obtained from multiple measurements are in a normal distribution and the calculated*S*If R is sufficiently reliable (with a sufficient degree of freedom), the repeatability limit is about three times the standard deviation of repeatability. Observers can usually use the repeatability limit to understand the uncertainty caused by the measurement method (see article 5.9) and to assess whether the measurement results meet the requirements.

The repeatability condition includes the five content listed in note 2. In other words, it means to complete repeated measurement tasks within the shortest time interval, including procedures, personnel, instruments, and environments under the same conditions as possible. The "Short Time" here can be understood as: to ensure that the first four conditions are the same or remain unchanged for a period of time, it mainly depends on the quality of personnel, the performance of the instrument and the impact on a variety of (see Article 4.8) monitoring. From the perspective of mathematical statistics and data processing, the measurement should be in the statistical control state during this period, that is, the random state that conforms to the statistical law. In layman's terms, it is the interval between measurements in a normal state. The variability in repeated observations is caused by the inability to maintain a constant variety of influences. The recurrence standard deviation is also known as the intra-group standard deviation.

Iii. [Measurement Result] reproducibility refers to "consistency between the same measurement result under changed measurement conditions" (5.7 ).

The "consistency" defined above is quantitative and can be expressed by the dispersion of the measurement results of the same quantity under the reproducibility condition. This indicates the amount of dispersion of the measurement result, which is usually calculated according to the besell formula. It is called the "reproducibility standard deviation" and is recorded*S*R. Subscript R is called "reproduction limit". Its meaning is similar to the repeatability limit in 5.6. Assuming that the reproducibility condition is for different laboratories in two locations, the observer can use the reproducibility limit to verify whether there is a large system effect between the two laboratories resulting in uncertainty.

The reproducibility conditions include the eight content listed in note 2. These contents can change one, multiple, or all of them. Therefore, the specification of change conditions (reproduction conditions) should be stated in the effective expression of reproducibility. For example, when performing a calibration laboratory comparison or Capability Verification test, the leading laboratory successively sends a three-level standard weight to several participating laboratories, each room is required to carry out measurement according to the methods specified in the third standard weight verification specification. Here, the measurement principle, measurement method, and conditions have not changed, but the observer, measurement instrument (balance), reference measurement standard (second-class standard weight), location, and time have all changed.

At this time, the measurement results obtained by each room should first be corrected according to the correction values of the reference measurement standards used by each room, and then calculated according to the besell formula.*S*R. This is the "measurement results are generally understood as corrected results" in note 4 ". Assuming that a measurement is performed several times under 5.6 entries in a repetitive condition, since the same reference measurement standard (the same second-class standard weight) is used in the same laboratory*S*There is no need to modify the value according to the reference measurement standard. Reproducibility is also known as reproducibility. The recurrence standard deviation is also known as the inter-group standard deviation.

IV. The deviation of the experimental standard [partial] refers to "performing the same measurement on the same*N*Measure to characterize the dispersion of measurement results*S*Calculated as follows:

Formula:*XI*Is*I*The result of the next measurement.*N*The arithmetic mean value of the next measurement result (5.8 ).

Limited for the same measurement*N*Measurement results or observations at any time can be considered as an infinite number of measurement results or a sample of the population. The mathematical statistics method is the information obtained through this sample (such as the arithmetic average value and the experimental standard deviation ).*S*To infer the general nature (such as expectations ).*μ*And variance*σ*2 ). Definition Note 1: When*N*When values are considered as distributed sampling,*X*For the expected non-deviation estimation on the score,*S*2 is the variance of the Distribution*σ*2. No deviation estimation. It is expected to be the arithmetic or weighted average value of the observed value obtained through the infinite Multiple measurements, also known as the overall mean.*μ*. Obviously, it only exists theoretically and can be expressed

*μ*= Lim Σ*XI*

Note 1 variance*σ*2, which is an infinite number of measurements.*XI*And expectations*μ*The arithmetic mean of the square of the difference, which only exists theoretically and can be expressed

Positive square root of variance*σ*It is often referred to as the standard (partial) deviation, also known as the overall standard (partial) deviation (population standard deviation) or the theoretical standard (partial) difference, the experimental standard deviation obtained through finite measurements in this Definition*S*, Also known as sample standard deviation ).*S*Yes*σ*.

The mean of the population in the normal distribution is different from the standard [partial ].

The figure shows that the overall mean is*μ*, The overall standard deviation is*σ*The normal distribution. As shown in figure (c,*σ*The smaller the value, the more concentrated or acute the distribution curve is, the smaller the dispersion of the measurement results or observations.*σ*The larger the curve, the smoother the characterization, and the larger the dispersion. As shown in figure (A), the distribution curve is*X*=*μ*The curve is not only single-peak, but also*X*=*μ*In a straight line, it is symmetric.*X*=*μ*±*σ*There are two inflection points. As shown in figure (B), the distribution center is*X-μ*,*μ*The value determines that the curve is*X*The position on the axis. Figure (d) has two different*μ*Value and Difference*σ*The normal distribution curves of values are compared.

Is*μ*Unbiased Estimation,*S2*Is*σ 2*Unbiased Estimation. Here, the "unbiased estimation" can be understood as: Ratio*μ*High probability, and ratio*μ*A small probability is equal or both are 50%, and when n → ∞ ,(-*μ*) → 0. It is worth noting that:*S*2 is*σ*The Unbiased Estimation of 2,*S*No*σ*But a small estimation, that is (*S-σ*) Is the probability of a negative value, greater (*S-σ*) Is the probability of positive.

*S*Is a single observed value*XI*The lab standard (partial) is*N*The experimental standard deviation of the arithmetic mean obtained from the next measurement, which is the estimated value of the standard deviation of the distribution. For easy differentiation, the former uses*S (X)*Indicates that the latter uses*S*(), So there are*S*() =*S*(*X*)/.

Generally*S*(*X*) Represents the repeatability of the measurement instrument, and is evaluated from this instrument.*N*Dispersion of the measurement results obtained from the second measurement. With the increase of the number of measurements, the dispersion of the measurement results decreases in inverse proportion, which is caused by the Mutual Compensation of positive and negative errors after the average of multiple observations. Therefore, when the measurement requirement is high or the standard deviation of the measurement result is expected to be small, it should be increased as appropriate*N*;*N*> 20,*N*The reduction rate slows down. Therefore*N*When you increase the number of measurements, the measurement time is extended and the measurement cost is increased. In general*N*≥3,*N*= 4 ~ 20 is recommended.

It should be emphasized that it is the experimental standard deviation of the average value, rather than the standard error of the average value.