Converting data that does not obey normal distribution into a non-normal or approximate normal distribution

Source: Internet
Author: User

The method of variable transformation can be applied to convert data from non-normal distribution to normal or approximate normal distribution. The commonly used variable transformation methods include logarithmic transformation, square root transformation, reciprocal transformation, square root and so on, and the appropriate variable transformation method should be chosen according to the data properties.
1, the logarithmic transformation will be the original data x value as the new distribution data:
X ' =LGX
X ' =lg (x+1) can be taken when there is a small value and zero in the original data
You can also select X ' =lg (x+k) or X ' =lg (k-x) as required.
Logarithmic transformations are commonly used (1) to make data that obeys a lognormal distribution normal. such as the distribution of certain pollutants in the environment, the distribution of some trace elements in the human body, can be improved by lognormal distribution. (2) To achieve the variance homogeneity of the data, in particular, the standard deviation of each sample is proportional to the mean or coefficient of variation CV close to a constant.
2. Square root transforms the square root of the original data x as the new distribution data.
X ' =sqrt (x)
The square root transformation is commonly used to: 1) to make the poission distribution of the count data or the light bias data is normal, the square root can be transformed to make it normal. 2) when the variance of each sample is positively correlated with the mean, the data can reach the homogeneity of variance.
3. The reciprocal transformation will be the inverse of the original data x as a new analytical data.
X ' =1/x
It is often used for data with large fluctuation of data, which can reduce the influence of extreme value.
4, square root inverse rotation transformation is the original data x square root of the inverse of the value as a new analytical data.
X ' =sin-1sqrt (x)
Data that is often used to obey the rate or percentage of two distributions. It is generally considered that the general rate is smaller, such as <30% or larger (such as >70%), deviating from the normal state is more obvious, through the sample rate of the square root of the inverse of the transformation, can make the data close to normal distribution, to achieve the homogeneity of variance requirements.


You can make the right conversions based on your own data. In addition, other analytical methods, such as rank and test, can be considered.

Converting data that does not obey normal distribution into a non-normal or approximate normal distribution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.