Statistics-Single Variable descriptive statistics

Source: Internet
Author: User

Review the statistical basis and prepare for the SPSS exam.

Get a strange set of data, like meeting a stranger, we meet a stranger, the first thing is often to look at TA, processing data is the same. Descriptive statistics are looking at a set of data and have a ballpark understanding of the data. In general, the data do three processing: centralized trend central tendency, discrete trend dispersion tendency, distribution morphology distribution tendency. Although simple, but the most basic, is the premise of our subsequent data analysis, through the descriptive statistics of the data, we can choose the appropriate statistical methods to prevent misuse.

Univariate statistical analysis In some books is also called a meta-statistic, only to face a variable, the method is more rigid fixed single.

Part one:central Tendency a positional statistic that sets the different observations (observation) of a variable to a value.

1.mean (X-bar), arithmetic mean (the average) a thing. -note that when used, the data should be normally distributed, even if it is not satisfied, it should be a single peak & basic symmetrical distribution. Do not choose to use the arithmetic mean when there are extreme values

2.median, the value of the middle number of the position. Find the location first, then look for the value. Position: (n+1)/2, odd position corresponding value, even? 5 about two numbers of mean. --arbitrary distribution patterns can be used

3.mode, majority

4. Other:

4.1 truncated mean trimmed mean, also known as the correction of the mean. Remove Max, min5%. The advantage is that it removes the effect of extreme values-which can be used when extreme values are available. The downside is that 10% of the data itself is real information, which is removed to reduce the information.

The 4.2 geometric mean G (geometric mean) is more used in medical statistics, when the data distribution is asymmetric, but the transformed representation of the symmetric distribution can be used.

4.3 Harmonic mean

Compared to mean and median, it should be said that the use of mean more widely, the use of more information, in the sample survey, the value of mean with the changes in the samples of the range of small, more stable, it should be said to be a better statistic, but once there is an extreme value of the existence of the mean will be greatly affected, Therefore, you should use median at this time.

Also, depending on the type of the variable, select the statistic. Nominal variables, can only use mode, otherwise meaningless, but the binary nominal variable may use the mean value. For fixed-order variables, you should use median, fixed-and constant-ratio variables using mean, in the absence of extreme values.

Supplementary, in the group distance grouping data, the mean value is calculated by the group median value method, the Median= group group lower bound +[(n/2-cfm-1)/fm]*i i is the group distance, the FM group frequency, CFM-1 group above the cumulative frequency.

Part two:dispersion Tendency, which is a scale statistic, measures the difference between data and how much the problem

1. Full distance range, foot distance

2. The use of variance is pushed down, but because of the unit of measure problem, the standard deviation treatment is used in practice. The standard deviation is consistent with the mean unit. Note the sample standard deviation. Note that the standard deviation calculation uses all the data and is also subject to extreme value interference.

3. The proportion of different people, see how the mode is representative.

4. Coefficient of variation and dispersion coefficient. Cv=s/mean. The dispersion of different samples can be compared.

5. Percentile, four percentile-eliminates extreme value interference. When the standard deviation is not available, it can be used.

Statistics-univariate descriptive statistics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.