Using Excel for Data Mining (4)---- highlighting outliers
After you configure your environment, you can use Excel for data mining.
Environment configuration issues can be found in:
http://blog.csdn.net/xinxing__8185/article/details/46445435
Sample dmaddins_sampledata.xlsx
Files:http://download.csdn.net/detail/xinxing__8185/8780481
In the Data table, select Table Analysis Tools Sample , which is the user's information statistics, including marital status, gender, income, children, education level, occupation, whether there is a house, number of cars, living area, age, whether purchased bicycles, etc.
Click on the data in the table and the tab will be one more
In the analysis data obtained from a set of parallel determinations , It is sometimes shown that the individual measured values are far apart from other data , which are called outliers or outlier values (qutlier). There are many statistical methods for analyzing outlier values.
On the surface, outliers are very large and very small values, which may be caused by errors or not errors, but because these data are very small, not representative, so it is deleted, so as not to affect the subsequent data mining.
Below, an example of the data is analyzed using the highlight outliers feature:
Click the highlight outliers and the following dialog box appears:
Select the columns you want to analyze to produce the following reports:
From the results, the detection of outliers is not isolated, but rather a combination of data factors between the columns.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Using Excel for Data Mining (4)----highlighting outliers