"Who says rookie doesn't have data analysis" Learning Notes 2_ Learning Notes

Source: Internet
Author: User
Tags numeric value

First, data cleaning

1. Duplication of data processing

(1) COUNTIF function constructs auxiliary columns


The C column is screened, the non 1 is selected, and then the deletion can leave the duplicates.

Or you can sort the C column in descending order, and then delete the first few rows that are not 1

(2) Excel Advanced Filter


(3) Conditional formatting: Highlighting duplicate values

(4) Pivot table: Count

(5) Excel deletes duplicates

2. Fill in missing data

(1) If the missing value is too high, there is a problem, acceptable missing value below 10%

(2) Four ways to handle trend values:

A. Using the average of sample statistics to replace

B. Value substitution calculated by statistical model

C. Delete records with missing values

D. There will be deficiencies worthy of record retention

(3) There are two main missing values: null value and Error identifier

A. Null value: CTRL+G, Position null value, directly enter the value to replace, press Ctrl+enter, all the empty values of the part is filled with the value just entered

B. Error identifiers: Excel find Replacements

Ps.shift continuously selected, CTRL does not continuously select

3. Detecting logical errors in data

According to the specific circumstances


The background is an option for a questionnaire that is titled Multiple choices and can only choose 3, 0 is not selected, 1 is selected

The figure uses two methods

The first method is to count the total number of options that are selected to be marked 1 in each row, or to mark an error if more than 3.

The second method is to use conditional format to select illegal numbers from 0 and 1. Using the OR function, =or (b3=1,b3=0) =false that if the proposition "B3 1 or 0" is wrong (=false), that is, B3 is neither 0 nor 1, then it is highlighted

Second, data processing

1. Data extraction

(1) Field breakdown

A.excel breakdown function

B.left function Right function

(2) fields combine:& and CONCATENATE functions


(3) field matching

VLOOKUP: to match a column

VLOOKUP (parameter 1, parameter 2, parameter 3, parameter 4)

Parameter 1: the area being searched

Parameter 2: the search area. Note that the search area must be in the first column of the search area, which is the column with parameter 1, and be sure to be in the first column of the parameter 2 search area.

Parameter 3: The returned serial number starts at 1

Parameter 4:0/false table exact match, 1/true table fuzzy match, default value is fuzzy match

2. Data calculation

(1) Simple calculation: +-*/

(2) Function calculation

A.sum and Average

B. Date-related


3. Data grouping

The threshold value is to find the nearest A2 in column D and not to A2, and the threshold is lower


4. Data Conversion

(1) Row and column conversions of data tables: Transpose

(2) Conversion between multiple selection and entry mode



HLOOKUP: Matching by row

Isnumber: Whether it is a numeric value

Search: Searching for functions

Iii. Sampling of data

Rand function: Returns the decimal number between [0,1]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.