Stata batch renaming, data checking, row and column conversion, type conversion, variable interception and generation, database merging and other commands _stata

Source: Internet
Author: User
Tags stub

One, variable batch renaming:

For example, to change the number of variables a_2 b_2 c_2 d_2 e_2 suffix to W

ren (*_2) (*W)

Second, check the duplicate data commonly used commands:

Duplicates report X//Reports x variable has no duplicates

Duplicates list x//list duplicate records

Bys X:gen Cn=_n

Browse if Cn>1

Drop CN//Browse specific duplicate values for next step analysis and processing

Duplicates drop x//delete duplicate value, keep first record of duplicate value

Third, the data transverse longitudinal conversion:
Long
+------------+ Wide
|                            I J stub | +----------------+
|------------| | I stub1 stub2 |
|     1 1 4.1 | Reshape |----------------|
|   1 2 4.5 | <---------> | 1 4.1 4.5 |
|                          2 1 3.3 | | 2 3.3 3.0 |
|                            2 2 3.0 | +----------------+
+------------+

Reshape before checking j,stub have duplicate records, there is no repetition can not reshape.

Turn vertically to landscape--if J is a Chinese character, first rename the variable to the English letter (RT) or the number can be beats after the variable name:

Gen rt= "BP" if j== "blood pressure"

Replace rt= ' height ' if j== ' height '

Reshape wide all the same I correspond to an inconsistent variable, I () j () string//If J is string, followed by a string tag

Transverse to portrait--to name the horizontal data as Stub1 stub2 and so on, to generate a new J variable.

Reshape long stub, I () j (new variable name)

Iv. data type conversions: destring, replace force

ToString, replace force

V. Intercept the generation of new variables: for example blood pressure (BP) 130/85 interception for high pressure (SBP) and low pressure (DBP)

Gen Sbp=real (substr (bp,1,3))

Gen Dbp=real (substr (bp,-2,2))

Calculate a new variable: for example, calculate BMI with height and weight data

Gen Bmi=weight^10000/(height^2)//weight multiplied by 10,000 divided by the square of height

Vii. Consolidation of databases:

Merge: Add new variable append: variable unchanged, add record

The variables used by the merge 1:1/1:m/m:1 barcode using "FILE.DTA"//merge must not be duplicated, and must be processed first

Append using filename

Calculate the same number of records in the variable x:

For example, the variable x represents the medical number, there are several identical medical numbers on behalf of this person a few times, now to screen out more than three times the number of people

Gen N=_n

by Tjid, Sort:gen n1=_n//N1 represents how many observations each tijd has

Drop if N1 <=3 or keep if N1 >=3


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.