One, variable batch renaming:
For example, to change the number of variables a_2 b_2 c_2 d_2 e_2 suffix to W
ren (*_2) (*W)
Second, check the duplicate data commonly used commands:
Duplicates report X//Reports x variable has no duplicates
Duplicates list x//list duplicate records
Bys X:gen Cn=_n
Browse if Cn>1
Drop CN//Browse specific duplicate values for next step analysis and processing
Duplicates drop x//delete duplicate value, keep first record of duplicate value
Third, the data transverse longitudinal conversion:
Long
+------------+ Wide
| I J stub | +----------------+
|------------| | I stub1 stub2 |
| 1 1 4.1 | Reshape |----------------|
| 1 2 4.5 | <---------> | 1 4.1 4.5 |
| 2 1 3.3 | | 2 3.3 3.0 |
| 2 2 3.0 | +----------------+
+------------+
Reshape before checking j,stub have duplicate records, there is no repetition can not reshape.
Turn vertically to landscape--if J is a Chinese character, first rename the variable to the English letter (RT) or the number can be beats after the variable name:
Gen rt= "BP" if j== "blood pressure"
Replace rt= ' height ' if j== ' height '
Reshape wide all the same I correspond to an inconsistent variable, I () j () string//If J is string, followed by a string tag
Transverse to portrait--to name the horizontal data as Stub1 stub2 and so on, to generate a new J variable.
Reshape long stub, I () j (new variable name)
Iv. data type conversions: destring, replace force
ToString, replace force
V. Intercept the generation of new variables: for example blood pressure (BP) 130/85 interception for high pressure (SBP) and low pressure (DBP)
Gen Sbp=real (substr (bp,1,3))
Gen Dbp=real (substr (bp,-2,2))
Calculate a new variable: for example, calculate BMI with height and weight data
Gen Bmi=weight^10000/(height^2)//weight multiplied by 10,000 divided by the square of height
Vii. Consolidation of databases:
Merge: Add new variable append: variable unchanged, add record
The variables used by the merge 1:1/1:m/m:1 barcode using "FILE.DTA"//merge must not be duplicated, and must be processed first
Append using filename
Calculate the same number of records in the variable x:
For example, the variable x represents the medical number, there are several identical medical numbers on behalf of this person a few times, now to screen out more than three times the number of people
Gen N=_n
by Tjid, Sort:gen n1=_n//N1 represents how many observations each tijd has
Drop if N1 <=3 or keep if N1 >=3