The difference between SAS proc SQL and ordinary SAS statements
1:the PROC SQL Step does not require a RUN statement. PROC SQL executes each query automatically
2:unlike Many other SAS procedures, PROC SQL continues to run after you submit a step. To end of the procedure, you must submit another PROC step, a DATA step, or a QUIT statement
PROC SQL; Variables in the *sql process are not case-sensitive, and between the table and the table, variables and variables must be separated by commas;
CREATE TABLE Table-name as
SELECT column-1<,... column-n>
from Table-1 | View-1<, ... table-n | View-n>
<WHERE expression>
<GROUP by column-1<, ... column-n>>
having experssion
<ORDER by column-1<, ... column-n>>;
Unlike other SAS procedures, the order of clauses within a SELECT statement in PROC SQL is important
Select: In the SELECT clause, you can both specify existing columns (columns that is already stored in a Table) and create new columns, and insert the variable you want after input,new columns exist only for the duration of the query, Unless a table or a view is created (the new column is only in the query).
where: In the WHERE clause, you can specify any column (s) from the underlying table (s). The columns specified in the WHERE clause does not has the to is specified in the SELECT clause (WHERE the variable can be any one of the variables in the table).
Orderby: The following can be followed by multiple variables, separated by commas, for ascending or descending of the specified keyword after the variable, desc ASC, you can also specify a variable that is not in the SELECT clause, for variables in the clause such as
Select A,b,c from table; The order by can be replaced by the name of the order by the A=order by 1, for the ordering of multiple variables, with the semicolon interval, the value and the letter can be mixed with order by 1, the age;
groupby: used together with summary functions to have a grouping function, otherwise SAS will automatically convert it to the order by function.
proc SQL; Select sum as totalmiles from sasuser.frequentflyers Group by MemberType; /* Thegroup BY is followed by a select in which all variables without the summary function are used, otherwise the grouping effect */quit can not be obtained ;
having: A has clause works with the GROUP by clause to restrict the groups that is displayed in the output, BA sed on one or more specified conditions. and group by, as a restrictive condition, control the output content
procSQL; SelectJobcode,avg(Salary) asAvgsalary format=Dollar11.2 fromSasuser.payrollmasterGroup byJobcode having avg(Salary)>(Select avg(Salary) fromsasuser.payrollmaster); quit;
/ * First execute the statement in the subquery, calculate the average of all the salary, and then, in the HAVING clause, compare the AVG salary of each group with the total AVG salary of the subquery, and filter out the desired results */
Single-table operation
procSQL; Create TableClass as Select /*the middle of select and from are all variables to be obtained, divided into two types, one of which is the dataset itself, and the other is the variable derived from it .*/name, Case whenSex eq'female' Then '1' whenSex eq'male' Then '2' Else '3' End asSex_tran Label="Gender Conversion",/*as a keyword, form a variable, label is a keyword in the SAS*/ sum(weight- -) asNew1 format=16.2, mean (height- -) asNEW2 format=16.2,/*generally, the sum mean will appear after group by because it is to be aggregated according to the group*/(Calculated New1-Calculated New2) asNew/*calculated is a keyword that indicates that new1 is a derived variable, and if it needs to be manipulated, add calculated*/ fromSashelp.class (where=(Heightbetween - and -))/*between and in SAS are closed intervals*/ Group byName,calculated Sex_tran/*except for the field behind group by, the other fields in select must be numeric, otherwise the group*/ /*the group system defaults to the following fields in ascending order*/ Order byCalculated Sex_trandesc/*additional order by for fields that you want to be descending*/; quit;
Having the usage, seldom applied in practice, can use data step to use data step
/*Having a statement is usually followed by a group by, and if it is used alone, its function is the same as where*/procSQL; Create TableHave1 as SelectID,Max(TXN_SEQ) asMax_txn,sum(amount) asSum_amount from Temp. Having1Group byID havingCalculated Sum_amount GE -; Quit;/*The goal is to remove the last observation of each group, but the use of having has in the case of grouping related variables are not insured, so you can use data step processing as far as possible with data step*/procSQL; Create TableHave2 as SelectID, txn_seq, amount from Temp. Having2Group byID havingTXN_SEQ EQMax(TXN_SEQ); Quit;
Creating and updating tables
SQL (1) Basic syntax in SAS