Today, a co-worker called me and asked me about this problem that was listed in my daily development, and it's true that both of them are related to grouping, but note that one is a grouping function and the other is an analytic function, The table used to explain the example of Scott's EMP and dept tables in Oracle's schema:
Mister Orders said: Need to count the number of each department in the company and each department of wages issued by the sum; the first analysis of the number of each department and the sum of wages, refers to the different departments of the personnel are placed in different places, and then the number of different departments to come out, as well as a salary added, That is to say, the data in the table is scattered, by grouping functions to display a group of several pieces of data, it is necessary to note that after the completion of the packet block is a whole unit, so for this whole is not allowed to use such as for a single object in the block to do processing
, formal because of this, many beginners will make this type of error, OK, on the SQL;
SELECT DEPTNO Department Number, COUNT (*) as department number, sum (SAL) departmental payroll sum from EMP GROUP by DEPTNO ORDER by SUM (SAL) DESC;
The results are not posted, and you can see that the entire EMP table is divided into three pieces of data, each piece of data for a department, for the group function, it does not care about each group in the end is composed of who, or relative to the individual, the group function is to stand on a macro point of view of the data;
OK, then another demand is coming, if you need to show the name of each department, salary, the employee in the Department of the payroll, the Department of the sum of wages. So this time using partition by will be more convenient to see the code:
SELECT DEPTNO department Name, ENAME employee name, Sal employee's salary, row_number () over (PARTITION by deptno orders by Sal) in the Department of payroll from EMP;
When it comes to this, we're going to start figuring out what the difference is, and again, the sum of departmental wages:
--Use the group by method
SELECT deptno department number, sum (SAL) departmental payroll sum from the EMP group by DEPTNO;
The results of the execution are:
Here's another look:
--Use the parse function PARTITION
by method SELECT Deptno department number, sum (SAL) over (PARTITION by Sal) departmental payroll sum from EMP;
The output results are:
It may be strange for this result, but this is the difference between the two functions, if it is not clear, look at the following more specific example:
SELECT DEPTNO department Name, ENAME employee name, Sal employee's salary, sum (SAL) over (PARTITION by deptno Order by Sal) The total payroll from EMP;
Output results:
Through the above three examples, we can draw the following conclusions:
1, the literal meaning, the grouping function is for the whole, the analytic function is used to analyze the individual, the result of presenting also verifies this point;
2, for group by, is from the overall macro to grasp the packet after the data module, for partition by, although it is also talking about data grouping, but it is from the individual data module, the corresponding operation, such as the sum of the above wages, such as each department has 100 employees, After using it, there will be 100 summation data displayed;
In short, group by groups the blocks as a whole, ignoring the individual; PARTITION by is also regarded as the whole, but the individual is valued.