Aggregate functions, such as SUM, often require the addition of a GROUP by statement.
GROUP by statement
The GROUP BY statement is used to combine aggregate functions to group result sets based on one or more columns.
SQL GROUP by syntax
SELECT column_name, Aggregate_function (column_name) from Table_namewhere column_name operator Valuegroup by column_name
SQL GROUP by instance
We have the following "Orders" table:
| o_id |
OrderDate |
Orderprice |
Customer |
| 1 |
2008/12/29 |
1000 |
Bush |
| 2 |
2008/11/23 |
1600 |
Carter |
| 3 |
2008/10/05 |
700 |
Bush |
| 4 |
2008/09/28 |
300 |
Bush |
| 5 |
2008/08/06 |
2000 |
Adams |
| 6 |
2008/07/21 |
100 |
Carter |
Now we want to find the total amount (total order) for each customer.
We want to use GROUP by statements to group customers.
We use the following SQL statements:
SELECT Customer,sum (Orderprice) from Ordersgroup by Customer
The result set looks like this:
| Customer |
SUM (Orderprice) |
| Bush |
2000 |
| Carter |
1700 |
| Adams |
2000 |
It's great, isn't it?
Let's take a look at what happens if you omit GROUP by:
SELECT Customer,sum (orderprice) from Orders
The result set looks like this:
| Customer |
SUM (Orderprice) |
| Bush |
5700 |
| Carter |
5700 |
| Bush |
5700 |
| Bush |
5700 |
| Adams |
5700 |
| Carter |
5700 |
The result set above is not what we need.
So why not use the SELECT statement above? This is explained as follows: The above SELECT statement specifies two columns (Customer and SUM (Orderprice)). SUM (Orderprice) returns a separate value (total for the "orderprice" column), and "Customer" returns 6 values (each corresponding to each row in the Orders table). Therefore, we do not get the right results. However, you have seen that the GROUP by statement solves this problem.
GROUP by more than one column
We can also apply the GROUP by statement to more than one column, just like this:
SELECT Customer,orderdate,sum (Orderprice) from Ordersgroup by customer,orderdate
An explanation of the SQL Group by clause