GROUP BY clause specifies a result table that consists of a grouping of the rows of intermediate result table that is the result of the previous clause. The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual rows.
ORDER BY clause specifies the particular order in which you want selected rows returned. This is sorted by ascending or descending collating sequence of a column’s or an expression’s value.
The ORDER BY statement is used to sequence the rows in the final output. The GROUP BY statement is used to combine multiple rows into one. The ORDER BY statement can only be applied to the result set of the SQL statement. Unlike the GROUP BY, it can not be used on any intermediate result set (e.g. a subquery or a nested-table expression). Nor can it be used in a view definition.
The GROUP BY statement is used to group individual rows into combined sets based on the value in one, or more, columns. The GROUPING SETS clause is used to define multiple independent GROUP BY clauses in one query. The ROLLUP and CUBE clauses are shorthand forms of the GROUPING SETS statement.
Notes:
- Specifying the same field multiple times in an ORDER BY list is allowed, but silly. Only the first specification of the field will have any impact on the data output order.
- If the ORDER BY column list does not uniquely identify each row, those rows with duplicate values will come out in random order. This is almost always the wrong thing to do when the data is being displayed to an end-user.
- Use the TRANSLATE function to order data regardless of the case. Note that this trick may not work consistently with some European character sets.
- NULL values always sort high.
GROUP BY & ORDER BY Rules and Restrictions:
- There can only be one GROUP BY per SELECT. Multiple select statements in the same query can each have their own GROUP BY.
- Every field in the SELECT list must either be specified in the GROUP BY or must have a column function applied against it.
- The result of a simple GROUP BY (i.e. with no GROUPING SETS, ROLLUP, or CUBE clause) is always a distinct set of rows, where the unique identifier is whatever fields were grouped on.
- There is no guarantee that the rows resulting from a GROUP BY will come back in any particular order unless an ORDER BY is also specified.
- Variable-length character fields with differing numbers on trailing blanks are treated as equal in the GROUP. The number of trailing blanks, if any, in the result is unpredictable.
- When grouping, all null values in the GROUP BY fields are considered equal.
GROUPING SETS Statement
The GROUPING SETS statement enables one to get multiple GROUP BY result sets from a single statement.
ROLLUP Statement
A ROLLUP expression displays sub-totals for the specified fields. This is equivalent to doing the original GROUP BY, and also doing more groupings on sets of the left-most columns.
CUBE Statement
A CUBE expression displays a cross-tabulation of the sub-totals for any specified fields. As such, it generates many more totals than the similar ROLLUP.
Input WEEK DAY_WEEK SALES_PERSON UNITS_SOLD 13 6 LUCCHESSI 3 13 6 LUCCHESSI 1 13 6 LEE 2 13 6 LEE 2 13 6 LEE 3 13 6 LEE 5 13 6 GOUNOT 3 13 6 GOUNOT 1 13 6 GOUNOT 7 13 7 LUCCHESSI 1 13 7 LUCCHESSI 2 13 7 LUCCHESSI 1 13 7 LEE 7 13 7 LEE 3 13 7 LEE 7 13 7 LEE 4 13 7 GOUNOT 2 13 7 GOUNOT 18 13 7 GOUNOT 1
Example 1: A query with a basic GROUP BY clause over 3 columns:
SELECT WEEK(SALES_DATE) AS WEEK, DAYOFWEEK(SALES_DATE) AS DAY_WEEK, SALES_PERSON, SUM(SALES) AS UNITS_SOLD FROM SALES WHERE WEEK(SALES_DATE) = 13 GROUP BY WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON ORDER BY WEEK, DAY_WEEK, SALES_PERSON; WEEK DAY_WEEK SALES_PERSON UNITS_SOLD 13 6 GOUNOT 11 13 6 LEE 12 13 6 LUCCHESSI 4 13 7 GOUNOT 21 13 7 LEE 21 13 7 LUCCHESSI 4
Example 2: The following query produces the result based on two different grouping sets of rows from the SALES table.
SELECT WEEK(SALES_DATE) AS WEEK, DAYOFWEEK(SALES_DATE) AS DAY_WEEK, SALES_PERSON, SUM(SALES) AS UNITS_SOLD FROM SALES WHERE WEEK(SALES_DATE) = 13 GROUP BY GROUPING SETS ((WEEK(SALES_DATE), SALES_PERSON), (DAYOFWEEK(SALES_DATE), SALES_PERSON)) ORDER BY WEEK, DAY_WEEK, SALES_PERSON; WEEK DAY_WEEK SALES_PERSON UNITS_SOLD 13 - GOUNOT 32 13 - LEE 33 13 - LUCCHESSI 8 - 6 GOUNOT 11 - 6 LEE 12 - 6 LUCCHESSI 4 - 7 GOUNOT 21 - 7 LEE 21 - 7 LUCCHESSI 4
The rows with WEEK 13 are from the first grouping set while the other rows are from the second grouping set.
Example 3: If you use the three distinct columns involved in the grouping sets in Example 2 in a ROLLUP clause, you can see grouping sets for (WEEK, DAY_WEEK, SALES_PERSON), (WEEK, DAY_WEEK), (WEEK), and grand-total.
SELECT WEEK(SALES_DATE) AS WEEK, DAYOFWEEK(SALES_DATE) AS DAY_WEEK, SALES_PERSON, SUM(SALES) AS UNITS_SOLD FROM SALES WHERE WEEK(SALES_DATE) = 13 GROUP BY ROLLUP (WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON) ORDER BY WEEK, DAY_WEEK, SALES_PERSON; WEEK DAY_WEEK SALES_PERSON UNITS_SOLD 13 6 GOUNOT 11 13 6 LEE 12 13 6 LUCCHESSI 4 13 6 - 27 13 7 GOUNOT 21 13 7 LEE 21 13 7 LUCCHESSI 4 13 7 - 46 13 - - 73 - - - 73
Example 4: The same query as Example 3, using CUBE instead of ROLLUP results in additional grouping sets for (WEEK, SALES_PERSON), (DAY_WEEK, SALES_PERSON), (DAY_WEEK), and (SALES_PERSON) in the result.
SELECT WEEK(SALES_DATE) AS WEEK, DAYOFWEEK(SALES_DATE) AS DAY_WEEK, SALES_PERSON, SUM(SALES) AS UNITS_SOLD FROM SALES WHERE WEEK(SALES_DATE) = 13 GROUP BY CUBE ( WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON ) ORDER BY WEEK, DAY_WEEK, SALES_PERSON; WEEK DAY_WEEK SALES_PERSON UNITS_SOLD 13 6 GOUNOT 11 13 6 LEE 12 13 6 LUCCHESSI 4 13 6 - 27 13 7 GOUNOT 21 13 7 LEE 21 13 7 LUCCHESSI 4 13 7 - 46 13 - GOUNOT 32 13 - LEE 33 13 - LUCCHESSI 8 13 - - 73 - 6 GOUNOT 11 - 6 LEE 12 - 6 LUCCHESSI 4 - 6 - 27 - 7 GOUNOT 21 - 7 LEE 21 - 7 LUCCHESSI 4 - 7 - 46 - - GOUNOT 32 - - LEE 33 - - LUCCHESSI 8 - - - 73
Example 5: The following query returns a result set that includes a grand total of the selected rows from the SALES table together with a group of rows aggregated by SALES_PERSON and MONTH.
SELECT SALES_PERSON, MONTH(SALES_DATE) AS MONTH, SUM(SALES) AS UNITS_SOLD FROM SALES GROUP BY GROUPING SETS((SALES_PERSON, MONTH(SALES_DATE)),()) ORDER BY SALES_PERSON, MONTH; SALES_PERSON MONTH UNITS_SOLD GOUNOT 3 35 GOUNOT 4 14 GOUNOT 12 1 LEE 3 60 LEE 4 25 LEE 12 6 LUCCHESSI 3 9 LUCCHESSI 4 4 LUCCHESSI 12 1 - - 155
Example 6: This example shows two simple ROLLUP queries followed by a query that treats the two ROLLUP clauses as grouping sets in a single result set and specifies row ordering for each column involved in the grouping sets.
SELECT WEEK(SALES_DATE) AS WEEK, DAYOFWEEK(SALES_DATE) AS DAY_WEEK, SUM(SALES) AS UNITS_SOLD FROM SALES GROUP BY ROLLUP (WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE)) ORDER BY WEEK, DAY_WEEK; WEEK DAY_WEEK UNITS_SOLD13 6 27
13 7 46
13 - 73
14 1 31
14 2 43
14 - 74
53 1 8
53 - 8
- - 155
SELECT MONTH(SALES_DATE) AS MONTH, REGION, SUM(SALES) AS UNITS_SOLD FROM SALES GROUP BY ROLLUP (MONTH(SALES_DATE), REGION) ORDER BY MONTH, REGION; MONTH REGION UNITS_SOLD3 Manitoba 22
3 Ontario-North 8
3 Ontario-South 34
3 Quebec 40
3 - 104
4 Manitoba 17
4 Ontario-North 1
4 Ontario-South 14
4 Quebec 11
4 - 43
12 Manitoba 2
12 Ontario-South 4
12 Quebec 2
12 - 8
- - 155
SELECT WEEK(SALES_DATE) AS WEEK, DAYOFWEEK(SALES_DATE) AS DAY_WEEK, MONTH(SALES_DATE) AS MONTH, REGION, SUM(SALES) AS UNITS_SOLD FROM SALES GROUP BY GROUPING SETS (ROLLUP(WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE)), ROLLUP(MONTH(SALES_DATE), REGION)) ORDER BY WEEK, DAY_WEEK, MONTH, REGION; WEEK DAY_WEEK MONTH REGION UNITS_SOLD13 6 - - 27
13 7 - - 46
13 - - - 73
14 1 - - 31
14 2 - - 43
14 - - - 74
53 1 - - 8
53 - - - 8
- - 3 Manitoba 22
- - 3 Ontario-North 8
- - 3 Ontario-South 34
- - 3 Quebec 40
- - 3 - 104
- - 4 Manitoba 17
- - 4 Ontario-North 1
- - 4 Ontario-South 14
- - 4 Quebec 11
- - 4 - 43
- - 12 Manitoba 2
- - 12 Ontario-South 4
- - 12 Quebec 2
- - 12 - 8
- - - - 155
- - - - 155
Using the two ROLLUP clauses as grouping sets causes the result to include duplicate rows. There are even two grand total rows.
Observe how the use of ORDER BY has affected the results:
- In the first grouped set, week 53 has been repositioned to the end.
- In the second grouped set, month 12 has now been positioned to the end and the regions are now displayed in alphabetic order.
- Null values are sorted high.
Group By in Join
ID NAME DEPT JOB YEARS SALARY COMM
-- ---- ---- ----- ------ ----- ------
10 Sanders 20 Mgr 7 18357.50 -
20 Pernal 20 Sales 8 18171.25 612.45
80 James 20 Clerk - 13504.60 128.20
90 Koonitz 42 Sales 6 18001.75 1386.70
100 Plotz 42 Mgr 7 18352.80 -
130 Yamaguchi 42 Clerk 6 10505.90 75.60
140 Fraye 51 Mgr 6 21150.00 -
150 Williams 51 Sales 6 19456.50 637.65
160 Molinare 10 Mgr 7 22959.20 -
190 Sneider 20 Clerk 8 14252.75 126.50
200 Scoutten 42 Clerk - 11508.60 84.20
210 Lu 10 Mgr 10 20010.00 -
220 Smith 51 Sales 7 17654.50 992.80
230 Lundquist 51 Clerk 3 13369.80 189.65
240 Daniels 10 Mgr 5 19260.25 -
250 Wheeler 51 Clerk 6 14460.00 513.30
260 Jones 10 Mgr 12 21234.00 -
One side of join – using common table expression
WITH STAFF2(DEPT, AVGSAL) AS (SELECT DEPT ,AVG(SALARY) FROM STAFF GROUP BY DEPT HAVING AVG(SALARY) > 18000 ) SELECT A.ID ,A.NAME ,A.DEPT FROM STAFF A ,STAFF2 B WHERE A.DEPT = B.DEPT ORDER BY A.ID; ANSWER ID NAME DEPT 160 Molinare 10 210 Lu 10 240 Daniels 10 260 Jones 10
One side of join – using full-select
SELECT A.ID ,A.NAME ,A.DEPT FROM STAFF A ,(SELECT DEPT AS DEPT ,AVG(SALARY) AS AVGSAL FROM STAFF GROUP BY DEPT HAVING AVG(SALARY) > 18000)AS B WHERE A.DEPT = B.DEPT ORDER BY A.ID; ANSWER ID NAME DEPT 160 Molinare 10 210 Lu 10 240 Daniels 10 260 Jones 10