Wednesday, June 5, 2013

OLAP Functionality

Three OLAP functions are now available in DB2: RANK, DENSE_RANK and ROW_NUMBER.

RANK and DENSE_RANK are very similar in that they will assign a ranking value to a row with respect to how it fares in comparison to other rows in the result set, based on a given criteria. RANK will assign a value that is relative to the top of the list , where DENSE_RANK will eliminate gaps in the sequencing of the assigned ranking.

ROW_NUMBER will simply assign the relative position of the row in the result set, starting at one.

These three functions can best be understood through a simple example that uses all three functions.

Given the following table:

CREATE TABLE SALARY_TABLE
    (NAME VARCHAR(20)
    ,SALARY DECIMAL(8,2)
    );

. . . that is populated with 20 rows of data, the following query is executed:

SELECT NAME, SALARY,
       RANK () OVER (ORDER BY SALARY DESC) AS RANK,
       DENSE_RANK () OVER (ORDER BY SALARY DESC) AS DENSE_RANK,
       ROW_NUMBER () OVER (ORDER BY SALARY DESC) AS ROW_NUMBER
    FROM SALARY_TABLE
    ORDER BY SALARY DESC;

The result of the query is shown below:

NAME        SALARY    RANK    DENSE_RANK    ROW_NUMBER

ELLEN    200000.00     1       1             1
PETE     175000.00     2       2             2
FRANK    150000.00     3       3             3
CATHY    125000.00     4       4             4
TOM      120000.00     5       5             5
JOE      100000.00     6       6             6
SUE      100000.00     6       6             7
DEB       90000.00     8       7             8
JOAN      85000.00     9       8             9
DAVE      80000.00    10       9            10
SALLY     80000.00    10       9            11
BILL      75000.00    12      10            12
BOB       70000.00    13      11            13
ANN       60000.00    14      12            14
PAUL      50000.00    15      13            15
KAREN     50000.00    15      13            16
MARY      50000.00    15      13            17
RAY       45000.00    18      14            18
TRACY     40000.00    19      15            19
JIM       25000.00    20      16            20

Ellen has the highest salary, and is therefore assigned a RANK value of 1. She is followed by Pete, Frank, Cathy and Tom who are ranked 2, 3, 4 and 5 respectively. Joe and Sue both have identical salaries and are therefore both assigned a RANK value of 6. They are followed by Deb, who has a RANK value of 8. Deb’s RANK value of 8 has been assigned because there are seven other people who have a salary greater than hers. Other “ties” in the ranking can be seen at numbers 10 and 15.

DENSE_RANK follows a similar pattern, except that in the case of ties in the ranking criteria, the next sequential value is assigned to avoid skips in the DENSE_RANK values assigned. In this example, following a tie between Joe and Sue at number 6, Deb is assigned a DENSE_RANK value of 7 as opposed to the assigned RANK value of 8.