In addition to OLAP functions such as RANK, DENSE_RANK and ROW_NUMBER, which were already supported in previous releases of DB2 LUW, v9.5 introduces five additional OLAP functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE and RATIO_TO_REPORT.
The LAG function returns the expression value for the row that is “x” rows before the current row in an OLAP window of data.
The LEAD function returns the expression value for the row that is “x” rows after the current row in an OLAP window of data.
The FIRST_VALUE function returns the expression value for the first row in an OLAP window of data.
The LAST_VALUE function returns the expression value for the last row in an OLAP window of data.
The RATIO_TO_REPORT function returns the ratio of an argument to the sum of the arguments in an OLAP window of data.
These new functions can best be understood through a simple example that uses three of these new functions.
Given the following table:
CREATE TABLE SALARY_TABLE
(NAME VARCHAR(20)
,SALARY DECIMAL(8,2)
);
. . . that is populated with 20 rows of data, the following query is executed:
SELECT NAME, SALARY,
(LEAD (SALARY, 1) OVER (ORDER BY SALARY)) – SALARY AS LEAD,
FIRST_VALUE (SALARY) OVER (ORDER BY SALARY DESC) – SALARY AS FRST_VAL,
DECIMAL(RATIO_TO_REPORT (SALARY) OVER (), 5, 4) AS RATIO_TO_REPORT
FROM SALARY_TABLE
ORDER BY SALARY DESC;
The result of the query is shown below:
NAME SALARY LEAD FRST_VAL RATIO_TO_REPORT
ELLEN 200000.00 - 0.00 0.1130
PETE 175000.00 25000.00 25000.00 0.0989
FRANK 150000.00 25000.00 50000.00 0.0847
CATHY 125000.00 25000.00 75000.00 0.0706
TOM 120000.00 5000.00 80000.00 0.0678
JOE 100000.00 20000.00 100000.00 0.0565
SUE 100000.00 0.00 100000.00 0.0565
DEB 90000.00 10000.00 110000.00 0.0508
JOAN 85000.00 5000.00 115000.00 0.0480
DAVE 80000.00 5000.00 120000.00 0.0452
SALLY 80000.00 0.00 120000.00 0.0452
BILL 75000.00 5000.00 125000.00 0.0424
BOB 70000.00 5000.00 130000.00 0.0395
ANN 60000.00 10000.00 140000.00 0.0339
PAUL 50000.00 10000.00 150000.00 0.0282
KAREN 50000.00 0.00 150000.00 0.0282
MARY 50000.00 0.00 150000.00 0.0282
RAY 45000.00 5000.00 155000.00 0.0254
TRACY 40000.00 5000.00 160000.00 0.0226
JIM 25000.00 15000.00 175000.00 0.0141
First, let’s look at the LEAD function. We want to get the difference between each person’s salary, and the salary of the person directly before them in result set. Since the result set is returned in descending order of SALARY, Ellen is listed first with the highest salary and has a null value for LEAD, since nobody is listed before her. Pete is next, and his salary is 25,000.00 less than Ellen’s. Frank’s salary is 25,000.00 less than Pete’s, and so on. In the event of a two or more people with equal salaries, the first one listed has a delta salary compared to the person before him/her, and the second (and subsequent) “ties” have a LEAD value of 0.00.
The FIRST_VALUE function in this example is showing the difference between the person’s own salary and the salary of the highest paid person in the department (Ellen, with 200,000.00).
The RATIO_TO_REPORT function in this example shows the ratio (or percentage) of each person’s salary compared to the total salary of everybody listed. The total salary of the entire table is 1,770,000.00. With a salary of 200,000.00, Ellen is making 0.1130 of the total salary in the table. Jim, at the bottom with a salary of only 25,000.00, is only making 0.0141 of the total salary in the table.
OLAP functions can also be “partitioned” by another value. If our table contained a column for the DEPARTMENT that each employee worked in, the OLAP functions could be further broken down such that the FIRST_VALUE and RATIO_TO_REPORT functions would be relative to each employee’s standing within their department rather than the entire table as a whole.