Wednesday, June 5, 2013

Common Table Expressions and Recursive SQL

A common table expression is basically a temporary table that exists only for the duration of a single SQL statement. Ideally, this can be used to replace a process in which data is gathered and stored in an interim table, which is then accessed by a subsequent SQL statement. This can now be done in a single SQL statement.

A common table expression can reference another common table expression that was defined earlier within the same SQL statement, thus allowing you to “stack” a number of intermediate processing results and process them all within a single execution of an SQL statement.

A common table expression can also reference itself, which allows for recursive processing within SQL. This is especially useful when processing tables that are self-referencing, as in a hierarchy. An example of this would be an employee table that references the employee’s manager, who in turn is also an employee with their own manager, etc.

Example of a recursive common table expression:

Given the following table:

CREATE TABLE PARTLIST
    (PART CHAR(2)
    ,SUBPART CHAR(2)
    ,QUANTITY INTEGER
    );

. . . that is populated as follows:

PART    SUBPART    QUANTITY

00      01          5
00      05          3
01      02          2
01      03          3
01      04          4
01      06          3
02      05          7
02      06          6
03      07          6
04      08         10
04      09         11
05      10         10
05      11         10
06      12         10
06      13         10
07      14          8
07      12          8

Say you have a need to answer the question “what is the total quantity of each part required to build part ‘01’?”

This can be done with the following recursive common table expression:

WITH RPL (PART, SUBPART, QUANTITY) AS
    (SELECT ROOT.PART, ROOT.SUBPART, ROOT.QUANTITY
         FROM PARTLIST ROOT
         WHERE ROOT.PART = '01'
     UNION ALL
     SELECT PARENT.PART, CHILD.SUBPART,
            PARENT.QUANTITY*CHILD.QUANTITY
         FROM RPL PARENT, PARTLIST CHILD
         WHERE PARENT.SUBPART = CHILD.PART
    )
    SELECT PART, SUBPART, SUM(QUANTITY) AS TOTAL_QTY_USED
        FROM RPL
        GROUP BY PART, SUBPART
        ORDER BY PART, SUBPART;

The result of the query is shown below:

PART    SUBPART    TOTAL_QTY_USED

01      02           2
01      03           3
01      04           4
01      05          14
01      06          15
01      07          18
01      08          40
01      09          44
01      10         140
01      11         140
01      12         294
01      13         150
01      14         144