Wednesday, June 5, 2013

Multi-Row Fetch and Insert

DB2 is a relational database, which means that it works with “sets” of data at all times. DB2 tables are basically sets of zero to many rows of data, and SQL statements often process against sets of data (i.e. tables) and return sets of data as their results (i.e. cursors). However, because DB2 application programs are usually written in third generation programming languages such as COBOL, they can only process one row of data at a time. When working with a cursor that contains a result set, rows are fetched one at a time from within a loop in the program and processed as they are retrieved. If the result set cursor contains 100 rows, for example, then processing is bouncing back and forth between the application program and DB2 100 times, once for each fetch. This constant context switching between the application program and DB2 can generate a fair amount of CPU overhead, which is unavoidable because of the nature of 3GL “one row at a time” processing.

Multi-row fetch alleviates this problem. Cursor processing in an application program can now return multiple rows in a single fetch statement, up to 32,767 rows at one time. This is done via the use of host-variable arrays rather than single host-variable fields within the host COBOL program. The fetch statement is coded with the number of rows to be fetched with each call. The result is that with the use of multi-row fetch in COBOL programs, much of the CPU overhead of multiple fetches can be reduced or eliminated, thus improving overall performance and processing elapsed time.

Benchmark testing of multi-row fetch has shown that performance gains can start to be realized at around 10 rows per fetch, and optimal performance starts to be seen at around 100+ rows per fetch.

Similar performance gains can be seen with multi-row insert, which follows a similar philosophy. Rows to be inserted are staged in host-variable arrays, which can then be sent to DB2 in a single INSERT statement.

Multi-row fetch and multi-row inserts require a more sophisticated level of error checking. New diagnostic commands are available to help with error handling when using multi-row fetch/insert processing.

There is a moderate amount of programming changes that would need to be made to retrofit multi-row fetch/insert processing into an existing DB2 COBOL program, but the performance gains that can be attained may make it worth the effort if the typical processing workload of the program involves fetching or inserting large sets of data per transaction, and CPU consumption and elapsed runtime are chronic constraints for the application.