DB2 magazine is a quarterly journal dedicated to database administrators, analysts, programmers, designers, consultants, and MIS/DP managers covering all DB2 platforms (including IBM AIX, Hewlett-Packard HP-UX, Sun Solaris, SCO UnixWare, Linux, Microsoft Windows NT, Microsoft Windows 95, Microsoft Windows 98, IBM OS/2, IBM OS/400, IBM MVS, IBM os/390, Various topics for IBM VM and IBM VSE.
To get the best performance from DB2 Information Integrator, the DBA needs to know something.
Most DBAs know what needs to be done to discover, evaluate, and implement improved methods that help improve their performance for poorly performing SQL queries. For federated queries, however, DBAs must learn new tricks and broaden the general Query-tuning task list to handle the new features provided by DB2 Information Integrator (DB2 II).
DB2 II supports federated queries across a variety of data sources, including SQL Server, Oracle, Informix, Sybase, and DB2 Universal Database (UDB) for z/OS, and ISeries, Linux, Un IX and Windows. The DB2 II product is a middleware that includes optimization and data access technologies that fully leverage the capabilities of the underlying data sources and provide a unified access language (SQL) for these different sources.
The DB2 II product has opened the door to information sources that were previously inaccessible or inaccessible. However, with each new data support technology emerging, new tuning opportunities have emerged. Using DB2 II does not guarantee optimal execution of each query. What the DBA is doing is still discovering and tuning poorly performing queries.
The federated object poses new challenges. In this issue, I will explain the tasks and tools required to tune the federated query. This article assumes that you have a basic understanding of DB2 II. You can review the knowledge of DB2 II through the articles listed in the Resources section, or through Ibm.com/software/data/integration. If you want to know how Merrill Lynch (my employer) uses DB2 II to solve real-life business problems, see the article "Taming the Beast".
Basic Tuning Tasks
Tuning any of the poorly performing SQL statements requires a methodical approach. Changing parameters without a plan makes it difficult to determine the effect of each change you make. If a change is counterproductive to performance, you need to know which changes should be returned.
Regardless of the tuning SQL, you should perform the following tasks:
Evaluates current performance and logs to the document.
Talk to the person who created the query and get reasonable performance expectations.
Establish business objectives for the query.
For queries from production systems, get a Explain plan.
If possible, rebuild the environment on the test system and rerun the query.
For all tables related to a query, be sure that you have recently executed a Runstats or equivalent remote command on these tables.
Ensure that the tables are reorganized to match their clustered indexes.
Find the SQL functions on the indexed columns in the WHERE clause, which may cause the optimizer to ignore the index.
Make sure that you use indexed columns wherever possible in the WHERE clause.
Find Cartesian product (Cartesian products).
Tracks the effect of each change, tracking one item at a time.
Federal Tuning Mission
Because federated queries are connected to different data sources, the first step in tuning is to split the query into two parts: local and remote. The local part uses objects that reside on the DB2 II server, and the remote part uses objects that reside on the remote host. Partitioning federated queries in this way helps to determine the root cause of performance problems.
All federated objects in the DB2 II server are identified as nicknames (Type=n) in the Syscat.tables catalog table. You can also identify federated objects in a query by using the EXPLAIN command, which displays the global access plan selected by the DB2 II engine. The output of the EXPLAIN command is the only way to determine whether the optimizer will use the overlay (push-down) processing (that is, the processing performed on the remote server) for the query. DB2 II uses information from wrappers, servers, and nickname objects to determine what tasks can be superimposed on a remote server.
Listing 1 shows an instance of a federated query stored in the Db2ii-query.sql file. Queries on bold-marked tables (MIDS.TBACCT and MIDS.TBACCT-HLDR) refer to remote objects. Table Hjg.iitbl is local to the DB2 II server. Explain will break down this query into several parts. For the part that will use the overlay, mark the word ship. Listing 2 shows an abridged edition of the Explain output. The optimizer marks the subquery #1 as "shipable".
Listing 1. Federated SQL Query
db2expln -d dbdsdr -stmtfile db2ii-query.sql -o db2ii-query.out -terminator ";" -g
select a.dsssca-no
, a.dssas-ty-cd
, imlp-1st-nm
, imlp-lst-nm
, mlp-st-cd
ac-eom-ast-am
from mids.tbacct a
, mids.tbacct-hldr ah
,hjg.iitbl
where a.dsssca-no =ah.dsssca-no
and ac-eom-ast-am > 1000000
and st-cd =mlp-st-cd and-mlp-st cd ='ny'
fetch first 1 rows only
Listing 2. Explain output, which marks the query to do the overlay processing
| Ship Distributed Subquery #1
| | #Columns = 6
Return Data to Application
| #Columns = 6
This part of the query is generated by the optimizer, which contains only those objects that access the remote data source. Listing 3 shows a portion of the Explain output, which has the rewritten query. The output in this list contains information about the query, including the SQL that will be sent to the remote data source, the nickname used, and the name and associated version of the remote server (here is midasp). If you are interested in the full Explain output, please download the file: IBM DB2 Universal Database SQL Explain Tool (DOC, 30K).
Listing 3. Optimizes the query that is written to access the remote object. Distributed substatement #1:
Server: MIDASP (DB2/UDB 7.1)
SQL Statement:
SELECT A0."DSSSCA-NO", A0."AC-EOM-AST-AM", A0."DSSAS-TY-CD", A1."MLP-ST- CD", A1."IMLP-1ST-NM", A1."IMLP-LST-NM"
FROM "MIDS"."TBACCT" A0, "MIDS"."TBACCT-HLDR" A1
WHERE (1000000 < A0."AC-EOM-AST-AM") AND (A1."MLP-ST-CD" ='NY') AND (A0."DSSSCA-NO" =A1."DSSSCA-NO") FOR READ ONLY
Nicknames Referenced:
MIDS.TBACCT-HLDR ID = 33125
Base = MIDS.TBACCT-HLDR
MIDS.TBACCT ID = 33124
Base = MIDS.TBACCT