Troubleshooting of DB2 redirection recovery and rollback

Source: Internet
Author: User

Foreword
An important part of database management and maintenance work is the regular backup and recovery of the database. The importance of this kind of work is not only reflected in the preservation of data and the disaster tolerance of the system, but also in the establishment of a database environment for the development and testing of application systems.

Imagine that in an application environment that is constantly updated and upgraded, the database data is constantly updated, and program developers are constantly developing new versions, establishing test data and environments, testing the applications, and then releasing them to the production environment. In this way, for a highly complex application, using real data in the production environment to establish a test environment becomes a reasonable choice. Of course, before using real data, some sensitive data may need to be deleted or modified in order to protect the company's interests.

In the above scenario, the database administrator will copy the data of the production environment to the development and test environment according to the requirements of the project development. Database redirection recovery technology provides a faster way to help administrators complete this work.

This article is aimed at readers who have preliminary knowledge of DB2 recovery, focusing on the problems often encountered in the process of redirected recovery and roll forward, and through some problems encountered in practical applications, how to avoid in advance and after the problem occurs Solution.

Introduction to DB2 redirected recovery and rollforward knowledge
From the scenario assumed in the previous section, we know that DB2 redirected recovery is often used to restore the database in different environments. Differences in these environments will cause some trouble for redirected recovery. For example: the memory of the production environment is usually larger than that of the test environment, and the space allocated to the transaction log in the production environment is also larger. In addition, the most obvious difference is that the location of the table space in the test environment is not the same as that in the original production environment. It's the same. For these differences, understanding and mastering in advance will help prevent and solve problems encountered during database recovery. The operation methods and related commands of redirect recovery and roll forward are introduced below.

Automatically generate redirection recovery script and redirection recovery status query
DB2 provides commands for users to extract database redirection recovery scripts from a database backup file. The example is as follows, where / db2_backup / db2inst1 / sample is the directory where the database backup file is located, and 20101023180128 is the timestamp of the database backup file.


db2 restore db sample from / db2_backup / db2inst1 / sample taken at 20101023180128
redirect generate script redirect_sample.sql
DB20000I The RESTORE DATABASE command completed successfully.
The generated redirect file redirect_sample.sql can be divided into three parts:

1. Restore statement

This statement is used to mark the beginning of a redirected restore operation command. It adds the redirect parameter to the normal restore command.


RESTORE DATABASE SAMPLE
FROM '/ db2_backup / db2inst1 / sample'
TAKEN AT 20101023180128
INTO SAMPLE
REDIRECT;
2. The set containers statement:

When the physical storage device of the target database is different from the original database, you need the following command to specify a new physical container.


SET TABLESPACE CONTAINERS FOR 0
 USING (
  PATH '/ db2inst1 / SAMPLE'
 );
 SET TABLESPACE CONTAINERS FOR 1
 USING (
  PATH '/ db2inst1 / temp'
 );
 SET TABLESPACE CONTAINERS FOR 2
 USING (
  DEVICE '/ dev / rsample_1G' 131072
 );
...
3. restore continue statement:

This statement means that the sequence of redirected recovery statements is completed, and the system starts to restore the database.

1
RESTORE DATABASE SAMPLE CONTINUE;
In the process of database restoration, we can check the status of restore through the list utilities command. Examples are as follows:


db2 list utilities show detail
 
ID = 4
Type = RESTORE
Database Name = SAMPLE
Partition Number = 0
Description = db
Start Time = 10/24/2010 13: 49: 17.515893
State = Executing
Invocation Type = User
Progress Monitoring:
 Completed Work = 2938126336 bytes
 Start Time = 10/24/2010 13: 49: 17.515898
The Completed Work represents the amount of data that has been completed, and the approximate completion time can be estimated compared with the size of the backup file.

Common roll-forward commands, determination of required log files and status query
There are many kinds of roll-forward commands, which are not listed here. The most commonly used statements are rollforward to and rollforward complete.

For example, using the log file of the specified directory, roll forward to a certain point in time:


rollforward db sample to 2010-11-21-17.00.00.000000
using local time overflow log path (/ db2_backup / sample / logs)
End of roll forward:


rollforward db sample complete overflow log path (/ db2_backup / sample / logs)
The most effective statement to query the rollforward status:


db2 rollforward db db_name query status
For example, if the restore ends successfully and rollforward has not started, you will get a similar result when viewing the status:


db2 rollforward db sample query status
 
 Rollforward Status
 
Input database alias = sample
Number of nodes have returned status = 1
 
Node number = 0
Rollforward status = DB pending
Next log file to be read = S0001519.LOG
Log files processed =-
Last committed transaction = 2010-10-23-08.41.52.000000 UTC
We can know that the next log file to be read by rollforward is S0001519.LOG.

In the process of database roll forward, we can also check the status of roll forward through list utilities.


$ db2 list utilities show detail
 
ID = 5
Type = ROLLFORWARD RECOVERY
Database Name = SAMPLE
Partition Number = 0
Description = Database Rollforward Recovery
Start Time = 10/25/2010 01: 45: 44.392021
State = Executing
Invocation Type = User
Progress Monitoring:
 Phase Number [Current] = 1
 Description = Forward
 Completed Work = 824384727 bytes
 Start Time = 10/25/2010 01: 45: 44.392051
 
 Phase Number = 2
 Description = Backward
 Completed Work = 0 bytes
 Start Time = Not Started
Analysis of common problems of DB2 redirection recovery
In the three stages of DB2 redirected recovery, errors often occur in the second stage, which is when set tablespace containers. Here are some common errors, and the solutions and prevention of these errors. For reference.

Size calculation error for bare device type containers
Commands and results:

1
2
db2 set tablespace containers for 8 using (DEVICE '/ dev / rsample_1G' 262144)
SQL1422N The size of the container is invalid. SQLSTATE = 54039
Solutions and prevention:

The container size 262144 is incorrect. Combine the check result of lslv and the pagesize of the tablespace to recalculate the container size.


twenty one
twenty two
twenty three
twenty four
25
26
lslv rsample_1G
LOGICAL VOLUME: rsample_1G VOLUME GROUP: datavg3
LV IDENTIFIER: 00c790ea00004c000000011fb9a36069.112 PERMISSION: read / write
VG STATE: active / complete
                LV STATE: opened / syncd
TYPE: raw
                WRITE VERIFY: off
MAX LPs: 512
                 
               PP SIZE: 64 megabyte (s)
COPIES: 1
                SCHED POLICY: parallel
LPs: 16
               PPs: 16
STALE PPs: 0
                BB POLICY: relocatable
INTER-POLICY: minimum
                RELOCATABLE: yes
INTRA-POLICY: middle
                UPPER BOUND: 1024
MOUNT POINT: N / A
                LABEL: None
MIRROR WRITE CONSISTENCY: on / ACTIVE
EACH LP COPY ON A SEPARATE PV?: Yes
Serialize IO?: NO
DEVICESUBTYPE: DS_LVZ
The calculation formula is as follows: PPs * PP Size / pagesize

Here, pagesize is calculated according to 8K.


db2 set tablespace containers for 8 using (DEVICE '/ dev / rsample_1G' 131072)
DB20000I The SET TABLESPACE CONTAINERS command completed successfully.
The specified container has already been used
Commands and results:


db2 set tablespace containers for 64 using (DEVICE '/ dev / rsample_4G' 524288)
SQL0294N The contain er is already in use. SQLSTATE = 42730
Solutions and prevention:

Check the status of the bare device through lslv, pay attention to the value of LV STATE. If it is opened / syncd, it means that this container is being used by other systems. If it is closed / syncd, it may be a usable device. "Maybe" here is because sometimes, for example, after the database is stopped, all the raw devices it uses are in the closed / syncd state, but if other applications use the raw device at this time, the database will be affected. damage. So be sure to confirm before use that no other system is using this bare device.

DB2 will set some flags when using a raw device to indicate which instance of the database is using the raw device. However, when DB2 deletes a table space or one of its containers, sometimes these flags will not be cleared. At this time, although no other database is using this device, the above error will still occur. After confirming that there is no other system in use, you can use the following DB2 command to manually clear these flags.


db2untag -f / dev / rsample_4G
The specified container type is inconsistent with the original container
Commands and results:


db2 set tablespace containers for 3 using (PATH '/ db2inst1 / SAMPLE / TBS / SYSTOOL ’)
SQL0298N Bad container path. SQLSTATE = 428B2
Solutions and prevention:

The original container is of type FILE. If it is specified as PATH during redirect recovery, an error will be reported.

After modification:


db2 set tablespace containers for 3
using (File '/db2inst1/SAMPLE/TBS/SYSTOOL.DAT' 100)
DB20000I The SET TABLESPACE CONTAINERS command completed successfully.
An error occurred in the specified container name
Commands and results:


db2 set tablespace containers for 106
using (DEVICE '/ dev / dev / rsample_500M' 65536)
SQL0298N Bad container path. SQLSTATE = 428B2
Solutions and prevention:

Ensure the correctness of the container name and path.

db2 set tablespace containers for 106
using (DEVICE '/ dev / rsample_500M' 65536)
DB20000I The SET TABLESPACE CONTAINERS command completed successfully.
An error occurred during Restore db continue, the database recovery directory is full

db2 restore db sample continue
SQL2544N The directory where the database is being restored has become full.
Solutions and prevention:

Check the statement containing PATH,


set tablespace containers for 0 using (
 PATH '/ db2inst1 / SAMPLE'
);
possible reason:

The directory / db2inst1 / SAMPLE is full.

Focus on checking the usage of the directory where the SMS tablespace is located. Replace or expand the file system. You can also get more detailed information through the db2diag.log file.

Rolling forward common problem analysis
When running rollforward, the log file is missing

Commands and results:


db2 "rollforward db sample to 2010-10-24-17.00.00
using local time overflow log path (/ db2_backup / db2inst1 / logs) "
 
SQL4970N Roll-forward recovery on database "SAMPLE" cannot reach the specified
stop point (end-of-log or point-in-time) on database partition (s) "0".
Roll-forward recovery processing has halted on log file "S0102805.LOG".
Error log (db2diag.log):


2010-11-23-03.03.17.731773-300 I2966741A419 LEVEL: Error
PID: 1089734 TID: 2615 PROC: db2sysc 0
INSTANCE: db2inst1 NODE: 000 DB: SAMPLE
EDUID: 2615 EDUNAME: db2loggr (SAMPLE) 0
FUNCTION: DB2 UDB, data protection services, sqlpgasn, probe: 650
RETCODE: ZRC = 0x801000BB = -2146434885 = SQLPR_MISSING_LOGFILES
"rollforward missing log files"
Solutions and prevention:

Obtain the required log files from the backup disk. Then run the rollforward command again.

You can also use the following command to prepare the required log files in advance to avoid errors. The last log file required by rollforward to a certain point in time can be judged from "Start Time" and "End Time".


db2 list history archive log since 20101023040030 for sample | more
 List History File for sample
Number of matching file entries = 30
Op Obj Timestamp + Sequence Type Dev Earliest Log Current Log Backup ID
---- -------- ---- --- ------- ------- --------
 X D 20101023045856 1 U S0102805.LOG C0000000
-------------------------------------------
-------------------------------------------
 Comment:
Start Time: 20101023045856
 End Time: 20101026033936
 Status: A
-------------------------------------------
When running rollforward complete, the active log space is full

Commands and results:


db2 "rollforward db sample complete overflow log path / db2_backup / db2inst1 / logs)"
SQL1004C There is not enough storage on the file system to process the command.
Error log (db2diag.log):


2010-10-27-23.06.28.470787-240 I172869537A496 LEVEL: Error
PID: 1970552 TID: 5655 PROC: db2sysc 0
INSTANCE: db2inst1 NODE: 000 DB: SAMPLE
APPHDL: 0-61 APPID: * LOCAL.db2inst1.101028030442
AUTHID: DB2inst1
EDUID: 5655 EDUNAME: db2agent (SAMPLE) 0
FUNCTION: DB2 UDB, recovery manager, sqlpForwardRecovery, probe: 2610
RETCODE: ZRC = 0x850F000C = -2062614516 = SQLO_DISK "Disk full."
 DIA8312C Disk was full.
Solutions and prevention:

Modify the database configuration parameter NEWLOGPATH to specify a directory with a larger space as the active log directory. Then run the rollforward command again.

1
db2 update db cfg for sample using NEWLOGPATH / db2_backup / db2inst1_log01 / sample
Or modify the redirect restore command in advance, and specify a larger directory as the active log directory when doing database recovery. This can avoid problems encountered during the rollforward process.


db2 "restore db sample \
 from / db2_backup / db2inst1 / backup \
 taken at 20101023084025 newlogpath / db2_backup / db2inst1_log01 / sample \
 redirect "
Buffer pool related errors, solutions and prevention

Commands and results:


db2 "rollforward db sample to 2010-11-21-17.00.00.000000
using local time overflow log path (/ db2_backup / db2inst1 / SAMPLE / logs) "
SQL1218N There are no pages currently available in bufferpool "".
SQLSTATE = 57011
Error log (db2diag.log):


2010-11-24-05.19.14.842891-300 I67571A941 LEVEL: Error
PID: 296330 TID: 75304
                PROC: db2sysc 0
INSTANCE: db2inst1 NODE: 000 DB: SAMPLE
APPHDL: 0-27 APPID: * LOCAL.db2inst1.101124101914
AUTHID: DB2INST1
EDUID: 75304 EDUNAME: db2agent (SAMPLE) 0
FUNCTION: DB2 UDB, SQO Memory Management, SqloMemController :: registerConsumer, p
robe: 1000
MESSAGE: ZRC = 0x8B0F0000 = -1961951232 = SQLO_NOMEM "No Memory Available"
 DIA8300C A memory heap error has occurred.
Solutions and prevention:

The error here is that the buffer set by the source database is too large and the system where the target database is located cannot be supported. We can modify the parameter value DB2_OVERRIDE_BPF to force DB2 to use a smaller buffer. After restarting the instance, perform the rollforward operation again.

1
db2set DB2_OVERRIDE_BPF = 500 (500 is the assumed value)
Table space state-related errors, solutions, and prevention

The LOAD operation may rollforward has a certain impact. Sometimes interaction is required during the rollforward process. If you select (t), the state of the table space will be abnormal.

Commands and results:


db2 "rollforward db sample to 2010-11-19-17.00.00
 using local time overflow log path (/ db2_backup / db2inst1 / logs) "
 
 SQL3799W Load recovery for table "TEST .WORK_DETAIL" at time
"20101116221501" on node "0" is pending due to warning "-2061" with additional
 information "/ dev / null".
 Do you want to continue (c), terminate this device only (d), abort the utility (t)?
Error log (db2diag.log):


2010-11-24-11.15.39.678474-300 I806561A381 LEVEL: Warning
PID: 2126036 TID: 1 PROC: db2redom (SAMPLE) 0
INSTANCE: db2inst1 NODE: 000 DB: SAMPLE
APPHDL: 0-668 APPID: * LOCAL.db2inst1.081124154042
FUNCTION: DB2 UDB, recovery manager, sqlpRecDbRedo, probe: 2129
MESSAGE: Tablespace 27 in restore pending state.
Solutions and prevention:

We can view the LOAD operation after a certain backup time.


db2 list history backup since 20101120170928 for sample | more
If there is a rollforward table space is not available. We can restore the corresponding table space. Or delete, rebuild the corresponding table space, and import the data.

Table state related errors, solutions and prevention

After the database restore and rollforward are completed, when you check the status of each data table, you may find that some data table statuses are unavailable. The simple method is to filter the db2diag.log file to find error messages similar to the following.

Error log (db2diag.log):


2010-11-26-08.01.51.170966-300 E21515A743 LEVEL: Warning
PID: 3104786 TID: 26364 PROC: db2sysc 0
INSTANCE: db2inst1 NODE: 000 DB: SAMPLE
APPHDL: 0-380 APPID: * LOCAL.db2inst1.101126125907
AUTHID: DB2INST1
EDUID: 26364 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 UDB, data management, sqldMarkObjInErr, probe: 1
MESSAGE: ADM5571W DB2 is marking the "DATA" object with id "141" in
 tablespace "8" for table "TBSPACEID = 8.TABLEID = 141" unavailable.
 Either the table will have to be dropped, or if the object is part of
 a partitioned table the partition in error can be detached or the
 index in error can be dropped.
Solutions and prevention:

In general, we need to delete and rebuild these data tables. If necessary, we can restore data from other environments or backups.

Conclusion
This article lists some frequently encountered problems and solutions in DB2 redirected recovery, hoping to help readers solve some of the problems encountered in actual work.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.