Monday, July 19, 2010

Oracle10g RMAN Recovery Catalog Known Performance Issues

In this Document
Description
Likelihood of Occurrence
Possible Symptoms
Workaround or Resolution
Patches
Modification History
References


Applies to:

Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 10.2.0.4 - Release: 10.1 to 10.2
Information in this document applies to any platform.
Oracle Server Enterprise Edition - Version: 10.1.0.2 to 10.2.0.4

Description

The bugs listed below are ones associated with the RMAN Recovery Catalog affecting RMAN performance. To reduce the impact the catalog can have on backup and recovery operations, make sure all the workarounds of these bugs are implemented in the recover.bsq file.

NOTE: Before making any changes to recover.bsq you should make a backup copy of the original file.

Likelihood of Occurrence

This can happen with any environment using RMAN with the recovery catalog version up to 10.2.0.4.0.

Possible Symptoms

Known RMAN catalog issues causing severe performance impact on RMAN Backup & Recovery operations.

BUG 6451722 RSR TABLE IN RMAN CATALOG DATABASE NEVER GETS CLEANED

BUG 5219484 CATALOG RESYNCS ARE VERY SLOW - ROUT TABLE HAS 6 MILLION ROWS +

BUG 6476935 10.2.0.3 RDBMS SLOW RESYNC DUE TO MISSING INDEX ON RSR TABLE

BUG: 6034995 NEED NEW PURGING ALGORITHM FOR ROUT ROWS DURING RESYNC

BUG: 7595777 RMAN TAKES A LONG TIME BEFORE STARTING THE BACKUP

BUG 7173341 RMAN internal cleanup may be slow


Workaround or Resolution

SOLUTIONS

BUG 7173341 RMAN internal cleanup may be slow

Workaround
==========
Replace cleanupRSR query by following in recover.bsq and upgrade the
recovery catalog schema using the UPGRADE CATALOG rman command.

DELETE FROM rsr
WHERE rsr_end < nowTime-60
AND rsr.dbinc_key IN
(select dbinc_key from dbinc
where dbinc.db_key = this_db_key) ;

BUG 5219484 CATALOG RESYNCS ARE VERY SLOW - ROUT TABLE HAS 6 MILLION ROWS +

Workaround
==========

See Metalink Note 378234.1 Rman Catalog Resync Operation is Very slow at 10G

BUG 5219484 CATALOG RESYNCS ARE VERY SLOW - ROUT TABLE HAS 6 MILLION ROWS +

Workaround
==========

Reducing the number of days from 60 to 7 for cleaning up ROUT
rows will also help. The ROUT table contains the history of RMAN
output for previous RMAN sessions (most of the time, this is due to a
scheduled backup output).

PROCEDURE cleanupROUT IS
start_time date;
high_stamp number;
high_session_key number;
BEGIN
IF (this_db_key IS NULL) THEN
raise_application_error(-20021, 'Database not set');
END IF;

start_time := SYSDATE;
high_stamp := date2stamp(start_time-7);

SELECT nvl(max(rsr_key), 0) INTO high_session_key
FROM rsr, dbinc
WHERE rsr.dbinc_key = dbinc.dbinc_key
AND dbinc.db_key = this_db_key
AND rsr.rsr_sstamp < high_stamp;

DELETE FROM rout
WHERE rout_skey <= high_session_key
AND this_db_key = rout.db_key;

deb('cleanupROUT - deleted ' || sql%rowcount || ' rows from rout table');
deb('cleanupROUT - took ' || ((sysdate - start_time) * 86400) || '
seconds');

END cleanupROUT;

<> 10.2.0.3 RDBMS SLOW RESYNC DUE TO MISSING INDEX ON RSR TABLE


Workaround
==========

SQL> create index rsr_i_stamp on rsr(rsr_sstamp, rsr_srecid);

<> NEED NEW PURGING ALGORITHM FOR ROUT ROWS DURING RESYNC

Details:
RMAN only purges rows from the ROUT table based on age which can lead to poor performance of SYNC operations as the table can grow excessively large.
Workaround:
Manually delete older rows from ROUT.


BUG: 7595777 RMAN TAKES A LONG TIME BEFORE STARTING THE BACKUP

Details: Standby resync too slow. This bug will be fixed in 10.2.0.5 / 11gR2
Workaround: Replace wasresynced function in recover.bsq with the following:
function wasresynced(until_stamp IN number
,high_stamp IN number) return number is
nodups number; -- number of duplicates
high number;
low number;
resyncstamp number;
begin
high := high_stamp;
low := until_stamp;
nodups := 0;
resyncstamp := 0;
deb('resync', 'wasresynced high_stamp=' || high_stamp ||
' high_date=' || stamp2date(high_stamp), dbtype);
.
for duprec in duprec_c(low, high) loop
if (dbms_rcvcat.isDuplicateRecord(recid => duprec.recid
,stamp => duprec.stamp
,type => duprec.type)) then
if (resyncstamp = 0) then
resyncstamp := duprec.stamp;
end if;
.
nodups := nodups + 1;
if (nodups >= maxdups) then
deb('resync', 'wasresynced resyncstamp=' || resyncstamp ||
' resyncdate=' || stamp2date(resyncstamp), dbtype);
return resyncstamp;
end if;
else -- couldn't find 16 consecutive duplicate records.
deb('resync', 'wasresynced could not find record recid=' ||
duprec.recid || ' stamp=' || duprec.stamp || ' type=' ||
duprec.type || ' maxdups=' || nodups, dbtype);
return 0;
end if;
end loop;

-- Timestamp range not enough to satisfy the number of duplicates.
-- Retry using a higher timestamp
deb('resync', 'timestamp range not enough - nodups=' || nodups,
dbtype);
return -1;
end;



Patches

You can review the $ORACLE_HOME/rdbms/admin/recover.bsq to verify you have these fixes. All the fixes except for bug 7595777 are in the generic 10.2.0.4.0 code line. Anything previous to this version requires all 4 bug fixes to avoid performance impact from the catalog connection during backup and recovery operations.

Modification History

01-OCT-2007 Document creation
18-Oct-2007 Alert published
22-Oct-2007 publicaton date added to Modificaiton History
11-DEC-2007 Removed bug# incorrectly added

References

BUG:6451722 - RSR TABLE IN RMAN CATALOG DATABASE NEVER GETS CLEANED
BUG:6476935 - SLOW RESYNC DUE TO MISSING INDEX ON RSR TABLE
BUG:7173341 - CLEANUPRSR IS TAKING VERY LONG TIME CAUSING SLOW RESYNC
BUG:7595777 - RMAN TAKES A LONG TIME BEFORE STARTING THE BACKUP
NOTE:247611.1 - Known RMAN Performance Problems
NOTE:363409.1 - Known RMAN issues in Oracle10g
NOTE:378234.1 - Rman Catalog Resync Operation is Very slow at 10G
NOTE:413098.1 - Extremely Poor RMAN Backup Performance to NFS After Upgrade to 10.2

No comments: