lunes, 31 de diciembre de 2012

Using Oracle Enterprise Manager Cloud Control 12c with Filer Snapshotting


Using Oracle Enterprise Manager Cloud Control 12c with Filer Snapshotting

by Porus Homi Havewala
December 2012

Introduction

Network Appliance's network-attached storage (NAS) filers have gained popularity in the corporate world because of their relative cost efficiency compared to Storage Area Networks (SANs). In NAS, servers are connected to the storage system by a standard Ethernet network, and they use standard file access protocols such as NFS and CIFS in order to make requests to the storage.
Oracle Databases are in production all over the world on NAS filers, including databases inside Oracle itself. Where EMC SANs use the concept of a Business Continuance Volume (BCV) to make a copy of an entire SAN volume (with the BCV utilizing as much expensive SAN space as the original volume), filers rely on a different backup paradigm, known as a snapshot.
Network Appliance's snapshot technology allows the creation of read-only copies of an entire file system. Such snapshots can be taken at any time and at predetermined intervals. This allows easy reverting to a file or file system in seconds. Another benefit is that snapshots also take up a minimal subset of disk space as compared to the original volume. The amount of space contained in a snapshot is calculated by the number of 4K data blocks that have been changed in the active file system, but are still held in a snapshot.
Snapshot restore software is then used to recover data, if required. A single file or an entire volume can be recovered using stored snapshot copies.
The alternative to snapshotting would be to use Oracle's Recovery Manager (RMAN), a technology that has evolved over the years and now consists of advanced mainframelike features such as block-level recovery, backup set compression, and the ability to back up a database without the need to place the database in the special backup mode. RMAN thus places less stress on redo log generation, whereas snapshotting still requires the old method of placing the entire database in backup mode.
Therefore, RMAN as a backup technology is clearly superior in the database technology sense of the word. However, the fact remains that in organizations with large filer implementations, the snapshot technology is normally agreed upon (by management) as the primary backup method for Oracle Databases, even if this means placing databases in backup mode. This may be due to the business compulsion to make use of the available filer technology for snapshotting.
The benefit is that the snapshotting takes place in the background and in minimal time, so the databases need not stay in backup mode for long. Also, for large databases above 200GB or 300GB in size, it makes sense to use the snapshotting method to save on the space required for backup. For databases smaller than 200GB, the RMAN-to-disk backup method can be used, providing faster recoverability and the ability to switch the database over to the RMAN backup copy in an emergency. Furthermore, RMAN features such as block-level recovery and compression can be used.
Oracle Enterprise Manager Cloud Control 12c is Oracle's premium enterprise management software for managing the entire Oracle technology stack—Oracle Databases, Middleware, Applications, Servers and Storage—across the enterprise. Highly scalable, Oracle Enterprise Manager Cloud Control 12c is typically used in large organizations that need a central repository of Oracle installations. Such central repositories are used for management purposes as well as for day-to-day administration and management activities by the database administrator (DBA) teams, as well as infrastructure, middleware and application teams.
Oracle Cloud Control has an excellent facility for creating and scheduling RMAN backups of the database using the GUI interface of the management console. The DBA can log in to the Oracle Cloud Control console, then go to the Database target, and selectAvailability..Backup & Recovery from the drop-down menu where the Backup Settings and Schedule Backup links are visible. Using these links, it is possible to set up an RMAN backup of any database under the DBA's control in under 10 minutes, running as a scheduled job in Oracle Enterprise Manager, without modifying and installing any shell scripts at the OS level.
Before this facility was available, UNIX shell scripts had to be written, tested, modified for each new database or server, and then installed and tested on each server, an activity that would take hours. Since the current push of many organizations is away from manual scripting, due to the high costs involved in maintaining such scripts and the costs in training new staff, this facility provided by Oracle Enterprise Manager was enthusiastically welcomed.
Support of snapshotting is important for organizations using snapshotting technology as the primary method of backing up large Oracle Databases. In this article, I will explain how to use Oracle Cloud Control to schedule snapshot backups, and as a plus point, to interface with RMAN. (Note however that this interface is only for the purpose of cataloging the snapshot copies in the database control file or the RMAN repository if used; no block-level recovery is possible.)

Background

Before starting this setup procedure, please note that the database volumes being used by the database should be set up without any scheduled snapshots at the filer level, in case the filer administrators have scheduled snapshots every hour along with nightly snapshots. Such hourly and nightly snapshots are a waste of space and should be removed, because snapshots of database files are considered for the most part useless (except in certain circumstances) if the database tablespaces have not been placed in backup mode. Oracle Cloud Control itself, via script, will be responsible for logging in remotely to the filer, and for creating the database snapshots at the filer level.
At the central Oracle Cloud Control level (we'll call it the EMCENTRAL site), we'll create a snapshot script to interact with the database and filer, and then create four snapshots per database to occur each day at six-hour intervals, as follows:
  • Snapshot 0 at 24:00 hours
  • Snapshot 1 at 06:00 hours
  • Snapshot 2 at 12:00 hours
  • Snapshot 3 at 18:00 hours
These snapshot copies will be recycled each day. Each snapshot will be deleted and then recreated at the appropriate time via a simple UNIX shell script placed on the database server, or in the case of an Oracle RAC cluster, on each node in the same location. It is also possible to place the script directly in Oracle Cloud Control, thus avoiding any script at the UNIX level.
For the purpose of filer capacity calculation, we have assumed that roughly 10 percent of blocks will change between the six-hour snapshots; but this will depend on application activity. (The amount of space contained in a snapshot is calculated by the number of 4K data blocks that have been changed in the active file system, but are still held in a snapshot.) Thus, if each snapshot takes up 10 percent of the database volume size, and 30 snapshots are stored, this would occupy extra filer capacity of 300 percent of the database volume size. This figure could be more or less, depending on application activity.
There is another storage caveat: If any file is deleted from the filer, the space cannot be reused unless the entire snapshot referring to that file is deleted from the filer. This may lead to an overhead of 10 to 20 percent of extra filer disk space as a requirement for the database volumes.

Cloud Control Filer Snapshot Setup

We presume that in a large organization with different departments or projects and different DBA teams looking after each department, the central Oracle Cloud Control site would normally have target groups set up by the central database team. These are lists of targets each DBA team should have access to, and include the host, database, listener, and other targets on each server. To handle the target group, a separate administrator, who has appropriate rights over the target group, can be created in Oracle Cloud Control.
Say, for example, that we are using the convention of CoXYZ_DeptABC_P for naming a production target group as well as the special administrator name. We can then proceed as follows:
  1. Log in to the Cloud Control console as the administrator CoXYZ_DeptABC_P .
  2. Navigate to Enterprise..Job..Library, and create a host command job.
  3. Select the target of the job as the host itself. (Note: In the case of an Oracle RAC, the cluster target may appear as down; whereas both agents, both hosts, and both instances may show as up in Oracle Cloud Control. In this case, only the host can be selected as a target and not the cluster. If the cluster is selected, the job is suspended immediately.)
  4. Type in the Job Name as SnapShotRmanJob
  5. Then type in the following command:
    /bin/ksh /home/oracle/dba/scripts/db_snapshot_backup.sh
  6. Under the Schedule, select a start time of either 12 a.m. or 6 a.m. or 12 p.m. or 6 p.m., and then set this to repeat every six hours.
Place the script supplied on both cluster nodes in the same location. In the script, change the line "SIDNAME="sidname1"" manually to the right instance security identifier (SID) as per the node, since in the case of Oracle RAC, the SIDS or instances are numbered from 1 to n. Other changes are also required, such as the IP address of the filer, and the volume names used by the database on the filer. In the case of the script, an Oracle RAC database with only two nodes is presumed.
The script supplied as a part of this article works in the following manner:
First, SQL commands are issued to ensure that an RMAN script will be generated on the server that will first uncatalog datafile copies in the previous six-hour database snapshot of the preceding day. This first RMAN script is then executed.
Second, another RMAN script is generated that will back up the control file of the database. This is also executed.
Third, a final RMAN script is generated that will alter the tablespaces into hot backup mode, log in remotely to the filer, delete the previous six-hour snapshots, again log in remotely to the filer, take a new six-hour snapshot, then alter the tablespaces out of hot backup mode, and finally catalog the datafile copies in the current snapshot. This RMAN script is finally executed.
A point to note is that after the snapshot is deleted, a UNIX sleep for some minutes has been forcibly introduced in order to make sure that the snapshot delete has completed, since the snapshot deletion occurs asynchronously in the background. If the deletion has not completed and the UNIX wait is not forced, then the new snapshot creation will fail. This has been noticed at times of heavy filer activity.
When the Oracle Enterprise Manager 12c Cloud Control job has been successfully created, you can sit back and enjoy the filer backups that will occur every six hours. But don't forget to check the RMAN logs that are being generated each six hours, to see if everything is working all right.
#!/bin/ksh
#
#    db_snapshot_backup.sh
#
#    function:    dbbackup using snapshots and cataloging in rman
#    Author:     Porus Homi Havewala
#

# job name 
JOBNAME="$0"
JOB=`basename $JOBNAME`
ERRMSG="
  $JOB: `date '+%H:%M:%S'` invalid parameter:
  $JOB <dbname> 
"

DBNAME="MYDBP"
SIDNAME="MYDBP1"
# Note: change sid manually as per node

# set environment vars such as Oracle Home etc. 
. envdb.sh $SIDNAME y
if [ $? -ne 0 ]; then
  echo "$JOB: `date '+%H:%M:%S'` Failed to set oracle env vars. Aborting.."
  exit ${ERROR}
fi

BACKUPDIR=/u50/flash_recovery_area/$DBNAME

if [ ! -d $BACKUPDIR ]; then
  echo "$JOB: `date '+%H:%M:%S'` Cannot find $BACKUPDIR directory. Aborting.."
  exit ${ERROR}
fi

# Filer specifics 

# Production Filer Ip and Db volume in Production for Main Database
MYFILER="<ip_address>" 
DBVOL="mydbp_vol1"
DBFILEMOUNTPOINT="u02/oradata/MYDBP/data"
UNDOFILEMOUNTPOINT1="u02/oradata/MYDBP/MYDBP1/undo"
UNDOFILEMOUNTPOINT2="u02/oradata/MYDBP/MYDBP2/undo"

# NLS_LANG is as per db setting
case $DBNAME in
   "MYDBP") NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1 ; export NLS_LANG 
         ;;
esac

# NLS_DATE_FORMAT is as per db setting
NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS' ; export NLS_DATE_FORMAT

# Following NLS settings are as per db setting, exporting to Unix env just in case 
# required
NLS_TIMESTAMP_FORMAT='YYYY-MM-DD HH24:MI:SS.FF3'; export NLS_TIMESTAMP_FORMAT
NLS_TIMESTAMP_TZ_FORMAT='YYYY-MM-DD HH24:MI:SS.FF3'; export NLS_TIMESTAMP_TZ_FORMAT

# copy parameter files for the Sid to the Filer so that a copy is included in the 
# snapshot
# This is not needed since in MyDbP Rac the spfile is on shared storage
# Therefore the following lines are commented

# if [ -f $ORACLE_BASE/admin/${ORACLE_SID}/pfile/init${ORACLE_SID}.ora ]; then
# cp $ORACLE_BASE/admin/${ORACLE_SID}/pfile/init${ORACLE_SID}.ora / 
    ${DBFILEMOUNTPOINT}/oradata/${ORACLE_SID}/init${ORACLE_SID}_backup.ora
# fi
# if [ -f $ORACLE_HOME/dbs/spfile${ORACLE_SID}.ora ]; then
# cp $ORACLE_HOME/dbs/spfile${ORACLE_SID}.ora /   
    ${DBFILEMOUNTPOINT}/oradata/${ORACLE_SID}/spfile${ORACLE_SID}_backup.ora
# fi

SCRIPT=`basename $0`
LOGDIR="$HOME/dba/logs"
CMDFILE1="$HOME/dba/scripts/snap_${DBNAME}_1.cmd"
CMDFILE2="$HOME/dba/scripts/snap_${DBNAME}_2.cmd"
CMDFILE3="$HOME/dba/scripts/snap_${DBNAME}_3.cmd"

NODENAME=`hostname`
# also include node name
LOGFILE1="${LOGDIR}/${SCRIPT}_${NODENAME}_${DBNAME}_`date 
'+%d%m%y-%H:%M'`_1.log"
LOGFILE2="${LOGDIR}/${SCRIPT}_${NODENAME}_${DBNAME}_`date 
'+%d%m%y-%H:%M'`_2.log"
LOGFILE3="${LOGDIR}/${SCRIPT}_${NODENAME}_${DBNAME}_`date 
'+%d%m%y-%H:%M'`_3.log"
LOGFILE4="${LOGDIR}/${SCRIPT}_${NODENAME}_${DBNAME}_`date 
'+%d%m%y-%H:%M'`_4.log"

sqlplus -S / <<EOF 
/* sql script to generate First RMAN script that will first uncatalog 
  datafile copies in the previous 6 hourly snapshot
*/
set feedback off
set heading off
set termout off
set linesize 1000
set pagesize 0
set trimspool on
set verify off
spool $CMDFILE1
select ' #
 # DO NOT EDIT THIS FILE - created dynamically by db_snapshot_backup.sh
 #' from dual;
select ' #
 # It is assumed that any datafile that is part of the database is already
 # part of the previous 6 hourly snapshot, so "uncatalog" that 
 # datafilecopy in RMAN before delete the 6 hourly snapshot.
 # Note that cannot "delete" the datafilecopy (as opposed to uncatalog)
 # because RMAN will try to delete the actual datafilecopy in the 
 # snapshot directory. This operation will fail since snapshots are read-only.
 #' from dual;
select distinct 'change datafilecopy ''' || name || ''' uncatalog;' from 
v\$datafile_copy
 where name like '%db_hot_' || to_char(sysdate,'HH24') || '%';
select 'exit;' from dual;
spool off;
exit;
EOF

rman nocatalog target=/ cmdfile=$CMDFILE1 log=$LOGFILE1
if [ $? -ne 0 ]; then
  echo "$JOB: `date '+%H:%M:%S'` Uncataloging failed, see $LOGFILE1. 
  Proceeding with next steps.."
fi

sqlplus -S / << EOF 
/* sql script to generate Second RMAN script that will backup the controlfile
*/
set feedback off
set heading off
set termout off
set linesize 1000
set pagesize 0
set trimspool on
set verify off
spool $CMDFILE2
select ' #
 # DO NOT EDIT THIS FILE - created dynamically by db_snapshot_backup.sh
 #' from dual;
select ' #
 # put a copy of the controlfile onto the Filer so that it becomes
 # part of the snapshot, also backup the controlfile to trace
 #' from dual; 
select 'sql "alter database backup controlfile to ''''/${DBFILEMOUNTPOINT}/' || 
NAME || '_controlfile_backup.ctrl'''' reuse";' from v\$database;
select 'sql ''alter database backup controlfile to trace'';' from dual;
select 'exit;' from dual;
spool off;
exit;
EOF

rman nocatalog target=/ cmdfile=$CMDFILE2 log=$LOGFILE2
if [ $? -ne 0 ]; then
  echo "$JOB: `date '+%H:%M:%S'` Backup of controlfile failed, see $LOGFILE2. 
  Proceeding with next steps.."
fi

sqlplus -S / << EOF 
/* sql script to generate Third RMAN script that will alter the tablespaces 
  into hot backup mode, delete the previous 6 hourly snapshot, 
  take a new 6 hourly snapshot, alter the tablespaces out 
  of hot backup mode and catalog the datafile copies in the current snapshot 
*/
set feedback off
set heading off
set termout off
set linesize 1000
set pagesize 0
set trimspool on
set verify off
spool $CMDFILE3
select ' #
 # DO NOT EDIT THIS FILE - created dynamically by db_snapshot_backup.sh
 #' from dual;
select ' #
 # bring existing backup mode tablespaces out of backup mode if any
 #' from dual; 
select 'sql ''alter tablespace ' || name || ' end backup'';' from V\$TABLESPACE 
where ts# in (select ts# from v\$datafile where file# in (select file# from v\$backup 
where status='ACTIVE') minus select ts# from v\$tempfile);
select ' #
 # Alter the tablespaces into backup mode
 #' from dual; 
select 'sql ''alter tablespace ' || name || ' begin backup'';' from V\$TABLESPACE 
where ts# not in (select ts# from v\$tempfile);
select ' #
 # Delete the previous 6 hourly snapshot 
 #' from dual; 
select 'host ''ssh oracledba@${MYFILER} snap delete ${DBVOL} db_hot_' || 
to_char(sysdate,'HH24') ||'_${DBNAME} | tee -a $LOGFILE4 ' || ''';' from dual;
select 'host ''sleep 480'';' from dual;
 # The sleep is to make sure the snap delete has completed, since it occurs in the 
 # background
select ' #
 # create a new 6 hourly snapshot which will replace the one just deleted
 #' from dual; 
select 'host ''ssh oracledba@${MYFILER} snap create ${DBVOL} db_hot_' || 
to_char(sysdate,'HH24') ||'_${DBNAME} | tee -a $LOGFILE4 ' || ''';' from dual;
select ' #
 # alter the tablespaces out of backup mode
 #' from dual; 
select 'sql ''alter tablespace ' || name || ' end backup'';' from V\$TABLESPACE 
where ts# not in (select ts# from v\$tempfile);
select 'sql ''alter system archive log current'';' from dual;
select 'host ''sleep 480'';' from dual;
select ' #
 # catalog the datafile copies that are part of the 6 hourly snapshot created
 #' from dual; 
select 'catalog datafilecopy ''/${DBFILEMOUNTPOINT}/.snapshot/db_hot_' || 
to_char(sysdate,'HH24') ||'_${DBNAME}' || substr(name,24) || 
''';' from v\$datafile where ts# not in (select ts# from v\$tablespace 
where name like 'UNDO%');
-- the substr(name,24) is to remove the /${DBFILEMOUNTPOINT} which is in the 
-- datafile but not in the snapshot file below the snapshot subdirectory

select 'catalog datafilecopy ''/${UNDOFILEMOUNTPOINT1}/.snapshot/db_hot_' || 
to_char(sysdate,'HH24') ||'_${DBNAME}' || substr(name,31) || 
''';' from v\$datafile where ts# in (select ts# from v\$tablespace where name in 
(Select value From SYS.GV_\$parameter where name = 'undo_tablespace' and inst_id = 1));
-- the substr(name,31) is to remove the /${UNDOFILEMOUNTPOINT1} which is in the datafile 
-- but not in the snapshot file below the snapshot subdirectory

select 'catalog datafilecopy ''/${UNDOFILEMOUNTPOINT2}/.snapshot/db_hot_' || 
to_char(sysdate,'HH24') ||'_${DBNAME}' || substr(name,31) || ''';' from v\$datafile 
where ts# in (select ts# from v\$tablespace where name in (Select value 
From SYS.GV_\$parameter where name = 'undo_tablespace' and inst_id = 2));
-- the substr(name,31) is to remove the /${UNDOFILEMOUNTPOINT2} which is in the 
-- datafile but not in the snapshot file below the snapshot subdirectory

select 'exit;' from dual;
spool off;
exit;
EOF

rman nocatalog target=/ cmdfile=$CMDFILE3 log=$LOGFILE3
if [ $? -ne 0 ]; then
  echo "$JOB: `date '+%H:%M:%S'` rman failed, see $LOGFILE3. Aborting.."
  exit ${ERROR}
fi

echo "$JOB: `date '+%H:%M:%S'` $ORACLE_SID snap backup completed"

# end db_snapshot_backup.sh

About the Author

Porus Homi Havewala is the Senior Manager in the Enterprise Technology Program Office of Oracle Singapore, and is the ASEAN regional SME and business development lead on Oracle Enterprise Manager technology. A Double Oracle Certified Master (OCM) in Oracle 10g and Oracle 11g, Porus has more than 25 years of experience in the IT industry, including more than 18 years of experience using Oracle technologies. He is the author of two books, Oracle Enterprise Manager Cloud Control 12c: Managing Data Center Chaos(2012, Packt Publishing) and Oracle Enterprise Manager Grid Control: Advanced OEM Techniques for the Real World (2010, Rampant TechPress), and the author of the Oracle Enterprise Manager Cloud Control 12c blog.  LinkedIn 

Source: http://www.oracle.com/technetwork/articles/oem/havewala-filer-oem12c-1883280.html

1 comentario: