Showing posts with label Scripts. Show all posts
Showing posts with label Scripts. Show all posts

Friday

Why are scripts needlessly complex?

Tonight i’m restoring an environment to another server as a test.   The source server is support by another group, so they use their own scripts to backup the database.  Before I started the restore I took a look at the scripts to see how its being backed up, file locations,etc and what I found utterly shocked me.

To backup one database there were at least 6 scripts scheduled in cron with a line count of almost 600 lines!   I couldn’t believe it.    I try to keep things as simple as possible.  For example, here is one of my backup scripts:

   1:  #!/bin/sh
   2:  . /home/oracle/.bash_profile
   3:  . /usr/local/bin/oraenv << END
   4:  ORCL
   5:  END
   6:   
   7:  cd /home/oracle/scripts
   8:  logfile=/home/oracle/scripts/log/rman_ORCL_LVL0.log.`date '+%d%m%y'`
   9:   
  10:  rman target / nocatalog CMDFILE /home/oracle/scripts/rman_ORCL_LVL0.sql LOG $logfile
  11:  status=$?
  12:   
  13:  if [ $status -gt 0 ] ; then
  14:     mailx -s "[BACKUP][FAILED] ORCL LVL0" me@myemail.com <<!
  15:  `cat $logfile`
  16:  !
  17:  else
  18:      mailx -s "[BACKUP][SUCCESS] ORCL LVL0" me@myemail.com <<!
  19:  `cat $logfile`
  20:  !
  21:  fi
  22:   
  23:  echo "Backup files removed (4+ days OLD):"
  24:  echo `find /u03/backup/ORCL  -mtime +4 -print`
  25:  find /u03/backup/ORCL -type f -mtime +4 -exec rm -f {} \;
  26:   
  27:  echo "Archive logs removed (2+ days OLD):"
  28:  echo `find /u03/archive/ORCL  -mtime +2 -print`
  29:  find /u03/archive/ORCL -type f -mtime +2 -exec rm -f {} \;



It doesn’t get much simpler than that.  I send emails both on SUCCESS and FAIL of the backup because i've seen cron stop working before.   I have filters in my mail client to separate them into different folders.  Each day I checked for failed backups and periodically I check successful folder to make sure my backups are working properly.   The rman_ORCL_LVL0.sql file basically contains:  unneedlessly

backup
   incremental level 0
   database
   tag 'ORCL_LVL0';

I know there are exceptions but in this case these databases are small and have simple backup requirements.  Yeah, I could spend a bit of time to put in variables so its more generic but I don’t manage hundreds of databases so thats not a huge concern.  

As for the original 600 lines of scripts… Yeah, its very generic, handles a multitude of scenarios and probably even does your laundry but its worthless IMHO if it can’t be quickly understood.  The last thing I want to be doing at 3am in the morning is trying to figure out someones scripts because I was the lucky person on call.

Monitoring your RMAN backup scripts

There is an easy way to monitor your RMAN backup scripts in a unix environment. I've seen quite a few DBA's log the rman output to a file and then grep it for errors. Ex.


rman target / CMDFILE /home/oracle/scripts/rman.sql LOG $logfile
errors=`cat $logfile|grep -i error|wc -l`
if test $errors -gt 0
then
mailx -s "BACKUP FAILED" email1@myserver.com, email2@myserver.com <<\!
`cat $logfile`
!
fi

Another method would be to examine the return code from the rman client. Rman returns a value of 0 to the shell if the backup was successful. Otherwise it returns a nonzero value and from what I have seen this is always 1. (Have you seen any other values?)

So another way of writing the above script would be


rman target / CMDFILE /home/oracle/scripts/rman.sql LOG $logfile
status=$?

if [ $status -gt 0 ] ; then
mailx -s "BACKUP FAILED: TESTDB" email1@myserver.com <<\!
`cat $logfile`
\!
else
mailx -s "SuccessfulBackup: TESTDB" email1@myserver.com <<\!
`cat $logfile`
\!
fi

NOTE: Ignore the backlashes before the exclamation points above. I'm not sure how to format it properly within blogger. I'll see if I can fix it later.

As you can see above, after rman executes I set a status variable. $? is automatically set to the return code from the rman client. After that I check the value and there was an error I send an email saying the backup failed and put the contents of the log file into the body of the message. If the backup was successful, I still send a copy of the logfile.

The reason I do this is because each day I will expect to see either a successful or failed email. I've had issues in the past with cron not executing scheduled jobs (for numerous reasons). So if I don't receive an email I'll verify the backup manually.

Wednesday

11i and Listener Passwords

A standard practice for any Oracle database should be to enable passwords on your listeners. EBS is no exception and recently Oracle published a metalink note: 386374.1 on how to do this for 11.5.x environments.

There are some pre-reqs, specifically:

3453499 (11i.ADX.F)
5225940 (Post ADX.F Fixes)
5107107 - AUTOCONFIG ENGINE & CONFIG TOOLS ROLLUP PATCH N or higher

The note has detailed steps on how to set the password at the application and database tiers.

In my opinion, just as important as setting a password, is to monitor for invalid attempts after. Otherwise whats the point? At best you will only slow down a determined intruder. Once you have set a password, any attempt to execute a command via lsnrctl (without setting your password first) will result in the following error showing up in your listener log:

TNS-01169: The listener has not recognized the password

Each DBA has his preferred language for scripts. Some prefer shell, others perl. Myself, I prefer perl. Below is a perl script I have used to monitor the listener log for the above error.


#!/usr/bin/perl -w

if ($#ARGV !=1) {
die "Usage: check_alert.pl ex. hostname /var/opt/oracle/listener.log.\n";
}
sleep 2;

$interval=60; # How many seconds before we check to see if data has been written to the logfile;
$email_threshold=5; # How many errors within the interval before an email gets sent;
$hostname=$ARGV[0];
$file=$ARGV[1];
open(filePtr, $file) or die "Can't find $file\n";


for (;;) {
@errors=("Subject: Listener Password Errors for $hostname\n");
$currTime = localtime(time);
push(@errors,"Here are some errors found at $currTime for $hostname.\n");

while () {
chop $_;
if (/TNS-01169/) {
push(@errors, "$_\n");
}
}

if ($#errors > $email_threshold) {
$rndExt = time;
$rndFile = "alert_errors_$rndExt";
open (TMPFILE, ">/tmp/$rndFile");

foreach $error (@errors) {
print TMPFILE $error;
}
close(TMPFILE);
system ("mail username\@mycompany.com < /tmp/$rndFile");
system ("rm /tmp/$rndFile");
}

sleep $interval;
seek filePtr, 0, 1;
}




The script takes two parameters: <hostname> <path_to_listener_log>

A couple of key things you may want to customize in the above script are the $interval and $email_threshold variables. As well, there is an email address towards the bottom where alerts will be sent.

Now, I am by no means a perl expert but the above works for me. I'm sure there are some more efficient ways to monitor the log file. Feel free to post a comment with your preferred method. If you cycle your logfiles keep in mind you'll need to restart this script. I've monitored resource consumption and its pretty minimal but I know some DBA's will prefer not having a process run constantly.

Note: I've only tested this script on a 9i listener log. Its on my todo list to test it on 8.0.6.

UPDATE: Thanks to Howard Rogers for informing me of the 10g changes. Security has been enhanced such that the TNS listener administrator has to be a member of the local dba group. Unless you plan on administering your listener remotely then you don't need a password. Please see the comments and Metalink note: 340009.1

Most of our databases are still 9i but i'll be sure to remove the listener passwords as part of the upgrades.