A few months ago I joined a new company embarking on a new R12 implementation project. From a DBA perspective I haven't noticed major day to day differences than 11i.. Patching, routine maintenance, etc are all pretty similar. One big difference that I have noticed is that it seems to be more stable. With 11i it seemed I was constantly applying one off patches but for this R12 implementation, I think I have applied 6 at most.
Last week we made the decision to go live... within 10 minutes production crashed! We could ping the server but could not login either remotely or at the console.. So a hard reset was in order. When the server came back up we found the following in /var/log/messages:
Jun 3 11:55:18 myserver kernel: kernel BUG at kernel/exit.c:904!
Jun 3 11:55:18 myserver kernel: invalid operand: 0000 [#1]
Jun 3 11:55:18 myserver kernel: SMP
A quick search on metalink turned up the following note:
Linux Crashes when Enterprise Manager Agent Starts on RHEL 4 Update 6 and 7
Doc ID: 729543.1
This note caught my eye immediately because we just started to setup grid control for this server in preparation for going live. We had also just purchased OEL support from Oracle, so we opened an SR with them to confirm, which they did.
Over the past few weeks we were debating which vendor to purchase linux support from. There were two camps, one which preferred Redhat and the other (dba's) Oracle. My argument was that Oracle would be more aware of linux based issues affecting their software. The other side argued that they weren't sure Oracle could deliver the same level of service as Redhat.
Fast forward to the kernel bug. It turns out that OEL customers are not affected because the OEL 4.7 kernel already contains the fix. I realize this is just one case but at least it adds some weight to my argument.
I won't be able to apply the kernel patch until the next maintenance window, so until then i'll have to monitor our production environment the old way, via scripts.