Friday

RAC and Vmware issues

Over the past little while i've been playing around with RAC on Vmware. Specifically 10.2.0.1 on Oracle Enterprise Linux 5. Hopefully next week i'll be able to post the install steps but here are a couple of pointers in case you are working on it now:

1. Randomly at least one, sometimes both of my vmware instances would hang. At the time of the hang I could see that they were both accessing shared disk, specifically the voting and ocr disks.

In my vmware.log file I could see the following error:

Msg_Post: Error Mar 12 16:47:25: vmx| [msg.log.error.unrecoverable] VMware Server unrecoverable error: (vmx) Mar 12 16:47:25: vmx| NOT_IMPLEMENTED
C:/ob/bora-56528/pompeii2005/bora/devices/scsi/scsiDisk.c:2874 bugNr=41568


I searched in vain for a solution and finally sent an email to the Oracle-L list. Thankfully Edgar saw my post and provided me with the solution. If your running on a slow computer (I didn't think my brand spanking new laptop was slow ;) you could have locking issues. In each of your vmware configuration files put the following line:

reslck.timeout="1200"

2. You should read the following notes before you start and modify your steps accordingly:

Subject: 10gR2 RAC Install issues on Oracle EL5 or RHEL5 or SLES10 (VIPCA Failures) Doc ID: Note:414163.1


Subject: VIPCA FAILS COMPLAINING THAT INTERFACE IS NOT PUBLIC Doc ID: Note:316583.1



If you are trying to install 10g on Oracle Enterprise Linux 5, as I am, you will hit errors installing clusterware. The first note above describes how a workaround used for a Linux threading bug is no longer valid. So before you run the root.sh script you will need to modify some files.

Note: The first time I installed clusterware I didn't see any errors. It was only when I verified the install I noticed something was wrong and found this note.

The second metalink article describes how vipca (which is executed automatically when you run root.sh during the clusterware install) doesn't like Private Network IP's being used for your public interface. It describes how to execute vipca manually.

I'm not 100% finished yet so I may encounter more issues. I have to recover from a backup and start my clusterware install again. While I was installing the database software my laptop BSOD'd and it corrupted my shared disks.

4 comments:

Anonymous said...

I'm finding w/ 10g RAC, vmware workstation 6 and CentOS 5.1, my raw disks get corrupt the minute the VM gets powered down.

vmware is messing up the disks somehow. I can reboot all I want, but if I shutdown, I have to reinstall.

Unknown said...

I had corruption issues as well but in my case the vmware instance would freeze. I would either have to restart all the vmware services or reboot my machine. For me, it turned out that since I was on a slow machine there were some timeout issues which would lock the environment. Occasionally after I restarted vmware, it would complain that some files were corrupted. I had to change the reslck.timeout value.

As well, I don't believe clustering is supported by vmware workstation. A few of my friends hit issues with it.

Anonymous said...

Hi!
I'm having a problem on very start of installation of clusterware.
My config is VmWare 2.0.1 on Windows, Guest OS RHEL 5.3, RAC 10.2g.
In my case trick is that clufvy test pass when I check one by node. When I put them to check together both nodes test fail.
So something has to be with VmWare. This "reslck.timeout="1200"" looks promising. Is there anything else I could try in my case...THX for any help or comment

Anonymous said...

Hallo Guru,

thanks a lot for your post,It helped me a lot.

Regards,
sam