TAF, Transparent Application Failover is the ability for a connection to failover and reconnect to a surviving node. Transactions are rolled back but TAF can be configured to resume SELECT operations. Before you can use TAF you have to create a Service. Initially services were used for connection load balancing but over the years has developed into a method to distribute and manage workload across database instances. For more information on services see Oracles documentation:
There are a few ways to create a service. You can use Grid Control, srvctl or dbca. For this exercise I am going to use DBCA.
Launch DBCA. Select Oracle Real Application Clusters database and click on next:
On the following screen click on the Add button. When prompted for a service name enter RAC and click on the OK button:
On the following screen verify both RAC instances (RACDB1, RACDB2) have the preferred option selected. Preferred means that the service will run on that instance when the environment first starts. An instance flagged as available means it will not run the service unless the preferred instances fail. Since this is a RAC environment we want users to be distributed between both nodes, thus they need to have the preferred option set.
The TAF policy is the failover settings. Select Basic. Basic means that the session is reconnected when a node fails. Preconnect means that a shadow process is created on another node in the cluster, to reduce time if there is a failure.
A check of the $ORACLE_HOME/network/admin/tnsnames.ora file will show the following new entry:
Lets break down this new entry and highlight a few key lines:
Line 5: (LOAD_BALANCE = yes)
This line indicates that Oracle will randomly select one of the addresses defined on the previously two lines and connect to that nodes listener. This is called client-side connect-time load balancing.
Lines 9 -13: These lines are the TAF settings.
Line 10: (TYPE = SELECT)
There are two types of failover. SELECT indicates that the session will be authenticated on a surviving node but as well SELECT statements will be re-executed. Rows already returned to the client are ignored.
The other option is SESSION. In a SESSION failover, only the users session is re-authenticated.
Line 11: (METHOD = BASIC)
BASIC means that the session will not reconnect to a surviving node until the failover occurs.
PRECONNECT means a shadow process is created on a backup instance to reduce failover time. There are some additional considerations when choosing this setting so be sure to read up on it.
Line 12 and 13: (RETRIES = 180)
(DELAY = 5)
Self-explanatory, the maximum number of retries to failover and the amount of time in seconds to wait between attempts.
Session 1: Login to the database via the service created above:
Session 2: Login as sysdba and query gv$session to determine which instance Scott is connected to:
Scott is connected to instance 2 (which resides on raclinux2), so lets shutdown that instance and see what happens. As sysdba, connect to RACDB2 and shutdown immediate. Once the instance has shutdown re-execute the query above to see which instance Scott is now connected to:
Why hasn’t the session failed over? Because when node 2 was shutdown scott wasn’t executing a query. Re-execute the select count(*) from emp statement, then query gv$session again:
If a session is inactive, it will not failover until another statement is issued.