Tuesday

AWR Operation failed

I know most are talking about 11g but we finally upgraded our production EBS database to 10g a few weeks ago. A few months ago, while a fellow DBA was on vacation, I used 10g's diagnostics features to troubleshoot a hanging issue in their data warehouse DB. The historical view which allows you to see what was happening in a database during a window of time is priceless and easily helped me resolve the issue. How often do users call you as they are experiencing performance problems? Never...! I'll be in a weekly meeting and users will say that they experienced issues 3 days ago! So I was looking forward to this ability.

Much to my surprise tho when I clicked on the performance tab in database control, the data was a couple of days old. 10 snapshots after the upgrade to 10g finished, it suddenly stopped working. I checked the alert log, trace files, etc but the only error I could find was if I tried to change the AWR settings. ie, changing the snapshot interval would consistently produce the following error:


ORA-13516: AWR Operation failed: only a subset of SQL can be issued ORA-06512: at
"SYS.DBMS_WORKLOAD_REPOSITORY", line 10 ORA-06512: at
"SYS.DBMS_WORKLOAD_REPOSITORY", line 33 ORA-06512: at line 1


A search of Metalink didn't result in any good hits so I opened a tar. An RDA and a few updates later they asked me to execute the following query to see if AWR was in restricted mode. In our case, it was not:

SQL> select nam.ksppinm name, val.KSPPSTVL, nam.ksppdesc description
2 from x$ksppi nam, x$ksppsv val
3 where nam.indx = val.indx AND
4 nam.ksppinm = '_awr_restrict_mode'
5 order By 1
6 /

NAME
--------------------------------------------------------------------------------
KSPPSTVL
--------------------------------------------------------------------------------
DESCRIPTION
--------------------------------------------------------------------------------
_awr_restrict_mode
FALSE
AWR Restrict Mode


The only other suggestion was to restart the database. This past weekend we had to apply some one-off patches so we took advantage of the maintenance window and restarted the database. Since then it has been working properly. I'm not a big fan of this kind of solution, I'd rather know the root cause and if its preventable. But in this case it seems Oracle Support is not sure either.

4 comments:

Anonymous said...

i have the same question!yesterday,a test db have no space on the disk when we used a job to insert large data,there is some archive log errors in the alert log and some background oracle processes down.so the awr snap down too.but,how can i startup these background process and not restart oracle database?

Anonymous said...

I have the same problem and also had difficulty finding any info on why the AWR would stop taking snapshots. Decidedly dodgy if you ask me. I discovered by chance that bouncing seems to clear it, but this is beginning to look suspiciously like a Microsoft product ;-)
cheers,
Alex

Anonymous said...

Good findings.
I have also written about this problem in http://arjudba.blogspot.com/2010/08/ora-13516-awr-operation-failed-swrf.html.

Unknown said...
This comment has been removed by the author.