检查发现一套使用ASM的rac两个实例基本上每个小时都会报一次ORA-32701错误,截取alert日志中错误信息如下:
一:版本信息
操作系统版本:AIX 61009
Oracle数据库版本:11.2.0.3.11(RAC)
二:错误描述
检查发现一套使用ASM的rac两个实例基本上每个小时都会报一次ORA-32701错误,截取alert日志中错误信息如下:
Sat Dec 06 09:44:00 2014
Errors in file /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/trace/egmmdb2_dia0_13500888.trc (incident=1041128):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/incident/incdir_1041128/egmmdb2_dia0_13500888_i1041128.trc
DIA0 terminating blocker (ospid: 15335610 sid: 1299 ser#: 5849) of hang with ID = 3
requested by master DIA0 process on instance 1
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
by terminating session sid: 1299 ospid: 15335610
Sat Dec 06 09:44:01 2014
Sweep [inc][1041128]: completed
Sweep [inc2][1041128]: completed
DIA0 successfully terminated session sid:1299 ospid:15335610 with status 31.
Sat Dec 06 09:45:35 2014
Errors in file /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/trace/egmmdb2_dia0_13500888.trc (incident=1041129):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/incident/incdir_1041129/egmmdb2_dia0_13500888_i1041129.trc
DIA0 terminating blocker (ospid: 15335610 sid: 1299 ser#: 5849) of hang with ID = 3
requested by master DIA0 process on instance 1
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
by terminating the process
DIA0 successfully terminated process ospid:15335610.
Sat Dec 06 09:45:37 2014
Sweep [inc][1041129]: completed
Sweep [inc2][1041129]: completed
Sat Dec 06 10:45:12 2014
Errors in file /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/trace/egmmdb2_dia0_13500888.trc (incident=1041130):
ORA-32701: Possible hangs up to hang ID=0 detected
Incident details in: /oracle/app/oracle/diag/rdbms/egmmdb/egmmdb2/incident/incdir_1041130/egmmdb2_dia0_13500888_i1041130.trc
Sat Dec 06 10:45:13 2014
Sweep [inc][1041130]: completed
Sweep [inc2][1041130]: completed
egmmdb2_dia0_13500888_i1041129.trc中截取如下信息:
*** 2014-12-06 09:45:35.770
Resolvable Hangs in the System
Root Chain Total Hang
Hang Hang Inst Root #hung #hung Hang Hang Resolution
ID Type Status Num Sess Sess Sess Conf Span Action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
3 HANG RSLNPEND 2 1299 2 2 HIGH GLOBAL Terminate Process
Hang Resolution Reason: Although the number of affected sessions did not
justify automatic hang resolution initially, this previously ignored
hang was automatically resolved.
inst# SessId Ser# OSPID PrcNm Event
----- ------ ----- --------- ----- -----
1 1444 7855 10420452 M000 enq: FU - contention
2 1299 5849 15335610 M000 not in wait
Dumping process info of pid[155.15335610] (sid:1299, ser#:5849)
requested by master DIA0 process on instance 1.
*** 2014-12-06 09:45:35.770
Process diagnostic dump for oracle@egmmdb2 (M000), OS id=15335610,
pid: 155, proc_ser: 153, sid: 1299, sess_ser: 5849
-------------------------------------------------------------------------------
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 09:45:35 ]
NOTE: scheduling delay has not been sampled for 0.376554 secs 0.000000 secs from [ 09:45:31 - 09:45:36 ], 5 sec avg
0.000000 secs from [ 09:44:36 - 09:45:36 ], 1 min avg
0.000000 secs from [ 09:40:36 - 09:45:36 ], 5 min avg
loadavg : 2.68 2.42 2.41
swap info: free_mem = 19881.13M rsv = 256.00M
alloc = 138.07M avail = 65536.00M swap_free = 65397.93M
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
240001 A oracle 15335610 1 0 60 20 948d16590 209136 f1000a01500d48b0 08:37:22 - 0:01 ora_m000_egmmdb2
Short stack dump:
ksedsts()+360
-------------------------------------------------------------------------------
Process diagnostic dump actual duration=0.084000 sec
(max dump time=15.000000 sec)