11.2.0.4 RAC TO RAC FOR ADG环境。由于历史原因,备库节点二一直没有启动,一直是启动节点一对外提供服务。节点一alert报错,lgwr进行kill实例操作并自行重启。
Mon Dec 24 16:11:24 2018
Archived Log entry 262740 added for thread 2 sequence 185858 ID 0x92570693 dest 1:
Mon Dec 24 16:12:28 2018
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x7FFCD7489FF8] [PC:0x9899B16, qcsAnalyzeBooleanExpr()+144] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/qdb/qdb1/trace/qdb1_ora_18912.trc (incident=580470):
ORA-07445: exception encountered: core dump [qcsAnalyzeBooleanExpr()+144] [SIGSEGV] [ADDR:0x7FFCD7489FF8] [PC:0x9899B16] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/qdb/qdb1/incident/incdir_580470/qdb1_ora_18912_i580470.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
根据抛出的trc文件,我们可以追踪到时间段内的某条SQL
** 2018-12-24 16:12:28.596
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=34vxxxxxx) -----
select from MSS.T_MSS_MOBILE_LOGIN_ADDR where UUID in ('7xxxxx4xxxxxx', '48xxxxxxx, 。。。。。。
整个trc文件190M,文本内容UUID几乎占了整个文件。
new 2: where owner = 'MSS' and segment_name = 'T_MSS_MOBILE_LOGIN_ADDR'
OWNER SEGMENT_NAME SEGMENT_TYPE Total_Bytes(MB)
MSS T_MSS_MOBILE_LOGIN_ADDR TABLE PARTITION 230054
[oracle@qdb1 trace]$ du -sh /u01/app/oracle/diag/rdbms/qdb/qdb1/incident/incdir_580470/qdb1_ora_18912_i580470.trc
190M /u01/app/oracle/diag/rdbms/qdb/qdb1/incident/incdir_580470/qdb1_ora_18912_i580470.trc
紧接着系统报了4021错误代码
Mon Dec 24 16:30:17 2018
System State dumped to trace file /u01/app/oracle/diag/rdbms/qdb/qdb1/trace/qdb1_ora_67693.trc
Mon Dec 24 16:32:32 2018
Errors in file /u01/app/oracle/diag/rdbms/qdb/qdb1/trace/qdb1_lgwr_39157.trc:
ORA-04021: timeout occurred while waiting to lock object
LGWR (ospid: 39157): terminating the instance due to error 4021
一直到最后的kill 实例,自动重启
Mon Dec 24 16:32:39 2018
License high water mark = 1203
Instance terminated by LGWR, pid = 39157
USER (ospid: 6702): terminating the instance
Instance terminated by USER, pid = 6702
Mon Dec 24 16:32:48 2018
Starting ORACLE instance (normal)
查询MOS相关文档,有一篇文档与我们的环境相符
ORA-04021: timeout occurred while waiting to lock object : DR Instance terminated by LGWR (文档 ID 2183882.1)
命中了BUG了。根据bug描述,需要修改参数
SQL> show parameter cursor_sharing
NAME TYPE VALUE
cursor_sharing string EXACT
在cursor设置为exact时,两条sql语句如果存在一点不同,就不会共享cursor,而进行两次硬解析。
设置为force时Oracle对输入的SQL值,会将where条件取值自动替换为绑定变量。以后在输入相同的结构SQL语句时,会进行cursor sharing共享游标
根据这个临时方案,我做了一个小实验。我在主库创建了一个t表作为该参数的测试
14:35:02 SYS@bapdb1(bapdb1)> create table t as select * from dba_objects;
Table created.
然后到备库进行具体实验。
Mark一下,择日进行参数修改。