ORACLE 11204 GRID在CRS磁盘组损坏后的处理
主要处理CRS磁盘组(ocr和votedisk所在的磁盘组)损坏的情况下,对CRS磁盘处理。
1、模拟故障
[grid@node1 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3152
Available space (kbytes) : 258968
ID : 1580323731
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check bypassed due to non-privileged user
[oracle@node1 ~]$ sqlplus / as sysdba
SQL> select name,path from v$asm_disk;
NAME PATH
ARCH_0000 /dev/raw/raw3
DATA_0000 /dev/raw/raw1
DAA_0000 /dev/raw/raw2
VOTE2_0000 /dev/raw/raw5
/dev/raw/raw4
/dev/raw/raw6
6 rows selected.
检查OCR备份
[grid@node1 grid]$ ocrconfig -showbackup
node1 2016/11/01 13:33:12 /u01/11.2.0/grid/cdata/cluster/backup00.ocr
node1 2016/11/01 13:33:12 /u01/11.2.0/grid/cdata/cluster/day.ocr
node1 2016/11/01 13:33:12 /u01/11.2.0/grid/cdata/cluster/week.ocr
node1 2016/10/31 17:03:27 /u01/11.2.0/grid/cdata/cluster/backup_20161031_170327.ocr
node1 2016/10/31 17:03:12 /u01/11.2.0/grid/cdata/cluster/backup_20161031_170312.ocr
node1 2016/10/31 17:02:44 /u01/11.2.0/grid/cdata/cluster/backup_20161031_170244.ocr
清空CRS磁盘:
[root@node1 ~]# dd if=/dev/zero of=/dev/raw/raw1 bs=10M count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 15.8185 s, 66.3 MB/s
2、停止集群:
[root@node1 grid]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.crsd' on 'node1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.node2.vip' on 'node1'
CRS-2673: Attempting to stop 'ora.ARCH.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.VOTE2.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'node1'
CRS-2673: Attempting to stop 'ora.orcl.db' on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.cvu' on 'node1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'node1'
CRS-2677: Stop of 'ora.node2.vip' on 'node1' succeeded
CRS-2677: Stop of 'ora.cvu' on 'node1' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.node1.vip' on 'node1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'node1'
CRS-2677: Stop of 'ora.orcl.db' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.DAA.dg' on 'node1'
CRS-2677: Stop of 'ora.node1.vip' on 'node1' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'node1' succeeded
CRS-2677: Stop of 'ora.DAA.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.ARCH.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.VOTE2.dg' on 'node1' succeeded
CRS-4549: Unexpected disconnect while executing shutdown request.
CRS-5022: Stop of resource "ora.crsd" failed: current state is "INTERMEDIATE"
CRS-2675: Stop of 'ora.crsd' on 'node1' failed
CRS-2679: Attempting to clean 'ora.crsd' on 'node1'
CRS-2681: Clean of 'ora.crsd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'node1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'node1'
CRS-2673: Attempting to stop 'ora.evmd' on 'node1'
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'node1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'node1'
CRS-2677: Stop of 'ora.crf' on 'node1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'node1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'node1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'node1' succeeded
CRS-5022: Stop of resource "ora.asm" failed: current state is "UNKNOWN"
CRS-2675: Stop of 'ora.asm' on 'node1' failed
CRS-2679: Attempting to clean 'ora.asm' on 'node1'
3、启动crs,修复ocr和vote
在node1启动crs
[root@node1 grid]# crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'node1'
CRS-2676: Start of 'ora.mdnsd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'node1'
CRS-2676: Start of 'ora.gpnpd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'node1'
CRS-2672: Attempting to start 'ora.gipcd' on 'node1'
CRS-2676: Start of 'ora.cssdmonitor' on 'node1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'node1'
CRS-2672: Attempting to start 'ora.diskmon' on 'node1'
CRS-2676: Start of 'ora.diskmon' on 'node1' succeeded
CRS-2676: Start of 'ora.cssd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'node1'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'node1'
CRS-2672: Attempting to start 'ora.ctssd' on 'node1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'node1'
CRS-2676: Start of 'ora.drivers.acfs' on 'node1' succeeded
CRS-2676: Start of 'ora.ctssd' on 'node1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'node1'
CRS-2676: Start of 'ora.asm' on 'node1' succeeded
[root@node1 grid]# ps -ef|grep smon
grid 7264 1 0 15:34 ? 00:00:00 asmsmon+ASM1
root 7317 6509 0 15:34 pts/0 00:00:00 grep smon
为OCR和votedisk创建ASM磁盘组
[grid@node1 ~]$ sqlplus / as sysasm
SQL> create diskgroup data external redundancy disk '/dev/raw/raw1' attribute 'COMPATIBLE.ASM'='11.2';
恢复ocr
[root@node1 cluster]# ocrconfig -restore week.ocr
恢复vote disk
注意:
1、如果是在ASM中使用crsctl replace votedisk +diskgroup 完成修复
2、如果在集群文件系统中,那么需要查看votedisk信息,crstl query css votedisk,然后删除原先失效的votedisk,进而添加votedisk
命令为:crsctl add css votedisk 'path'
在本例中使用的是ASM,故使用下面方法恢复。
[root@node1 cluster]# crsctl replace votedisk +DATA
Successful addition of voting disk 4414805f23ee4fcbbfc5b63ea5a121b5.
Successfully replaced voting disk group with +DATA.
CRS-4266: Voting file(s) successfully replaced
关闭CRS
[root@node1 cluster]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.crsd' on 'node1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.node2.vip' on 'node1'
CRS-2673: Attempting to stop 'ora.cvu' on 'node1'
CRS-2677: Stop of 'ora.cvu' on 'node1' succeeded
CRS-2677: Stop of 'ora.node2.vip' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.oc4j' on 'node1'
CRS-2677: Stop of 'ora.oc4j' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'node1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.node1.vip' on 'node1'
CRS-2677: Stop of 'ora.scan1.vip' on 'node1' succeeded
CRS-2677: Stop of 'ora.node1.vip' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.ARCH.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.VOTE2.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'node1'
CRS-2673: Attempting to stop 'ora.orcl.db' on 'node1'
CRS-2677: Stop of 'ora.ARCH.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.VOTE2.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.orcl.db' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.DAA.dg' on 'node1'
CRS-2677: Stop of 'ora.registry.acfs' on 'node1' succeeded
CRS-2677: Stop of 'ora.DAA.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2677: Stop of 'ora.asm' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'node1'
CRS-2677: Stop of 'ora.ons' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'node1'
CRS-2677: Stop of 'ora.net1.network' on 'node1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node1' has completed
CRS-2677: Stop of 'ora.crsd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'node1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'node1'
CRS-2673: Attempting to stop 'ora.crf' on 'node1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'node1'
CRS-2673: Attempting to stop 'ora.evmd' on 'node1'
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2677: Stop of 'ora.mdnsd' on 'node1' succeeded
CRS-2677: Stop of 'ora.crf' on 'node1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'node1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'node1' succeeded
CRS-2677: Stop of 'ora.asm' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'node1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'node1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'node1' succeeded
CRS-2677: Stop of 'ora.cssd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'node1'
CRS-2677: Stop of 'ora.gipcd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'node1'
CRS-2677: Stop of 'ora.gpnpd' on 'node1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
启动CRS
[root@node1 cluster]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.