从DPF删除一个数据库分区决不是像直接编辑db2nodes.cfg那么简单,正确的做法是使用命令“db2stop drop partitionnum”。
如果你直接编辑db2nodes.cfg删除对应分区那行或者那些行,而恰好有数据库分区组(database partition group)还在使用被删除这个或者这些分区,那么会导致问题。 看下面的例子:
[db2inst1@testhost ~]$ cat sqllib/db2nodes.cfg
0 testhost 0
1 testhost 1
2 testhost 2
3 testhost 3
4 testhost 4
[db2inst1@testhost ~]$ db2 "select TBSP_NAME,DBPARTITIONNUM,TBSP_UTILIZATION_PERCENT from SYSIBMADM.TBSP_UTILIZATION order by TBSP_NAME with ur"
[db2inst1@testhost ~]$ db2 "create table t1(i int not null primary key, c char(10)) in TBS_TEST"
DB20000I The SQL command completed successfully.
[db2inst1@testhost ~]$ db2 terminate
DB20000I The TERMINATE command completed successfully.
[db2inst1@testhost ~]$./insert.sh
[db2inst1@testhost ~]$ db2stop
03/25/2018 18:54:41 1 0 SQL1064N DB2STOP processing was successful.
03/25/2018 18:54:41 2 0 SQL1064N DB2STOP processing was successful.
03/25/2018 18:54:42 0 0 SQL1064N DB2STOP processing was successful.
03/25/2018 18:54:43 3 0 SQL1064N DB2STOP processing was successful.
03/25/2018 18:54:43 4 0 SQL1064N DB2STOP processing was successful.
SQL1064N DB2STOP processing was successful.
[db2inst1@testhost ~]$ ipclean -a
Application ipclean: Removing all IPC resources for db2inst1
[db2inst1@testhost ~]$ db2_kill
rah: primary monitoring process for db2nkill is 61762
testhost: db2nkill: DB2 member 0 with PID 56900 does not exist.
testhost: db2nkill: DB2 member 1 with PID 57016 does not exist.
testhost: db2nkill: DB2 member 2 with PID 57113 does not exist.
testhost: db2nkill: DB2 member 3 with PID 57236 does not exist.
testhost: db2nkill: DB2 member 4 with PID 57269 does not exist.
testhost: db2nkill [] completed ok
[db2inst1@testhost ~]$ db2start
03/25/2018 19:06:23 3 0 SQL1063N DB2START processing was successful.
03/25/2018 19:06:25 2 0 SQL1063N DB2START processing was successful.
03/25/2018 19:06:25 1 0 SQL1063N DB2START processing was successful.
03/25/2018 19:06:26 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
[db2inst1@testhost ~]$ db2 connect to testdb
Database Connection Information
Database server = DB2/LINUXX8664 10.5.7
SQL authorization ID = DB2INST1
Local database alias = TESTDB
[db2inst1@testhost ~]$ db2 create table t2 like t1
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL1034C The database was damaged, so all applications processing the
database were stopped. SQLSTATE=58031
可以看到无法创建与t1一样的表。
[db2inst1@testhost ~]$ db2 "select * from t1"
SQL0901N The SQL statement or command failed because of a database system
error. (Reason "sqlno_node_set_add [100]:rc( 0) ".)
SQLSTATE=58004
x
2018-03-25-19.08.45.376940-420 I2972162E756 LEVEL: Info
PID : 64558 TID : 140661319788288 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : TESTDB
APPHDL : 0-62 APPID: *N0.db2inst1.180326020755
AUTHID : DB2INST1 HOSTNAME: testhost
EDUID : 64 EDUNAME: db2agent (TESTDB) 0
FUNCTION: DB2 UDB, access plan manager, sqlra_dumpEventsForType, probe:3317
DATA #1 : <preformatted>
Dumped APM/RDS event recorder information to file "/home/db2inst1/sqllib/db2dump/FODC_AppErr_2018-03-25-19.08.43.975256_64558_64_000/TESTDB.20180325190845.000.SQLRA.sqlraLOW.events.bin".
If IBM service is contacted to help resolve a recovery problem, this file may be requested.
2018-03-25-19.08.45.378378-420 I2972919E756 LEVEL: Info
PID : 64558 TID : 140661319788288 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : TESTDB
APPHDL : 0-62 APPID: *N0.db2inst1.180326020755
AUTHID : DB2INST1 HOSTNAME: testhost
EDUID : 64 EDUNAME: db2agent (TESTDB) 0
FUNCTION: DB2 UDB, access plan manager, sqlra_dumpEventsForType, probe:3317
DATA #1 : <preformatted>
Dumped APM/RDS event recorder information to file "/home/db2inst1/sqllib/db2dump/FODC_AppErr_2018-03-25-19.08.43.975256_64558_64_000/TESTDB.20180325190845.000.SQLRA.sqlraMED.events.bin".
If IBM service is contacted to help resolve a recovery problem, this file may be requested.
2018-03-25-19.08.45.379713-420 I2973676E757 LEVEL: Info
PID : 64558 TID : 140661319788288 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : TESTDB
APPHDL : 0-62 APPID: *N0.db2inst1.180326020755
AUTHID : DB2INST1 HOSTNAME: testhost
EDUID : 64 EDUNAME: db2agent (TESTDB) 0
FUNCTION: DB2 UDB, access plan manager, sqlra_dumpEventsForType, probe:3317
DATA #1 : <preformatted>
Dumped APM/RDS event recorder information to file "/home/db2inst1/sqllib/db2dump/FODC_AppErr_2018-03-25-19.08.43.975256_64558_64_000/TESTDB.20180325190845.000.SQLRA.sqlraHIGH.events.bin".
If IBM service is contacted to help resolve a recovery problem, this file may be requested.
2018-03-25-19.08.45.380027-420 I2974434E233 LEVEL: Severe
PID:64558 TID:140661319788288 NODE:000 Title: RDS DBCB
Dump File:/home/db2inst1/sqllib/db2dump/FODC_AppErr_2018-03-25-19.08.43.975256_64558_64_000/64558.64.000.dump.bin
……
重新创建在db2nodes.cfg中编辑恢复节点
下面是正确的方法,使用的是“db2stop drop partitionnum”,当然还有一些其它命令做为辅助。 主要是通过"drop dbpartitionnum verify"去确认当前分区不存在任何数据库的任何分区组(database partition group)上,如果存在,返回SQL6035W,否则返回SQL6034W。 如果返回的是SQL6035W,你则需要使用命令“redistribute database partition group”重分布分区组,去掉对当前分区的使用。 下面是具体过程:
[db2inst1@testhost ~]$ db2 "select * from syscat.dbpartitiongroupdef"
DBPGNAME DBPARTITIONNUM IN_USE
----------------- -------------------------- ----------
IBMCATGROUP 0 Y
IBMDEFAULTGROUP 0 Y
IBMDEFAULTGROUP 1 Y
IBMDEFAULTGROUP 2 Y
IBMDEFAULTGROUP 3 Y
IBMDEFAULTGROUP 4 Y
DATAGROUP 0 Y
DATAGROUP 1 Y
DATAGROUP 2 Y
DATAGROUP 3 Y
10 record(s) selected.
[db2inst1@testhost ~]$ export DB2NODE=4
[db2inst1@testhost ~]$ db2 terminate
DB20000I The TERMINATE command completed successfully.
[db2inst1@testhost ~]$ db2 drop dbpartitionnum verify
SQL6035W Database partition "4" is being used by database "TESTDB".
[db2inst1@testhost ~]$ export DB2NODE=0
[db2inst1@testhost ~]$ db2 terminate
[db2inst1@testhost ~]$ db2 connect to testdb
[db2inst1@testhost ~]$ db2 "redistribute database partition group DATAGROUP uniform drop dbpartitionnums (4)"
DB20000I The REDISTRIBUTE NODEGROUP command completed successfully.
[db2inst1@testhost ~]$ db2 force applications all
DB20000I The FORCE APPLICATION command completed successfully.
DB21024I This command is asynchronous and may not be effective immediately.
[db2inst1@testhost ~]$ export DB2NODE=4
[db2inst1@testhost ~]$ db2 terminate
DB20000I The TERMINATE command completed successfully.
[db2inst1@testhost ~]$ db2 drop dbpartitionnum verify
SQL6035W Database partition "4" is being used by database "TESTDB".
[db2inst1@testhost ~]$ db2stop drop dbpartitionnum 4
SQL6076W If you continue, this command will remove all database files for the specified database partition from the current instance. Before continuing, ensure that the specified database partition is not in use.
Do you want to continue ? (y/n)y
03/25/2018 23:02:06 3 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:07 1 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:07 0 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:07 2 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:07 4 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:11 3 0 SQL1063N DB2START processing was successful.
03/25/2018 23:02:11 1 0 SQL1063N DB2START processing was successful.
03/25/2018 23:02:11 4 0 SQL1063N DB2START processing was successful.
03/25/2018 23:02:11 2 0 SQL1063N DB2START processing was successful.
03/25/2018 23:02:11 0 0 SQL1063N DB2START processing was successful.
03/25/2018 23:02:14 4 0 SQL6034W Database partition "4" is not being used by any databases.
03/25/2018 23:02:17 0 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:18 2 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:19 1 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:19 3 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:19 4 0 SQL1064N DB2STOP processing was successful.
03/25/2018 23:02:21 4 0 SQL1064N DB2STOP processing was successful.
SQL1064N DB2STOP processing was successful.
[db2inst1@testhost ~]$ cat sqllib/db2nodes.cfg
0 testhost 0
1 testhost 1
2 testhost 2
3 testhost 3
[db2inst1@testhost ~]$ db2start
03/25/2018 23:04:32 2 0 SQL1063N DB2START processing was successful.
03/25/2018 23:04:32 1 0 SQL1063N DB2START processing was successful.
03/25/2018 23:04:33 3 0 SQL1063N DB2START processing was successful.
03/25/2018 23:04:34 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
[db2inst1@testhost ~]$ db2 connect to testdb
SQL1469N Instance "db2inst1" (database partition number "0") does not have
node "4" specified in its db2nodes.cfg file. SQLSTATE=08004
[db2inst1@testhost ~]$ export DB2NODE=0
[db2inst1@testhost ~]$ db2 terminate
DB20000I The TERMINATE command completed successfully.
[db2inst1@testhost ~]$ db2 create table t2 like t1
DB20000I The SQL command completed successfully.