MySQL5.6.37版本,某人在测试环境主库误操作执行删表操作,导致主从断开,在从库查看主从信息如下:
Last_Errno: 1837
Last_Error: Worker 3 failed executing transaction '' at master log mysql-bin.013343, end_log_pos 289330740; Error 'When @@SESSION.GTID_NEXT is set to a GTID,
you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current
@@SESSION.GTID_NEXT is '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902'.' on query. Default database: 'DBNAME'. Query: 'DROP TABLE IF EXISTS TABLENAME_1,TABLENAME_2 ...
执行删表操作怎么可能会导致主从断开,问后知道是通过某工具误操作导致了删表,并立刻停止了。
故障可能原因:
1、create table table_name as select * from table_name; 会拆分成 creat table 和 insert 两个事务,传到slave时,slave执行完 create table以后,没有insert的GTID,于是报错
2、MyISAM 存储引擎,myisam引擎支持insert delayed语法,insert delay是异步写入,也就是一旦执行立即返回给客户端成功。mysql内部处理insert delay时,会将多个线程的insert合并后一起执行,但是只生成了一个GTID;于是传到slave后,由于是myisam表,从库的同样只能执行第一条SQL,于是报错
3、主库innodb执行一个事务,只产生一个gtid,myisam不支持事务,事务的第一条执行完以后,第二个sql就没有gtid,于是报错
4、临时表
5、BUG
本次故障
1、检查对应的库没有MyISAM表:
select table_schema,table_name,engine from information_schema.tables where engine !='innodb' and table_schema = 'DBNAME';
2、检查过enforce_gtid_consistency主从库都为on,CREATE TABLE ... SELECT语句不能执行成功,并且这次故障并不涉及CREATE TABLE ... SELECT语句,故排除
3、主从存储引擎一致
4、没有临时表
在主库审计日志查看,执行了drop schema dbname;
20200707 12:55:20 ' drop schema DBNAME'
查看主库 binlog :
# at 289328506
#200707 12:55:08 server id 100 end_log_pos 289328554 CRC32 0x1401e82a GTID [commit=yes]
SET @@SESSION.GTID_NEXT= '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902';
# at 289328554
#200707 12:55:08 server id 100 end_log_pos 289329642 CRC32 0x0728afdf Query thread_id=2388454119 exec_time=12 error_code=0
SET TIMESTAMP=1594097708;
SET @@session.sql_mode=270532608;
;
SET @@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33;
DROP TABLE IF EXISTS TABLENAME_1,TABLENAME_2,TABLENAME_3,TABLENAME_4,......;
# at 289329642
#200707 12:55:08 server id 100 end_log_pos 289330740 CRC32 0x0d0122e4 Query thread_id=2388454119 exec_time=12 error_code=0
SET TIMESTAMP=1594097708;
DROP TABLE IF EXISTS TABLENAME_5,TABLENAME_6,TABLENAME_7,TABLENAME_8,......;
# at 289330740
#200707 12:55:08 server id 100 end_log_pos 289331832 CRC32 0xd8409afa Query thread_id=2388454119 exec_time=12 error_code=0
SET TIMESTAMP=1594097708;
DROP TABLE IF EXISTS TABLENAME_9,TABLENAME_10,TABLENAME_11,TABLENAME_12,......;
# at 289331832
#200707 12:55:08 server id 100 end_log_pos 289332298 CRC32 0xa6657cc5 Query thread_id=2388454119 exec_time=12 error_code=0
SET TIMESTAMP=1594097708;
DROP TABLE IF EXISTS TABLENAME_13,TABLENAME_14,TABLENAME_15,TABLENAME_16,......;
# at 289332298
#200707 12:55:20 server id 100 end_log_pos 289332346 CRC32 0x0cc19e83 GTID [commit=yes]
SET @@SESSION.GTID_NEXT= '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462903';
查看从库 binlog :
# at 19856236
#200707 12:55:08 server id 100 end_log_pos 19856284 CRC32 0x5e42595e GTID [commit=yes]
SET @@SESSION.GTID_NEXT= '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902';
DROP TABLE IF EXISTS TABLENAME_1,TABLENAME_2,TABLENAME_3,TABLENAME_4,......# at 19857398
#200707 12:55:23 server id 100 end_log_pos 19857446 CRC32 0x81998bd5 GTID [commit=yes]
SET @@SESSION.GTID_NEXT= '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463065';
# at 19857446
#200707 12:55:23 server id 100 end_log_pos 19857509 CRC32 0x0916a5e2 Query thread_id=2388456043 exec_time=0 error_code=0
SET TIMESTAMP=1594097723;
SET @@session.sql_mode=524288;
;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33;
BEGIN
;
# at 19857509
#200707 12:55:23 server id 100 end_log_pos 19859435 CRC32 0x2eec3842 Rows_query
# insert into talename....
......
......
#200707 12:55:23 server id 100 end_log_pos 19860287 CRC32 0xc983ec89 Xid = 5405680966
COMMIT;
# at 19860287
#200707 12:55:08 server id 100 end_log_pos 19861411 CRC32 0x20f2bc78 Query thread_id=2388454119 exec_time=1669 error_code=0
SET TIMESTAMP=1594097708;
SET @@session.sql_mode=270532608;
;
SET @@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33;
DROP TABLE IF EXISTS TABLENAME_5,TABLENAME_6,TABLENAME_7,TABLENAME_8,......;
# at 19861411
#200707 12:55:08 server id 100 end_log_pos 19862529 CRC32 0x35204dfe Query thread_id=2388454119 exec_time=1672 error_code=0
SET TIMESTAMP=1594097708;
DROP TABLE IF EXISTS TABLENAME_9,TABLENAME_10,TABLENAME_11,TABLENAME_12,......;
# at 19862529
#200707 12:55:23 server id 100 end_log_pos 19862577 CRC32 0x48b02b33 GTID [commit=yes]
SET @@SESSION.GTID_NEXT= '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463066';
1、因为数据库开启了GTID复制,每一个GTID需要与一个唯一的事务对应,"drop schema dbname;" 在从库将删表语句拆分成了多个语句。
2、查看主从GTID,发现从库GTID缺少了从'07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462903' 至 '07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463064'信息
3、而且从库在GTID为'07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902'时执行了一个DROP TABLE IF EXISTS语句后,直接进入GTID为'07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463065' 执行insert into语句并COMMIT
4、COMMIT后,正常需要设置不同的@@SESSION.GTID_NEXT,但是没有,而是再次执行DROP TABLE IF EXISTS语句。事务在GTID为'07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902'后发生了异常拆分,所以主从复制发生报错。