PostgreSQL索引失效会发生什么-编程学习网

前段时间碰到个奇怪的索引失效的问题，实际情况类似下面这样：

bill=# begin;
BEGIN
bill=*# create index idx_t1 on t1(id);
CREATE INDEX
bill=*# explain select * from t1 where id = 1;
                     QUERY PLAN
----------------------------------------------------
 Seq Scan on t1  (cost=0.00..25.88 rows=6 width=36)
   Filter: (id = 1)
(2 rows)

bill=*# end;
COMMIT
bill=# explain select * from t1 where id = 1;
                             QUERY PLAN
---------------------------------------------------------------------
 Bitmap Heap Scan on t1  (cost=1.50..7.01 rows=6 width=36)
   Recheck Cond: (id = 1)
   ->  Bitmap Index Scan on idx_t1  (cost=0.00..1.50 rows=6 width=0)
         Index Cond: (id = 1)
(4 rows)

很显然的问题就是，我在事务中创建了索引，却没办法使用。但是当事务提交了后便可以正常使用了，这是什么情况呢？

这个其实和pg_index中indcheckxmin属性有关，关于这个字段的解释如下：

If true, queries must not use the index until the xmin of this pg_index row is below their TransactionXmin event horizon, because the table may contain broken HOT chains with incompatible rows that they can see

经检查也确实如此:

bill=*# select indcheckxmin from pg_index where indexrelid = 'idx_t1'::regclass;
 indcheckxmin
--------------
 t
(1 row)

那么问题来了，什么情况下创建索引时会将索引的该属性设置为true呢？

从前面官方文档对于该字段的解释，如果表中包含broken HOT chains 则会为true，那什么是broken HOT chains ？似乎和HOT机制有关。那是不是只有存在broken HOT chains 才会设置为true呢？

这里就不卖关子了，直接给出结论，然后我们再去一一验证。

经测试发现，以下两种情况会导致索引的indcheckxmin设置为true：

当前事务中表上存在broken HOT chains，即官方文档中所说；
当old_snapshot_threshold被设置时。

场景一：broken HOT chains

这种情况，只要在当前事务中表中存在HOT更新的行时就会存在。那么什么时候会进行HOT更新呢？两个前提：

新的元组和旧元组必须在同一个page中；
索引字段不能进行更新。

既然如此，实际中常见的两种情况就是：

对表上最后一个page进行更新；
表设置了fillfactor，即每个page上有预留的空闲空间。

例子：

表中插入10条数据，自然只有1个page：

bill=# insert into t1 select generate_series(1,10),md5(random()::text);
INSERT 0 10

进行更新：

bill=# update t1 set info = 'bill' where id = 10;
UPDATE 1

查看发现的确是HOT更新：

关于t_infomask2字段的解释这里就不再赘述。

PostgreSQL索引失效会发生什么

接下来我们创建索引：

可以发现indcheckxmin被设置为true，在当前事务中索引不可用。

PostgreSQL索引失效会发生什么

经过验证，在index_build阶段，判断到BrokenHotChain，便将indcheckxmin修改为true。

PostgreSQL索引失效会发生什么

具体的修改代码如下：

	
if ((indexInfo->ii_BrokenHotChain || EarlyPruningEnabled(heapRelation)) &&
		!isreindex &&
		!indexInfo->ii_Concurrent)
	{
		Oid			indexId = RelationGetRelid(indexRelation);
		Relation	pg_index;
		HeapTuple	indexTuple;
		Form_pg_index indexForm;
		pg_index = table_open(IndexRelationId, RowExclusiveLock);
		indexTuple = SearchSysCacheCopy1(INDEXRELID,
										 ObjectIdGetDatum(indexId));
		if (!HeapTupleIsValid(indexTuple))
			elog(ERROR, "cache lookup failed for index %u", indexId);
		indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
		
		Assert(!indexForm->indcheckxmin);

		indexForm->indcheckxmin = true;
		CatalogTupleUpdate(pg_index, &indexTuple->t_self, indexTuple);
		heap_freetuple(indexTuple);
		table_close(pg_index, RowExclusiveLock);
	}

同样我们也可以验证得知，的确是因为brokenhotchains导致的indcheckxmin被设置为true。

场景二：old_snapshot_threshold

先来看例子：

最简单的场景，完全的一张空表，在事务中创建索引indcheckxmin就会被设置为true，果然索引也是不可用。

bill=# drop table t1;
DROP TABLE
bill=# create table t1(id int,info text);
CREATE TABLE
bill=# begin;
BEGIN
bill=*# create index idx_t1 on t1(id);
CREATE INDEX
bill=*# select indcheckxmin from pg_index where indexrelid = 'idx_t1'::regclass;
 indcheckxmin
--------------
 t
(1 row)

bill=*# explain select * from t1 where id = 1;
                     QUERY PLAN
----------------------------------------------------
 Seq Scan on t1  (cost=0.00..25.88 rows=6 width=36)
   Filter: (id = 1)
(2 rows)

那么为什么old_snapshot_threshold会产生这样的影响呢？

经过跟踪发现，当开启该参数时，在事务中创建索引的snapshotdata结构如下：

(SnapshotData) $6 = {
snapshot_type = SNAPSHOT_MVCC
xmin = 856
xmax = 856
xip = 0x00007fd55c804fc0
xcnt = 0
subxip = 0x00007fd55ad5d000
subxcnt = 0
suboverflowed = false
takenDuringRecovery = false
copied = false
curcid = 1
speculativeToken = 0
vistest = NULL
active_count = 0
regd_count = 0
ph_node = {
first_child = NULL
next_sibling = NULL
prev_or_parent = NULL
}
whenTaken = 691752041261069
lsn = 208079736
}

而禁用该参数呢？

(SnapshotData) $7 = {
snapshot_type = SNAPSHOT_MVCC
xmin = 828
xmax = 828
xip = 0x00007fad31704780
xcnt = 0
subxip = 0x00007fad3155d000
subxcnt = 0
suboverflowed = false
takenDuringRecovery = false
copied = false
curcid = 1
speculativeToken = 0
active_count = 0
regd_count = 0
ph_node = {
first_child = NULL
next_sibling = NULL
prev_or_parent = NULL
}
whenTaken = 0
lsn = 0
}

可以看到，区别在于不使用该参数时，创建snapshotdata不会设置whenTaken和lsn，那么这两个参数是干嘛的呢？

先来看看snapshotdata的结构：

typedef struct SnapshotData
{
    SnapshotType snapshot_type; 
    
    TransactionId xmin;         
    TransactionId xmax;         
    
    TransactionId *xip;
    uint32      xcnt;           
    
    TransactionId *subxip;
    int32       subxcnt;        
    bool        suboverflowed;  
    bool        takenDuringRecovery;    
    bool        copied;         
    CommandId   curcid;         
    
    uint32      speculativeToken;
    
    struct GlobalVisState *vistest;
    
    uint32      active_count;   
    uint32      regd_count;     
    pairingheap_node ph_node;   
    TimestampTz whenTaken;      
    XLogRecPtr  lsn;            
    
    uint64      snapXactCompletionCount;
} SnapshotData;

如上所示，TimestampTz表示snapshot何时产生的，为什么启用old_snapshot_threshold时会设置该值呢？

因为该值正是用来判断快照是否过旧的：


void
TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
{
        if (RelationAllowsEarlyPruning(relation)
                && (snapshot)->whenTaken < GetOldSnapshotThresholdTimestamp())
                ereport(ERROR,
                                (errcode(ERRCODE_SNAPSHOT_TOO_OLD),
                                 errmsg("snapshot too old")));
}

这样我们也比较好理解为什么设置了该参数时创建的索引在当前事务中不可用：

因为我们不设置该参数时，在事务中创建索引是可以保证MVCC的一致性，那么索引便是安全可用的。

而使用参数时，由于TimestampTz被设置，数据库会对其进行判断该行数据是否已经过期，如果过期了那便会被清理掉，这样对于索引来说便是不安全的，没法保证数据的一致性，对于不是hot-safe的索引，自然要将其indcheckxmin设置为true，防止在事务中创建索引后数据实际已经过期被删除的情况。

总结

当pg_index的indcheckxmin字段被设置为true时，直到此pg_index行的xmin低于查询的TransactionXmin视界之前，查询都不能使用此索引。

而产生这种现象主要有两种情况：

1. 表上在当前事务中存在broken HOT chains；

2. old_snapshot_threshold被设置时。

到此这篇关于PostgreSQL索引失效会发生什么的文章就介绍到这了,更多相关PostgreSQL索引失效内容请搜索我们以前的文章或继续浏览下面的相关文章希望大家以后多多支持我们！

文章详情

PostgreSQL索引失效会发生什么

软考中级精品资料免费领

相关文章

猜你喜欢

PostgreSQL索引失效会发生什么

mysql索引为什么会失效

mysql索引什么情况下会失效

mysql中什么情况会导致索引失效

10张图，搞懂索引为什么会失效？

MySQL细数发生索引失效的情况

mysql中联合索引生效的条件及索引失效的条件是什么

面试突击：什么情况会导致 MySQL 索引失效？

索引失效的原因是什么

MySQL调优之索引在什么情况下会失效详解

索引失效底层原理是什么

MySQL索引失效的原理是什么

MySQL细数发生索引失效的情况实例分析

mysql in索引失效的原因是什么

容易引起oracle索引失效的原因是什么

mysql联合索引失效的原理是什么

mysql中出现索引失效的原因是什么

mysql联合索引生效的原则是什么

MySQL索引失效原因及SQL查询语句不走索引原因是什么

跳槽必备之你设计索引的原则是什么?怎么避免索引失效?