文章详情

短信预约-IT技能 免费直播动态提醒

请输入下面的图形验证码

提交验证

短信预约提醒成功

PostgreSQL 源码解读(58)- 查询语句#43(make_one_rel函数#8-B...

2024-04-02 19:55

关注

这一小节继续介绍查询物理优化中的create_index_paths->create_bitmap_heap_path函数,该函数创建位图堆扫描访问路径节点。
关于BitmapHeapScan的相关知识,请参照PostgreSQL DBA(6) - SeqScan vs IndexScan vs BitmapHeapScan这篇文章.
本节没有描述具体的Cost成本计算方法(公式),后续再行详述。

一、数据结构

Cost相关
注意:实际使用的参数值通过系统配置文件定义,而不是这里的常量定义!

 typedef double Cost; 

 
 
 
 
 #define DEFAULT_SEQ_PAGE_COST  1.0       //顺序扫描page的成本
 #define DEFAULT_RANDOM_PAGE_COST  4.0      //随机扫描page的成本
 #define DEFAULT_CPU_TUPLE_COST  0.01     //处理一个元组的CPU成本
 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.005   //处理一个索引元组的CPU成本
 #define DEFAULT_CPU_OPERATOR_COST  0.0025    //执行一次操作或函数的CPU成本
 #define DEFAULT_PARALLEL_TUPLE_COST 0.1    //并行执行,从一个worker传输一个元组到另一个worker的成本
 #define DEFAULT_PARALLEL_SETUP_COST  1000.0  //构建并行执行环境的成本
 
 #define DEFAULT_EFFECTIVE_CACHE_SIZE  524288    

 double      seq_page_cost = DEFAULT_SEQ_PAGE_COST;
 double      random_page_cost = DEFAULT_RANDOM_PAGE_COST;
 double      cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
 double      cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
 double      cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
 double      parallel_tuple_cost = DEFAULT_PARALLEL_TUPLE_COST;
 double      parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;
 
 int         effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
 Cost        disable_cost = 1.0e10;//1后面10个0,通过设置一个巨大的成本,让优化器自动放弃此路径
 
 int         max_parallel_workers_per_gather = 2;//每次gather使用的worker数

二、源码解读

create_bitmap_heap_path函数
create_index_paths->create_bitmap_heap_path函数,创建位图堆扫描访问路径节点.

 
 BitmapHeapPath *
 create_bitmap_heap_path(PlannerInfo *root,
                         RelOptInfo *rel,
                         Path *bitmapqual,
                         Relids required_outer,
                         double loop_count,
                         int parallel_degree)
 {
     BitmapHeapPath *pathnode = makeNode(BitmapHeapPath);//创建节点
 
     pathnode->path.pathtype = T_BitmapHeapScan;
     pathnode->path.parent = rel;
     pathnode->path.pathtarget = rel->reltarget;
     pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
                                                           required_outer);
     pathnode->path.parallel_aware = parallel_degree > 0 ? true : false;
     pathnode->path.parallel_safe = rel->consider_parallel;
     pathnode->path.parallel_workers = parallel_degree;
     pathnode->path.pathkeys = NIL;  
 
     pathnode->bitmapqual = bitmapqual;
 
     cost_bitmap_heap_scan(&pathnode->path, root, rel,
                           pathnode->path.param_info,
                           bitmapqual, loop_count);//成本估算
 
     return pathnode;//返回结果
 }
 
//-------------------------------------------------------- cost_bitmap_heap_scan
 
 void
 cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
                       ParamPathInfo *param_info,
                       Path *bitmapqual, double loop_count)
 {
     Cost        startup_cost = 0;//启动成本
     Cost        run_cost = 0;//执行成本
     Cost        indexTotalCost;//索引扫描总成本
     QualCost    qpqual_cost;//表达式成本
     Cost        cpu_per_tuple;
     Cost        cost_per_page;
     Cost        cpu_run_cost;
     double      tuples_fetched;
     double      pages_fetched;
     double      spc_seq_page_cost,
                 spc_random_page_cost;
     double      T;
 
     
     Assert(IsA(baserel, RelOptInfo));
     Assert(baserel->relid > 0);
     Assert(baserel->rtekind == RTE_RELATION);
 
     
     if (param_info)
         path->rows = param_info->ppi_rows;
     else
         path->rows = baserel->rows;
 
     if (!enable_bitmapscan)//不允许位图扫描
         startup_cost += disable_cost;//禁用之
 
     pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
                                          loop_count, &indexTotalCost,
                                          &tuples_fetched);//计算页面数
 
     startup_cost += indexTotalCost;//启动成本为BitmapIndexScan的总成本
     T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;//页面数
 
     
     get_tablespace_page_costs(baserel->reltablespace,
                               &spc_random_page_cost,
                               &spc_seq_page_cost);//访问表空间页面成本
 
     
     if (pages_fetched >= 2.0)
         cost_per_page = spc_random_page_cost -
             (spc_random_page_cost - spc_seq_page_cost)
             * sqrt(pages_fetched / T);
     else
         cost_per_page = spc_random_page_cost;
 
     run_cost += pages_fetched * cost_per_page;//执行成本
 
     
     get_restriction_qual_cost(root, baserel, param_info, &qpqual_cost);//获取条件表达式
 
     startup_cost += qpqual_cost.startup;//增加启动成本
     cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;//增加处理每个元组的CPU成本
     cpu_run_cost = cpu_per_tuple * tuples_fetched;//CPU运行成本
 
     
     if (path->parallel_workers > 0)//是否并行?
     {
         double      parallel_divisor = get_parallel_divisor(path);
 
         
         cpu_run_cost /= parallel_divisor;
 
         path->rows = clamp_row_est(path->rows / parallel_divisor);
     }
 
     //计算最终成本
     run_cost += cpu_run_cost;
 
     
     startup_cost += path->pathtarget->cost.startup;
     run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
     path->startup_cost = startup_cost;
     path->total_cost = startup_cost + run_cost;
 }

//--------------------------------------- compute_bitmap_pages

 
 double
 compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
                      int loop_count, Cost *cost, double *tuple)
 {
     Cost        indexTotalCost;
     Selectivity indexSelectivity;
     double      T;
     double      pages_fetched;
     double      tuples_fetched;
     double      heap_pages;
     long        maxentries;
 
     
     cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
 
     
     tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);//计算总元组数
 
     T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
 
     
     pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
 
     
     heap_pages = Min(pages_fetched, baserel->pages);//堆页面数
     maxentries = tbm_calculate_entries(work_mem * 1024L);//位图最大入口数
 
     if (loop_count > 1)
     {
         
         pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
                                             baserel->pages,
                                             get_indexpath_pages(bitmapqual),
                                             root);
         pages_fetched /= loop_count;
     }
 
     if (pages_fetched >= T)
         pages_fetched = T;//数据字典中的页面数
     else
         pages_fetched = ceil(pages_fetched);
 
     if (maxentries < heap_pages)//最大入口数小于堆页面数
     {
         double      exact_pages;
         double      lossy_pages;
 
         
         lossy_pages = Max(0, heap_pages - maxentries / 2);
         exact_pages = heap_pages - lossy_pages;
 
         
         if (lossy_pages > 0)
             tuples_fetched =
                 clamp_row_est(indexSelectivity *
                               (exact_pages / heap_pages) * baserel->tuples +
                               (lossy_pages / heap_pages) * baserel->tuples);
     }
 
     if (cost)
         *cost = indexTotalCost;
     if (tuple)
         *tuple = tuples_fetched;
 
     return pages_fetched;
 }

//--------------------------- tbm_calculate_entries

 
 long
 tbm_calculate_entries(double maxbytes)
 {
     long        nbuckets;
 
     
     nbuckets = maxbytes /
         (sizeof(PagetableEntry) + sizeof(Pointer) + sizeof(Pointer));//桶数
     nbuckets = Min(nbuckets, INT_MAX - 1);  
     nbuckets = Max(nbuckets, 16);   
 
     return nbuckets;
 }

//--------------------------- cost_bitmap_tree_node

 
 void
 cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
 {
     if (IsA(path, IndexPath))//索引访问路径
     {
         *cost = ((IndexPath *) path)->indextotalcost;
         *selec = ((IndexPath *) path)->indexselectivity;
 
         
         *cost += 0.1 * cpu_operator_cost * path->rows;
     }
     else if (IsA(path, BitmapAndPath))//BitmapAndPath
     {
         *cost = path->total_cost;
         *selec = ((BitmapAndPath *) path)->bitmapselectivity;
     }
     else if (IsA(path, BitmapOrPath))//BitmapOrPath
     {
         *cost = path->total_cost;
         *selec = ((BitmapOrPath *) path)->bitmapselectivity;
     }
     else
     {
         elog(ERROR, "unrecognized node type: %d", nodeTag(path));
         *cost = *selec = 0;     
     }
 }

三、跟踪分析

测试脚本如下

select t1.* 
from t_dwxx t1 
where dwbh > '10000' and dwbh < '30000';

启动gdb跟踪

(gdb) b create_bitmap_heap_path
Breakpoint 1 at 0x78f1c1: file pathnode.c, line 1090.
(gdb) c
Continuing.

Breakpoint 1, create_bitmap_heap_path (root=0x23d93d8, rel=0x248a788, bitmapqual=0x2473a08, required_outer=0x0, 
    loop_count=1, parallel_degree=0) at pathnode.c:1090
1090        BitmapHeapPath *pathnode = makeNode(BitmapHeapPath);

创建节点,并赋值

1090        BitmapHeapPath *pathnode = makeNode(BitmapHeapPath);
(gdb) n
1092        pathnode->path.pathtype = T_BitmapHeapScan;
(gdb) n
1093        pathnode->path.parent = rel;
(gdb) n
1094        pathnode->path.pathtarget = rel->reltarget;
(gdb) n
1095        pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
(gdb) 
1097        pathnode->path.parallel_aware = parallel_degree > 0 ? true : false;
(gdb) 
1098        pathnode->path.parallel_safe = rel->consider_parallel;
(gdb) 
1099        pathnode->path.parallel_workers = parallel_degree;
(gdb) 
1100        pathnode->path.pathkeys = NIL;  
(gdb) 
1102        pathnode->bitmapqual = bitmapqual;

进入cost_bitmap_heap_scan函数

(gdb) 
1104        cost_bitmap_heap_scan(&pathnode->path, root, rel,
(gdb) step
cost_bitmap_heap_scan (path=0x24737d8, root=0x23d93d8, baserel=0x248a788, param_info=0x0, bitmapqual=0x2473a08, 
    loop_count=1) at costsize.c:949
949     Cost        startup_cost = 0;

输入参数,其中bitmapqual为T_IndexPath节点
路径的其他关键信息:rows = 2223, startup_cost = 0.28500000000000003, total_cost = 169.23871600907944

(gdb) p *(IndexPath *)bitmapqual
$2 = {path = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x248a788, pathtarget = 0x248a998, param_info = 0x0, 
    parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 2223, startup_cost = 0.28500000000000003, 
    total_cost = 169.23871600907944, pathkeys = 0x0}, indexinfo = 0x23b63b8, indexclauses = 0x2473948, 
  indexquals = 0x2473b38, indexqualcols = 0x2473b88, indexorderbys = 0x0, indexorderbycols = 0x0, 
  indexscandir = ForwardScanDirection, indextotalcost = 50.515000000000001, indexselectivity = 0.22227191011235958}

开始计算成本

...
980     startup_cost += indexTotalCost;
(gdb) p indexTotalCost
$16 = 51.070750000000004
(gdb) p startup_cost
$17 = 0
(gdb) p pages_fetched
$18 = 64
(gdb) p baserel->pages
$19 = 64
...
(gdb) p qpqual_cost
$20 = {startup = 0, per_tuple = 0.0050000000000000001}

最终的访问路径信息

(gdb) p *(BitmapHeapPath *)path
$22 = {path = {type = T_BitmapHeapPath, pathtype = T_BitmapHeapScan, parent = 0x248a788, pathtarget = 0x248a998, 
    param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 2223, 
    startup_cost = 51.070750000000004, total_cost = 148.41575, pathkeys = 0x0}, bitmapqual = 0x2473a08}

除了BitmapHeapPath,还有BitmapOr和BitmapAnd,这两种Path的解析后续再详述.

四、参考资料

allpaths.c
cost.h
costsize.c
PG Document:Query Planning

免责声明:

① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的,并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据,供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。

② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341

软考中级精品资料免费领

  • 历年真题答案解析
  • 备考技巧名师总结
  • 高频考点精准押题
  • 资料下载
  • 历年真题
  • 2024年上半年信息系统项目管理师第二批次真题及答案解析(完整版)

    难度     813人已做
    查看
  • 【考后总结】2024年5月26日信息系统项目管理师第2批次考情分析

    难度     354人已做
    查看
  • 【考后总结】2024年5月25日信息系统项目管理师第1批次考情分析

    难度     318人已做
    查看
  • 2024年上半年软考高项第一、二批次真题考点汇总(完整版)

    难度     435人已做
    查看
  • 2024年上半年系统架构设计师考试综合知识真题

    难度     224人已做
    查看

相关文章

发现更多好内容

猜你喜欢

AI推送时光机
咦!没有更多了?去看看其它编程学习网 内容吧
首页课程
资料下载
问答资讯