postgresql 库中出现性能问题,对于复杂的sql, 常用分析过程:
- 简化SQL,定位性能异常点:
- 简化输出。像下面语句,可以先把输出的子查询去掉。有时也可以使用count(*)代替输出。
- 逐个测试union(minus),with子句。基于这些语句的独立性,可以逐个测试,逐渐添加条件,找到异常点。
- 分析执行计划,查看表数据量,连接方式,统计信息情况,索引情况
- Explain 各部分的消耗,连接方式等,如果语句可以在接受时间内执行,可以使用explain(analyze, buffers, timing)
- Pg_stat_user_table可以查看什么时候做的vacuum和analyze,live tuple和dead tuple个数,还有增删改查的次数等。
- Pg_stats 可以查看值的分布情况
回到下面的SQL:
1. 先做简化,使用count(*)替换所有输出:
explain(analyze , buffers, timing) select count(*)
from sms_task_content_info a,
tsk_type_tbl b,
tsk_plan_info c,
sm_code_tbl d,
smu_info e
where a.course_type = b.course_type
and a.course_id = c.content_id
and c.plan_maker = e.user_id
and e.region_code = d.region_code
and d.is_valid = "Y"
and c.date_plan >= to_date("2016-12-01", "yyyy-mm-dd")
and c.date_plan < to_date("2016-12-31", "yyyy-mm-dd") + 1
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2947.16..2947.18 rows=1 width=0) (actual time=49.154..49.154 rows=1 loops=1)
Buffers: shared hit=1602
-> Hash Join (cost=657.32..2928.06 rows=7643 width=0) (actual time=13.259..48.521 rows=7440 loops=1)
Hash Cond: ((c.content_id)::text = (a.course_id)::text)
Buffers: shared hit=1602
-> Hash Join (cost=459.24..2615.33 rows=7643 width=33) (actual time=10.020..42.532 rows=7440 loops=1)
Hash Cond: ((c.plan_maker)::text = (e.user_id)::text)
Buffers: shared hit=1491
-> Seq Scan on tsk_plan_info c (cost=0.00..2022.34 rows=7643 width=45) (actual time=0.629..29.272 rows=7440 loops=1)
Filter: ((date_plan >= to_date("2016-12-01"::text, "yyyy-mm-dd"::text)) AND (date_plan < (to_date("2016-12-31"::text, "yyyy-mm-dd"::text) + 1)))
Rows Removed by Filter: 25003
Buffers: shared hit=1286
-> Hash (cost=412.29..412.29 rows=3756 width=12) (actual time=9.377..9.377 rows=3756 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 164kB
Buffers: shared hit=205
-> Hash Join (cost=179.00..412.29 rows=3756 width=12) (actual time=3.754..7.788 rows=3756 loops=1)
Hash Cond: ((e.region_code)::text = (d.region_code)::text)
Buffers: shared hit=205
-> Seq Scan on smu_info e (cost=0.00..167.56 rows=3756 width=14) (actual time=0.006..1.228 rows=3756 loops=1)
Buffers: shared hit=130
-> Hash (cost=127.00..127.00 rows=4160 width=6) (actual time=3.736..3.736 rows=4103 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 156kB
Buffers: shared hit=75
-> Seq Scan on sms_region_code_tbl d (cost=0.00..127.00 rows=4160 width=6) (actual time=0.003..2.201 rows=4103 loops=1)
Filter: ((is_valid)::text = "Y"::text)
Rows Removed by Filter: 4
Buffers: shared hit=75
-> Hash (cost=171.94..171.94 rows=2092 width=33) (actual time=3.231..3.231 rows=2093 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 133kB
Buffers: shared hit=111
-> Hash Join (cost=12.25..171.94 rows=2092 width=33) (actual time=0.021..2.231 rows=2093 loops=1)
Hash Cond: ((a.course_type)::text = (b.course_type)::text)
Buffers: shared hit=111
-> Seq Scan on sms_task_content_info a (cost=0.00..130.92 rows=2092 width=35) (actual time=0.004..0.818 rows=2093 loops=1)
Buffers: shared hit=110
-> Hash (cost=11.00..11.00 rows=100 width=20) (actual time=0.009..0.009 rows=6 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
Buffers: shared hit=1
-> Seq Scan on tsk_type_tbl b (cost=0.00..11.00 rows=100 width=20) (actual time=0.003..0.005 rows=6 loops=1)
Buffers: shared hit=1
Planning time: 2.522 ms
Execution time: 49.270 ms
(42 rows)
Time: 71.990 ms
去掉子查询后,语句很快就输出了, 问题就在输出结果里的子查询,到最后输出7440行,就意味着那两个子查询都需要7440次。整体语句慢在这里。
select distinct d.description as "RANGE",
b.description as "COURSE_CLASSIFICATION_DESC",
to_char(c.date_make, "yyyy-mm-dd") as "DATE_MAKE",
to_char(c.date_end, "yyyy-mm-dd") as "DATE_END",
to_char(c.date_plan, "yyyy-mm-dd") as "DATE_PLAN",
(select cast((case
when (select count(1)
from sms_task_content_info a2,
tsk_plan_info b2,
smu_info s2
where a2.course_id = b2.content_id
and b2.plan_maker = s2.user_id
and b2.plan_status != "2"
and s2.region_code = e.region_code
and a2.course_type = a.course_type
and to_char(b2.date_make, "yyyy-mm-dd") =
to_char(c.date_make, "yyyy-mm-dd")
and to_char(b2.date_plan, "yyyy-mm-dd") =
to_char(c.date_plan, "yyyy-mm-dd")
and to_char(b2.date_end, "yyyy-mm-dd") =
to_char(c.date_end, "yyyy-mm-dd")) != 0 then
(cast(100 AS numeric(5, 2)) *
(select count(1)
from sms_task_content_info a1,
tsk_plan_info b1,
smu_info s1
where a1.course_id = b1.content_id
and b1.plan_maker = s1.user_id
and b1.plan_status = "1"
and s1.region_code = e.region_code
and a1.course_type = a.course_type
and to_char(b1.date_make, "yyyy-mm-dd") =
to_char(c.date_make, "yyyy-mm-dd")
and to_char(b1.date_plan, "yyyy-mm-dd") =
to_char(c.date_plan, "yyyy-mm-dd")
and to_char(b1.date_end, "yyyy-mm-dd") =
to_char(c.date_end, "yyyy-mm-dd")) /
(select count(1)
from sms_task_content_info a2,
tsk_plan_info b2,
smu_info s2
where a2.course_id = b2.content_id
and b2.plan_maker = s2.user_id
and b2.plan_status != "2"
and s2.region_code = e.region_code
and a2.course_type = a.course_type
and to_char(b2.date_make, "yyyy-mm-dd") =
to_char(c.date_make, "yyyy-mm-dd")
and to_char(b2.date_plan, "yyyy-mm-dd") =
to_char(c.date_plan, "yyyy-mm-dd")
and to_char(b2.date_end, "yyyy-mm-dd") =
to_char(c.date_end, "yyyy-mm-dd")))
else
"0"
end) AS numeric(5, 2)) || "%"
from dual) as "FINISH_RATIO",
d.region_code,
b.course_type
from sms_task_content_info a,
tsk_type_tbl b,
tsk_plan_info c,
sm_code_tbl d,
smu_info e
where a.course_type = b.course_type
and a.course_id = c.content_id
and c.plan_maker = e.user_id
and e.region_code = d.region_code
and d.is_valid = "Y"
-- and e.region_code in()
-- and a.course_type = "1"
and c.date_plan >= to_date("2016-12-01", "yyyy-mm-dd")
and c.date_plan < to_date("2016-12-31", "yyyy-mm-dd") + 1
order by d.region_code, b.course_type;