昨天中秋节,本该是团圆的好日子,苦逼的运维我还要值班(哈哈,吐槽一下)本以为会没有啥事,谁知道比较重要的一台Oracle服务器突然报警,CPU 2个core都飙到100%,load average也比较高,如下图:
AWS CloudWatch也可以看出来CPU长期使用率100%
从图可得:系统us比较高,sy基本可以忽略,Memory和IO都已经检查过,不存在瓶颈,根据以往经验,极有可能是Oracle数据库有SQL在长时间运行,并且没有释放,登录到数据库查看,可以看到sid为410,408,404进程执行的都是同一个SQL,
SYS@xxxxxx>SELECT b.sid oracleID,
b.username,
b.serial#,
spid,
paddr,
b.machine,
c.sql_text
FROM v$process a, v$session b, v$sqlarea c
WHERE a.addr = b.paddr AND b.sql_hash_value = c.hash_value;
ORACLEID USERNAME SERIAL# SPID PADDR MACHINE
---------- ------------------------------ ---------- ------------------------ ---------------- ----------------------------------------------------------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
410 PRERNAP2 371 16743 00000002DEC84E60 Prernap2-mbr
with cte as(select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM
(Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate), cte1 as (select foreca
stdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, TO_C
HAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as (select cte.shipto, cte.buyerpartnumber,ct
e.week_month, cte.week_year, max(rank_week) as max_week from cte group by cte.shipto, cte.buyerpartnumber,cte.week_month, cte.week_year), cte4 as (select cte2.*, cte1.forecast_month, cte1.forecast_y
408 PRERNAP21163 15129 00000002DEC916A0 Prernap2-mbr
with cte as(select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM
(Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate), cte1 as (select foreca
stdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, TO_C
HAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as (select cte.shipto, cte.buyerpartnumber,ct
e.week_month, cte.week_year, max(rank_week) as max_week from cte group by cte.shipto, cte.buyerpartnumber,cte.week_month, cte.week_year), cte4 as (select cte2.*, cte1.forecast_month, cte1.forecast_y
18 PRERNAP2 311 19710 00000002DEC948B0 Prernap2-mbr
with cte as(select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM
(Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate), cte1 as (select foreca
stdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, TO_C
HAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as (select cte.shipto, cte.buyerpartnumber,ct
e.week_month, cte.week_year, max(rank_week) as max_week from cte group by cte.shipto, cte.buyerpartnumber,cte.week_month, cte.week_year), cte4 as (select cte2.*, cte1.forecast_month, cte1.forecast_y
404 PRERNAP2 665 21911 00000002DEC95960 Prernap2-mbr
with cte as( select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM
(Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate ), cte1 as ( select for
ecastdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, T
O_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as ( select cte.shipto, cte.buyerpartnumbe
r,cte.week_month, cte.week_year, max(rank_week) as max_week , cte1.forecast_month, cte1.forecast_year from cte inner join cte1 on cte1.forecast_year>=cte.week_year and cte.shipto = cte1.shipto and c
22 SYS 447 23888 00000002DEC96A10 ec2-admart-01
SELECT b.sid oracleID, b.username, b.serial#, spid,paddr, b.machine,c.sql_text FROM v$process a, v$session b, v$sqlarea c WHERE a.addr = b.paddrAND b.sq
l_hash_value = c.hash_value
387 PRERNAP2 313 24261 00000002DEC97AC0 Prernap2-mbr
with cte as( select distinct weekdate, week_month, week_year, SHIPTO, BUYERPARTNUMBER, dense_rank() over(partition by week_year,SHIPTO, BUYERPARTNUMBER, week_month order by weekdate) as rank_week FROM
(Select weekdate, to_char(weekdate, 'MM') as week_month, to_char(weekdate, 'YYYY') as week_year, SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by weekdate ), cte1 as ( select for
ecastdate,forecast_month, forecast_year, shipto, buyerpartnumber from (select to_date(forecastdate, 'YYYYMMDD') as forecastdate, TO_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'MM' ) as forecast_month, T
O_CHAR(to_date(forecastdate, 'YYYYMMDD'), 'YYYY' ) as forecast_year,SHIPTO, BUYERPARTNUMBER from AUTO.ford_forecast_details ) order by forecastdate ), cte2 as ( select cte.shipto, cte.buyerpartnumbe
r,cte.week_month, cte.week_year, max(rank_week) as max_week , cte1.forecast_month, cte1.forecast_year from cte inner join cte1 on cte1.forecast_year>=cte.week_year and cte.shipto = cte1.shipto and c
6 rows selected.
SYS@xxxxxx>select b.sid,b.serial#,b.machine,b.terminal,b.program,b.process,b.status
from v$lock a, v$session b where a.SID = b.SID and username is not null and username not in ('SYS','SYSTEM');
SID SERIAL# MACHINE TERMINAL PROGRAM PROCESS STATUS
---------- ---------- ---------------------------------------------------------------- ------------------------------ ------------------------------------------------ ------------------------ --------
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
404 665 Prernap2-mbr unknown SQL Developer 4145 ACTIVE
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
408 1163 Prernap2-mbr unknown SQL Developer 3377 ACTIVE
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
404 665 Prernap2-mbr unknown SQL Developer 4145 ACTIVE
408 1163 Prernap2-mbr unknown SQL Developer 3377 ACTIVE
410 371 Prernap2-mbr unknown SQL Developer 5691 ACTIVE
18 311 Prernap2-mbr unknown SQL Developer 1497 ACTIVE
20 221 Prernap2-mbr unknown SQL Developer 4689 ACTIVE
20 221 Prernap2-mbr unknown SQL Developer 4689 ACTIVE
387 313 Prernap2-mbr unknown SQL Developer 6246 ACTIVE
15 rows selected.
SYS@xxxxxx>select sid, username, blocking_session from v$session where blocking_session is not null;
SID USERNAME BLOCKING_SESSION
---------- ------------------------------ ----------------
18 PRERNAP2 408
387 PRERNAP2 404
410 PRERNAP2 408
SYS@xxxxxx>select sid, serial#, username from v$session where sid='410';
SID SERIAL# USERNAME
---------- ---------- ------------------------------
410 371 PRERNAP2
解决方法
找到开发人员,询问原因,得到的反馈是在测试几条SQL(我擦,竟然在生产环境测试SQL,哎,一点敬畏之心都没有,可怕!)
kill掉blocked的进程,释放资源,再这么跑下去,系统随时可能崩溃,最后去优化一下的SQL,再去执行
alter system kill session '410,371';
......其他几个进程同理干掉即可