原文http://www.oracle.com/technetwork/articles/schumacher-analysis-099313.html
数据库Oracle 10 g,许多以前难以得到的响应时间指标马上就可以获得了。
历史上,为了获得最大的数据库性能,Oracle dba和性能分析人士为获得可靠的响应时间指标体系以及用户会话活动打了一场艰苦的战斗。dba一直面临的问题有两个方面:第一,确定数据库或用户会话花费他们的时间到底在哪里;第二,确定客观自然的用户体验。
考虑到在数据库中所有可能的活动和互动,这些任务是非比寻常的。Oracle等待接口,介绍了许多版本以前对于指导怎么使用它的管理员来说一直是一个伟大的启动,即使它缺乏告诉DBA系统或用户会话如何有效地处理事务或查询的理想能力。启用和研读跟踪文件可以获得这种层次的细节,但对大多数劳累的dba需要管理大型的数据库集群,这种练习是一种奢侈。
幸运的是,那些已经升级到Oracle dba 10 g的DBA们会发现主要响应时间系统的改进提供一个更好的图片展示系统和会话级响应时间指标系统。最重要的是,Oracle数据库自动诊断监控系统(ADDM)提供了洞察响应时间和更多信息通过自动收集统计分析,识别问题域,甚至通过Oracle Enterprise Manager Grid Control GUI提供建议。
此外,这儿最值得我们讨论的是,Oracle数据库10g的历史机制它允许dba回看之前的结果以实现自己的响应时间趋势分析,这有助于他们确定峰值和非高峰时间的事务/系统时间以及通过延长批周期或ETL作业定位流氓程序和SQL。
在这篇文章中,我将探讨在系统,会议,和SQL的水平使用其中的一些历史机制。关于ADDM的更多信息,请参阅甲骨文文档以及在” Arup Nanda's "Oracle Database 10g: Top 20 Features for DBAs”的“ADDM和SQL调优顾问“分期。
系统级响应时间分析
从全局或系统级别考虑,dba通常希望得到这些问题的答案:
总的来说,我的数据库运行得怎么样?定义效率的是什么?
我的用户体验平均响应时间是多少?
哪些活动最影响总体响应时间?
在Oracle 10 g数据库之前这些问题的答案对于DBA来说已经相当难以捉摸,但现在这样的指标可以是比较容易捕捉如果你碰巧使用最新和最伟大的Oracle数据库。
首先,部分答案如何,一般来说,运行中的数据库可以通过发送该查询在数据库Oracle 10 g中获得:
select METRIC_NAME,
VALUE
from SYS.V_$SYSMETRIC
where METRIC_NAME IN ('Database CPU Time Ratio',
'Database Wait Time Ratio') AND
INTSIZE_CSEC =
(select max(INTSIZE_CSEC) from SYS.V_$SYSMETRIC);
METRIC_NAME VALUE
------------------------------ ----------
Database Wait Time Ratio 6
Database CPU Time Ratio 94
The Oracle Database 10g V$SYSMETRIC view contains several excellent response-time metrics, two of which are the Database Wait Time Ratio and Database CPU Time Ratio. The query above shows the latest snapshot of these two statistics, which help you determine if your database is currently experiencing a high percentage of waits/bottlenecks vs. smoothly running operations. The Database CPU Time Ratio is calculated by dividing the amount of CPU expended in the database by the amount of "database time," which is defined as the time spent by the database on user-level calls (with instance background process activity being excluded). High values (90-95+ percent) are good and indicate few wait/bottleneck actions, but take this threshold only as a general rule of thumb because every system is different.
You can also take a quick look over the last hour to see if the database has experienced any dips in overall performance by using this query:
select end_time,
value
from sys.v_$sysmetric_history
where metric_name = 'Database CPU Time Ratio'
order by 1;
END_TIME VALUE
-------------------- ----------
22-NOV-2004 10:00:38 98
22-NOV-2004 10:01:39 96
22-NOV-2004 10:02:37 99
22-NOV-2004 10:03:38 100
22-NOV-2004 10:04:37 99
22-NOV-2004 10:05:38 77
22-NOV-2004 10:06:36 100
22-NOV-2004 10:07:37 96
22-NOV-2004 10:08:39 100
.
.
And, you can get a good idea of the minimum, maximum, and average values of overall database efficiency by querying the V$SYSMETRIC_SUMMARY view with a query such as this:
select CASE METRIC_NAME
WHEN 'SQL Service Response Time' then 'SQL Service Response Time (secs)'
WHEN 'Response Time Per Txn' then 'Response Time Per Txn (secs)'
ELSE METRIC_NAME
END METRIC_NAME,
CASE METRIC_NAME
WHEN 'SQL Service Response Time' then ROUND((MINVAL / 100),2)
WHEN 'Response Time Per Txn' then ROUND((MINVAL / 100),2)
ELSE MINVAL
END MININUM,
CASE METRIC_NAME
WHEN 'SQL Service Response Time' then ROUND((MAXVAL / 100),2)
WHEN 'Response Time Per Txn' then ROUND((MAXVAL / 100),2)
ELSE MAXVAL
END MAXIMUM,
CASE METRIC_NAME
WHEN 'SQL Service Response Time' then ROUND((AVERAGE / 100),2)
WHEN 'Response Time Per Txn' then ROUND((AVERAGE / 100),2)
ELSE AVERAGE
END AVERAGE
from SYS.V_$SYSMETRIC_SUMMARY
where METRIC_NAME in ('CPU Usage Per Sec',
'CPU Usage Per Txn',
'Database CPU Time Ratio',
'Database Wait Time Ratio',
'Executions Per Sec',
'Executions Per Txn',
'Response Time Per Txn',
'SQL Service Response Time',
'User Transaction Per Sec')
ORDER BY 1
METRIC_NAME MINIMUM MAXIMUM AVERAGE
------------------------------ ---------- ---------- ----------
CPU Usage Per Sec 0 7 1
CPU Usage Per Txn 1 29 8
Database CPU Time Ratio 61 100 94
Database Wait Time Ratio 0 39 5
Executions Per Sec 2 60 8
Executions Per Txn 16 164 41
Response Time Per Txn (secs) 0 .28 .08
SQL Service Response Time (sec 0 0 0
User Transaction Per Sec 0 1 0
The query above contains more response-time metrics than simply the Database CPU and Wait Time Ratios (we'll cover those later), but you can see the benefit in being able to acquire this information. For this particular instance, the average Database CPU Time Ratio is 94, which is well within our acceptable limits.
The next question DBAs pose at the system level involves the average level of response time that their user community is experiencing. (Prior to Oracle Database 10g this type of data was difficult to capture, but not anymore.) The query shown above that interrogates the V$SYSMETRIC_SUMMARY view tells us what we need to know. If complaints of unacceptable response times are mounting from users, the DBA can check the Response Time Per Txn and SQL Service Response Time metrics to see if a database issue exists. For example, the statistics shown above report that the maximum response time per user transaction has been only .28 second, with the average response time being a blazing .08 second. Oracle certainly wouldn't be to blame in this case.
If, however, response times are longer than desired, the DBA will then want to know what types of user activities are responsible for making the database work so hard. Again, before Oracle Database 10g, this information was more difficult to acquire, but now the answer is only a query away:
select case db_stat_name
when 'parse time elapsed' then
'soft parse time'
else db_stat_name
end db_stat_name,
case db_stat_name
when 'sql execute elapsed time' then
time_secs - plsql_time
when 'parse time elapsed' then
time_secs - hard_parse_time
else time_secs
end time_secs,
case db_stat_name
when 'sql execute elapsed time' then
round(100 * (time_secs - plsql_time) / db_time,2)
when 'parse time elapsed' then
round(100 * (time_secs - hard_parse_time) / db_time,2)
else round(100 * time_secs / db_time,2)
end pct_time
from
(select stat_name db_stat_name,
round((value / 1000000),3) time_secs
from sys.v_$sys_time_model
where stat_name not in('DB time','background elapsed time',
'background cpu time','DB CPU')),
(select round((value / 1000000),3) db_time
from sys.v_$sys_time_model
where stat_name = 'DB time'),
(select round((value / 1000000),3) plsql_time
from sys.v_$sys_time_model
where stat_name = 'PL/SQL execution elapsed time'),
(select round((value / 1000000),3) hard_parse_time
from sys.v_$sys_time_model
where stat_name = 'hard parse elapsed time')
order by 2 desc;
DB_STAT TIME_SECS PCT_TIME
----------------------------- --------- --------
sql execute elapsed time 13263.707 45.84
PL/SQL execution elapsed time 13234.738 45.74
hard parse elapsed time 1943.687 6.72
soft parse time 520.584 1.8
.
.
The example output above shows a database that has spent the vast majority of its time handling SQL and PL/SQL requests. Complete descriptions of all the statistics supported by the V$SYS_TIME_MODEL view can be found here.
In addition to active time, a DBA will want to know the global wait times as well. Prior to Oracle Database 10g, a DBA had to view individual wait events to understand waits and bottlenecks, but now Oracle provides a summary/rollup mechanism for waits via wait classes:
select WAIT_CLASS,
TOTAL_WAITS,
round(100 * (TOTAL_WAITS / SUM_WAITS),2) PCT_WAITS,
ROUND((TIME_WAITED / 100),2) TIME_WAITED_SECS,
round(100 * (TIME_WAITED / SUM_TIME),2) PCT_TIME
from
(select WAIT_CLASS,
TOTAL_WAITS,
TIME_WAITED
from V$SYSTEM_WAIT_CLASS
where WAIT_CLASS != 'Idle'),
(select sum(TOTAL_WAITS) SUM_WAITS,
sum(TIME_WAITED) SUM_TIME
from V$SYSTEM_WAIT_CLASS
where WAIT_CLASS != 'Idle')
order by 5 desc;
WAIT_CLASS TOTAL_WAITS PCT_WAITS TIME_WAITED_SECS PCT_TIME
--------------- ----------- ---------- ---------------- ----------
User I/O 2245204 7.48 4839.43 54.39
System I/O 2438387 8.12 2486.21 27.94
Application 920385 3.07 513.56 5.77
Other 39962 .13 422.36 4.75
Commit 200872 .67 284.76 3.2
Network 24133213 80.38 162.26 1.82
Concurrency 6867 .02 102.63 1.15
Configuration 39377 .13 86.21 .97
It's much easier to tell now that the bulk of overall wait time is due, for example, to user I/O waits than to try to tally individual wait events to get a global picture. As with response-time metrics, you can also look back in time over the last hour with a query like this one:
select to_char(a.end_time,'DD-MON-YYYY HH:MI:SS') end_time,
b.wait_class,
round((a.time_waited / 100),2) time_waited
from sys.v_$waitclassmetric_history a,
sys.v_$system_wait_class b
where a.wait_class# = b.wait_class# and
b.wait_class != 'Idle'
order by 1,2;
END_TIME WAIT_CLASS TIME_WAITED
-------------------- --------------- -----------
22-NOV-2004 11:28:37 Application 0
22-NOV-2004 11:28:37 Commit .02
22-NOV-2004 11:28:37 Concurrency 0
22-NOV-2004 11:28:37 Configuration 0
22-NOV-2004 11:28:37 Network .01
22-NOV-2004 11:28:37 Other 0
22-NOV-2004 11:28:37 System I/O .05
22-NOV-2004 11:28:37 User I/O 0
.
.
You can, of course, just focus on a single SID with the V$SESS_TIME_MODEL view and obtain data for all statistical areas of a session. You can also view current session wait activity using the new wait classes using the following query:
select a.sid,
b.username,
a.wait_class,
a.total_waits,
round((a.time_waited / 100),2) time_waited_secs
from sys.v_$session_wait_class a,
sys.v_$session b
where b.sid = a.sid and
b.username is not null and
a.wait_class != 'Idle'
order by 5 desc;
SID USERNAME WAIT_CLASS TOTAL_WAITS TIME_WAITED_SECS
--- ---------- --------------- ----------- ----------------
257 SYSMAN Application 356104 75.22
255 SYSMAN Commit 14508 25.76
257 SYSMAN Commit 25026 22.02
257 SYSMAN User I/O 11924 19.98
.
.
.
After this stage, you can check the standard individual wait events as you've been able to do in earlier versions of Oracle with V$SESSION_WAIT and V$SESSION_EVENT. You'll also find the new wait classes in these two modified views with Oracle Database 10g.
If you need to look back in time to discover what sessions were logged on and consuming the most resources, you can use the following query. In the example below, we're looking at activity from midnight to 5 a.m. on November 21, 2004, that involved user I/O waits:
select sess_id,
username,
program,
wait_event,
sess_time,
round(100 * (sess_time / total_time),2) pct_time_waited
from
(select a.session_id sess_id,
decode(session_type,'background',session_type,c.username) username,
a.program program,
b.name wait_event,
sum(a.time_waited) sess_time
from sys.v_$active_session_history a,
sys.v_$event_name b,
sys.dba_users c
where a.event# = b.event# and
a.user_id = c.user_id and
sample_time > '21-NOV-04 12:00:00 AM' and
sample_time < '21-NOV-04 05:00:00 AM' and
b.wait_class = 'User I/O'
group by a.session_id,
decode(session_type,'background',session_type,c.username),
a.program,
b.name),
(select sum(a.time_waited) total_time
from sys.v_$active_session_history a,
sys.v_$event_name b
where a.event# = b.event# and
sample_time > '21-NOV-04 12:00:00 AM' and
sample_time < '21-NOV-04 05:00:00 AM' and
b.wait_class = 'User I/O')
order by 6 desc;
SESS_ID USERNAME PROGRAM WAIT_EVENT SESS_TIME PCT_TIME_WAITED
------- -------- ---------- ------------------------- ---------- -------------
242 SYS exp@RHAT9K db file scattered read 3502978 33.49
242 SYS oracle@RHA db file sequential read 2368153 22.64
242 SYS oracle@RHA db file scattered read 1113896 10.65
243 SYS oracle@RHA db file sequential read 992168 9.49
The Oracle Database 10g V$ACTIVE_SESSION_HISTORY view comes into play here to provide an insightful look back in time at session experiences for a given time period. This view gives you a lot of excellent information without the need for laborious tracing functions. We'll see more use of it in the next section, which deals with analyzing the response times of SQL statements.
SQL Response-Time Analysis
Examining the response time of SQL statements became easier in Oracle9i, and with Oracle Database 10g, DBAs have many tools at their disposal to help them track inefficient database code.
Historically the applicable V$ view here has been V$SQLAREA. Starting with Oracle9i, Oracle added the ELAPSED_TIME and CPU_TIME columns, which have been a huge help in determining the actual end user experience of a SQL statement execution (at least, when dividing them by the EXECUTIONS column, which produces the average amount of time per execution).
In Oracle Database 10g, six new wait-related and timing columns have been added to V$SQLAREA:
APPLICATION_WAIT_TIME
CONCURRENCY_WAIT_TIME
CLUSTER_WAIT_TIME
USER_IO_WAIT_TIME
PLSQL_EXEC_TIME
JAVA_EXEC_TIME
The new columns are helpful in determining, for example, the amount of time that a procedure spends in PL/SQL code vs. standard SQL execution, and if a SQL statement has experienced any particular user I/O waits. For example, a query you could use to find the top five SQL statements with the highest user I/O waits would be:
select *
from
(select sql_text,
sql_id,
elapsed_time,
cpu_time,
user_io_wait_time
from sys.v_$sqlarea
order by 5 desc)
where rownum < 6;
SQL_TEXT SQL_ID ELAPSED_TIME CPU_TIME USER_IO_WAIT_TIME
------------------------- ------------ ------------ ---------- ---------------
select bucket db78fxqxwxt7 47815369 19000939 3423
SELECT :"SYS_B_0" FROM SY agdpzr94rf6v 36182205 10170226 2649
select obj#,type#,ctime,m 04xtrk7uyhkn 28815527 16768040 1345
select grantee#,privilege 2q93zsrvbdw4 28565755 19619114 803
select bucket 96g93hntrzjt 9411028 3754542 606
Of course, getting the SQL statements with the highest elapsed time or wait time is good, but you need more detail to get to the heart of the matterwhich is where the V$ACTIVE_SESSION_HISTORY view again comes into play. With this view, you can find out what actual wait events delayed the SQL statement along with the actual files, objects, and object blocks that caused the waits (where applicable).
For example, let's say you've found a particular SQL statement that appears to be extremely deficient in terms of user I/O wait time. You can issue the following query to get the individual wait events associated with the query along with the corresponding wait times, files, and objects that were the source of those waits:
select event,
time_waited,
owner,
object_name,
current_file#,
current_block#
from sys.v_$active_session_history a,
sys.dba_objects b
where sql_id = '6gvch2xu9ca3g' and
a.current_obj# = b.object_id and
time_waited <> 0;
EVENT TIME_WAITED OWNER OBJECT_NAME file block
------------------------- ----------- ------ --------------------- ---- ------
db file sequential read 27665 SYSMAN MGMT_METRICS_1HOUR_PK 3 29438
db file sequential read 3985 SYSMAN SEVERITY_PRIMARY_KEY 3 52877
Of course, you can use V$ACTIVE_SESSION_HISTORY in a historical fashion to narrow down unoptimized SQL statements for a particular time period. The point is that Oracle Database 10g makes it a lot easier to conduct response-time analysis on SQL statements with simplified data dictionary views, as opposed to the time-consuming trace-and-digest method.
Conclusion
DBAs and performance analysts who manage the performance of Oracle Database 10g will find many of the response-time metrics they've yearned for over the years now at their fingertips in the latest release of Oracle's flagship database. Such statistics will help accelerate the process of finding the proverbial needle in the haystack of large and complex database performance situations.
Oracle性能调优:时间的有限和无限
http://www.uml.org.cn/sjjm/200606204.htm