性能调试---(三)CPU性能分析

dn001 2009-06-19 00:09:14

1:CPU的体系结构和工作原理
2:操作系统和进程
3:衡量CPU闲忙程度的指标
4:CPU资源成为系统性能的瓶颈的征兆
5:哪些进程是占用CPU资源的大户?
6:利用SAR工具分析CPU的利用率
7:利用SAR工具分析运行进程队列长度
8:利用SAR工具分析系统调用
9:利用time命令测试某个命令和程序的执行效率
10:利用top命令查看最耗CPU资源的进程
11:利用uptime命令查看系统整体情况
12:利用GlancePlus分析系统CPU资源利用率
13:对CPU需求密集型系统的性能调试

CPU的体系结构和工作原理

我们所说的CPU一般是指微处理器，即Microprocessor，一般地，一个CPU的主要组成部分为：

CPU(central processing unit)
cache：cache就是高速内存，它的存取时间一般是10-20微秒(ns)，这样，CPU可以在一个时钟周期内访问一次cache；而一般的内存的存取时间为80-90微秒(ns)，它的大小对CPU的性能有很大的影响。
TLB(translation lookaside boffer)：TLB是高速cache，它用于存放最近访问的虚拟地址和与其对应的物理地址对，这样TLB将可以把虚拟地址转换为物理地址。TLB是内存中系统转换表的一个子集；TLB通常是指向一个内存页面，而不是一个内存地址；它的大小对CPU的性能有很大的影响。
coprocessor
不同的CPU，一般有不同的时钟频率和高速缓存容量。

CPU在一次时钟周期内一般可以从高速缓存内取到一个指令并执行它。因此，从理论上说，只要CPU的主频越快，单位时间内所能执行的指令则越多。目前，有些CPU可以在一个时钟周期内执行多条指令，如PA8500可以执行4条指令。
高速缓存的大小会制约CPU的执行效率，虽然CPU主频很快，但它取不到数据，则只有空运行。因此，高速缓存的大小很重要；高速缓存又分数据高速缓存和指令高速缓存，分别存放从内存预先取来的即将执行的数据和指令单元。
虚拟寻址

一般，系统中的虚拟地址空间要比物理地址空间大得多，例如，如果系统是64位的，则理论上，它的寻址空间可以达到2的64次幂(2**64=18,447PB)，但由于受费用的因素的影响，实际上的物理内存最大只有十几GB的内存。

每个进程都有自己的唯一虚拟地址空间，然而，进程的运行必须把虚拟地址映射到物理地址，这需要TLB、高速缓存和内存三者的配合。如果需要的信息不在内存，则导致一个页面错。

流水线(Pipelining)

TLB和高速缓存试图在一个时钟周期内给CPU提供它所需的信息，然而，这个过程是100%的利用率，对CPU来说，它必须先用一个时钟周期去取下一个指令，再一个时钟周期去执行这条指令，这样，CPU的利用率也只有50%。为了让CPU更忙，通常的做法是，采用流水线的方法。如PA8500是采用7个步骤的流水线。

操作系统和进程

HP-UX一个多用户、多任务的Unix操作系统。它的性能依赖于用户数的多少、用户任务的类型、硬/软件件的配置。

HP－UX有两种运行级别：

用户级：系统用户可以与操作系统进行交互操作，如运行应用和系统命令。用户级通过系统调用接口访问内核级。
内核级：操作系统自动运行一些功能，它们主要对硬件进行操作。
在操作系统中，用户程序是以进程方式运行。进程的状态有以下几种：

SRUN
SSLEEP
SZOMB
SIDL
SSTOP
CPU的调度

一旦进程所需的数据调入内存后，它将等待CPU调度者来分配CPU时间。一般，在HP-UX中，每个进程都可以得一个固定的时间片来运行，这个时间片的长度为十分之一秒(1/10秒)。

由于HP-UX是一个多任务的操作系统，它需要一种手段来进程的执行次序，这就是中断。在系统中，时钟中断处理器是用来处理时钟中断的系统软件。具体地说，它将收集系统和accounting statistics and does context switching.系统性能也与这种中断发生的频率有关。

进程何优先级

每个进程都有自己的优先级；
实时优先级：-32~127，一个进程如果想以实时优先级运行，则必须用命令#rtprio来设置；
分时系统优先级：128～177；
分时用户优先级：178～251；
优先级：252～255 are used by the system as virtual memory management prioritIEs for process deactivation.
分时进程在初始优先级是由系统分配的，为一个定值。用户可以通过改变进程的nice值来改变分时进程的优先级。因为进程会随着它的执行，将以nice值来降低它的优先级，当它在等待执行时，又将以nice值来增加其优先级。nice值的系统缺值为20。
在系统性能分析过程中，我关心不仅仅在完成一个进程耗时多少，还包括时间花在哪以及它的时间多少。

衡量CPU闲忙程度的指标

要分析系统的CPU资源是否够的前提谁占用了CPU资源，占用了多少，时间多长。下面是一些衡量CPU闲忙程度的经用指标：

1)用户使用CPU的情况

CPU运行常规用户进程
CPU运行niced process
CPU运行实时进程
2)系统使用CPU的情况

用于系统调用
用于I/O管理：中断和驱动
用于内存管理：paging and swapping
用于进程管理：context switch and process start
3)WIO：由于进程等待I/O而使CPU处于空闲状态的比率，这些I/O主要指block I/O,raw I/O,VM paging/swapins；

4)CPU的空闲率，即除了上面的WIO以外的空闲情况；

5)CPU用于上下文交换的比率(Context Switch CPU utilization)

6)nice

7)real-time

8)运行进程队列的长度，即处于可运行状态的进程个数的大小，不过我们关心的是这些在等待CPU调度执行时所花的时间；

9)平均负载(load average)

CPU资源成为系统性能的瓶颈的征兆

CPU就像人的大脑，完成各种交给它的任务。如果任务太多，CPU就要忙不过来，它的运行效率就要下降。就像人生病会有一典型症状一样，当CPU资源成为系统性能的瓶颈时，它也有一些典型的症状：

很慢的响应时间(slow response time)
CPU空闲时间为零(zero percent idle CPU)
过高的用户占用CPU时间(high percent user CPU)
过高的系统占用CPU时间(high percent system CPU)
长时间的有很长的运行进程队列(large run queue size sustained over time)
processes blocked on prority
必须注意的是，如果系统出现上面的这些症状并不能说一定是由于CPU资源不够，事实，有些症状的出现很可能是由于其他资源的不足而引起，如内存不够时，CPU会忙内存管理的事，这时从表面上， CPU的利用是100%，甚至显得不够，如果据此就简单地认为增加CPU就可以解决问题是大错特错了。

因此，还是那句话，必须用不同的工具、从不同的方面对系统进行分析后，才能做出结论，即使这样，经验将起到不可替代的作用。

哪些进程是占用CPU资源的大户?

在操作系统中，并不是所有的进程都以同样的方式使用CPU资源。通常情况下，有些进程需要比其他进程更多的CPU时间片才能顺利地完成任务。下面是一些典型的占用CPU资源的大户：

进程创建(process creation)
终端字符进程(teminal character processes(MUX- and LAN-based)
计算密集型进程和实时进程
X-终端和X-服务器进程(X-terminals and X-servers)

利用SAR工具分析CPU的利用率

利用SAR进行CPU的利用率分析的命令形式：

#sar -u，这时数据是通过sa1在后台定时生成；
#sar -u 5 100，每隔5秒取样一次，共取100次；
SAR -u:Report CPU utilization (the default); portion of time running in one of several modes. On a multi-processor system, if the -M option is used together with the -u option, per-CPU utilization as well as the average CPU utilization of all the processors are reported. If the -M option is not used, only the average CPU utilization of all the processors is reported:

cpu: cpu number (only on a multi-processor system with the -M option);
%usr: user mode;
%sys: system mode;
%wio: idle with some process waiting for I/O (only block I/O, raw I/O, or VM pageins/swapins indicated);
%idle: otherwise idle;
对结果的分析

首先，我们看%idle列的值，如果为接近零，则再看对应%wio列的值，如果这列的大于7，则表明系统的磁盘或其他I/O可能有问题，需要进一步的分析：

用iostat命令分析各个磁盘的传输闲忙状况，如#iostat -t 5 2，每隔5秒取样一次，共取2次；
用sar -d命令分析各块设备(磁盘、磁带)活动情况；
用sar -b命令分析系统的缓存的活动情况；
用sar -w命令分析进程的deactivation/reactivation and switching activities of the system;
如果%idle列很小，而对应的%wio列的值也很小，这时，我们查看%usr列和%sys列的值。如果%usr列的值很大，说明有用户进程占用很多CPU时间；如果%sys列的值很大，则说明系统管理方面花了很多时间。需要进一步的分析：

用GlancePlus对占用CPU时间最大的进程进行单独分析，为什么它会占用如此多的CPU时间。
如果%sys列的值很大，可以用SAR -C命令对系统调用进行进一步分解，看这些系统调用主要是做些什么。同时，还必须分析是否有其他瓶颈，如paging也会引起%sys的值很大，这时，可以用sar -q查看系统的运行进程队列长度，也可以用GlancePlus和vmstat查看内存的使用情况；

利用SAR工具分析运行进程队列长度

利用SAR进行运行进程队列长度分析的命令形式：

#sar -q，这时数据是通过sa1在后台定时生成；
#sar -q 5 100，每隔5秒取样一次，共取100次；
SAR -q: Report average queue length while occupied, and percent of time occupied. On a multi-processor Machine, if the -M option is used together with the -q option, the per-CPU run queue as well as the average run queue of all the processors are reported. If the -M option is not used, only the average run queue information of all the processors is reported:

cpu: cpu number (only on a multi-processor system with the -M option);
runq-sz: Average length of the run queue(s) of processes (in memory and runnable);
%runocc: The percentage of time the run queue(s) were occupied by processes (in memory and runnable);
swpq-sz: Average length of the swap queue of runnable processes (processes swapped out but ready to run);
%swpocc: The percentage of time the swap queue of runnable processes (processes swapped out but ready to run) was occupied.
对结果的分析：

这些数据越小越好。

如果runq-sz大于4，或者%swapocc大于5时，则表明系统的CPU或内存可能有问题，需要进一步的分析：

用sar -u命令分析CPU的使用情况；
用sar -w命令分析进程的deactivation/reactivation and switching activities of the system;
也可以用GlancePlus；

利用SAR工具分析系统调用

利用SAR进行系统调用分析的命令形式：

#sar -c，这时数据是通过sa1在后台定时生成；
#sar -c 5 100，每隔5秒取样一次，共取100次；
SAR -c: Report system calls:

scall/s: Number of system calls of all types per second;
sread/s: Number of read() and/or readv() system calls per second;
swrit/s: Number of write() and/or writev() system calls per second;
swpq-sz: Average length of the swap queue of runnable processes (processes swapped out but ready to run);
fork/s: Number of fork() and/or vfork() system calls per second;
exec/s: Number of exec() system calls per second;
rchar/s: Number of characters transferred by read system calls block devices only) per second;
wchar/s: Number of characters transferred by write system calls (block devices only) per second.
对结果的分析：

如果scall/s列的值很大，那么这么多的系统调用的原因就必须仔细分析了。

我们可以查看fork/s和exec/s列的值，看看系统是否在创建大量新的进程。

利用time命令测试某个命令和程序的执行效率

我们可以利用time命令来测试一个命令的执行效率，语法为：

time command

command is executed. Upon completion, time prints the elapsed time during the command, the time spent in the system, and the time spent executing the command. Times are reported in seconds.

Execution time can depend on the performance of the memory in which the program is running.

当我们觉得某个进程的性能不好时，最简单的方法就是利用time命令来查看一下进程执行时它的时间分布情况，然后再用其他工具进一步分析。

利用top命令查看最耗CPU资源的进程

我们可以利用top命令来查看最耗CPU资源的进程。top命令还会根据进程占用CPU资源的多少而动态改变。

它的语法为：

top [-s time] [-d count] [-q] [-u] [-h] [-n number]

其中各选项的含义为：

-s time: 屏幕刷新的时间间隔time，缺省为5秒；
-d count: 屏幕刷新count次后，top命令自己也退出；
-q: This option runs the top program at the same priority as if it is executed via a nice -20 command so that it will execute faster (see nice(1)). This can be very useful in discovering any system problem when the system is very sluggish. This option is accessibly only to users who have appropriate privileges.
-u: User ID (uid) numbers are displayed instead of usernames. This improves execution speed by eliminating the additional time required to map uid numbers to user names.
-h: Hides the individual CPU state information for systems having multiple processors. Only the average CPU status will be displayed.
-n number: Show only number processes per screen. Note that this option is ignored if number is greater than the maximum number of processes that can be displayed per screen.
在top命令运行时，我们可用以下几个快捷键来翻屏：

j: 向前翻；
k: 向后翻；
t: 回到第一页；
对结果的分析：

通过top命令，我们可以快速了解到目前系统的CPU资源使用情况，尤其是占用CPU资源最多的进程是我们必须关注的对象。

我们通过RES(the current size of the process resident in memory)列可以知道每个进程占用内存的数量。

我们通过NICE列可以知道系统是否使用NICE值来调节该进程的工作负载平衡。

利用uptime命令查看系统整体情况

uptime prints the current time, the length of time the system has been up, the number of users logged on to the system, and the average number of jobs in the run queue over the last 1, 5, and 15 minutes.

w is linked to uptime and prints the same output as uptime -w, displaying a summary of the current activity on the system.

它的语法为：

uptime [-hlsuw] [user]

w [-hlsuw] [user]

其中各选项的含义为：

-h: Suppress the first line and the heading line. This option should not be used with the -u option. This option assumes the use of the -w option to uptime.
-l: Use long output. This option assumes the use of the -w option to uptime.
-s: Use the short form of output for displaying terminal information. The terminal name is abbreviated; the login time and CPU times are suppressed.
-u: Print only the first line describing the overall state of the system. This is the default for the uptime command.ormation for systems having multiple processors. Only the average CPU status will be displayed.
-w: Print a summary of the current activity on the system for each user. This is the default for the w command.

利用GlancePlus分析系统CPU资源利用率

利用HP的GlancePlus工具可以对进程的整体情况和单独的某个进程都详细分析。

1)对CPU的整体使用情况的分析：

进入GlancePlus；
按?键进入联机帮助界面；
按c键进入CPU的详细界面；
按b键表示向后翻页，按f键表示向前翻页；
通过CPU Detail Screen，我们可以知道CPU时间的分布情况，用户用了多少，系统用了多少等。

2)对单个进程的CPU资源占用情况分析：

进入GlancePlus；
按?键进入联机帮助界面；
按g键进入进程列表界面；
按s键进入进程选择界面，通常最忙的进程会作为缺省进程；
输入想查看的进程号；
按b键表示向后翻页，按f键表示向前翻页；
在对单个进程的分析中，我们通常要关注以下几个值：

CPU Usage;
User CPU;
System CPU;
Priority;
Logical and Physical Reads and Writes;
Total RSS/VSS;
blocked on(通过按shift+>来得到);

对CPU需求密集型系统的性能调试

1)基于硬件的方法：

升级到更快的CPU；
升级到更大的高速缓存；
增加CPU个数；
把应用分布到多个系统中；
使用无盘结点；
增加浮点处理器；
2)基于软件的方法：

在不是高峰时间运行批处理；
Nice umimportant application;
使用rtpio命令来帮助重要的应用；
使用plock命令来帮助重要的应用；
Turn off system accounting;
Consider using Taskbroker or DCE;
优化应用；
考虑使用进程资源管理器(Process Resource Manager)，不过PRM只有在HP-UX平台上有。

标签：