You can use the "ps" command to find the
top CPU consumers on a UNIX/Linux server. Below we find
the Process ID which is hogging CPU/Memory:
$ ps -e -o
pcpu,pid,user,tty,args |grep -i oracle|sort -n -k 1 -r|head
You can also use the watch command by
enclosing the ps command in double quotes.
$ watch "ps -ef | awk -F' ' '{print \$2}'"
$ watch "ps -e -o pcpu,pid,user,args |sort -k 1 -n -r | head
-10"
If you?ve spent much time working in a UNIX environment you?ve
probably seen the load averages more than a few times.
load averages: 2.43, 2.96, 3.41
In
his blog entry from late last year, Zach sums it up quite
nicely:
In short it is the average sum of the number of processes
waiting in the run-queue plus the number currently executing
over 1, 5, and 15 minute time periods.
The formula is a bit more complicated than that, but this
serves well as a functional definition. Zach provides a bit more
detail in
his article and also points out
Dr. Neil Gunther?s article on the topic which has as much
depth on the topic as anyone could ever ask.
So what does this mean about your system?
Well, for a quick example let?s consider the output below.
The load average of a system can typically be found by running
top
or
uptime
and users typically
don?t need any special privileges for these commands.
load averages: 2.43, 2.96, 3.41
Here we see the one minute load average is 2.43, five minute
is 2.96, and fifteen minute load average is 3.41.
Here are some conclusions we can draw from this.
- On average, over the past one minute there have been
2.43 processes running or waiting for a resource
- Overall the load is on a down-trend since the average
number of processes running or waiting in the past minute
(2.43) is lower than the average running or waiting over the
past 5 minutes (2.96) and 15 minutes (3.41)
- This system is busy, but we cannot conclude how busy
solely from load averages.
It is important here to mention that the load average does
not take into account the number of processes. Another critical
detail is that processes could be waiting for any number of
things including CPU, disk, or network.
So what we do know is that a system that has a load average
significantly higher than the number of CPUs is probably pretty
busy, or bogged down by some bottleneck. Conversely a system
which has a load average significantly lower than the number of
CPUs is probably doing just fine.