How to monitor CPU usage on a Linux VPS

How to monitor CPU usage on a Linux VPS

I read a discussion on LowEndBox, where a VPS user was suspended because his CPU usage was more than the allowed load. So I tried to ascertain how to find the load on my VPS.

The command top and htop can show current stats:

Install htop with:

Debian

apt-get install htop

Centos 6

wget http://pkgs.repoforge.org/htop/htop-1.0.2-1.el6.rf.x86_64.rpm
rpm -Uvh htop-1.0.2-1.el6.rf.x86_64.rpm
rm htop-1.0.2-1.el6.rf.x86_64.rpm

Run it:

htop

MWSnap011 2013-04-13, 18_21_57

Note the top right stats which show the average load on the CPU. 1 here is roughly equivalent to utilization of one core (I think). The CPU% coloumn shows the load due to each process.

The command mpstat also shows the CPU load of each core.

To get a total load report:

mpstat
Linux 2.6.32-042stab072.10 (hermes)     04/13/13        _x86_64_        (8 CPU)

12:45:14     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:45:14     all    1.07    0.00    0.11    0.02    0.00    0.00    0.02    0.00   98.78

And to get it broken down per core:

mpstat -P ALL
Linux 2.6.32-042stab072.10 (hermes)     04/13/13        _x86_64_        (8 CPU)

12:46:03     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
12:46:03     all    1.07    0.00    0.11    0.02    0.00    0.00    0.02    0.00   98.79
12:46:03       0    1.91    0.00    0.17    0.01    0.00    0.00    0.03    0.00   97.89
12:46:03       1    1.30    0.00    0.16    0.01    0.00    0.00    0.02    0.00   98.51
12:46:03       2    0.49    0.00    0.04    0.00    0.00    0.00    0.01    0.00   99.45
12:46:03       3    0.58    0.00    0.05    0.04    0.00    0.00    0.02    0.00   99.31
12:46:03       4    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:03       5    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:03       6    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
12:46:03       7    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00

You can also enable data collection by editing the file.

By default, it is:

egrep -v '^#|^$' /etc/default/sysstat
ENABLED="false"
SA1_OPTIONS="-S DISK"
SA2_OPTIONS=""

I used egrep to filter out blank and comment lines.

Change the ENABLED= value to true.

Note: To run sar to monitor the system activity, you need to first enable it with:

dpkg-reconfigure sysstat

Else you get the error message:

#sar
Cannot open /var/log/sysstat/sa13: No such file or directory
Please check if data collecting is enabled in /etc/default/sysstat

Now start the service with:

/etc/init.d/sysstat start
Starting the system activity data collector: sadc.

Now to see the collected data:

sar -A
Linux 2.6.32-042stab072.10 (hermes)     04/13/13        _x86_64_        (8 CPU)

18:34:16          LINUX RESTART

That is only a default clean log. I’ll need to repeat the command after some time to get useful information

To check CPU load at intervals of 5 seconds, and report 10 different checks:

sar -u 5 10
Linux 2.6.32-042stab072.10 (hermes)     04/13/13        _x86_64_        (8 CPU)

18:41:24        CPU     %user     %nice   %system   %iowait    %steal     %idle
18:41:29        all      0.85      0.00      0.10      0.00      0.05     99.00
18:41:34        all      0.05      0.00      0.05      0.00      0.00     99.90
18:41:39        all      0.05      0.00      0.05      0.00      0.00     99.90
18:41:44        all      0.00      0.00      0.05      0.00      0.00     99.95
18:41:49        all      0.05      0.00      0.05      0.00      0.00     99.90
18:41:54        all      0.05      0.00      0.10      0.00      0.00     99.85
18:41:59        all      0.95      0.00      0.10      0.05      0.00     98.90
18:42:04        all      1.50      0.00      0.30      0.00      0.00     98.19
18:42:09        all      2.20      0.00      0.35      0.00      0.05     97.39
18:42:14        all      0.70      0.00      0.20      0.00      0.00     99.10
Average:        all      0.64      0.00      0.14      0.01      0.01     99.21

You can keep checking even when logged off, and get the final stats by scheduling such checks at predetermined frequencies and intervals, sending the process to the background:

nohup sar -o /root/cpustats.log 10 6 >/dev/null 2>&1 &

That takes care of reporting for 60 minutes, and checking every 10 minutes. Every 10 mins x 6 times = 60 minutes.

You can of course create a cron job to sample these data every N hourly.

Once the data is saved to this file, which incidentally is a binary file, you can get the report by:

sar -f /root/cpustats.log

To find the most resource consuming processes:

To get top 20 CPU hogging processes:

#ps -eo pcpu,pid,user,args | sort -k 1 -r | head -20
%CPU   PID USER     COMMAND
 1.0  6386 vu2003   /usr/bin/php5-cgi
 0.9  5930 vu2003   /usr/bin/php5-cgi
 0.6  5680 vu2003   /usr/bin/php5-cgi
 0.5  6412 vu2003   /usr/bin/php5-cgi
 0.5  6061 vu2003   /usr/bin/php5-cgi
 0.5  5913 vu2003   /usr/bin/php5-cgi
 0.4  5932 vu2003   /usr/bin/php5-cgi
 0.3  6063 vu2003   /usr/bin/php5-cgi
 0.1   633 mysql    /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
 0.0  6677 root     head -20
 0.0  6676 root     sort -k 1 -r
 0.0  6675 root     ps -eo pcpu,pid,user,args
 0.0  6674 sshd     sshd: unknown [net]
 0.0  6673 root     sshd: unknown [priv]
 0.0  6408 vu2003   /usr/bin/php5-cgi
 0.0  5964 root     /usr/bin/perl /var/www/imscp/engine/imscp-apache-logger
 0.0  5934 postfix  pickup -l -t fifo -u -c
 0.0  5742 root     -bash
 0.0  5739 root     sshd: [email protected]/0

To get all of them, one screenful at a time:

ps -eo pcpu,pid,user,args | sort -r -k1 | less

 


You are reading this post on Joel G Mathew’s tech blog. Joel's personal blog is the Eyrie, hosted here.