Today BizHat.com server crashed with out any reason. After rebooting found the server load too high.
10:36:45 up 3:22, 1 user, load average: 13.44, 12.99, 11.64
161 processes: 143 sleeping, 14 running, 3 zombie, 1 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 194.6% 0.0% 3.6% 0.0% 0.2% 0.0% 1.0%
cpu00 98.4% 0.0% 0.9% 0.1% 0.3% 0.0% 0.0%
cpu01 96.2% 0.0% 2.7% 0.0% 0.0% 0.0% 0.9%
Mem: 1000772k av, 964904k used, 35868k free, 0k shrd, 94668k buff
397532k active, 422624k inactive
Swap: 2096440k av, 48k used, 2096392k free 627512k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
27616 root 25 0 1516 448 1352 R 18.1 0.0 15:05 1 sadc
31070 root 25 0 1516 448 1352 R 16.1 0.0 38:49 0 sadc
19843 root 25 0 1512 444 1352 R 16.1 0.0 18:48 1 sadc
15733 root 25 0 1520 452 1352 R 16.1 0.0 6:52 1 sadc
24981 root 25 0 1516 448 1352 R 15.9 0.0 49:31 0 sadc
22411 root 25 0 1516 448 1352 R 15.9 0.0 4:36 1 sadc
28613 root 25 0 1512 444 1352 R 15.9 0.0 2:47 1 sadc
2991 root 25 0 1520 452 1352 R 15.9 0.0 1:02 0 sadc
2221 root 25 0 1520 452 1352 R 15.1 0.0 11:58 0 sadc
13076 root 25 0 1516 448 1352 R 14.7 0.0 23:29 1 sadc
3267 root 25 0 1512 444 1352 R 14.1 0.0 29:34 0 sadc
11301 root 25 0 1516 448 1352 R 13.9 0.0 9:12 0 sadc
8205 bizhatc 15 0 0 0 0 Z 4.9 0.0 0:00 0 php
8126 root 17 0 5228 1216 4892 R 1.7 0.1 0:00 1 top
7420 root 17 0 2116 1032 1932 S 0.1 0.1 0:00 1 prm
1 root 16 0 1504 512 1356 S 0.0 0.0 0:00 1 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
4 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
6 root 5 -10 0 0 0 SW< 0.0 0.0 0:00 0 events/0
7 root 5 -10 0 0 0 SW< 0.0 0.0 0:00 1 events/1
8 root 5 -10 0 0 0 SW< 0.0 0.0 0:00 1 khelper
Too many sadc process running, taking up too much CPU time.
Also start getting CPU Heating Warding.
[root@server10 root]#
Message from syslogd@server10 at Tue Sep 21 11:01:37 2004 …
server10 kernel: CPU0: Temperature above threshold
Message from syslogd@server10 at Tue Sep 21 11:01:37 2004 …
server10 kernel: CPU1: Temperature above threshold
Its found that sadc is run by cron every 10 mints and is not important process.
The process is removed
mkdir /root/backup
mv /etc/cron.d/sysstat /root/backup/
After rebooting the server load become normal and no CPU heating warning.
load average: 0.48, 0.49, 0.21
3 Responses to BizHat.com Server Crash on 22 Sep 2004