CPU 사용량 체크 명령어는 top ,vmstat,sar 등등 많은 것이 있지만
CPU 부하가 일어났을 때, 어떤 프로세스때문인지는 알 수가 없다..따라서 필자는 아래와 같이 CPU부하가 일어났을 때, CPU부하가 높은 순서를 메일로 발송되도록 스크립트를 작성해 보았다.

<cpu-usage.sh>

#!/bin/bash
s_time=$(date +%Y-%m-%d' '%H:%M:%S)
PREV_TOTAL=0
PREV_USER=0

while true; do
#CPU 전체 사용량을 cat으로 추출한다.
  CPU=(`cat /proc/stat | grep '^cpu '`) # Get the total CPU statistics.
  unset CPU[0]                          # Discard the "cpu" prefix.
  USER=${CPU[1]}                        # Get the idle CPU time.

  # Calculate the total CPU time.
  TOTAL=0
  for VALUE in "${CPU[@]}"; do
    let "TOTAL=$TOTAL+$VALUE"
  done


  # Calculate the CPU usage since we last checked.
  let "DIFF_USER=$USER-$PREV_USER"
  let "DIFF_TOTAL=$TOTAL-$PREV_TOTAL"
  let "DIFF_USAGE=$DIFF_USER*100/$DIFF_TOTAL"
  #echo -en "CPU: $DIFF_USAGE%  \n"

  # Remember the total and idle CPU times for the next check.
  PREV_TOTAL="$TOTAL"
  PREV_USER="$USER"


  if [ "$DIFF_USAGE" -ge 20 ]; then
  #Process=`/bin/ps -eo pmem,pcpu,rss,vsize,args | /bin/sort -k 2 -r | /usr/bin/head -n 20`
#cpu부하가 일어났을 때, 어떤 프로세스때문에 부하가 일어났는지 프로세스 상태를 메일로 발송해 준다.
  Process=`/bin/ps -eo ppid,user,bsdstart,bsdtime,%mem,%cpu,args --sort=-%cpu | /usr/bin/head -n 20`
  echo -e "$(hostname) as on $s_time \n $Process\n" | mail -s "Alert: Almost out of cpu usage $DIFF_USAGE%" root@test.com
  fi

  # Wait before checking again.(5분마다 해당 스크립트가 작동하도록 300초 설정)
  sleep 300
done

위와 같이 스크립트를 작성하였다면, 백그라운드에서 해당 스크립트가 작동되록 설정해 준다.

#백그라운드 실행
sh <root directory>/cpu-usage.sh &

#프로세스가 잘 운영되고 있는지 확인
ps -ef | grep cpu

#stress 설치 및 부하 테스트

stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 400s

Site : http://weather.ou.edu/~apw/projects/stress/

1. 설치
# wget http://weather.ou.edu/~apw/projects/stress/stress-1.0.4.tar.gz
# tar xvfz stress-1.0.4.tar.gz
# cd stress-1.0.4
# ./configure
# make && make install


2. 실행
# stress
`stress' imposes certain types of compute stress on your system

Usage: stress [OPTION [ARG]] ...
-?, --help show this help statement
--version show version statement
-v, --verbose be verbose
-q, --quiet be quiet
-n, --dry-run show what would have been done
-t, --timeout N timeout after N seconds
--backoff N wait factor of N microseconds before work starts
-c, --cpu N spawn N workers spinning on sqrt()
-i, --io N spawn N workers spinning on sync()
-m, --vm N spawn N workers spinning on malloc()/free()
--vm-bytes B malloc B bytes per vm worker (default is 256MB)
--vm-stride B touch a byte every B bytes (default is 4096)
--vm-hang N sleep N secs before free (default none, 0 is inf)
--vm-keep redirty memory instead of freeing and reallocating
-d, --hdd N spawn N workers spinning on write()/unlink()
--hdd-bytes B write B bytes per hdd worker (default is 1GB)

Example: stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10s

Note: Numbers may be suffixed with s,m,h,d,y (time) or B,K,M,G (size).

 

Posted by 박물지