Detect IO activity on linux

How to detect IO activity on a linux system.

You can use command top and have a look at the CPU line, the IO activity can be observed in CPU Wait :

Cpu(s): 59.1%us,  1.8%sy,  0.0%ni, 28.4%id, 10.3%wa,  0.0%hi,  0.4%si,  0.0%st

If your I/O wait percentage is greater than (1/# of CPU cores x 100) then your CPUs are waiting a significant amount of time for the disk subsystem to catch up.
In the output above, I/O wait is 10.3%. This server has 8 cores (via cat /proc/cpuinfo: cat /proc/cpuinfo|grep ^processor|wc -l ). This is very close to (1/8 cores x 100 = 12.5). Disk access may be slowing the application down if I/O wait is consistently around this threshold.

After that you can use iostat command :

$ iostat -xkd 1 
Linux 2.6.18-164.15.1.el5 (myserver)     07/06/2012
Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    75.00  3.00 362.00    36.00  9436.00    51.90    44.66  138.31   2.55  93.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00    75.00  3.00 362.00    36.00  9436.00    51.90    44.66  138.31   2.55  93.00
dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  0.00 40.00     0.00   160.00     8.00     1.35   33.75   1.95   7.80
dm-4              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-8              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-9              0.00     0.00  0.00 24.00     0.00    96.00     8.00     2.52  104.88   7.92  19.00
dm-10             0.00     0.00  3.00 264.00    36.00  7292.00    54.89    46.78  198.82   3.48  93.00

Here we can see one  physical disk and two partitions: sda1 and sda2. Partition sda2 is currently used at 93% mostly in write (wkB/s).

Above all dm-x represent the LVM logical volumes. Use the following to link with the LV name :

$ sudo lvdisplay|awk  '/LV Name/{n=$3} /Block device/{d=$3; sub(".*:","dm-",d); print d,n;}'
dm-0 /dev/Volume00/root
dm-1 /dev/Volume00/var
dm-2 /dev/Volume00/var_log
dm-3 /dev/Volume00/opt
dm-4 /dev/Volume00/tmp
dm-5 /dev/Volume00/home
dm-6 /dev/Volume00/usr
dm-7 /dev/Volume00/swap
dm-8 /dev/Volume00/logs
dm-9 /dev/Volume00/data
dm-10 /dev/Volume00/mysql

Now we know that most IO activity on sda2 comes from the mysql partition.

If we want to know which process is performing IO we can activate IO logging on the kernel :

echo 1 > /proc/sys/vm/block_dump

This makes the kernel start writing messages about every I/O operation that takes place and informations can be read with dmesg. Do not forget to reset this value to 0 when you’re done !

$ dmesg | egrep "READ|WRITE|dirtied" | egrep -o '([a-zA-Z]*)' | sort | uniq -c | sort -rn | head
   1040 mysqld
   1030 kjournald
    438 java
     50 sendmail
     47 pdflush
      7 procmail
      5 bash
      4 crond
      3 syslogd
$ dmesg | awk '/(READ|WRITE|dirtied)/ {process[$1]++} END {for (x in process) \
print process[x],x}' |sort -nr |awk '{print $2 " " $1}' |  head -n 10
mysqld(19332): 1114
kjournald(19768): 395
mysqld(2137): 241
kjournald(19774): 132
mysqld(31220): 124
pdflush(27124): 97
kjournald(19762): 89
kjournald(19756): 82
java(24661): 78
java(12055): 41

It was predictable that the process doing a lot of IO on the partition is mysql.

But now if we want to know on which files theses IO are done, let’s write the inodes that are being accessed on a file for the partition dm-10 :

dmesg -c |grep java.*dm-10|awk '{print $4}'|sort -n|uniq -c > /tmp/i

and search the corresponding filename on the partition :

for i in $(cat /tmp/i|awk '{print $2}'); do find /opt/mysql -inum  $i ; done

Here we look into /opt/mysql because /dev/Volume00/mysql is mounted on it.

So here we have seen the IO activity on the server, identified which disk/partition was loaded, what was the process loading the most and on which files.

Advertisements

Nagios active plugin with metrics

The plugin

We use the check_http common plugin and format the metrics output :

$ cat /usr/lib64/nagios/plugins/check_kk_http 
#!/bin/bash
/usr/lib64/nagios/plugins/check_http $*  | sed 's/|/| /' |sed 's/[a-zA-Z];;;0[.0-9]*//g'

Exec the plugin

Exec the plugin to check what Nagios and Nagiosperfd will receive :

$ /usr/lib64/nagios/plugins/check_kk_http -I 10.76.98.97  -p 80  -u "/monitor.html" -r "monitorOK"
HTTP OK: HTTP/1.1 200 OK - 123386 bytes in 3.619 second response time | time=3.618789 size=123386

Define the service

Define the command and the service using the command

define command{
        command_name                    check_URL_category_serving
        command_line                    $USER1$/check_kk_http -I $HOSTADDRESS$ -p 80 -u "/monitor.html" -r "monitorOK" 

}
define service{
        use                             active-service
        hostgroup_name                  datanav_ws
        normal_check_interval           5
        max_check_attempts              3
        service_description             URL_category_serving
        check_command                   check_URL_category_serving
        register                        1
}

 

Define the perl module for metrics

Define the perl module to handle the metric of this plugin

#!/usr/bin/perl -w
#
#
### Build from F5cpu-template.pm ###
package Nagios::Metrics::URL_category_serving;
use strict;
use base qw(Nagios::Metrics);

my $SERVICE = 'URL_category_serving';

my %keymap = (
    $SERVICE => {
        'keys' => {
            'time'       => ["$SERVICE", 'stats'],
            'size'       => ["$SERVICE", 'stats']
        } 
    }
);

my %rrdschema = (
    "$SERVICE" => {
        'stats' => { 
            'ds' => {
                'time'               => ['GAUGE', undef, 0, undef],
                'size'               => ['GAUGE', undef, 0, undef]
            } 
        } 
    }
);

sub new {
    my $class = shift;
    my $args = shift;
    my $self = Nagios::Metrics->new({
            'service'    => $SERVICE,
            'log'        => $args->{log},
            'keymap'    => \%keymap,
            'rrdschema'    => \%rrdschema,
            'namekey'       => '^([\w]+)\.*(.*)',
            preprocess    => sub {
                my ($instance, $value) = ($_[0] =~ /^([^=]*)=(.*)$/);
                $instance="$SERVICE.$instance";
                return "$SERVICE/$instance=$value";
            }
    });
    bless $self, $class;
    return $self;
}

1;






 

Nagios On-Demand Macros and cluster service

This post will address two points in Nagios :

  1. Cluster service
  2. On-Demand Macros

this little documentation should provide you with some information on how to monitor clusters of services.
Imagine we have a host running several services (service1, service2 .. serviceN) and some services are monitoring the same application but in a different way.
Example:

service appli : raise an alert if appli is not running
service appli perf : raise an alert if performance of appli are bad (> critical thresold)
service appli load : raise an alert if the memory or CPU load of appli are bad (> critical thresold)

The problem is when the load is bad, performance are also bad and two alerts are raised for only one problem.
We want to raise only one alert if one service or more are in a critical state. Let’s create a cluster to do that !

Plugin check_cluster

We have to use the plugin check_cluster (http://nagiosplugins.org/man/check_cluster) :

Usage: check_cluster (-s | -h) -d val1[,val2,...,valn] [-l label]
[-w threshold] [-c threshold] [-v] [--help]

Options:
 --extra-opts=[section][@file]
    Read additionnal options from ini file
 -s, --service
    Check service cluster status
 -h, --host
    Check host cluster status
 -l, --label=STRING
    Optional prepended text output (i.e. "Host cluster")
 -w, --warning=THRESHOLD
    Specifies the range of hosts or services in cluster that must be in a
    non-OK state in order to return a WARNING status level
 -c, --critical=THRESHOLD
    Specifies the range of hosts or services in cluster that must be in a
    non-OK state in order to return a CRITICAL status level
 -d, --data=LIST
    The status codes of the hosts or services in the cluster, separated by
    commas
 -v, --verbose
    Show details for command-line debugging (Nagios may truncate output)

Examples:

#  Will alert critical if there are more than 1 service in a non-OK state.

$ /usr/lib64/nagios/plugins/check_cluster -s -l "my service cluster" -c 1 -d 0,0,0,0
CLUSTER OK: my service cluster: 4 ok, 0 warning, 0 unknown, 0 critical

# Will alert critical if there are more than 1 service in a non-OK state.

$ /usr/lib64/nagios/plugins/check_cluster -s -l "my service cluster" -c 1 -d 0,1,0,0
CLUSTER OK: my service cluster: 3 ok, 1 warning, 0 unknown, 0 critical

# Will alert critical if there are 1 or more services in a non-OK state.

$ /usr/lib64/nagios/plugins/check_cluster -s -l "my service cluster" -c @1 -d 0,1,0,0
CLUSTER CRITICAL: my service cluster: 3 ok, 1 warning, 0 unknown, 0 critical

Threshold format

You can have a look at http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT for THRESHOLD format and examples:

This is the generalised format for ranges:
[@]start:end

Notes:

  1. start ≤ end
  2. start and “:” is not required if start=0
  3. if range is of format “start:” and end is not specified, assume end is infinity
  4. to specify negative infinity, use “~”
  5. alert is raised if metric is outside start and end range (inclusive of endpoints)
  6. if range starts with “@”, then alert if inside this range (inclusive of endpoints)

Note: Not all plugins are coded to expect ranges in this format yet. There will be some work in providing multiple metrics.
Table 3. Example ranges

Range definition Generate an alert if x…
10 < 0 or > 10, (outside the range of {0 .. 10})
10: < 10, (outside {10 .. ∞})
~:10 > 10, (outside the range of {-∞ .. 10})
10:20 < 10 or > 20, (outside the range of {10 .. 20})
@10:20 ≥ 10 and ≤ 20, (inside the range of {10 .. 20})

New paragraph

Monitoring Service Clusters

Let’s say you have three DNS servers that provide redundant services on your network. First off, you need to be monitoring each of these DNS servers separately before you can monitor them as a cluster. We’ll assume that you already have three separate services (all called “DNS Service”) associated with your DNS hosts (called “host1”, “host2” and “host3”).

In order to monitor the services as a cluster, you’ll need to create a new “cluster” service. However, before you do that, make sure you have a service cluster check command configured. Let’s assume that you have a command called check_service_cluster defined as follows:

define command {
        command_name    check_service_cluster
        command_line    /usr/lib64/nagios/plugins/check_cluster --service -l $ARG1$ -w $ARG2$ -c $ARG3$ -d $ARG4$
}

Now you’ll need to create the “cluster” service and use the check_service_cluster command you just created as the cluster’s check command. We’ll have to pass to ARG4 the service states of all services in the cluster. It’s here we will use on-demand macros.

Nagios on-demand macros

If you would like to reference values for another host or service in a command (for which the command is not being run), you can use what are called “on-demand” macros. On-demand macros look like normal macros, except for the fact that they contain an identifier for the host or service from which they should get their value. Here’s the basic format for on-demand macros:

  • $HOSTMACRO:host_name$
  • $SERVICEMACRO:host_name:service_description$

Note that the macro name is seperated from the host or service identifier by a colon (:). For on-demand service macros, the service identifier consists of both a host name and a service description – these are seperated by a colon (:) as well.

Examples of on-demand host and service macros follow:

$HOSTDOWNTIME:myhost$
$SERVICESTATEID:novellserver:DS Database$

Let’s use the on-demand macros for our cluster.

The example below will generate a CRITICAL alert if 2 or more services in the cluster are in a non-OK state, and a WARNING alert if only 1 of the services is in a non-OK state. If all the individual service members of the cluster are OK, the cluster check will return an OK state as well.

define service {
        ...
        check_command   check_service_cluster!"DNS Cluster"!0!1!$SERVICESTATEID:host1:DNS Service$,$SERVICESTATEID:host2:DNS Service$,$SERVICESTATEID:host3:DNS Service$
        ...
}

It is important to notice that we are passing a comma-delimited list of on-demand service state macros to the $ARG4$ macro in the cluster check command. That’s important! We can use on-demand macros in with the current service state IDs (numerical values, rather than text strings) of the individual members of the cluster.

But imagine if you thousands of services hosts to add in your cluster, that’s becoming tricky to fill ARG4..

Can we use something like $SERVICESTATEID:$HOSTNAME$:service name$ ? Unfortunately it doesn’t works..

So we will have to use a trick:

we will use instead $SERVICESTATEID:servicegroup name:,$ , this on-demand macro will return the status of service name for each host in servicegroup name but not formatted as expected (Example : 0,1,0,2)

So we will create a new plugin script to replace servicename by a comma to meet the check_cluster format :

Create a new plugin called /usr/lib64/nagios/plugins/check_servicecluster

#!/bin/bash
/usr/lib64/nagios/plugins/check_cluster -s -l $1 -c $2 -d $3

Now define a Nagios command :

define command {
        command_name                    check_servicecluster
        # $ARG1$ = the critical threshold
        # $ARG2$ = the data list
        command_line                    /usr/lib64/nagios/plugins/check_servicecluster "Services Cluster description" $ARG1$ $ARG2$
}

Define a Nagios service group for the services you want to monitor and declare it in the service definition :

define servicegroup{
        servicegroup_name               my_servicegroup
        alias                           my_servicegroup
}
define service{
...
        hostgroup_name                  my_hostgroup       ; a hostgroup with several hosts
        service_description             my_service         ; the service name
        servicegroups                   my_servicegroup    ; declare this service part of my_servicegroup group
...
}

And now define the service cluster to monitor all my_service services :

define service {
         use                             active-service
         hostgroup_name                  my_hostgroup
         normal_check_interval           1
         service_description             my_servicecluster
         servicegroups                   my_servicegroup
         # declare the check command with critical threshold = 1 (>1)
         # to check the state of all service my_service declared in the service group my_servicegroup :
         check_command                   check_servicecluster!1!$SERVICESTATEID:my_servicegroup:my_service$
         contact_groups                  my_contactgroup
         register                        1
}

So now if we have hundreds hosts in the service group they are all declared in the cluster with a simple on-demand macro !

bmailhe

Nagios RRD dynamic template

What is the advantage of graph based on regex ?

If you have a dynamic plugin which send metrics to Nagios and if theses metrics can differs from one server from another. Or if you want all metrics coming from differents servers into the same graph.

The client send status and metrics to Nagios server through NSCA.

Here is an example of data sent :

myclientserver.com    hibiscus_mem    0    2011-06-14 13:05:01 (CRIT mem=70% cpu=90% / WARN mem=65% cpu=80%) Tomcat mem/cpu usage - OK:( [hibiscus mem=21.3% cpu=35.0%])| hibiscus.mem=21.3 hibiscus.cpu=35.0

We can see metrics sent are hibiscus.mem and hibiscus.cpu (hibiscus is the metric name and cpu and mem are the metrics keys). But if this plugin is ran on an another server, the metrics name can be different (like metric2.mem tomcat.mem etc.) and we do not want to edit the graph for each metric or create a graph for each new metric.

Let create a graph based on regex for these dynamic metrics.

  • The base regular expression is based on the plugin name “hibiscus_mem”. It is used to match Base Regular Expression below the Update button.
  • The selection is “$1” which means it is based on the server name. When the template will be created we’ll pickup a server from a list an get a graph with any metrics returned by plugin hibiscus_mem
  • The red line shows the based regular expression but with $1 => with the selected server

  • Here the File RE is the same as the red line defined above.
  • The DS (data source) is “mem” or “cpu” the metrics keys
  • Element = UN,0,$,IF which means put value=0 if metrics’s value in undefined (NaN). If you want to make sure you get a real value for the combined Data Source and not NaN whenever one of the element values is NaN, you may use “UN,0,$,IF” in the “Element” field.
  • Formula here is empty because we chose a specific server so we’ll only have one DS for metricname.metrickey. Make sure here you put “Nothing” in the Type Color column.
  • Choose some colors, here as we only have one DS we only choose one color that will be draw with LINE1. Make sure the RRA CDEF is set at “Avg” otherwise it won’t work.
  • Label Format is “$1 mem” or “$1 cpu” wich means the server name ($1) and the metric key.

Graph options :

As we deal with percentage values we set min value=0 and max=100 in rigid boundaries

Save the graph template.

Here is the result  :

Tomorrow if the plugin script return a new metric name from the same server (like hibiscus2.mem and hibiscus2.cpu) it will be automatically be added to the graph (if the perl module for nagiosperf can handle it automatycally to).

Here is the way to script the perl module nagiosperfd to be dynamical with metricname :

package Nagios::Metrics::hibiscus_mem;
use strict;
use base qw(Nagios::Metrics);

my $SERVICE = 'hibiscus_mem';
my $SERVICEID = 'hibiscus_mem';

#  SUBSYSTEM => keys => data source name (metric) => [ SUBSYSTEM, RRDFILE (, RRD DS NAME) ]
my %keymap = (
    "$SERVICEID" => {
        'keys' => {
            "mem" =>     [ "$SERVICEID", "stats" ],
            "cpu" =>     [ "$SERVICEID", "stats" ],
        },
   },
);
sub fixStats
{
    my $self = shift;
    my $name = shift;
    my $inst = shift;
    my $metrics = shift;
}

# SUBSYS->RRDFILE               DSTYPE      HB      MIN     MAX         AF
my %rrdschema = (
    "$SERVICEID" => {
        "stats" => {
             ds => {
                "mem" =>      ["GAUGE",   undef,  0,  undef ],
                "cpu" =>      ["GAUGE",   undef,  0,  undef ],
             },
         },
     },
);

sub new {
    my $class = shift;
    my $args = shift;
    my $self = Nagios::Metrics->new({
        log           => $args->{log},
        service       => $SERVICE,
        step          => 300,                 # how often are results submitted
        keymap        => \%keymap,
        rrdschema     => \%rrdschema,
        # This regname allows a check result coming as 'SERVICE-instance'
        # or 'SERVICE.instance'
        # to be recognized by this module
        # checks coming as 'SERVICE' work also just fine
        #regname       => '(?:' . $SERVICE . '|' . $SERVICE . '[\.\-].*)',
        'namekey'       => '^([\w]+)\.*(.*)',
        preprocess    => sub {
        my ($instance, $value) = ($_[0] =~ /^([^=]*)=(.*)$/);
          return "$SERVICEID/$instance=$value";
        },

    });
    bless $self, $class;
    return $self;
}

1;

As we can see we never defined the metric name but only the metric keys. The name are automatycally extracted by the code

'namekey'       => '^([\w]+)\.*(.*)',
        preprocess    => sub {
        my ($instance, $value) = ($_[0] =~ /^([^=]*)=(.*)$/);
          return "$SERVICEID/$instance=$value";
        },

Fix increasing Seconds_Behind_Master issue

The problem

Sometimes on innodb replication the slave is being late behind the master. This can be observed by the Seconds_Behind_Master value always increasing when the master is inserting/updating rows.

Exemple :

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
 Master_Host: masterdb.somewhere.com
 Master_User: replication
 Master_Port: 3306
 Connect_Retry: 60
 Master_Log_File: mysql-bin.007781
 Read_Master_Log_Pos: 365235732
 Relay_Log_File: mysqld-3306-relay-bin.004386
 Relay_Log_Pos: 259967669
 Relay_Master_Log_File: mysql-bin.007781
 Slave_IO_Running: Yes
 Slave_SQL_Running: Yes
(...)
 Seconds_Behind_Master: 33
1 row in set (0.00 sec)

In this example we can see that the slave is 33s behind the master while both CPU and IO are not overloaded.

Why ?

Because the log buffer is written out to the log file at each transaction commit and the flush to disk operation is performed on the log file. Writing and flushing operations are time expensive for only one transaction. It will be better to perform these operations for several transactions instead of only one.

How to do it ?

Just change the value of innodb_flush_log_at_trx_commit from 1 (default) to 0 :

mysql> SET GLOBAL innodb_flush_log_at_trx_commit = 0 ;

And add it in the my.cnf in order to take this setting in account at the next restart of mysql engine.

More information :  http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

synchronize a slave with the master without stopping the master

# 1. Stop Mysql server on the slave :

/etc/init.d/mysqld stop

# 1. On the Master, flush all the tables and block write statements by executing

mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

# 1. While the read lock placed by FLUSH TABLES WITH READ LOCK is in effect, read the value of the current binary log name and offset on the master:

mysql>  SHOW MASTER STATUS;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mysql-bin.000017 |   685677 |              |                  |
+--------------------+----------+--------------+------------------+

1 row in set (0.00 sec)

# 1. While the read lock is placed on the master, copy the data from the master to the slave (command to execute on the slave):

SLAVE> rsync --delete --exclude 'mysql' --exclude '*.info' -Wave ssh  root@MASTER:/opt/mysql/data/ /opt/mysql/data/

# 1. After you have taken the snapshot and recorded the log name and offset, you can re-enable write activity on the master:

mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
# 1. Remove the *.info files on the slave :
rm /opt/mysql/data/*info

# 1. Start Mysql server on the slave :

/etc/init.d/mysqld start

# 1. Stop the slave threads on the slave server :

mysql> slave stop;
Query OK, 0 rows affected (0.00 sec)

# 1. Execute the following statement on the slave, replacing the option values with the actual values relevant to your system:

mysql> CHANGE MASTER TO
->     MASTER_HOST='master_host_name',
->     MASTER_USER='replication_user_name',
->     MASTER_PASSWORD='replication_password',
->     MASTER_LOG_FILE='mysql-bin.000017',
->     MASTER_LOG_POS=685677;

# 1. Start the slave threads on the slave server :

mysql> slave start;
Query OK, 0 rows affected (0.00 sec)

# 1. Check the slave threads status on the slave server with command ‘

mysql> show slave status\G;

Slave_IO_Running and Slave_SQL_Running must be equal to yes .

Hello world!

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!