Module: check_mk
Branch: master
Commit: b334e12eef275766117a107ca300825d603480ba
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=b334e12eef2757…
Author: Lars Michelsen <lm(a)mathias-kettner.de>
Date: Thu Jan 31 16:50:37 2013 +0100
Fixed small problems with mkeventd initscript
---
mkeventd/omd/mkeventd.init | 29 ++++++++++++++++++-----------
1 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/mkeventd/omd/mkeventd.init b/mkeventd/omd/mkeventd.init
index 4e6237a..6b402b3 100755
--- a/mkeventd/omd/mkeventd.init
+++ b/mkeventd/omd/mkeventd.init
@@ -32,17 +32,24 @@ case "$1" in
echo 'Not running.'
else
echo -n "killing $THE_PID..."
- kill $THE_PID
- N=0
- while [ -e "$PIDFILE" ] ; do
- sleep 0.5
- echo -n .
- N=$((N + 1))
- if [ $N -gt 20 ] ; then
- echo "PID file did not vanish."
- exit 1
- fi
- done
+ kill $THE_PID 2>/dev/null
+ if [ $? -eq 0 ]; then
+ # Only wait for pidfile removal when the signal could be sent
+ N=0
+ while [ -e "$PIDFILE" ] ; do
+ sleep 0.5
+ echo -n .
+ N=$((N + 1))
+ if [ $N -gt 20 ] ; then
+ echo "PID file did not vanish."
+ exit 1
+ fi
+ done
+ else
+ # Remove the stale pidfile to have a clean state after this
+ rm $PIDFILE
+ fi
+ echo 'OK'
fi
;;
restart)
Module: check_mk
Branch: master
Commit: a64413c4358b2a8c3a62b4bc7e424877e55426c4
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=a64413c4358b2a…
Author: Lars Michelsen <lm(a)mathias-kettner.de>
Date: Thu Jan 31 16:38:03 2013 +0100
Updated bug entries #0908, #0909
---
.bugs/908 | 11 +++++++++++
.bugs/909 | 14 ++++++++++++++
2 files changed, 25 insertions(+), 0 deletions(-)
diff --git a/.bugs/908 b/.bugs/908
new file mode 100644
index 0000000..9bc9191
--- /dev/null
+++ b/.bugs/908
@@ -0,0 +1,11 @@
+Title: mkeventd init script should detect "already running" on start
+Component: ec
+State: open
+Date: 2013-01-31 16:32:50
+Targetversion: 1.2.2
+Class: cleanup
+
+When the mkeventd is started using the init-script and already running,
+the init script should detect that and print something like:
+
+Starting mkeventd...Already running.
diff --git a/.bugs/909 b/.bugs/909
new file mode 100644
index 0000000..13b895c
--- /dev/null
+++ b/.bugs/909
@@ -0,0 +1,14 @@
+Title: mkeventd init script does not handle stale pidfiles correctly
+Component: ec
+State: open
+Date: 2013-01-31 16:36:01
+Targetversion: 1.2.2
+Class: bug
+
+When running etc/init.d/mkeventd stop when having a pidfile but no running
+process, the "kill $THE_PID" call results in an error message which should
+be supressed. Instead it should show "OK" and delete the stale pidfile.
+
+The status command also has this error message which should be suppressed.
+
+Restart is completely broken in this case.
Module: check_mk
Branch: master
Commit: 596d07e38774ee6f6f15a4eddff52ba292e12624
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=596d07e38774ee…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Thu Jan 31 16:34:51 2013 +0100
lnx_bonding: new check for checking bonding interfaces on Linux
---
ChangeLog | 1 +
agents/check_mk_agent.linux | 8 ++
checkman/lnx_bonding | 37 ++++++++
checks/lnx_bonding | 167 ++++++++++++++++++++++++++++++++++
web/plugins/wato/check_parameters.py | 23 +++++
5 files changed, 236 insertions(+), 0 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 8834d50..edf97cf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -25,6 +25,7 @@
* livestatus_status: new check for monitoring performance of monitoring
* FIX: diskstat.include: fix computation of queue length on windows
(thanks to K.H. Fiebig)
+ * lnx_bonding: new check for checking bonding interfaces on Linux
Multisite:
* Added comment painter to notification related views
diff --git a/agents/check_mk_agent.linux b/agents/check_mk_agent.linux
index 80e883c..decc537 100755
--- a/agents/check_mk_agent.linux
+++ b/agents/check_mk_agent.linux
@@ -161,6 +161,14 @@ then
done
fi
+
+# Current state of bonding interfaces
+if [ -e /proc/net/bonding ] ; then
+ echo '<<<lnx_bonding:sep(58)>>>'
+ pushd /proc/net/bonding > /dev/null ; head -v -n 1000 * ; popd
+fi
+
+
# Number of TCP connections in the various states
echo '<<<tcp_conn_stats>>>'
netstat -nt | awk ' /^tcp/ { c[$6]++; } END { for (x in c) { print x, c[x]; } }'
diff --git a/checkman/lnx_bonding b/checkman/lnx_bonding
new file mode 100644
index 0000000..a610fb9
--- /dev/null
+++ b/checkman/lnx_bonding
@@ -0,0 +1,37 @@
+title: Check state of Bonding network interface on Linux
+agents: linux
+author: Mathias Kettner <mk(a)mathias-kettner.de>
+license: GPL
+distribution: check_mk
+description:
+ This check checks the current state of a Linux bonding interface.
+ If the total bonding state of the interface is down, then the
+ check gets critical. If at least one slave interface is down
+ or not the expected slave is currently active, then the check
+ gets WARN.
+
+item:
+ The name of the bonding interface, etc {bond0}.
+
+examples:
+ # On linux hosts the lowest numbered eth interface should be
+ # the active one.
+ check_parameters = [
+ ( { "expect_active" : "lowest" }, [ 'linux'], ALL_HOSTS, ALL_SERVICES )
+ ]
+
+inventory:
+ One check per {active} bonding interface will be created.
+
+[parameters]
+parameters (dict): Currently two parameters are available in this dict:
+
+ {"primary"}: The assumed primary interface. This is for agents that do
+ not provide this information. In that case this is set automatically
+ by the inventory to the currently active interface.
+
+ {"expect_active"}: Which interface should be expected active. This is one
+ of the following strings: {"primary"}: The interface set by the
+ parameter {"primary"}, {"lowest"}: The interface sorting lowest
+ alphabetically, {"ignore"}: Ignore which interace is active.
+
diff --git a/checks/lnx_bonding b/checks/lnx_bonding
new file mode 100644
index 0000000..b252300
--- /dev/null
+++ b/checks/lnx_bonding
@@ -0,0 +1,167 @@
+#!/usr/bin/python
+# -*- encoding: utf-8; py-indent-offset: 4 -*-
+# +------------------------------------------------------------------+
+# | ____ _ _ __ __ _ __ |
+# | / ___| |__ ___ ___| | __ | \/ | |/ / |
+# | | | | '_ \ / _ \/ __| |/ / | |\/| | ' / |
+# | | |___| | | | __/ (__| < | | | | . \ |
+# | \____|_| |_|\___|\___|_|\_\___|_| |_|_|\_\ |
+# | |
+# | Copyright Mathias Kettner 2013 mk(a)mathias-kettner.de |
+# +------------------------------------------------------------------+
+#
+# This file is part of Check_MK.
+# The official homepage is at http://mathias-kettner.de/check_mk.
+#
+# check_mk is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation in version 2. check_mk is distributed
+# in the hope that it will be useful, but WITHOUT ANY WARRANTY; with-
+# out even the implied warranty of MERCHANTABILITY or FITNESS FOR A
+# PARTICULAR PURPOSE. See the GNU General Public License for more de-
+# ails. You should have received a copy of the GNU General Public
+# License along with GNU Make; see the file COPYING. If not, write
+# to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
+# Boston, MA 02110-1301 USA.
+
+# <<<lnx_bonding:sep(58)>>>
+# ==> bond0 <==
+# Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
+#
+# Bonding Mode: load balancing (round-robin)
+# MII Status: down
+# MII Polling Interval (ms): 0
+# Up Delay (ms): 0
+# Down Delay (ms): 0
+#
+# ==> bond1 <==
+# Ethernet Channel Bonding Driver: v3.2.5 (March 21, 2008)
+#
+# Bonding Mode: fault-tolerance (active-backup)
+# Primary Slave: eth0
+# Currently Active Slave: eth0
+# MII Status: up
+# MII Polling Interval (ms): 100
+# Up Delay (ms): 0
+# Down Delay (ms): 0
+#
+# Slave Interface: eth4
+# MII Status: up
+# Link Failure Count: 0
+# Permanent HW addr: 00:1b:21:49:d4:e4
+#
+# Slave Interface: eth0
+# MII Status: up
+# Link Failure Count: 1
+# Permanent HW addr: 00:26:b9:7d:89:2e
+
+def parse_lnx_bonding(info):
+ lines = iter(info)
+ bonds = {}
+
+ # Skip header with bonding version
+ try:
+ bond = lines.next()[0].split()[1]
+ bonds[bond] = {}
+ while True:
+ # ==> bond0 <==
+ lines.next() # Skip Channel Bonding Driver
+
+ # Parse global part
+ main = {}
+ bonds[bond]["main"] = main
+ while True:
+ line = lines.next()
+ main[line[0].strip()] = line[1].strip()
+ if line[0].strip() == "Down Delay (ms)":
+ break
+
+ # Parse interfaces
+ interfaces = {}
+ bonds[bond]["interfaces"] = interfaces
+ while True:
+ line = lines.next()
+ if line[0].startswith("==>"):
+ bond = line[0].split()[1]
+ bonds[bond] = {}
+ break
+ elif line[0].strip() == "Slave Interface":
+ eth = line[1].strip()
+ interfaces[eth] = {}
+ elif line:
+ interfaces[eth][line[0].strip()] = ":".join(line[1:]).strip()
+
+ except StopIteration:
+ pass
+
+ # Now convert to generic dict, also used by other bonding checks
+ converted = {}
+ for bond, status in bonds.items():
+ interfaces = {}
+ for eth, ethstatus in status["interfaces"].items():
+ interfaces[eth] = {
+ "status" : ethstatus["MII Status"],
+ "hwaddr" : ethstatus.get("Permanent HW addr", ""),
+ "failures" : int(ethstatus["Link Failure Count"]),
+ }
+ converted[bond] = {
+ "status" : status["main"]["MII Status"],
+ "primary" : status["main"].get("Primary Slave"),
+ "active" : status["main"].get("Currently Active Slave"),
+ "mode" : status["main"]["Bonding Mode"].split('(')[0].strip(),
+ "interfaces" : interfaces,
+ }
+ return converted
+
+
+def inventory_lnx_bonding(info):
+ parsed = parse_lnx_bonding(info)
+ return [ (bond, { "primary" : status["primary"]}) for (bond, status) in parsed.items()
+ if status["status"] == "up" ]
+
+
+def check_lnx_bonding(item, params, info):
+ parsed = parse_lnx_bonding(info)
+ if item not in parsed:
+ return (3, "UNKNOWN - no such bonding interface")
+ status = parsed[item]
+ if status["status"] != "up":
+ return 2, "CRIT - interface is " + status["status"]
+
+ infos = []
+ state = 0
+ for eth, slave in status["interfaces"].items():
+ infos.append("%s/%s %s" % (eth, slave["hwaddr"], slave["status"]))
+ if slave["status"] != 'up':
+ state = 1
+ infos[-1] += "(!)"
+
+ primary = status.get("primary", params.get("primary"))
+ if primary:
+ infos.append("primary: " + primary)
+ active = status["active"]
+ if active:
+ infos.append("active: " + active)
+
+ expect = params.get("expect_active", "ignore")
+ if expect in [ "primary", "lowest" ]:
+ if expect == "primary":
+ expected_active = primary
+ else: # "lowest"
+ slaves = status["interfaces"].keys()
+ slaves.sort()
+ expected_active = slaves[0]
+ if expected_active != active:
+ infos[-1] += "(!)"
+ infos.append("expected is %s" % expected_active)
+ state = 1
+
+ return state, nagios_state_names[state] + " - " + ", ".join(infos)
+
+
+check_info['lnx_bonding'] = {
+ "check_function" : check_lnx_bonding,
+ "inventory_function" : inventory_lnx_bonding,
+ "service_description" : "Bonding Interface %s",
+ "group" : "bonding",
+}
diff --git a/web/plugins/wato/check_parameters.py b/web/plugins/wato/check_parameters.py
index ccd96d2..e67bf33 100644
--- a/web/plugins/wato/check_parameters.py
+++ b/web/plugins/wato/check_parameters.py
@@ -1027,6 +1027,29 @@ checkgroups.append((
"dict")
)
+register_check_parameters(
+ subgroup_networking,
+ "bonding",
+ _("Status of Linux bonding interfaces"),
+ Dictionary(
+ elements = [
+ ( "expect_active",
+ DropdownChoice(
+ title = _("Warn on unexpected active interface"),
+ choices = [
+ ( "ignore", _("ignore which one is active") ),
+ ( "primary", _("require primary interface to be active") ),
+ ( "lowest", _("require interface that sorts lowest alphabetically") ),
+ ]
+ )
+ ),
+ ]
+ ),
+ TextAscii(
+ title = _("Name of the bonding interface"),
+ ),
+ "dict")
+
checkgroups.append((
subgroup_networking,
"if",
Module: check_mk
Branch: master
Commit: 0802cbb7a10bfc2d34ddc7494130bb40df853b1a
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=0802cbb7a10bfc…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Thu Jan 31 14:58:45 2013 +0100
EC: peed up rule matches in some special cases by factor of 100 and more
---
ChangeLog | 1 +
mkeventd/bin/mkeventd | 5 +++++
2 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 16be061..8834d50 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -44,6 +44,7 @@
Event Console:
* Added UNIX socket for sending events to the EC
+ * Speed up rule matches in some special cases by factor of 100 and more
1.2.1i5:
Core:
diff --git a/mkeventd/bin/mkeventd b/mkeventd/bin/mkeventd
index 72256a2..fc05d77 100755
--- a/mkeventd/bin/mkeventd
+++ b/mkeventd/bin/mkeventd
@@ -1122,6 +1122,11 @@ class EventServer:
for key in [ "match", "match_ok", "match_host", "match_application" ]:
if key in rule:
value = rule[key].strip()
+ # Remote leading .* from regex. This is redundant and
+ # dramatically destroys performance when doing an infix search.
+ if key in [ "match", "match_ok" ]:
+ while value.startswith(".*") and not value.startswith(".*?"):
+ value = value[2:]
if not value:
del rule[key]
continue
Module: check_mk
Branch: master
Commit: d9ec2bf3130412b1d5984ecf54b4b402d6bc8db4
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=d9ec2bf3130412…
Author: Lars Michelsen <lm(a)mathias-kettner.de>
Date: Thu Jan 31 11:28:03 2013 +0100
Updated bug entries #0906
---
.bugs/906 | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/.bugs/906 b/.bugs/906
new file mode 100644
index 0000000..c788695
--- /dev/null
+++ b/.bugs/906
@@ -0,0 +1,15 @@
+Title: logwatch.ec: Need some fallback mechanism if processing takes too long
+Component: checks
+State: open
+Date: 2013-01-31 11:23:38
+Targetversion: 1.2.2
+Class: bug
+
+Nagios kills Check_MK when it takes too long. When e.g. logwatch.ec is taking
+more than 60 seconds, the script is terminated and all unprocessed messages
+are lost. This results in data-loss and must be prevented!
+
+This is not only based on the amount of messages to be processed at once,
+when the event console is hanging or taking too long for any reason,
+the pending messages are lost. Just limiting the number of messages is
+no real solution.