Module: check_mk
Branch: master
Commit: cb886e22d7d678507acdb7084c615adcb806e1ed
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=cb886e22d7d678…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Sun Jan 29 22:30:11 2012 +0100
mounts: ignore change in commit= option
---
.bugs/476 | 13 ++++++++++---
ChangeLog | 2 ++
checkman/mounts | 20 ++++++++++++--------
checks/mounts | 33 +++++++++++++++++++++++++++------
4 files changed, 51 insertions(+), 17 deletions(-)
diff --git a/.bugs/476 b/.bugs/476
index 1e88e12..22cbf79 100644
--- a/.bugs/476
+++ b/.bugs/476
@@ -1,13 +1,20 @@
-Title: mounts Checks should only check ro flag
+Title: mounts check should only check ro flag
Component: checks
+State: done
+Class: bug
Benefit: 1
-State: open
Cost: 1
Date: 2011-12-06 08:52:27
Targetversion: 1.2.0
-Class: bug
Currently on my laptop the mounts check is flapping around because
the kernel seems to add a mount option with "commit" and change
it all the time. Basically we'd need to just monitor the ro
status.
+
+Another idea is to consolidate the check into one single
+check. This is difficult - however - because we need to
+safe one per-filesystem parameter: the expected options.
+
+2012-01-29 22:29:13: changed state open -> done
+commit is now excluded.
diff --git a/ChangeLog b/ChangeLog
index 2f96498..9f88f2d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -42,6 +42,8 @@
link speed (fixes speed of 10GBit/s and 20GBit/s ports, thanks Marco Poet)
* cmctc.temp: serivce has been renamed from "CMC Temperature %s" to just
"Temperature %s", in order to be consistent with the other checks.
+ * mounts: exclude changes of the commit option (might change on laptops),
+ make only switch to ro critical, other changes warning.
Multisite:
* Improve transaction handling and reload detection: user can have
diff --git a/checkman/mounts b/checkman/mounts
index 59da5e4..1030bf5 100644
--- a/checkman/mounts
+++ b/checkman/mounts
@@ -6,13 +6,16 @@ distribution: check_mk
description:
This check monitors the options with which a filesystem is mounted by
using the output of {/proc/mounts}. In normal operation mount options
- should never change. If they do a severe problem can be the case: In
- case of an I/O error while writing to disk, most filesystems (e.g.
- {ext3}) will switch to read-only per default in order to avoid further
- filesystem corruption. This check makes sure that you get to know
- if this happens.
+ should never change. In recent versions of Linux (e.g. UBUNTU oneiric),
+ the mount option {commit=} may change automatically depending
+ on the current battery state, so this option is being excluded from
+ the monitoring.
+
+ This check goes critical if the mount option {ro} appears. This
+ might indicate IO errors are even a filesystem corruption!
+
+ Any other change of mount options will trigger a warning state.
- Any change in the mount options results in a {CRITICAL} state.
If the filesystem is not being found mounted, the check returns
an {UNKNOWN} state.
@@ -27,5 +30,6 @@ inventory:
by the agent.
[parameters]
-target_options (list of strings): The list of expected mount options. Any change
- in that list will result in a critical state. Example: {['data=ordered', 'rw']}
+target_options (list of strings): The list of expected mount options. During
+ inventory this list is being put into the check's parameter.
+ Example: {['data=ordered', 'rw']}
diff --git a/checks/mounts b/checks/mounts
index c82dab7..5764cbe 100644
--- a/checks/mounts
+++ b/checks/mounts
@@ -36,13 +36,34 @@ def inventory_mounts(info):
def check_mounts(item, targetopts, info):
for dev, mp, fstype, options, dump, fsck in info:
if item == mp:
- targetopts.sort()
opts = options.split(",")
- opts.sort()
- if opts != targetopts:
- return (2, "CRIT - mount options are %s, expected are %s" % (",".join(opts), ",".join(targetopts)))
- else:
- return (0, "OK - mount options are %s" % (",".join(opts)))
+ # Now compute the exact difference.
+ exceeding = []
+ missing = []
+ for o in opts:
+ if o not in targetopts and not o.startswith("commit="):
+ exceeding.append(o)
+ for o in targetopts and not o.startswith("commit="):
+ if o not in opts:
+ missing.append(o)
+
+ if not missing and not exceeding:
+ return (0, "OK - mount options exactly as expected")
+
+ infos = []
+ if missing:
+ infos.append("missing: %s" % ",".join(missing))
+ if exceeding:
+ infos.append("exceeding: %s" % ",".join(exceeding))
+ infotext = ", ".join(infos)
+
+ if "ro" in exceeding:
+ return (2, "CRIT - filesystem has switched to read-only "
+ "and is probably corrupted(!!), " + infotext)
+
+ # Just warn in other cases
+ return (1, "OK - " + infotext)
+
return (3, "UNKNOWN - filesystem not mounted")
Module: check_mk
Branch: master
Commit: 607df1d1d598d653e906a539fa801d4a493cc30e
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=607df1d1d598d6…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Sun Jan 29 17:01:21 2012 +0100
Updated bug entries #0412
---
.bugs/412 | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/.bugs/412 b/.bugs/412
index 1dfc6c2..c1c0e39 100644
--- a/.bugs/412
+++ b/.bugs/412
@@ -1,10 +1,14 @@
Title: Documentation for dashboards is missing
Component: docu
+State: done
+Class: todo
Benefit: 3
-State: open
Cost: 3
Date: 2011-10-14 12:30:36
-Class: todo
+Targetversion: future
A common question will be how to customize the dashboard. Create basic docs
for the dashboard format and how to create custom dashlets and dashboards
+
+2012-01-29 17:01:19: changed state open -> done
+Documentation is finally written. Phuey.
Module: check_mk
Branch: master
Commit: 540d799d9809a877741c8b3213021a744a25c735
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=540d799d9809a8…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Sat Jan 28 14:27:59 2012 +0100
Updated bug entries #0594
---
.bugs/594 | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/.bugs/594 b/.bugs/594
index 9096822..fef2c04 100644
--- a/.bugs/594
+++ b/.bugs/594
@@ -1,9 +1,13 @@
-Title: CPU Load immer auf 100% bei -nv (Windows)
+Title: CPU Utilization immer auf 100% bei -nv (Windows)
Component: core
State: open
Date: 2012-01-10 17:58:52
Targetversion: 1.2.0
Class: nastiness
-Windows Hosts zeigen bei einem cmk -nv immer einen CPU Load von 100% an.
+Windows Hosts zeigen bei einem cmk -nv immer eine CPU Utilization von 100% an.
Der Normale Check funktioniert. Tritt hier bei 1.1.13i2 auf.
+
+Grund: der Check arbeitet mit Countern. Sollte aber trotzdem machbar sein,
+dass er funktioniert, wenn schon von früher Werte da sind. Bei erstenmal
+sollte der Check pending sein (analog zu disk IO).
Module: check_mk
Branch: master
Commit: 7e738a59146d307e2581292763e30f15af4616d3
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=7e738a59146d30…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Sat Jan 28 13:10:53 2012 +0100
Updated bug entries #0608, #0607
---
.bugs/607 | 8 ++++++--
.bugs/608 | 8 ++++++--
2 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/.bugs/607 b/.bugs/607
index b16128f..26c6db7 100644
--- a/.bugs/607
+++ b/.bugs/607
@@ -5,5 +5,9 @@ Date: 2012-01-27 14:46:10
Targetversion: 1.2.0
Class: nastiness
-Das ändern der Einstellung: wato_activation_method über Wato erforder einen Apache restart damit
-es funktioniert. Wato führt diesen jedoch nicht aus.
+Das Ändern der Einstellung: wato_activation_method über Wato erforder einen
+Apache restart damit es funktioniert. Wato führt diesen jedoch nicht aus.
+
+Mathias: Das kann eigentlich nicht so sein. Änderungen in multisite.mk brauchen
+nie einen Neustart. WATO kann auch keinen Neustart von Apache machen. Der
+Fehler muss wo anders liegen.
diff --git a/.bugs/608 b/.bugs/608
index b60b14e..ebd2214 100644
--- a/.bugs/608
+++ b/.bugs/608
@@ -5,6 +5,10 @@ Date: 2012-01-27 16:09:02
Targetversion: 1.2.0
Class: nastiness
-Wird über Wato ein Nagios Reload anstelle des Restart ausgeführt,
-braucht das neuladen des Livestatus Modules zu lange und die Sitebar läuf in einen
+Wird über Wato ein Nagios Reload anstelle des Restart ausgeführt, braucht
+das Neuladen des Livestatus Modules zu lange und die Sidebar läuf in einen
Livestatus Timeout und muss neu geladen werden.
+
+Mathias: Ich denke wir lösen das Problem in der Livestatus-API für Python.
+Wenn der Connect nicht klappt, probieren wir es eine Zeitlang. Dafür
+gibt es ja eine Timeout-Einstellung.