Module: check_mk
Branch: master
Commit: d59176f1d458b918a9d08ca2ed596d265f74d93b
URL:
http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=d59176f1d458b9…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Wed Dec 3 18:45:26 2014 +0100
#1622 FIX megaraid_bbu: handle case isSOHGood and consider it as critical
Thanks to Jukka Santala, who wrote:
"Discovered (the hard way) that when MegaRAID BBU batteries degrade too low,
disabling WriteBack cache, the firmware doesn't turn on "battery pack needs
replacing" flag. Instead <tt>isSOHGood</tt> seems to be only reliable
way
to detect this condition. In this patch I've ranked it as critical, as it
is fairly equivalent to battery pack missing. Unfortunately it's generally
too late when this goes on, so it would be nice to have little warning.
The current agents do not query "Absolute State of charge" off the controller,
and on the DELL 12G RAID-controllers this statistics doesn't even exist;
they also have different design capacity. Because of these limitations I've
resolved to just listing the "Full Charge Capacity" in the check status. I
considered having the passed on perfdata for graphing, but as they're only
updated on battery re-learn, it's limited usability. This patch is against
latest innovation versions, and has been tested again various DELL 9G,
11G and 12G servers with 5/i, 6/i, H700 and H710P RAID-controllers."
---
.werks/1622 | 24 ++++++++++++++++++++++++
ChangeLog | 1 +
checks/megaraid_bbu | 4 +++-
3 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/.werks/1622 b/.werks/1622
new file mode 100644
index 0000000..e07e28a
--- /dev/null
+++ b/.werks/1622
@@ -0,0 +1,24 @@
+Title: megaraid_bbu: handle case isSOHGood and consider it as critical
+Level: 1
+Component: checks
+Compatible: compat
+Version: 1.2.5i7
+Date: 1417628636
+Class: fix
+
+Thanks to Jukka Santala, who wrote:
+
+"Discovered (the hard way) that when MegaRAID BBU batteries degrade too low,
+disabling WriteBack cache, the firmware doesn't turn on "battery pack needs
+replacing" flag. Instead <tt>isSOHGood</tt> seems to be only reliable
way
+to detect this condition. In this patch I've ranked it as critical, as it
+is fairly equivalent to battery pack missing. Unfortunately it's generally
+too late when this goes on, so it would be nice to have little warning.
+The current agents do not query "Absolute State of charge" off the controller,
+and on the DELL 12G RAID-controllers this statistics doesn't even exist;
+they also have different design capacity. Because of these limitations I've
+resolved to just listing the "Full Charge Capacity" in the check status. I
+considered having the passed on perfdata for graphing, but as they're only
+updated on battery re-learn, it's limited usability. This patch is against
+latest innovation versions, and has been tested again various DELL 9G,
+11G and 12G servers with 5/i, 6/i, H700 and H710P RAID-controllers."
diff --git a/ChangeLog b/ChangeLog
index 9c93ad7..d7a9062 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -75,6 +75,7 @@
* 1566 FIX: 3ware_disks: consider VERIFYING state as OK now...
* 1612 FIX: job: Fixed wrong reported start time for running jobs
* 1571 FIX: check_mk_agent.linux: fix output of lnx_if on Ubuntu 8.04 (on older
kernels), repairs tcp_conn_stats...
+ * 1622 FIX: megaraid_bbu: handle case isSOHGood and consider it as critical...
Multisite:
* 1508 Allow input of plugin output and perfdata when faking check results...
diff --git a/checks/megaraid_bbu b/checks/megaraid_bbu
index 76ae5d1..baa07cc 100644
--- a/checks/megaraid_bbu
+++ b/checks/megaraid_bbu
@@ -44,6 +44,7 @@ megaraid_bbu_refvalues = {
'Over Temperature' : ('No', 2),
'Over Charged' : ('No', 1),
'Voltage' : ('OK', 2), # nolearn
+ 'isSOHGood' : ('Yes', 2),
}
@@ -85,7 +86,8 @@ def check_megaraid_bbu(item, _no_params, info):
if 'Relative State of Charge' not in controller:
charge = ", No charge information reported for this controller"
else:
- charge = ", Charge is %s" % controller['Relative State of
Charge']
+ charge = ", Charge is %s, Capacity is %s" % \
+ (controller['Relative State of Charge'], controller['Full
Charge Capacity'])
# verify defined important parameters to current level
for varname, (refvalue, refstate) in megaraid_bbu_refvalues.items():