Module: check_mk
Branch: master
Commit: 8b1f650c5bb673ab1e6502922c3a3bc5ac67d216
URL:
http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=8b1f650c5bb673…
Author: Florian Heigl <fh(a)mathias-kettner.de>
Date: Mon Aug 20 17:24:17 2012 +0200
checks/megaraid_bbu: The check for BBU status now silences while a battery learn cycle is
running
---
ChangeLog | 1 +
checkman/megaraid_bbu | 21 ++++++++++++++++-----
checks/megaraid_bbu | 22 ++++++++++++++--------
3 files changed, 31 insertions(+), 13 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index e34dad1..e704d88 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -9,6 +9,7 @@
* New Checks for Siemens Blades (BX600)
* New Checks for Fortigate Firewalls
* FIX: megaraid_pdisks: handle case where no enclosure device exists
+ * FIX: megaraid_bbu: handle the controller's learn cycle. No errors in that
period.
* mysql_capacity: cleaned up check, levels are in MB now
* jolokia_info, jolokia_metrics: new rewritten checks for jolokia (formerly
jmx4perl). You need the new plugin mk_jokokia for using them
diff --git a/checkman/megaraid_bbu b/checkman/megaraid_bbu
index e4ea82f..0e7f83e 100644
--- a/checkman/megaraid_bbu
+++ b/checkman/megaraid_bbu
@@ -4,14 +4,25 @@ author: Florian Heigl <fh(a)mathias-kettner.de>
license: GPL
distribution: check_mk
description:
- This check monitors the existance and status of battery backup units on controllers that
are based on the mid- and highend LSI Megaraid chipsets. The entry chipsets do not support
BBUs. The Linux agent will try find any existing BBUs - if {MegaCli} is found in your
search path.
- The BBUs might come in various types {(iBBU, BBU)} and also some other vendors are using
this RAID chip. It is tested against Intel, Dell, IBM and FSC models.
+ This check monitors the existance and status of battery backup units on controllers
+ that are based on the mid- and highend LSI Megaraid chipsets. The entry chipsets do
+ not support BBUs. The Linux agent will try find any existing BBUs - if {MegaCli} is
+ found in your search path.
+ The BBUs might come in various types {(iBBU, BBU)} and also some other vendors are
+ using this RAID chip. It is tested against Intel, Dell, IBM and FSC models.
- The check works by matching the agent output against a dictionary of expected values. If
you have MegaCli installed and some values are not detected, it might be neccessary to
update your version of MegaCli.
+ The check works by matching the agent output against a dictionary of expected values.
+ If you have MegaCli installed and some values are not detected, it might be neccessary
+ to update your version of MegaCli.
- {MegaCli} can be downloaded from LSI at the following URL
{http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/8.02.16_MegaCLI.zip}
+ {MegaCli} can be downloaded from LSI at the following URL
+ {http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/8.02.16_MegaCLI.zip}
- It would be possible to make the warning / critical levels user specifiable. See the
check source for this if you have a need to influence those parameters.
+ It would be possible to make the warning / critical levels user specifiable.
+ See the check source for this if you have a need to influence those parameters.
+ Most controllers run a "battery learn cycle" periodically or on user request.
+ The check detects this learn cycle and suppresses all errors while this cycle is
active.
+ This affects all models that do not have a flash / capacitor based BBU system.
item:
A string "RAID Adapter/BBU" followed by the ID of the adapter as reported by
MegaCli.
diff --git a/checks/megaraid_bbu b/checks/megaraid_bbu
index 1e0066b..be2f9b2 100644
--- a/checks/megaraid_bbu
+++ b/checks/megaraid_bbu
@@ -31,18 +31,19 @@
# Load a fake controller with known good values for the most
# important parameters only and try to define their importance
megaraid_bbu_refvalues = {
- 'Remaining Capacity Low' : ('No', 1),
- 'I2c Errors Detected' : ('No', 1),
+ 'Remaining Capacity Low' : ('No', 1), #
nolearn
+ 'I2c Errors Detected' : ('No', 1),
'Temperature' : ('OK', 2),
'Pack is about to fail & should be replaced': ('No', 1),
- 'Charging Status' : ('None', 1),
- 'Battery State' : ('Operational', 2),
+ 'Charging Status' : ('None', 1), #
nolearn
+ 'Battery State' : ('Operational', 2), #
nolearn
'Learn Cycle Status' : ('OK', 1),
+ 'Learn Cycle Active' : ('Yes', 0),
'Battery Pack Missing' : ('No', 2),
'Battery Replacement required' : ('No', 1),
'Over Temperature' : ('No', 2),
'Over Charged' : ('No', 1),
- 'Voltage' : ('OK', 2),
+ 'Voltage' : ('OK', 2), # nolearn
}
@@ -80,19 +81,23 @@ def check_megaraid_bbu(item, _no_params, info):
# get current charge level
charge = (", Charge is %s" % controller['Relative State of
Charge'])
+
# verify defined important parameters to current level
for varname, (refvalue, refstate) in megaraid_bbu_refvalues.items():
+ # the try/except should handle controller types that don't have certain
values
+ # if your bbu chipset fails and you still get a partial response this will lead
+ # to a false result. but people asked for it :>
try:
- value = controller[varname]
- # build a list of all errors
if controller[varname] != refvalue:
broken.append("%s is %s, but should be %s(%s)" % (varname, value,
refvalue, "!" * refstate))
state = max(state, refstate)
except:
pass
+ if controller["Learn Cycle Active"] == "Yes":
+ return (0, "OK - no states to check (controller is in learn cycle)" +
charge)
# return assembled info
- if broken:
+ elif broken:
return (state, nagios_state_names[state] + " - " + ", ".join(broken)
+ charge)
else:
return (0, "OK - all states as expected" + charge)
@@ -100,4 +105,5 @@ def check_megaraid_bbu(item, _no_params, info):
return (3, "UNKNOWN - Check not implemented")
+
check_info["megaraid_bbu"] = (check_megaraid_bbu, "RAID Adapter/BBU
%s", 0, inventory_megaraid_bbu)