Module: check_mk
Branch: master
Commit: dd767cb1be3047c5ed8b06d16ec065dcb5e5f2d9
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=dd767cb1be3047…
Author: Lars Michelsen <lm(a)mathias-kettner.de>
Date: Fri Feb 25 13:52:50 2011 +0100
Updated bug entries
---
.bugs/121 | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/.bugs/121 b/.bugs/121
index 9c0a887..d5c6371 100644
--- a/.bugs/121
+++ b/.bugs/121
@@ -1,11 +1,15 @@
Title: Adding more than one column at once fails
Component: multisite
+State: done
+Class: bug
+Date: 2011-02-04 10:06:44
Benefit: 2
-State: open
Cost: 1
-Date: 2011-02-04 10:06:44
-Class: bug
+Fun: 0
In view editor if you add more than one column in one turn
without saving in between the first added column will be
erased.
+
+2011-02-25 13:52:41: changed state open -> done
+Fixed column adding mechanism not to use innerHTML on field container directly. Using DOM noded instead.
Module: check_mk
Branch: master
Commit: b25ac1ad7a2314a3510dcd3852bce2490dad2ff1
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=b25ac1ad7a2314…
Author: Lars Michelsen <lm(a)mathias-kettner.de>
Date: Fri Feb 25 13:53:56 2011 +0100
Multisite: Fixed column editor forgetting uncommited changes
---
ChangeLog | 2 ++
web/htdocs/js/check_mk.js | 11 ++++++++++-
2 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 6b1bd57..9e1cf2e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -29,6 +29,8 @@
mode (such as host or service detail)
* FIX: fix PNP icon in cases where host and service icons are displayed in
same view (found by Wolfgang Barth)
+ * FIX: Fixed view column editor forgetting pending changes to other form
+ fields
WATO:
* FIX: fix problem with vanishing services on Windows. Affected were services
diff --git a/web/htdocs/js/check_mk.js b/web/htdocs/js/check_mk.js
index 050bf22..cec6c96 100644
--- a/web/htdocs/js/check_mk.js
+++ b/web/htdocs/js/check_mk.js
@@ -320,8 +320,17 @@ function column_swap_ids(o1, o2) {
}
function add_view_column_handler(id, code) {
+ // Can not simply add the new code to the innerHTML code of the target
+ // container. So first creating a temporary container and fetch the
+ // just created DOM node of the editor fields to add it to the real
+ // container afterwards.
+ var tmpContainer = document.createElement('div');
+ tmpContainer.innerHTML = code;
+ var oNewEditor = tmpContainer.lastChild;
+
var oContainer = document.getElementById('ed_'+id).firstChild;
- oContainer.innerHTML += code;
+ oContainer.appendChild(oNewEditor);
+ tmpContainer = null;
if (oContainer.lastChild.previousSibling)
fix_buttons(oContainer, oContainer.lastChild.previousSibling);
Module: check_mk
Branch: master
Commit: b2d60016b9f1d363de1044df501256df6811b2ed
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=b2d60016b9f1d3…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Fri Feb 25 12:49:26 2011 +0100
Tuned man-page of h3c_lanswitch_cpu
---
README.writing_checks | 29 +++++++++++++------------
checkman/h3c_lanswitch_cpu | 49 +++++++++++++++++++++++--------------------
checks/h3c_lanswitch_cpu | 2 +-
3 files changed, 42 insertions(+), 38 deletions(-)
diff --git a/README.writing_checks b/README.writing_checks
index 687720b..6620511 100644
--- a/README.writing_checks
+++ b/README.writing_checks
@@ -121,9 +121,6 @@ Other issues:
with the rest of the data and produce useful results. use int() in all other cases,
e.g. if the check does not make any sense if you have no valid data.
-Manpages:
-*
-
Performance data:
* Only set perfdata flag when the check really produces performance data
output.
@@ -136,14 +133,18 @@ SNMP based checks:
* Only use numeric OIDs in your checks. Name based OIDs rely on MIB files
and the check won't work when the MIB files are not in place.
-* Scan function:
-
-Agent based checks:
-
-* Put sample output snippets of the agent as comments into
- the check source code. Add examples for all cases handled
- by your check. -> That makes the code more easy to understand
- and help not to break something if someone changes something
- in the check's parser.
- In an optimal case you include several code examples of different
- states.
+Manpages
+
+* Each check *must* have a man page. This should be:
+ - complete
+ - precise
+ - terse
+ - helpful!!
+
+* Information that must be contained in the description:
+ - What does the check exactly?
+ - Under which circumstances goes the check to WARN/CRIT?
+ - Which devices are supported by the check?
+ - Does the check require some configuration of the agent or
+ some separate plugin? (example: The logwatch check requires
+ the agent plugin mk_logwatch to be installed)
diff --git a/checkman/h3c_lanswitch_cpu b/checkman/h3c_lanswitch_cpu
index 3c742c0..a852990 100644
--- a/checkman/h3c_lanswitch_cpu
+++ b/checkman/h3c_lanswitch_cpu
@@ -1,39 +1,42 @@
-title: Check CPU Utilization of 3COM and H3C switches.
+title: Check CPU utilization of 3COM and H3C switches
agents: snmp
author: Florian Heigl <fh(a)mathias-kettner.de>
license: GPL
distribution: check_mk
description:
-This Check queries the CPU utilization for various LAN switches that are supporting the MIB for H3C lan switches.
-These are:
-Newer 3Com switches (Superstack 4 and later), H3C branded switches and now for example the HP "A-Series" Networking gear. The same switches from the original maker Huawei are not supporting the MIB. These also only support cpu stats, whereas the H3C Mib also gives Memory stats and Fabric status etc.
-The check should correctly label the CPU in your switch / stack / core switch module slots.
-
-Note some of these switches don't handle snmp well and might generate up to 90% cpu load while queried.
-To accomodate for this you should lower the frequency of SNMP based checks with them.
-The CPU usage ratio returned by the check is the average load of the last 5 minutes.
+ This Check queries the CPU utilization of various LAN switches that are
+ supporting the MIB for H3C lan switches. These are: newer 3COM switches
+ (Superstack 4 and later), H3C branded switches and now for example the HP
+ "A-Series" Networking gear. The same switches from the original maker Huawei
+ are {not} supporting the MIB. The check tries to correctly label the CPU in
+ your switch / stack / core switch module slots.
+ The check goes WARN/CRIT if the average CPU usage of the last {five minutes}
+ exceeds configurable levels.
+ Note that some of these switches don't handle SNMP well and might generate
+ up to 90% cpu load while queried. To accomodate for this you should lower
+ the frequency of SNMP based checks with them.
+ Some devices have a high usage even in normal conditions (i.e. running full
+ BGP tables). In that case you might have to fine tune the check parameters
+ or even have to disable the check.
inventory:
- The check will add one service per Unit/CPU in a stack.
+ The check will create one separate service per CPU.
-[parameters]
-warning (int): the percentage of 1-minute average load at which the switch is considered too busy.
+item:
+ A string describing the switch and unit number, for example {"Switch 2 CPU 0"}
+ or {"Switch 1 Slot 11"}.
-critical (int): the percentage of 1-minute average load at which the switch is considered overloaded.
+perfdata:
+ One value: the 5 minute average CPU usage in percent (from 0 to 100).
+[parameters]
+warning (int): the percentage of 5-minute average usage at which the check returns WARN.
+critical (int): the percentage of 5-minute average usage at which the check goes critical.
[configuration]
-h3c_lanswitch_cpu_default_levels (int, int): the warn and critical levels for the check.
- They are set to default to 50% for "WARN" and 75% for "CRIT".
- If you expect heavy load (i.e. running full BGP tables) then you might want to adjust these.
+switch_cpu_default_levels (int, int): the default warn and critical levels.
+ They are preset to {(50,75)} (meaning 50% for WARN and 75% for CRIT).
-
-item:
- The CPU found by inventory (it's OID is a multiple of 65536 for stackable switches, or the slot Id it resides at for multicontroller switches and will be divided to come to a sane value). Later multicore CPU switches should be supported already.
-
-
-perfdata:
- It does generate performance data (the 5min average CPU load)
diff --git a/checks/h3c_lanswitch_cpu b/checks/h3c_lanswitch_cpu
index 227411e..fdadc26 100644
--- a/checks/h3c_lanswitch_cpu
+++ b/checks/h3c_lanswitch_cpu
@@ -73,7 +73,7 @@ def check_h3c_lanswitch_cpu(item, params, info):
if h3c_lanswitch_cpu_genitem(line[0]) == item:
util = int(line[1])
infotext = (" - average usage was %d%% over last 5 minutes." % util)
- perfdata = [ ( "usage", util, warn, crit, 0) ]
+ perfdata = [ ( "usage", util, warn, crit, 0, 100) ]
if util > crit:
return (2, "CRIT" + infotext, perfdata)
Module: check_mk
Branch: master
Commit: 9823d40205f9321bd5977b36bfe64d078c8df597
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=9823d40205f932…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Fri Feb 25 12:20:18 2011 +0100
h3c_lanswitch_cpu: make conform to new guidelines
---
README.writing_checks | 45 +++++++++++++++++++++++++++++--
checks/h3c_lanswitch_cpu | 65 ++++++++++++++++++++++++++-------------------
2 files changed, 79 insertions(+), 31 deletions(-)
diff --git a/README.writing_checks b/README.writing_checks
index 308cb82..687720b 100644
--- a/README.writing_checks
+++ b/README.writing_checks
@@ -1,5 +1,7 @@
This file will help you to write *good* checks for Check_MK.
+Naming
+
* Check file names should be named short and unique. They must consist
only of lower case characters, digits and underscores and begin
with a lower case character.
@@ -16,10 +18,9 @@ This file will help you to write *good* checks for Check_MK.
not be named after the vendor but after the MIB. An example are the
hr_* checks.
-In case of SNMP checks you might use the name of the MIB you fetch data from as
- part of check name
-Header notes:
+Coding style
+
* If the check is contributed by a third party (like you), you must
add your name and your email address.
@@ -82,6 +83,44 @@ Header notes:
You can use it to get nagios state string from nagios return codes. e.g.:
nagios_state_names[0] gives you 'OK'.
+SNMP Scan function:
+
+* Every SNMP check *must* have an SNMP scan function. That function
+ should as *minimal* as possible: It should only fire at those devices
+ that really can support the check. Reason: unneccessary SNMP walks
+ on devices not supporting that check must be avoied.
+ The scan function must on the other hand not be so strict that it
+ rules out devices where the check would work. If in conflict between
+ these two issues than rather make the scan function not too strict.
+ The scan function should avoid fetching non-standard-OIDs by any
+ means. It should rather try to use the basic SMIv2 OIDs as these will
+ already have been fetched and cached by the scan functions of other checks!
+ All scan functions of all checks together should fetch as few OIDs as
+ possible!
+
+Other issues:
+
+* Default values for check parameters (e.g. switch_cpu_default_levels) must be
+ chosen in a way that they make sense for *everybody*, not just for your special
+ case. In case you are not sure then rather choose too loose than too tight levels.
+ This helps avoid false alarms.
+
+* If the same configuration variable is used in multiple checks, *all* of them
+ must set a default value and all those values must be identical!
+
+* Your check should assume that the agent is always producing valid data. It
+ should *not* try to handle cases where the agent output is broken. This is
+ handled by Check_MK via Python exceptions. Otherwise you would make the code
+ uglier and also disable the debug handler.
+
+* int() vs. saveint(), float vs. savefloat(): int(s) will throw an exception if
+ if is not a valid number string (or empty). Then Check_MK will catch the exception
+ and make the check result "UNKNOWN" with an according error message. saveint(s) will
+ assume 0, if s is not valid. Important: use saveint() in all places, where you know
+ or suspect that some device does not supply valid data *but* the check can work
+ with the rest of the data and produce useful results. use int() in all other cases,
+ e.g. if the check does not make any sense if you have no valid data.
+
Manpages:
*
diff --git a/checks/h3c_lanswitch_cpu b/checks/h3c_lanswitch_cpu
index 6b799bb..227411e 100644
--- a/checks/h3c_lanswitch_cpu
+++ b/checks/h3c_lanswitch_cpu
@@ -34,11 +34,38 @@
# SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.13 = Gauge32: 16
# Reasonably low warning and crit levels
-h3c_lanswitch_cpu_default_levels = (50, 75)
+switch_cpu_default_levels = (50, 75)
+
+
+# We do not want to use the end OID as item since.
+# We prefer "Switch 1 CPU 1" over "65537"...
+def h3c_lanswitch_cpu_genitem(item):
+ # decide switch class here (stacked or standalone/modular)
+ cpuid = int(item)
+
+ # if we have a cpuid lower than (hopefully) 256 it is not hashed with a unit ID
+ if cpuid < 256:
+ switchid = 1
+ cputype = "Slot"
+ cpunum = cpuid
+
+ # othwise, if above 64k it is a known stackable switch
+ elif cpuid >= 65536:
+ switchid = cpuid / 65536
+ cputype = "CPU"
+ cpunum = cpuid % 65536
+
+ # if we end up here 3com has added another hash method.
+ else:
+ switchid = 1
+ cputype = "Unknown"
+ cpunum = cpuid
+ return ("Switch %d %s %d" % (switchid, cputype, cpunum))
def inventory_h3c_lanswitch_cpu(checkname, info):
- return [ (h3c_lanswitch_cpu_genitem(line[0]), "h3c_lanswitch_cpu_default_levels") for line in info ]
+ return [ (h3c_lanswitch_cpu_genitem(line[0]), "switch_cpu_default_levels") for line in info ]
+
def check_h3c_lanswitch_cpu(item, params, info):
warn, crit = params
@@ -46,7 +73,7 @@ def check_h3c_lanswitch_cpu(item, params, info):
if h3c_lanswitch_cpu_genitem(line[0]) == item:
util = int(line[1])
infotext = (" - average usage was %d%% over last 5 minutes." % util)
- perfdata = [ ( "Usage", util, warn, crit, 0) ]
+ perfdata = [ ( "usage", util, warn, crit, 0) ]
if util > crit:
return (2, "CRIT" + infotext, perfdata)
@@ -54,38 +81,20 @@ def check_h3c_lanswitch_cpu(item, params, info):
return (1, "WARN" + infotext, perfdata)
else:
return (0, "OK" + infotext, perfdata)
- return (3, "UNKNOWN - Unit/CPU %s not found" % item)
-# decide switch class here (stacked or standalone/modular) and make a check item for it.
-def h3c_lanswitch_cpu_genitem(item):
- cpuid = int(item)
-# if we have a cpuid lower than 512 it is not hashed with a unit ID
- if cpuid < 256:
- switchid = 1
- cputype = "Slot"
- cpunum = cpuid
-# othwise, if above 64k it is a known stackable switch
- elif cpuid >= 65536:
- switchid = cpuid / 65536
- cputype = "CPU"
- cpunum = cpuid % 65536
-# if we end up here 3com has added another hash method.
- else:
- switchid = 1
- cputype = "Unknown"
- cpunum = cpuid
- return ("Switch %d %s %d" % (switchid, cputype, cpunum))
+ return (3, "UNKNOWN - %s not found" % item)
+
check_info["h3c_lanswitch_cpu"] = (check_h3c_lanswitch_cpu, "CPU Load %s", 1, inventory_h3c_lanswitch_cpu )
-# get only the 5-min average load.
snmp_info["h3c_lanswitch_cpu"] = \
- ( "1.3.6.1.4.1.43.45.1.6.1.1.1", [ OID_END, "3" ] )
+ ( "1.3.6.1.4.1.43.45.1.6.1.1.1", [
+ OID_END,
+ "3" # 5-min average load.
+ ])
# just a rough match that will handle most devices.
snmp_scan_functions["h3c_lanswitch_cpu"] = \
- lambda oid: oid (".1.3.6.1.2.1.1.1.0").lower().startswith('3com s')
-
-
+ lambda oid: oid(".1.3.6.1.2.1.1.1.0").lower().startswith('3com s')
Module: check_mk
Branch: master
Commit: c2b8b88ceb9b254e7b76b207b03064f3b0e7d439
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=c2b8b88ceb9b25…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Fri Feb 25 11:27:42 2011 +0100
Improved check guidelines
---
README.writing_checks | 51 +++++++++++++++++++++++++++++++++++++++++++++-
checks/decru_cpu | 2 +-
checks/h3c_lanswitch_cpu | 22 ++++++-------------
3 files changed, 58 insertions(+), 17 deletions(-)
diff --git a/README.writing_checks b/README.writing_checks
index feaffd1..63172aa 100644
--- a/README.writing_checks
+++ b/README.writing_checks
@@ -1,8 +1,57 @@
This file will help you to write *good* checks for Check_MK.
Code styling:
+
* Use four spaces for intending your code. Just don't use tab chars.
- And if you relly can't life without tabs set the tab width to 8 spaces.
+ And if you really can't life without tabs set the tab width to 8 spaces.
+
+* For parts part of the official Check_MK the file header with the
+ copyright information must be present. This will be automatically
+ created if you call 'make headers' in the main source directory
+
+* TCP-Agent based checks *must* put an output example of the
+ agent in comments into the check file right after the header
+ and before the implementation. If the agent output can have
+ different formats or output style then put an example for each
+ kind of style the check supports (e.g.: the output of multipath -l
+ has changed its layout between SLES 10 and SLES 11).
+
+ For SNMP based checks put examples if the kind of output is
+ in some respect remarkable.
+
+ The example output is very helpful for understanding how the
+ check parser works.
+
+* Configuration variable for main.mk should be named after
+ the check, if they are only used by this check. This does
+ not hold for variables, that are used by several checks
+ (e.g. filesystem_levels is used by df, hr_fs, df_netapp, ...)
+
+* The service description of different check types that essentially
+ do the same must be identical (e.g. if/if64/ifoperstatus). Reason:
+ this makes rules in main.mk simpler for the user!
+
+* If a check does not use check parameter, then the inventory function
+ must return None as parameter and the check function must name
+ the parameter argument _no_params.
+
+* The name of the inventory and check function must be prefixed
+ with the name of the check type, for example inventory_h3c_lanswitch_cpu
+ for 'h3c_lanswitch'
+
+* Order of implementation:
+
+ 1. fileheader with GPL
+ 2. example output from agent
+ 3. default settings of configuration variables
+ 4. helper functions and variable, if any needed
+ 5. inventory function
+ 6. check function
+ 7. check_info[] definition
+ 8. snmp_info[] definition
+ 9. snmp_scan_functions[] definition
+
+* Configuration variables for main.mk
Manpages:
*
diff --git a/checks/decru_cpu b/checks/decru_cpu
index cf1bd6b..03c1170 100644
--- a/checks/decru_cpu
+++ b/checks/decru_cpu
@@ -28,7 +28,7 @@ def inventory_decru_cpu(checkname, info):
if len(info) == 5:
return [ (None, None) ]
-def check_decru_cpu(item, params, info):
+def check_decru_cpu(item, _no_params, info):
user, nice, system, interrupt, idle = map(lambda x:float(x[0]) / 10.0, info)
user += nice
diff --git a/checks/h3c_lanswitch_cpu b/checks/h3c_lanswitch_cpu
index 1fe3002..6b799bb 100644
--- a/checks/h3c_lanswitch_cpu
+++ b/checks/h3c_lanswitch_cpu
@@ -24,22 +24,14 @@
# to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor,
# Boston, MA 02110-1301 USA.
-# check for switches using H3C lanswitch MIB
-#
-#
-# on standalone switches (SS500-EL, etc) we will get stats per cpu.
-#SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.1 = Gauge32: 11
-# the id for the CPU will be "65536". The unit Id display will show '1'.
-#
-# on multi-unit stacks (SS5500-EL, etc) we will get stats per cpu in each switch.
-# the table id for the CPU is (unitId*65536)+0.
-#SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.65536 = Gauge32: 11
+# Example output of multi-unit stack (SS5500-EL, etc):
+# SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.65536 = Gauge32: 11
#
-# on multi-slot switches (SS8800 etc) we will get stats for each card.
-#SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.0 = Gauge32: 11
-#[...]
-#SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.12 = Gauge32: 16
-#SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.13 = Gauge32: 16
+# Example outpout of multi-slot switche (SS8800 etc):
+# SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.0 = Gauge32: 11
+# [...]
+# SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.12 = Gauge32: 16
+# SNMPv2-SMI::enterprises.43.45.1.6.1.1.1.3.13 = Gauge32: 16
# Reasonably low warning and crit levels
h3c_lanswitch_cpu_default_levels = (50, 75)