Module: check_mk
Branch: master
Commit: 53f722f1b3d36fa8b295090a5999d8574e9d5674
URL:
http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=53f722f1b3d36f…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Thu Aug 18 16:52:44 2011 +0200
mk_logwatch: configurable limits in new lines and time
---
.bugs/116 | 13 +++++++--
.bugs/26 | 10 +++++--
ChangeLog | 5 +++
agents/.f12 | 4 +++
agents/logwatch.cfg | 2 +-
agents/plugins/.f12 | 3 ++
agents/plugins/mk_logwatch | 60 ++++++++++++++++++++++++++++++++++++++++---
checkman/logwatch | 46 ++++++++++++---------------------
8 files changed, 102 insertions(+), 41 deletions(-)
diff --git a/.bugs/116 b/.bugs/116
index d9c6116..9f2c661 100644
--- a/.bugs/116
+++ b/.bugs/116
@@ -1,10 +1,11 @@
Title: Make configurable limit on lines processed by mk_logwatch
Component: checks
+State: done
+Class: feature
+Date: 2011-01-30 13:42:20
Benefit: 1
-State: open
Cost: 2
-Date: 2011-01-30 13:42:20
-Class: feature
+Fun: 0
Logfile that grow fast make problems with mk_logwatch since
the processing takes too long. We could make the following
@@ -19,3 +20,9 @@ or "W" if this limit is exceeded.
Exceeding lines are not lost but processed in the next
turn.
+
+We could also allow a time limit, e.g. maxtime=20s.
+
+
+2011-08-18 16:52:03: changed state open -> done
+Is implemented. Documentation is online.
diff --git a/.bugs/26 b/.bugs/26
index 7b7d7d7..ccc447c 100644
--- a/.bugs/26
+++ b/.bugs/26
@@ -1,11 +1,15 @@
Title: korrupte logwatch-state Datei abfangen
Component: checks
+State: done
+Class: bug
+Date: 2010-12-21 16:33:25
Benefit: 1
-State: open
Cost: 2
-Date: 2010-12-21 16:33:25
-Class: bug
+Fun: 0
Wenn aus irgendeinem Grund die state-Datei vom Logwatch-Agenten kaputt
ist, soll man diese einfach neu anlegen (mit Exceptions abfangen). Und
irgendwie eine Fehlermeldung hochsenden (künstliche Logzeile?)
+
+2011-08-18 16:52:29: changed state open -> done
+Ein Fehler in der Datei wird einfach ignoriert.
diff --git a/ChangeLog b/ChangeLog
index 7f2acd8..95cf0fa 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+1.1.11i3:
+ Checks & Agents:
+ * mk_logwatch: allow to set limits in processing time and number of
+ new log messages per log file
+
1.1.11i2:
Core, Setup, etc.:
* FIX: sort output of cmk --list-hosts alphabetically
diff --git a/agents/.f12 b/agents/.f12
index 1c27cc5..d1f6a26 100644
--- a/agents/.f12
+++ b/agents/.f12
@@ -1 +1,5 @@
sudo cp -r * /omd/versions/default/share/check_mk/agents/
+if [ -e /etc/check_mk ]
+then
+ sudo install -m 644 logwatch.cfg /etc/check_mk/
+fi
diff --git a/agents/logwatch.cfg b/agents/logwatch.cfg
index 6dedc9e..696785c 100644
--- a/agents/logwatch.cfg
+++ b/agents/logwatch.cfg
@@ -38,7 +38,7 @@
I mdadm.*: Rebuild.*event detected
W mdadm\[
-/var/log/kern /var/log/kern.log
+/var/log/syslog /var/log/kern.log
C panic
C Oops
diff --git a/agents/plugins/.f12 b/agents/plugins/.f12
new file mode 100755
index 0000000..9c4a18e
--- /dev/null
+++ b/agents/plugins/.f12
@@ -0,0 +1,3 @@
+#!/bin/bash
+sudo mkdir -p /usr/lib/check_mk_agent/plugins
+sudo install -m 755 * /usr/lib/check_mk_agent/plugins
diff --git a/agents/plugins/mk_logwatch b/agents/plugins/mk_logwatch
index 462054c..e4d86e3 100755
--- a/agents/plugins/mk_logwatch
+++ b/agents/plugins/mk_logwatch
@@ -26,7 +26,7 @@
# Call with -d for debug mode: colored output, no saving of status
-import sys,os,re
+import sys, os, re, time
if '-d' in sys.argv[1:]:
tty_red = '\033[1;31m'
@@ -179,7 +179,29 @@ def process_logfile(logfile, patterns):
f = os.fdopen(fl)
worst = 0
outputtxt = ""
+ lines_parsed = 0
+ start_time = time.time()
+
for line in f:
+ lines_parsed += 1
+ # Check if maximum number of new log messages is exceeded
+ if opt_maxlines != None and lines_parsed > opt_maxlines:
+ outputtxt += "%s Maximum number (%d) of new log messages
exceeded.\n" % (
+ opt_overflow, opt_maxlines)
+ worst = max(worst, opt_overflow_level)
+ os.lseek(fl, 0, 2) # Seek to end of file, skip all other messages
+ break
+
+ # Check if maximum processing time (per file) is exceeded. Check only
+ # every 100'th line in order to save system calls
+ if opt_maxtime != None and lines_parsed % 100 == 10 \
+ and time.time() - start_time > opt_maxtime:
+ outputtxt += "%s Maximum parsing time (%.1f sec) of this log file
exceeded.\n" % (
+ opt_overflow, opt_maxtime)
+ worst = max(worst, opt_overflow_level)
+ os.lseek(fl, 0, 2) # Seek to end of file, skip all other messages
+ break
+
level = "."
for lev, pattern in patterns:
if pattern.search(line[:-1]):
@@ -189,6 +211,7 @@ def process_logfile(logfile, patterns):
break
color = {'C': tty_red, 'W': tty_yellow, 'I': tty_blue,
'.': ''}[level]
outputtxt += "%s%s %s%s\n" % (color, level, line[:-1], tty_normal)
+
new_offset = os.lseek(fl, 0, 1) # os.SEEK_CUR not available in Python 2.4
status[logfile] = new_offset, inode
@@ -203,17 +226,44 @@ except Exception, e:
print "CANNOT READ CONFIG FILE: %s" % e
sys.exit(1)
+# Simply ignore errors in the status file. In case of a corrupted status file we simply
begin
+# with an empty status. That keeps the monitoring up and running - even if we might loose
a
+# message in the extreme case of a corrupted status file.
try:
status = read_status()
-except IOError:
- status = {}
except Exception, e:
- print "CANNOT PARSE STATUS FILE: %s" % e
- sys.exit(1)
status = {}
+
+# The filename line may contain options like 'maxlines=100' or
'maxtime=10'
for filenames, patterns in config:
+ # Initialize options with default values
+ opt_maxlines = None
+ opt_maxtime = None
+ opt_overflow = 'C'
+ opt_overflow_level = 2
+ try:
+ options = [ o.split('=', 1) for o in filenames if '=' in o ]
+ for key, value in options:
+ if key == 'maxlines':
+ opt_maxlines = int(value)
+ elif key == 'maxtime':
+ opt_maxtime = float(value)
+ elif key == 'overflow':
+ if value not in [ 'C', 'I', 'W' ]:
+ raise Exception("Invalid value %s for overflow. Allowed are C, I
and W" % value)
+ opt_overflow = value
+ opt_overflow_level = {'C':2, 'W':1,
'I':0}[value]
+ else:
+ raise Exception("Invalid option %s" % key)
+ except Exception, e:
+ print "INVALID CONFIGURATION: %s" % e
+ sys.exit(1)
+
+
for glob in filenames:
+ if '=' in glob:
+ continue
logfiles = [ l.strip() for l in os.popen("ls %s 2>/dev/null" %
glob).readlines() ]
if len(logfiles) == 0:
print '[[[%s:missing]]]' % glob
diff --git a/checkman/logwatch b/checkman/logwatch
index e4512d5..fe9f1a8 100644
--- a/checkman/logwatch
+++ b/checkman/logwatch
@@ -4,41 +4,29 @@ author: Mathias Kettner <mk(a)mathias-kettner.de>
license: GPL
distribution: check_mk
description:
- This check processes the output of those agents
- having the logwatch extension. The windows agent
- has this extension built in. The logwatch extension
- of the Linux/UNIX agents needs a configuration file
- that lists all relevant logfiles and lists possible
- log lines that should result in warning or critical
- state. The windows agents does not need any configuration
- but sends all log files in the Windows event log.
- It uses the warning/error classification of Windows.
+ This check processes the output of agents with the logwatch pluing. The windows agent
has built
+ in this extension. The logwatch extension of the Linux/UNIX agents needs a configuration
file
+ that lists all relevant logfiles and lists possible log lines that should result in
warning
+ or critical state. The windows agents does not need any configuration but sends all log
files
+ in the Windows event log. It uses the warning/error classification of Windows.
- Relevant log messages found by the agent are stored
- locally into a text file. The check is critical, if
- at least one new {or old} log message exists that
- is classified as critical. If at least one warning
- message exists but no critical, the check results
- in a warning state.
+ Relevant log messages found by the agent are stored locally into a text file. The check
is
+ critical, if at least one new {or old} log message exists that is classified as
critical. If
+ at least one warning message exists but no critical, the check results in a warning
state.
- The only way to bring the state back to OK is to
- delete the text file with the stored log messages.
- This is stored below {/var/lib/check_mk/logwatch}.
- Usually the logwatch webpage is used to browse and
- delete the messages. Please refer to the online
- documentation of check_mk for more details about logwatch.
+ The only way to bring the state back to OK is to delete the text file with the stored
log
+ messages. This is stored below {/var/lib/check_mk/logwatch}. Usually the logwatch
webpage is
+ used to browse and delete the messages. Please refer to the online documentation of
check_mk
+ for more details about logwatch.
item:
- The name of the logfile. For Linux/UNIX this is
- the complete absolute path name of the logfile.
- For Windows this is the name as shown in the
- windows event log, for example {Application} (case
- sensitive!).
+ The name of the logfile. For Linux/UNIX this is the complete absolute path name of the
logfile.
+ For Windows this is the name as shown in the windows event log, for example
{Application}
+ (case sensitive!).
inventory:
- All logfiles sent by the agent are automatically
- inventorized. Please use standard inventory configuration
- methods if you want to ignore certain log files.
+ All logfiles sent by the agent are automatically inventorized. Please use standard
inventory
+ configuration methods if you want to ignore certain log files.
examples: