Module: check_mk
Branch: master
Commit: 875f48ee6d52eeee5b2fdc5d8a86040380cb032a
URL:
http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=875f48ee6d52ee…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Thu Oct 10 21:46:37 2013 +0200
Updated draft for rule based notifications
---
doc/drafts/LIESMICH.rule_based_notifications | 2 +
doc/drafts/README.rule_based_notifications | 74 ++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/doc/drafts/LIESMICH.rule_based_notifications
b/doc/drafts/LIESMICH.rule_based_notifications
index 89d3254..ca2c751 100644
--- a/doc/drafts/LIESMICH.rule_based_notifications
+++ b/doc/drafts/LIESMICH.rule_based_notifications
@@ -178,3 +178,5 @@ wurde.
Bei der Konfigerzeugung für Nagios/CMC muss man einen speziellen
Benutzer anlegen, der für alles Kontakt ist.
+===> ACHTUNG: Es gibt jetzt ein englisches Dokument, das etwas aktueller
+ist und auch noch etwas zu den Bulk-Notifications schreibt.
diff --git a/doc/drafts/README.rule_based_notifications
b/doc/drafts/README.rule_based_notifications
index 49f5775..169a81e 100644
--- a/doc/drafts/README.rule_based_notifications
+++ b/doc/drafts/README.rule_based_notifications
@@ -187,3 +187,77 @@ or not (presumed you have a high enough log level).
During the configuration generation for Nagios/CMC a special user
needs to be created who is contact for everything.
+
+
+BULK NOTIFICATIONS
+------------------
+The idea of "bulk notifications" or "notification aggregation" is to
cut
+down the number of notifications in cases where some general problem causes
+hundreds or thousands of similar alerts at the same time. In some cases this
+problem can be tackled with Nagios "service dependencies". The problem is
+that this is very tedious to configure and includes the risk of blocking
+critical notification per mistake.
+
+The bulk notification feature would collect all notifications of the same
+channel and target user for a certain time into a pool and then send *one*
+notification email/sms/whatever that contains all those notifications. A
+typical time frame of such a pooling could be 1 minute. Assuming that your
+check period is one minute, that would mean that all alerts that origin from
+the same problem would be packed into one single notification.
+
+In order to make you not blind to *other* problems that happen during such a
+burst of alerts the pooling could be done on a per-checktype or per-hostgroup
+or per-whatever base. It also could take the service level of objects into
+account so that e.g. all alerts of the level "Tier 3" is bulked by 5 minutes,
+the alerts of "Tier 1" would not be bulked by sent independently.
+
+The configuration of the notification aggregation could be done as a parameter
+to the notification rules. Example:
+
+Condition:
+- Service Level <= 10 (Tier 3)
+
+Notification:
+- Send to all configured monitoring contacts
+- Send via Email
+
+Aggregation:
+- Aggregate up to 2 minutes
+- Aggregate by check type
+
+Possible options for the aggregation:
+ - time range (e.g. collect notifications up to 2 minutes)
+ - force separate notifications for different
+ [ ] hosts
+ [ ] check types
+ [ ] states (WARN, CRIT, etc.)
+ [ ] service descriptions
+ [ ] service levels
+
+
+OVERRIDING
+----------
+If there is a global rule that configures a notification without aggregation,
+then a later rule can override that and add an aggregation. That way a user
+can enable bulk notifications for otherwise un-bulked ones.
+
+Vice versa: if in a rule notifications are aggregated and a later rule
+(e.g. one of a user) has aggregation disabled, that setting has precedence.
+
+IMPLEMENTATION
+--------------
+In order to send bulk notifications the notification plugin
+API needs to be extended. Currently each plugin expects information
+about *one* alert only in environment varibles like
+NOTIFY_HOSTNAME and NOTIFY_HOSTSTATE. A plugin supporting
+bulk notifications would need another API allowing to handle
+a *list* of such alerts. Instead of environment variable we
+could have the plugin reading lines like
+NOTIFY_HOSTNAME=foo
+NOTIFY_HOSTATE=2
+
+A plugin should tell Check_MK that is using that new API by
+putting some information into its first or second comment line.
+That way the notification configurator can make sure that the
+user can only select those plugins for bulk notification that
+support it.