Module: check_mk
Branch: master
Commit: 35e3a9a139f6627ca2963ed74dd3e0bb38d4d145
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=35e3a9a139f662…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Fri Oct 11 03:59:22 2013 +0200
Updated bug entries #0983
---
.bugs/983 | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/.bugs/983 b/.bugs/983
index b939259..6282a6f 100644
--- a/.bugs/983
+++ b/.bugs/983
@@ -1,9 +1,9 @@
Title: Multisite: cut down number of Livestatus queries
Component: multisite
-State: open
+Class: nastiness
+State: done
Date: 2013-10-10 23:24:22
Targetversion: 1.2.3i3
-Class: nastiness
Currently for each page one query of SET status is being done.
On low-latency connections this hurts. We could either do without
@@ -29,3 +29,6 @@ if "\nCache:" in request:
Cache: 500 --> Sekunden
Cache: reload --> Bis config reload
+
+2013-10-11 03:59:21: changed state open -> done
+Done.
Module: check_mk
Branch: master
Commit: c66a942015572b64edef23933b45c5bba8cd0c32
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=c66a942015572b…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Fri Oct 11 01:17:11 2013 +0200
Updated bug entries #0983
---
.bugs/983 | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/.bugs/983 b/.bugs/983
index 6e8d41d..b939259 100644
--- a/.bugs/983
+++ b/.bugs/983
@@ -15,3 +15,17 @@ with a cached result.
Also we should cache queries needed for the search filters that
are contained in many views. The caching could be done by
the livproxyd.
+
+The caching should take into account if the core has been
+changed its configuration. For example the list of check
+commands, host groups and stuff can only change due to
+a configuration reload. So we can safely cache such a query.
+
+Question is, how the proxy could know that it should cache.
+A new Cache: header? But then the proxy would need to parse
+the headers completely. Or we simply make a:
+if "\nCache:" in request:
+ ....
+
+Cache: 500 --> Sekunden
+Cache: reload --> Bis config reload
Module: check_mk
Branch: master
Commit: 15c8820707bbab924e8c21472de4287d214d5e12
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=15c8820707bbab…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Thu Oct 10 23:32:34 2013 +0200
Updated bug entries #0983
---
.bugs/983 | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/.bugs/983 b/.bugs/983
new file mode 100644
index 0000000..6e8d41d
--- /dev/null
+++ b/.bugs/983
@@ -0,0 +1,17 @@
+Title: Multisite: cut down number of Livestatus queries
+Component: multisite
+State: open
+Date: 2013-10-10 23:24:22
+Targetversion: 1.2.3i3
+Class: nastiness
+
+Currently for each page one query of SET status is being done.
+On low-latency connections this hurts. We could either do without
+that or let liveproxyd cache that kind of query. We could make
+the Heartbeat of the liveproxyd query exactly the same columns
+as Multisite needs for its status-query and then simply answer
+with a cached result.
+
+Also we should cache queries needed for the search filters that
+are contained in many views. The caching could be done by
+the livproxyd.
Module: check_mk
Branch: master
Commit: 875f48ee6d52eeee5b2fdc5d8a86040380cb032a
URL: http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=875f48ee6d52ee…
Author: Mathias Kettner <mk(a)mathias-kettner.de>
Date: Thu Oct 10 21:46:37 2013 +0200
Updated draft for rule based notifications
---
doc/drafts/LIESMICH.rule_based_notifications | 2 +
doc/drafts/README.rule_based_notifications | 74 ++++++++++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/doc/drafts/LIESMICH.rule_based_notifications b/doc/drafts/LIESMICH.rule_based_notifications
index 89d3254..ca2c751 100644
--- a/doc/drafts/LIESMICH.rule_based_notifications
+++ b/doc/drafts/LIESMICH.rule_based_notifications
@@ -178,3 +178,5 @@ wurde.
Bei der Konfigerzeugung für Nagios/CMC muss man einen speziellen
Benutzer anlegen, der für alles Kontakt ist.
+===> ACHTUNG: Es gibt jetzt ein englisches Dokument, das etwas aktueller
+ist und auch noch etwas zu den Bulk-Notifications schreibt.
diff --git a/doc/drafts/README.rule_based_notifications b/doc/drafts/README.rule_based_notifications
index 49f5775..169a81e 100644
--- a/doc/drafts/README.rule_based_notifications
+++ b/doc/drafts/README.rule_based_notifications
@@ -187,3 +187,77 @@ or not (presumed you have a high enough log level).
During the configuration generation for Nagios/CMC a special user
needs to be created who is contact for everything.
+
+
+BULK NOTIFICATIONS
+------------------
+The idea of "bulk notifications" or "notification aggregation" is to cut
+down the number of notifications in cases where some general problem causes
+hundreds or thousands of similar alerts at the same time. In some cases this
+problem can be tackled with Nagios "service dependencies". The problem is
+that this is very tedious to configure and includes the risk of blocking
+critical notification per mistake.
+
+The bulk notification feature would collect all notifications of the same
+channel and target user for a certain time into a pool and then send *one*
+notification email/sms/whatever that contains all those notifications. A
+typical time frame of such a pooling could be 1 minute. Assuming that your
+check period is one minute, that would mean that all alerts that origin from
+the same problem would be packed into one single notification.
+
+In order to make you not blind to *other* problems that happen during such a
+burst of alerts the pooling could be done on a per-checktype or per-hostgroup
+or per-whatever base. It also could take the service level of objects into
+account so that e.g. all alerts of the level "Tier 3" is bulked by 5 minutes,
+the alerts of "Tier 1" would not be bulked by sent independently.
+
+The configuration of the notification aggregation could be done as a parameter
+to the notification rules. Example:
+
+Condition:
+- Service Level <= 10 (Tier 3)
+
+Notification:
+- Send to all configured monitoring contacts
+- Send via Email
+
+Aggregation:
+- Aggregate up to 2 minutes
+- Aggregate by check type
+
+Possible options for the aggregation:
+ - time range (e.g. collect notifications up to 2 minutes)
+ - force separate notifications for different
+ [ ] hosts
+ [ ] check types
+ [ ] states (WARN, CRIT, etc.)
+ [ ] service descriptions
+ [ ] service levels
+
+
+OVERRIDING
+----------
+If there is a global rule that configures a notification without aggregation,
+then a later rule can override that and add an aggregation. That way a user
+can enable bulk notifications for otherwise un-bulked ones.
+
+Vice versa: if in a rule notifications are aggregated and a later rule
+(e.g. one of a user) has aggregation disabled, that setting has precedence.
+
+IMPLEMENTATION
+--------------
+In order to send bulk notifications the notification plugin
+API needs to be extended. Currently each plugin expects information
+about *one* alert only in environment varibles like
+NOTIFY_HOSTNAME and NOTIFY_HOSTSTATE. A plugin supporting
+bulk notifications would need another API allowing to handle
+a *list* of such alerts. Instead of environment variable we
+could have the plugin reading lines like
+NOTIFY_HOSTNAME=foo
+NOTIFY_HOSTATE=2
+
+A plugin should tell Check_MK that is using that new API by
+putting some information into its first or second comment line.
+That way the notification configurator can make sure that the
+user can only select those plugins for bulk notification that
+support it.