ID: 8638
Title: Fixed broken notifications when Check_MK notification spooler not enabled (regression since 1.4.0i2)
Component: Notifications
Level: 3
Class: Bug fix
Version: 1.4.0i3
When updating an existing site to 1.4.0i2 or 1.4.0i2p2 while not having the "Check_MK notification spooler"
enabled via "omd config", the notification system stopped sending notifications.
The background is that the default notification mechanism had been changed from direct delivery to
asynchronous delivery. And this was done even when the notification spooler, which is responsible for
delivering the asynchronous notification, was disabled.
The notifications are not lost, they have simply not been processed. In case you are affected,
there should exist one file for each notification that has been created but was not delivered
below <tt>var/check_mk/notify/spool/</tt>.
To fix this issue just install the new version. After the update all notification will be sent out.
In case you want to supress the stuck notifications, you can go to the directory mentioned above and
remove the <tt>.mk</tt> files in this directory before starting the site again.
To make it work for you with the existing versions, you can execute "omd config" and go to
"Distributed Monitoring > MKNOTIFYD" and enable this option. After starting the site again,
all your notifications will be processed and are sent to the users.
ID: 8394
Title: New graph designer for creating custom graphs
Component: metrics
Level: 3
Class: New feature
Version: 1.2.7i4
With the new custom graphs you can now create performance graphs with arbitrary
metrics from arbitrary hosts and services. That way you can e.g. create a graph
for showing HTTP response times from three different servers in one graph.
The new custom graphs can be added to reports, graph collections and dashboards.
There are several paths to the graph designer:
LI:Click on the context menu of an existing graph and select <i>Add all metrics of this graph to: new custom graph</i>
LI:Go to a host or service detail, there to <i>Service Metrics</i>, use the small icon of one metric and select <i>Add this metric to custom graph...</i>
LI:Press <i>EDIT</i> in the Views, Bookmarks or Reports-Snapin and click the new button <i>Custom Graphs</i> at the top
ID: 8359
Title: Manage Extension Packages (MKPs) via WATO
Component: WATO
Level: 3
Class: New feature
Version: 1.2.7i4
WATO now has a new module for managing Check_MK Extension Packages (MKPs) via WATO.
The new module helps you to create MKPs and also allows you to install MKPs (e.g.
from <a href="http://exchange.check-mk.org">The Check_MK Exchange</a>. MKPs can also
be distributed to your slave sites.
ID: 8358
Title: Automatic Check_MK Agent updates for Linux
Component: agents
Level: 3
Class: New feature
Version: 1.2.7i4
The Check_MK Agent Bakery now allows for automatically updating Check_MK
Agents via HTTP/HTTPS. This is realized by a new agent plugin called
<tt>cmk-update-agent</tt>, that is currently only available for Linux,
but a Windows version will follow soon. The steps for setting up the
automatic updates is:
1. Go to the new page for the Agent Deployment. You find a new button
for this in the Agent Bakery page.
2. Fullfil all of the prerequisites that are shown on that new pages.
This involves creating agents that contain the new plugin with a
valid configuration and enabling the deployment.
3. Install the new prepared agents on the target hosts of your choice.
4. Call <tt>cmk-update-agent register</tt> on these systems in order
to register the agents for updates.
5. Don't forget to enable the agent deployment's master switch!
ID: 8304
Title: New graphical frontend for metrics time graphs
Component: metrics
Level: 3
Class: New feature
Version: 1.2.7i3
The Check_MK Enterprise Edition has now a new graphical user frontend for
displaying metrics time graphs. Instead of using pixeled PNG images it makes
use of interactive vector graphics using HTML 5. These graphs do not only
look better. They allow you to:
LI:Zoom horitontally using your mouse wheel
LI:Zoom vertically by dragging points up and down
LI:Scroll horizontally (into past and future) by dragging points left and right
LI:Resize a graph using the right bottom corner
If more than one graph is being displayed on the same HTML page then any zoom
or shift in time will be synchronzed to <b>all</b> graphs on the same page.
Also the reporting uses the new graphing engine and replaces the embedded
pixel images by real vector graphics.
A further advantage of the new system is, that in distributed setups no
reverse HTTP proxy is needed any longer. The data is being fetched via
Livestatus (up-to-date core is neccessary).
ID: 8301
Title: cmcdump: New tool for offline mirroring satellite sites into a central site
Component: cmc
Level: 3
Class: New feature
Version: 1.2.7i3
The CMC now has a new replication mechanism for mirroring the state
of satellite monitoring sites into a central site. This is much like
<tt>livedump</tt> for the Nagios core but is much more powerful.
In order to setup this you need to call <tt>cmcdump -C > cmc.config</tt>
on the remote site and transfer that file to the central site into
<tt>etc/check_mk/conf.d/yourfile.mk</tt>. This will dump the configuration
of all hosts and services. Afterwards activate the updated configuration
with <tt>cmk -O</tt>. You need to repeat it from time to time so that
your central site is up-to-date.
In a much shorter interval (e.g. once per minute) you call <tt>cmcdump >
cmc.state</tt> on the same remote site. This can easily be done with a cron
job. That file you also transfer to the central site via any mechanism
you like (scp, http, rsync, ...). Over there read it into the core with:
C+:
OM:unixcat < cmc.state tmp/run/live
C-:
This will update the core's complete state of all hosts and services that
are contained in the dump. The transferred state will correctly reflect the
following variables:
LI:The actual state (PEND, OK, WARN, ...)
LI:The plugin output
LI:The long (multiline) plugin output
LI:The performance data
LI:Whether the object is flapping and the current level of flappiness
LI:The time of the last check execution
LI:The time of the last state change
LI:The check execution time
LI:The check latency
LI:The number of the current check attempt
LI:Whether the current state is hard or soft
LI:Whether a problem has been acknowledged
LI:Whether the object is currently in a scheduled downtime
In the central site this will almost - but not entirely - be handled
like a check execution. One difference is that no notifications will be
sent. But performance data is being processed graphs will be created. Also
the monitoring log is being updated and availability data can be processed.
Depending on our synchronization interval of the data transfer the latter
one might not be 100% precise however.
The tool <tt>cmcdump</tt> is in your path and can directly be executed.
Call <tt>cmcdump --help</tt> for details on how to call this tool.
ID: 8275
Title: Alert Handlers - execute automatic actions upon state changes of hosts and services
Component: cmc
Level: 3
Class: New feature
Version: 1.2.7i2
Check_MK now supports automatic actions (scripts) to be executed upon the
state change of a host or service. This is similar to Nagios "Event Handlers"
but has a much more flexible configuration and other advantages.
At the beginning there is a state change of a host or service. It does not
matter whether this change is "soft" - because the maximum number of check
attempts has not been reached. It simply matters that the state has changed
from one of OK/WARN/CRIT/UNKNOWN to another.
Whenever this happens a new global rule chain of <i>Alert Handler Rules</i>
is being processed. Each rule that matches calls an external script of
your choice. Most times people want to restart services, trigger garbage
collections of Java machines or do similar stuff.
Please note that some folks insist that monitoring should not try to repair
things or by any other means actively <i>change</i> things. Whether you share
this opinion or not is your own decision. Alert handlers do not limit you
in what you exactly do with them. But you have been warned.
When you compare alert handlers with the Rule Based Notifications (RBN) then
here are some important differences:
LI:Notifications are being suppressed during downtimes, outside of the notification period, when the host is down and other situations. Alerts are never suppressed.
LI:Notifications cannot be triggered at soft state changes.
LI:Alert handlers only work with the Check_MK Micro Core. If you need Nagios or Icinga please use the Event Handlers of those cores.
LI:Alert handling rules do not allow cancelling. All matching rules are being executed.
LI:As long as no alert rule is defined the alert handling mechanism is deactivated in the core.
Note: this is just the first implementation of the Alert Handlers. Next steps
will the introduction of error tracking, notifications tied to alert actions,
even more flexible conditions, a system for secure remote execution and much
more. Stay tuned!
H2:Setting up Alert handlers
For setting up alert handlers you first need to create a script that should
be called. This can be written in any programming language - most people
will use a simple BASH script. It must be executable and be installed in
<tt>~/local/share/check_mk/alert_handlers</tt> and made executable.
The script is provided with all information about the alert with environment
variable that being with <tt>ALERT_</tt> - very similar to a notification
script. A good start for testing is the following script:
F+:/local/share/check_mk/alert_handlers/foo
#!/bin/bash
env | grep ALERT_ | sort > /tmp/alert.out
F-:
This will dump all the variable of the alert into the file <tt>/tmp/alert.out</tt>.
When specifying this script in the alert handler rule - simply write <tt>foo</tt>
here.
Useful for debugging is to set <i>Alert handler log level</i> to <i>Full dump
of all variables</i> and <i>Logging of the alert processing</i> to <i>on</i>
in the global settings. You will find information it <tt>~/var/log/cmc.log</tt>
and <tt>~/var/log/alerts.log</tt>.
ID: 8273
Title: Recurring Scheduled Downtimes - adhoc via command and also via rule set
Component: cmc
Level: 3
Class: New feature
Version: 1.2.7i2
Check_MK now supports <i>recurring scheduled downtimes</i> (or simply
<i>recurring downtimes</i>). Let's assume that you have a couple of servers
that are rebooted once a week always at the same time. You surely do not
want any notifications about that, you also do not want to have these hosts
displayed as "problems" in the problem views.
Up to now the only useful tool in such situations was the notification
period. But that way you can just suppress notifications - not the problem
display. Also setting up notification periods requires configuration
permissions (WATO).
The new recurring downtimes create normal scheduled downtimes for you on
a regular base. This is a direct enhancement of the downtimes that already
exist. They now have a new field where you can specify an interval in which
the downtime should be repeated. Everything is then handled by the monitoring
core (CMC) and no external cron job is involved. This has a few advantages:
LI:No cron job is needed
LI:The recurring downtimes are visible in the <i>Downtimes</i> view - even if they are currently not active
LI:Recurring downtimes can be set and removed by using the existing downtime commands
You can create recurring downtimes in two ways:
H2:Using commands
The easiest way is to just use the same commands as for creating one-time
downtimes. The command box now has a new option <i>Repeat this downtime
on a regular base every ____</i>. You can choose between <i>hourly</i>,
<i>daily</i> and <i>weekly</i>. If you e.g. create a downtime for 12:34 on
Monday and select <i>weekly</i> then this downtime will be repeated on every
Monday from now on. Changes in the daylight saving time are compensated,
so the time of day (12:34) will be valid in and out of DST.
Such downtimes behave exactly like one-time downtimes - with the single
difference that they are not being deleted when they end but shifted to the
next interval instead.
U2:Using rule sets
There is a second way to create recurring downtimes: two new WATO rule sets
called <i>Recurring downtimes for hosts</i> and <i>Recurring downtimes for
services</i>. Using these you can base the downtimes on WATO folders and host
tags. While this could also by done by selecting host and services via the GUI
and applying commands - it still has one advantage: You can specify downtimes
for objects that <i>still do not exist</i>. If you create a recurring downtime
for all servers with the tag <i>Windows</i> then also Windows hosts that
will be added at a later time will automatically get that recurring downtime.
Recurring downtimes that have been created via a rule can not be deleted by
the operator via a command, of course. All downtime views have a new column
<i>Origin</i> that shows you wether a downtime exists due to a <i>command</i>
or due to <i>configuration</i>.
ID: 8090
Title: Report scheduler - automatically email reports on a regular base
Component: Reporting & Availability
Level: 3
Class: New feature
Version: 1.2.6b1
The new reporting scheduler allows users to have reports emailed at
regular intervals. The scheduler is available for admins and normal
users. Normal users can only mail reports to themselves.
Currently the schedule can be daily, weekly or monthly, where you
can configure the time of day and the relative day in the week
or month.
The scheduler allows to specify contexts for reports that need
a context. For example you can use the shipped example report
<i>Report of host</i> and specify the host to report about
directly in the scheduler.
ID: 8078
Title: All Linux agent plugins now supported by agent bakery - except ORACLE
Component: agents
Level: 3
Class: New feature
Version: 1.2.5i6
The new agent bakery now supports all official Check_MK agent plugins
for the Linux agent - in the variants RPM and DEB. The only exception
are the ORACLE agent plugins since these are currently being rewritten.
As soon as they are finished, they also will be supported.