ID: 8052
Title: Speedup availability queries by new caching (disabled per default)
Component: Livestatus
Level: 3
Class: New feature
Version: 1.2.5i4
The Check_MK Micro Core now has an alternative implementation of the
Livestatus table <tt>statehist</tt>. This table is the basis for all
availability computations. In the current implementation, which is still
the only when using the Nagios core, for each query all historic logfiles
that cover the query range have to be evaluated. Despite caching this can
mean an intense effort in CPU and IO usage. If you have a larger number of
hosts and services then a query for a larger time frame could last for minutes.
The new implementation needs to be enabled in the global settings
for the Check_MK Micro Core: <i>In-memory cache for availability data
(experimental)</i>. You also have to configure a time range. This limits how
long into the past you can do availability queries. The default setting is
two years.
During the start of The Core all historic log files for that time ranged are
parsed into a very efficient in-memory database so that future availability
queries do not need any disk IO or logfile parsing. The cache is automatically
updated when new alerts happen. Please also note that The Core is not
restarted during normal operation and activation of changes, so the cache
is just invalidated when you reboot your server or do a software update
of Check_MK.
The parser can process 500.000 messages per second and more, so if your disk
IO is fast enough even parsing a large history does not take longer than
a couple of minutes. This is done in the background and does not prevent
The Core from working or queries from being answered. Even availability
queries are being answered while the cache is still being built up. If the
queried time range is already in the cache then the query can immediately
be processed. Otherwise it waits for the cache to be ready.
When it comes to timeperiod definitions the new implementation has a
different behaviour: It reflects later changes in the definitions of your
timeperiods. This is conveniant when you want to work with service periods
for your availability queries. The classical implementation evaluates the
<tt>TIMEPERIOD TRANSITION</tt> entries in your logfiles. The new one directly
takes the current definitions into account and computes them for the time
range in the past.
<b>Note:</b> As of today this implemention is still highly <i>experimental</i>
and might not only produce wrong results, but might crash your core.
ID: 8050
Title: Use the same problem id through all notifications of a problem
Component: cmc
Level: 2
Class: Bug fix
Version: 1.2.5i3
When a host or services goes from OK into a hard non-OK state, then
a new problem ID is generated. When the state now changes between several
different non-OK states and goes finally back to OK, then the same
problem ID is being reused in all notifications. That way a matching of
notifications in an external system is possible. Previously each state
change created a new problem ID.
ID: 8051
Title: Smart PING now uses same layout as vanilla PING
Component: cmc
Level: 2
Class: New feature
Version: 1.2.5i3
The Smart PING of the Micro Core now per defaults creates exactly the same type of
packets that a plain command line <tt>ping</tt> does. This has the disadvantage of
creating larger packets then neccessary - but has the advantage of being compatible
with more firewalls. Some of those tend to regard ICMP ECHO REQUEST packets without
payload as some bogus attack and drop them.
You can reenable PING packets without payout via a new global option for
<tt>main.mk</tt>:
F+:main.mk
cmc_smartping_omit_payload = True
F-:
This is also available via WATO in the global settings of the Micro Core.
ID: 8045
Title: Do no increase notification number on recovery notifications
Component: cmc
Level: 2
Class: Bug fix
Version: 1.2.5i3
When a notification of the type RECOVERY is being created then the
notification number is not longer increased. This is neccessary.
Otherwise RECOVERY notifications always have at least a number of 2
which garbles possible escalations.
ID: 8044
Title: Fixed different behaviour of inline SNMP and normal SNMP mode which resulted in check exceptions
Component: inline-snmp
Level: 2
Class: Bug fix
Version: 1.2.5i5
Fixed a bug in our inline SNMP bulk walk implementation which resulted in
different behaviour compared to classic SNMP mode. This resulted (at least
in our case) in exceptions during checking.
ID: 8038
Title: Adapted bulkwalk behaviour of snmp command to fix issue with missing OID
Component: inline-snmp
Level: 2
Class: Bug fix
Version: 1.2.5i1
The previous implementation of the bulkwalk in our inline SNMP code was not
working as e.g. the snmpbulkwalk command of net-snmp. The bulkwalk command
asks for an OID via bulkwalk, processes the response and when none of the
responded OIDs matches this OID, the command performs an explicit SNMP GET
on the requested OID to get that single value. This has now been added.
ID: 8035
Title: Inline-SNMP now handles errors in case of SNMP errors like "noSuchName ..."
Component: inline-snmp
Level: 2
Class: Bug fix
Version: 1.2.5i1
Previous versions made SNMP communication fail once a single OID could not be
fetched. This has been fixed to match the behaviour of the classic SNMP code.
ID: 8034
Title: Automatically packages RPM and DEB agents for Linux
Component: agents
Level: 3
Class: New feature
Version: 1.2.5i3
Check_MK now creates RPM and DEB packages for your Linux agents via WATO.
Via rules in the new section <i>Monitoring Agents</i> you can configure
settings and plugins for the agent. The new WATO module <i>Monitoring Agents</i>
allows you to download the RPM or DEB packages of the various configurations.
<b>Note</b>: This feature is not finished yet and still experimental. Most
of the according WATO rules are still missing. There is no documentation
yet. <b>Use at your own risk!</b>.
ID: 8025
Title: Add all custom variables of host, service and contact to notification
Component: cmc
Level: 2
Class: New feature
Version: 1.2.5i1
The CMC now automatically adds all custom variables of hosts, service and
contacts to the notification context. The variable name is prefixed with
the word <tt>HOST</tt>, <tt>SERVICE</tt> or <tt>CUSTOM</tt> and - as its
variable name custom - converted to upper case. So a host variable with
the name <tt>_foobar</tt> will be available as <tt>HOST_FOOBAR</tt> in the
notification context. The names will be prefixed with <tt>NOTIFY_</tt> and put
into the environment of the notification plugin. So at the end the variable
will be available as <tt>NOTIFY_HOST_FOOBAR</tt>, e.g. in a shell script:
F+:mynotify.sh
echo "Foobar: $NOTIFY_HOST_FOOBAR"
F-:
H2:Notes
<ul>
<li>In the configuration files in <tt>main.mk</tt> or below <tt>conf.d</tt>
the variables can be set via <tt>extra_host_conf</tt> or <tt>extra_service_conf</tt>.
You need to have the variable names begin with an underscore. So a variable name of <tt>foobar</tt>
is not allowed. You have to write either <tt>_foobar</tt> or <tt>_FOOBAR</tt>.</li>
<li>When you add contact custom variables via WATO (button <i>Custom Attributes</i>
in the users management</i>) the underscore will <i>automatically</i> be added.</li>
<li>when using Nagios as monitoring core you have to adapt <tt>check_mk_templates.cfg</tt>
whenever you add a new custom variable.</li>
</ul>
ID: 8018
Title: Flapping notifiations for services are no longer sent if switched off
Component: cmc
Level: 2
Class: Bug fix
Version: 1.2.5i1
The users global notification option "Service" - "Start or end of flapping state"
was not processed correctly by the core, so the user received flapping alert
notifications even if this option was disabled.