ID: 8060
Title: Implemented limiting of fetched OIDs for if/if64 checks
Component: inline-snmp
Level: 1
Class: New feature
Version: 1.2.6b1
It is now possible to configure a rule in WATO which limits the number of
OIDs fetched by a check, for example the if/if64 check, which might then
only fetch the information for some interfaces instead of all interfaces.
This is a performance optimisation feature which is only useful for special
use cases. The OID ranges to fetch need to be specified manually and can
only be configured for a small subset of checks where it makes sense to
configure it. It applies only when using bulk walks with Inline-SNMP.
In case of the interface checks this rule might be useful when you have
e.g. a network switch with several hundrets of interfaces and only want
to monitor a small amount of the network interfaces, for example the
first 10 interfaces.
ID: 8061
Title: Fixed segfaults on check timeout when using Inline-SNMP
Component: inline-snmp
Level: 1
Class: Bug fix
Version: 1.2.6b1
When the Inline-SNMP code was currently executing Net-SNMP library
code while a check timeout (SIGALRM) occured, this resulted in
a segmentation fault which leads to an unexpected situation. This
resulted in totally terminated checks or broken WATO service
discovery pages. This issue has been fixed now.
ID: 8055
Title: Change authorization method for host/service groups to loose
Component: config
Level: 2
Class: incomp
Version: 1.2.5i5
Up to now the CMC adopted the behaviour of Nagios when it came to the
authorization of seeing host and service groups. Nagios lets a user see a
host group only if he is contact for <b>every</b> host in that group. That
leads to anomalies, however. Because in the details of a host you can see
the group nevertheless and some of the views kind of display a host group,
just by printing out the host plus the group it is contained in.
Both in normal Livestatus and with the CMC you can change the behaviour.
In CMC this is done with <i>Authorization settings</i> in the global
settings.
<b>Note:</b> the new default setting is now <i>loose</i>. If you want
back the previous behaviour, please change it back in the global
settings.
ID: 8056
Title: Process Performance data on passive check results
Component: cmc
Level: 1
Class: Bug fix
Version: 1.2.5i5
When sending passive check results via the legacy command pipe
then performance data was not correctly processed but interpreted
as check output. This has been fixed.
ID: 8053
Title: Fix sporadic invalid OK status for active checks
Component: cmc
Level: 2
Class: Bug fix
Version: 1.2.5i4
In some very rare cases the CMC misinterpreted the check result status of active
checks as OK, even if the status would be WARN, CRIT or UNKNOWN. This was due
to an invalid byte offset. This has been fixed.
ID: 8054
Title: Reschedule checks if next check would be too far in future after config change
Component: cmc
Level: 2
Class: Bug fix
Version: 1.2.5i5
If due to a configuration change (in the check period or in the check interval) the
next scheduled check of a host or service would be too far in the future, then it
is now being rescheduled to be at correctly expected time.
ID: 8051
Title: Smart PING now uses same layout as vanilla PING
Component: cmc
Level: 2
Class: New feature
Version: 1.2.5i3
The Smart PING of the Micro Core now per defaults creates exactly the same type of
packets that a plain command line <tt>ping</tt> does. This has the disadvantage of
creating larger packets then neccessary - but has the advantage of being compatible
with more firewalls. Some of those tend to regard ICMP ECHO REQUEST packets without
payload as some bogus attack and drop them.
You can reenable PING packets without payout via a new global option for
<tt>main.mk</tt>:
F+:main.mk
cmc_smartping_omit_payload = True
F-:
This is also available via WATO in the global settings of the Micro Core.
ID: 8052
Title: Speedup availability queries by new caching (disabled per default)
Component: Livestatus
Level: 3
Class: New feature
Version: 1.2.5i4
The Check_MK Micro Core now has an alternative implementation of the
Livestatus table <tt>statehist</tt>. This table is the basis for all
availability computations. In the current implementation, which is still
the only when using the Nagios core, for each query all historic logfiles
that cover the query range have to be evaluated. Despite caching this can
mean an intense effort in CPU and IO usage. If you have a larger number of
hosts and services then a query for a larger time frame could last for minutes.
The new implementation needs to be enabled in the global settings
for the Check_MK Micro Core: <i>In-memory cache for availability data
(experimental)</i>. You also have to configure a time range. This limits how
long into the past you can do availability queries. The default setting is
two years.
During the start of The Core all historic log files for that time ranged are
parsed into a very efficient in-memory database so that future availability
queries do not need any disk IO or logfile parsing. The cache is automatically
updated when new alerts happen. Please also note that The Core is not
restarted during normal operation and activation of changes, so the cache
is just invalidated when you reboot your server or do a software update
of Check_MK.
The parser can process 500.000 messages per second and more, so if your disk
IO is fast enough even parsing a large history does not take longer than
a couple of minutes. This is done in the background and does not prevent
The Core from working or queries from being answered. Even availability
queries are being answered while the cache is still being built up. If the
queried time range is already in the cache then the query can immediately
be processed. Otherwise it waits for the cache to be ready.
When it comes to timeperiod definitions the new implementation has a
different behaviour: It reflects later changes in the definitions of your
timeperiods. This is conveniant when you want to work with service periods
for your availability queries. The classical implementation evaluates the
<tt>TIMEPERIOD TRANSITION</tt> entries in your logfiles. The new one directly
takes the current definitions into account and computes them for the time
range in the past.
<b>Note:</b> As of today this implemention is still highly <i>experimental</i>
and might not only produce wrong results, but might crash your core.
ID: 8049
Title: Fix exception of cmcrushd on RedHat 5.X
Component: cmc
Level: 1
Class: Bug fix
Version: 1.2.5i3
Problem is due to an alternative behaviour of Python when sys.exit
is being called.
ID: 8050
Title: Use the same problem id through all notifications of a problem
Component: cmc
Level: 2
Class: Bug fix
Version: 1.2.5i3
When a host or services goes from OK into a hard non-OK state, then
a new problem ID is generated. When the state now changes between several
different non-OK states and goes finally back to OK, then the same
problem ID is being reused in all notifications. That way a matching of
notifications in an external system is possible. Previously each state
change created a new problem ID.