ID: 1663
Title: winperf_if: fixed incorrect enumeration of interface index
Component: Checks & Agents
Level: 2
Class: Bug Fix
Version: 1.2.7i1
The previous update broke services, which used the interface index as item.
This has been fixed.
ID: 1659
Title: windows agent: fixed output of 64 bit performance counters
Component: Checks & Agents
Level: 2
Class: Bug Fix
Version: 1.2.5i7
The windows agent was unable to output 64 bit performance counters.
Those values were cut at 32 bit, which caused unwanted counter wraps in the checks.
ID: 1726
Title: Move variable data of Linux/UNIX agents to /var/lib/check_mk_agent
Component: Checks & Agents
Level: 2
Class: New Feature
Version: 1.2.5i7
The Linux, AIX, Solaris and other UNIX agents now put state and cache files
no longer below <tt>/etc/check_mk</tt> but per default below <tt>/var/lib/check_mk_agent</tt>.
This can be changed directly in the agent in the following line:
F+:check_mk_agent.*
export MK_VARDIR=/var/lib/check_mk_agent
F-:
This means that if you update to the new version of the agent, you must have
in mind the following things:
<ul>
<li>Check_MK agent will not find saved <tt>logwatch.state</tt> files and will consider all logfiles as new. This means
that you will miss up to one check cycle of possible new logfile entries.</li>
<li>Check_MK agent will forget its cached data and recompute all asynchronous checks</li>
<li>Maybe other plugin-specific loss of memory</li>
</ul>
You can prevent this by manually copying selective data from
<tt>/etc/check_mk</tt>, if you like.
ID: 1725
Title: The get_average() function from now on only returns one argument: the average
Component: Core & Setup
Level: 2
Class: New Feature
Version: 1.2.5i7
Note to all developers of checks that use <tt>get_average()</tt>: In order to simplify
the check API the function <tt>get_average()</tt> from now on does not return
the additional <tt>timedif</tt> value anymore - just the rate. Please check your checks
for the usage of this function.
ID: 1723
Title: New check API function get_average() as more intelligent replacement for get_counter()
Component: Core & Setup
Level: 2
Class: New Feature
Version: 1.2.5i7
The function <tt>get_counter()</tt> is now deprecated in the programming of
checks. There is a new function called <tt>get_rate()</tt> that should be
used as a replacement.
F+:
def get_rate(countername, this_time, this_val, allow_negative=False, onwrap=SKIP):
...
return rate
F-:
The call syntax is almostthe same - just with the new optional parameter
<tt>onwrap</tt>. Important however: now just the rate (counter steps
per second) is being returned. The formerly additional return value
<tt>timedif</tt> has been dropped since it is of no real use. So the return
type has changed from tuple to float.
The most imporant change - however - is in the handling of counter wraps. A
<i>counter wrap</i> happens in three situations:
<ul>
<li>When the counter is seen for the first time (initialization)</li>
<li>When the previous value of the counter is larger than the new one</li>
<li>When the time difference since the last counter update was less than one second</li>
</ul>
Wraps usually happen when a device reboots or when the valid range of the
counter is exceeded and it wraps through again to zero.
The old function <tt>get_counter()</tt> used to raise an exception of type
<tt>MKCounterWrapped</tt>. This exception was handeld by the main core of
Check_MK, which skipped that check for one cycle. The problem were checks
with more than one counter: at the point of initialization the code of the
check wasaborted after the first of these counters had been initialized.
If you had 10 counters, you would need 10 check cycles until the first time
a check result would be returned. So in order to avoid that the check had
to catch the <tt>MKCounterWrapped</tt> itself and handle this situation -
very ugly.
The new function <tt>get_rate</tt> implements a different approach.
Per default no exception is raised in case of a counter wrap, but simply the
value <tt>0.00</tt> is being returned. But Check_MK keeps record of this wrap
event. After the check function has completed (and all counters are handled),
Check_MK creates <i>one final</i> <tt>MKCounterWrapped</tt> exception, so
that the (invalid) check result is being skipped as it should be. This way
the check programmers' burden is a reduced a bit because now even if the
check has several counters he does not need to catch counter wraps.
In order to give the check more flexibility there are two other behaviours,
that can be selected by the optional argument <tt>onwrap</tt>:
<table>
<tr><th>onwrap</th><th>behaviour</th></tr>
<tr><td class=tt>SKIP</td><td>Skip result of check, after all counters are handled (default)</td></tr>
<tr><td class=tt>RAISE</td><td>Immediately raise a <tt>MKCounterWrapped</tt> exception (legacy behaviour)</td></tr>
<tr><td class=tt>ZERO</td><td>Ignore the wrap and return a rate of 0.0 (be careful!)</td></tr>
</table>
Note: Using <tt>ZERO</tt> is generally <i>not</i> a good idea. This can
make a service jump from CRIT to OK from now and then and generate bogus
notifications.
ID: 1719
Title: Allow to allow both host name and IP address when checking for events in Event Console
Component: Event Console
Level: 2
Class: New Feature
Version: 1.2.5i7
The check <i>Check for events in Event Console</i> used to allow either a
match for the host name or the IP address when trying to find events that
belong to a monitored host. Now there is a new third option <i>Try both
host name and IP address</i> that will try to match the host name and the
IP address at the same time and will match the event if one of both matches
succeeds. This helps in situations where the Event Console sometimes just
gets the IP address of the remote host correctly, but has no host name.
ID: 1710
Title: omd_status: Fix totally missing section in Linux agent
Component: Checks & Agents
Level: 2
Class: Bug Fix
Version: 1.2.5i7
Reason was an invalid call to <tt>run_cached</tt>. As a result the agent section
<tt>omd_status</tt> was always empty. This has been fixed.
ID: 1704
Title: Fix notification analyser in case there are non-Ascii characters in the notification context
Component: Notifications
Level: 2
Class: Bug Fix
Version: 1.2.5i7
ID: 1703
Title: ups_test: Fix computation of time since last self test
Component: Checks & Agents
Level: 2
Class: Bug Fix
Version: 1.2.5i7
Instead of the time since the last self test of the battery, the check used
the time the last test has <i>lasted</i>. So this check did not trigger
an alarm every - just for the summary state of the last test.
Thanks for the bug report to Cyril Pawelko!
ID: 1700
Title: Enable icon for link to host/service parameters per default now
Component: Multisite
Level: 2
Class: New Feature
Version: 1.2.5i7
Also let it point to the general configuration page of that host/service, not
to its check parameters. Now also active checks and hosts have this icon.