ID: 8538
Title: Make graph data resolution visible in the graphs
Component: Metrics System
Level: 1
Class: Bug Fix
Version: 1.2.9i1
It was not transparent to the users which data resolution the graphs are calculated and rendered with.
The step of the graphs is now rendered to the top right of the graphs, next to the time specification
to make this transparent.
ID: 8612
Title: New alert handler for safely executing remote commands on Linux hosts
Component: Alert Handlers
Level: 1
Class: New Feature
Version: 1.2.9i1
Check_MK now ships it's own system for safely executing commands on remote Linux
hosts via SSH. Everything is setup with the agent bakery - including SSH key
exchange. Please refer to the new article in the user manual for details.
ID: 8611
Title: Alert handlers now log success or failure to monitoring history
Component: Alert Handlers
Level: 2
Class: New Feature
Version: 1.2.9i1
Alert handlers to send feedback to the monitoring core. The history of the
affected host or service now gets two log entries: one for the launch of
the handler, one for the termination.
Also two identical alert handlers will not be allowed any longer to run
in parallel. This will avoid flooding processes in case of hanging handlers.
ID: 8539
Title: Made consolidation of values transparent to the user
Component: Metrics System
Level: 2
Class: Bug Fix
Version: 1.2.9i1
When showing graphs of larger time ranges, the graphs are normally based on
aggregated values, which means multiple measured values are consolidated to
a single value.
For example, when a graph is based on the "max" aggregation function, like
nearly all graphs are by default, the values in the Average column are not
the real average values of the shown time, but the average values of the
max values in the step of the graph. The step of the graph is now shown on
the top right of the graph.
To clarify this: If you look at a 7 day graph, you have a step of 30 minutes.
If the service is being checked in a 1 minute interval, the graph is based on
the maximum of 30 single values of each 30 minutes step.
So your average values are the 7 day average of the 30 minute maximum values.
This is obviously not what most users would expect, but we would have to fetch
3 times more data per graph to get the real average and the real min values
for each graph. The graphs are already producing a lot of load and we try
to keep the impact low, so we decided to make the calculation transparent and
changable by the user.
If you have a max graph, the min and average columns are slightly grayed out
to visualize the situation. If you hover on the titles, you get a description
about it. Now you can click on the columns titles to change the aggregation
function the graph is based on.
ID: 8540
Title: cmk-update-agent: Fixed exception at end of registration when using auth secret
Component: Agent Bakery
Level: 1
Class: Bug Fix
Version: 1.2.9i1
ID: 8570
Title: Make PROCESS_HOST_CHECK_RESULT's behavior more Nagios-like
Component: The Check_MK Micro Core
Level: 1
Class: Bug Fix
Version: 1.2.9i1
The Nagios documentation for the external command PROCESS_HOST_CHECK_RESULT
explicitly states that the return code in the commandline is already the
host state, not the result code of a check result (which was the core's
previous interpretation). We follow Nagios now more closely and map the
return code in the commandline as follows:
0 => UP
1 => DOWN/UNREACHABLE (previously this meant UP, too)
2 => DOWN/UNREACHABLE
We still deviate a bit from Nagios, because the actual decision if a host is
DOWN or UNREACHABLE is done by the core and can't be overridden from the
outside.
ID: 8613
Title: Fix exception when signing agent and a host once had an agent but now don't
Component: Agent Bakery
Level: 2
Class: Bug Fix
Version: 1.4.0i1
ID: 8541
Title: Service checks timeouts can now be configured individually for services
Component: The Check_MK Micro Core
Level: 2
Class: New Feature
Version: 1.2.9i1
It is a good idea to keep the service check timeout values as low as possible. All past versions
could only configure this option globally. So if you had only some services needing a higher
execution time, you had to increase this global value for all your services.
Now you can use the rule set "Service check timeout (Microcore)" to control this timeout per
service. It will override the globally configured option for all matching services.