ID: 8572
Title: Made scheduling of check helpers more robust
Component: cmc
Level: 2
Class: Bug fix
Version: 1.4.0i1
Under very rare circumstances the monitoring core miscounted the number of
idle helpers, leading to an endless loop with 100% CPU load.
ID: 8573
Title: Handle large regular expressions more gracefully
Component: cmc
Level: 1
Class: Bug fix
Version: 1.4.0i2
Livestatus queries with large regular expressions in their filters could
lead to a stack overflow and consequently to a termination of the micro
core. To handle this in a more robust way, we set a limit on the size and
complexity of a regular expression, and bump the stack sizes of the
Livestatus threads from 64kB to 256kB. The net result is that we can
correctly handle regular expression patters of up to roughly 2k characters,
the exact value depends on the regex features used. For larger expressions
we return a failure status in a clean way.
ID: 8574
Title: Added support for warn/crit/min/max values via Carbon/InfluxDB connections
Component: cmc
Level: 1
Class: New feature
Version: 1.4.0i2
In addition to the value of a metric itself, it is now possible to send the
warn/crit/min/max values of a metric to Carbon/InfluxDB, too. This is
configurable per service via rules (previously it was just configurable if
the metric value should be sent or not).
ID: 8575
Title: Fixed segfaults and incorrect Livestatus replies when history file is missing
Component: cmc
Level: 1
Class: Bug fix
Version: 1.4.0i3
When the history file is missing at cmc startup time, the next "GET log"
query will cause a segfault. After the automatic restart, further similar
queries will just return an empty reply until the next log rotation,
although the core has created a history by then.
ID: 8570
Title: Make PROCESS_HOST_CHECK_RESULT's behavior more Nagios-like
Component: cmc
Level: 1
Class: Bug fix
Version: 1.4.0i1
The Nagios documentation for the external command PROCESS_HOST_CHECK_RESULT
explicitly states that the return code in the commandline is already the
host state, not the result code of a check result (which was the core's
previous interpretation). We follow Nagios now more closely and map the
return code in the commandline as follows:
0 => UP
1 => DOWN/UNREACHABLE (previously this meant UP, too)
2 => DOWN/UNREACHABLE
We still deviate a bit from Nagios, because the actual decision if a host is
DOWN or UNREACHABLE is done by the core and can't be overridden from the
outside.
ID: 8567
Title: Invertible Graphite mangling
Component: cmc
Level: 1
Class: New feature
Version: 1.4.0i1
When sending performance data to Graphite, the host/service/variable part needs
to be mangled. The result of the default mangling is more readable (it simply
replaces special characters with an underscore), but it is not invertible. This
change adds C-style mangling using the standard octal escapes ('\\ooo'), so you
can reconstruct the various parts from the data sent to Graphite.