ID: 5539
Title: Improved performance of pages showing many graphs
Component: metrics
Level: 2
Class: New feature
Version: 1.5.0i2
Previously the GUI was loading all graphs in the progress of
rendering a view. The view was rendered to the user once the
information for all graphs is known. This mechanism made pages
that show a lot of graphs load slowly.
With this change the views are now rendered with a place holder
for each graph which is a lot faster. After the view has been
loaded for the user, the browser starts to render the graphs
asynchronously and simultaneously which will also reduce the
total time needed for loading.
ID: 5465
Title: Fixed calculation of standard deviation.
Component: Livestatus
Level: 1
Class: Bug fix
Version: 1.5.0i2
Note: This fixes only a slightly esoteric feature, namely "Stats: std ..."
headers. Normal users are not affected, only those using this header via
self-written scripts.
Livestatus incorrectly used a bias correction when calculating the standard
deviation. For more mathematical background see:
https://en.wikipedia.org/wiki/Bessel%27s_correctionhttp://mathworld.wolfram.com/StandardDeviation.html
ID: 5463
Title: Report invalid table names via response header.
Component: Livestatus
Level: 1
Class: Bug fix
Version: 1.5.0i2
Even when "ResponseHeader: fixed16" was used, an invalid table name in a GET
request was not reported via Livestatus (only in the log file of the
monitoring core). This was a regression compared to 1.2.8 and has been
fixed.
ID: 5549
Title: mk_oracle.ps1: Fixed scattered information for configuration
Component: Checks & agents
Level: 1
Class: Bug fix
Version: 1.5.0i2
Information about the correct configuration of the oracle plugin for windows
has been scattered around serveral places in Check_MK. As for now all needed
information is available in the script itself.
ID: 5088
Title: postgres_stats: Age for never checked tables is now configurable
Component: Checks & agents
Level: 1
Class: New feature
Version: 1.5.0i1
ID: 5538
Title: Improved performance when processing a large amount of piggyback data
Component: Core & setup
Level: 1
Class: New feature
Version: 1.5.0i2
When Check_MK needs to handle a large amount of piggyback data (a lot of piggbacked
hosts from a lot of piggyback source hosts, several hundreds to thousands),
the performance of Check_MK could decrease during regular monitoring. This was caused
by some too expensive house keeping logic that was executed too often.
The mechanism has now been changed to work like this:
<ul>
<li>During regular monitoring now piggyback data is removed anymore from the disk.</li>
<li>New piggyback data is written to disk when communicating with the source host.</li>
<li>When monitoring piggybacked hosts, the outdated piggyback data available on the
disk is filtered.</li>
<li>There is a dedicated housekeeping cron job executed sites crontab daily at 00:10
which removes outdated piggyback data. This job is mostly used to free up some tmpfs
space, the outated stored data is not read by monitoring anymore.</li>
</ul>
ID: 5411
Title: Windows agent: handle WMI timeouts
Component: Checks & agents
Level: 1
Class: Bug fix
Version: 1.5.0i2
All sections depending on WMI (Windows Management Instrumentation)
queries have been suffering from periodic freezing, the time interval
between subsequent freezes being typically 18...20 minutes. At those
moments, the Windows agent has not been delivering any output for some
of its WMI-dependent sections (e. g., ps, uptime, dotnet_clrmemory,
wmi_cpuload, msexch and wmi_webservices). The corresponding checks have
issued error messages of type "Missing agent sections...". Various
strategies have been previously used attempting to cope with the
periodic problems with WMI. Werk #4008 introduced a timeout of 10s in
order to prevent the agent from completely blocking if a WMI query
freezes. However, this led to the described problem of missing agent
output totally when no response was given to a WMI query within 10s.
Moreover, multiple WMI queries waiting for 10s after another led to
periodic long execution times of the Windows agent.
This Werk introduces a new strategy for coping with the periodic
freezing of WMI queries. The timeout of the queries is reduced to 2.5s
instead of 10s per query, reducing the total execution time of the
Windows agent by approximately 75% when the problem occurs. Upon a WMI
timeout, the Windows agent issues it in its output so, that the affected
checks can tolerate it by setting their state to UNKNOWN. In normal
cases, the check should get back to OK when the agent is contacted the
next time and the WMI freeze is most likely gone.
There seems to be a connection of the WMI freezes to the Windows service
WMI Performance Adapter. https://lokna.no/?p=1430 suggests that the
startup type of this service be set to automatic, ensuring the service
is running. Without this, the WMI Performance Adapter seems to get
started periodically when WMI is queried. Testing with WMI Performance
Adapter service running has showed clear signs of improvement, reducing
the frequency of freezing WMI queries even if not completely ending
them.