Module: check_mk
Branch: master
Commit: 70efbe6af639077067075bc61f7ace6ae2efc031
URL:
http://git.mathias-kettner.de/git/?p=check_mk.git;a=commit;h=70efbe6af63907…
Author: Lars Michelsen <lm(a)mathias-kettner.de>
Date: Wed Jun 14 09:55:39 2017 +0200
4756 FIX Fix possible timeouts when changing configurations or changing user profile
properties
Check_MK is using generic functions to perform file IO, for example to save configuration
files
of WATO or saving the user properties of the GUI. With 1.4.0b1 we introduced a change that
was
intended to prevent loss of just written files during hard OS crashes. This change has
now
been reverted because it behaves badly in high IO load situations and might also itself
cause
high IO load in case files are written in a high frequency.
The IO behaviour in this situation has now simply been changed back to the old state
before
1.4.0b1.
In future releases (1.5+) we'll find a solution to improve the handling of OS crash
recovery
situations.
Change-Id: Ic07b8151e419a7cc821b4480f56367da140eb7da
---
.werks/4756 | 21 +++++++++++++++++++++
lib/store.py | 26 ++++++++++++++++++++++++--
2 files changed, 45 insertions(+), 2 deletions(-)
diff --git a/.werks/4756 b/.werks/4756
new file mode 100644
index 0000000..22a2a75
--- /dev/null
+++ b/.werks/4756
@@ -0,0 +1,21 @@
+Title: Fix possible timeouts when changing configurations or changing user profile
properties
+Level: 1
+Component: core
+Class: fix
+Compatible: compat
+Edition: cre
+State: unknown
+Version: 1.5.0i1
+Date: 1497426322
+
+Check_MK is using generic functions to perform file IO, for example to save configuration
files
+of WATO or saving the user properties of the GUI. With 1.4.0b1 we introduced a change
that was
+intended to prevent loss of just written files during hard OS crashes. This change has
now
+been reverted because it behaves badly in high IO load situations and might also itself
cause
+high IO load in case files are written in a high frequency.
+
+The IO behaviour in this situation has now simply been changed back to the old state
before
+1.4.0b1.
+
+In future releases (1.5+) we'll find a solution to improve the handling of OS crash
recovery
+situations.
diff --git a/lib/store.py b/lib/store.py
index 51f4c69..b6c7037 100644
--- a/lib/store.py
+++ b/lib/store.py
@@ -166,8 +166,30 @@ def save_file(path, content, mode=0660):
os.chmod(tmp_path, mode)
tmp.write(content)
- tmp.flush()
- os.fsync(tmp.fileno())
+ # The goal of the fsync would be to ensure that there is a consistent file
after a
+ # crash. Without the fsync it may happen that the file renamed below is just
an empty
+ # file. That may lead into unexpected situations during loading.
+ #
+ # Don't do a fsync here because this may run into IO performance issues.
Even when
+ # we can specify the fsync on a fd, the disk cache may be flushed completely
because
+ # the disk does not know anything about fds, only about blocks.
+ #
+ # For Check_MK 1.4 we can not introduce a good solution for this, because the
changes
+ # would affect too many parts of Check_MK with possible new issues. For the
moment we
+ # stick with the IO behaviour of previous Check_MK versions.
+ #
+ # In the future we'll find a solution to deal better with OS crash
recovery situations.
+ # for example like this:
+ #
+ # TODO(lm): The consistency of the file will can be ensured using copies of
the
+ # original file which are made before replacing it with the new one. After
first
+ # successful loading of the just written fille the possibly existing copies
of this
+ # file are deleted.
+ # We can archieve this by calling os.link() before the os.rename() below.
Then we need
+ # to define in which situations we want to check out the backup file(s) and
in which
+ # cases we can savely delete them.
+ #tmp.flush()
+ #os.fsync(tmp.fileno())
os.rename(tmp_path, path)