kubernetes/pkg/util
Kubernetes Submit Queue 6dfe5c49f6 Merge pull request #38865 from vwfs/ext4_no_lazy_init
Automatic merge from submit-queue

Enable lazy initialization of ext3/ext4 filesystems

**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #30752, fixes #30240

**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```

**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.

These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount

Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```

The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.

When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system. 

As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory. 

I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.

As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
2017-01-18 09:09:52 -08:00
..
async Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
bandwidth refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
cert genericapiserver: cut off certificates api dependency 2017-01-16 14:10:59 +01:00
chmod Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
chown Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
clock Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
config Enable streaming proxy redirects by default (beta) 2017-01-17 12:56:03 -08:00
configz Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
crlf Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
dbus Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
ebtables Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
env Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
errors add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
exec Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
flag Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
flock Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
flowcontrol tolerate clock change in throttle testing 2017-01-09 14:03:09 -05:00
framer add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
goroutinemap start the apimachinery repo 2017-01-11 09:09:48 -05:00
hash Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
homedir Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
httpstream mechanical 2017-01-16 09:35:12 -05:00
i18n Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
initsystem Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
integer Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
interrupt Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
intstr move openapi types to pkg/openapi 2017-01-16 13:40:14 -05:00
io mechanical repercussions 2017-01-13 08:27:14 -05:00
iptables start the apimachinery repo 2017-01-11 09:09:48 -05:00
json add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
jsonpath Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
keymutex Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
labels Update deployment equality helper 2017-01-11 18:34:12 +01:00
limitwriter Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
logs start the apimachinery repo 2017-01-11 09:09:48 -05:00
maps Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
metrics start the apimachinery repo 2017-01-11 09:09:48 -05:00
mount Merge pull request #38865 from vwfs/ext4_no_lazy_init 2017-01-18 09:09:52 -08:00
net add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
netsh Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
node refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
oom Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
parsers Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
procfs start the apimachinery repo 2017-01-11 09:09:48 -05:00
proxy start the apimachinery repo 2017-01-11 09:09:48 -05:00
rand move pkg/util/rand 2017-01-16 16:04:03 -05:00
resourcecontainer Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
rlimit Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
runtime add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
selinux Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
sets add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
slice move pkg/util/rand 2017-01-16 16:04:03 -05:00
strategicpatch add unit test 2017-01-12 15:01:38 -08:00
strings Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
sysctl Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
system refactor: use metav1.ObjectMeta in other types 2017-01-17 16:17:19 -05:00
taints start the apimachinery repo 2017-01-11 09:09:48 -05:00
term start the apimachinery repo 2017-01-11 09:09:48 -05:00
testing Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
threading Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
uuid start the apimachinery repo 2017-01-11 09:09:48 -05:00
validation add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
version Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
wait add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
workqueue start the apimachinery repo 2017-01-11 09:09:48 -05:00
yaml add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
BUILD add back just enough empty packages to allow heapster cycles to succeed 2017-01-17 08:07:30 -05:00
doc.go Use Go canonical import paths 2016-07-16 13:48:21 -04:00
template.go Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
template_test.go Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
trace.go Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
trie.go Make Kubernetes OpenAPI operation IDs unique 2016-10-12 14:54:12 -07:00
umask.go Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
umask_windows.go delete ErrorTimeout() function and modify Umask() args 2017-01-12 11:05:30 +08:00
util.go kubelet: storage: don't hang kubelet on unresponsive nfs 2016-10-18 08:45:40 -05:00
util_test.go start the apimachinery repo 2017-01-11 09:09:48 -05:00