The CFQ I/O scheduler has known performance issues when used in
certain OS configurations. For example, we can see between 10x - 30x
drop in I/O throughput running the following command, with the CFQ I/O
scheduler:
dd if=/dev/zero of=/root/test.img bs=512 count=10000 oflags=dsync
Throughput with CFQ: 60 KB/s
Throughput with noop or deadline: 1.5 MB/s - 2 MB/s
This performance drop is caused by the undesirable interaction between
4 different components:
- blkio cgroup controller enabled
- ext4 with the jbd2 kthread running in the root blkio cgroup
- dd running on ext4, in any other blkio cgroup than that of jbd2
- CFQ I/O scheduler with defaults for slice_idle and group_idle
When docker is enabled, systemd creates a blkio cgroup called
system.slice to run system services (and docker) under it, and a
separate blkio cgroup called user.slice for user processes. So, when
dd is invoked, it runs under user.slice.
The dd command above includes the dsync flag, which performs an
fdatasync after every write to the output file. Since dd is writing to
a file on ext4, jbd2 will be active, committing transactions
corresponding to those fdatasync requests from dd. (In other words, dd
depends on jdb2, in order to make forward progress). But jdb2 being a
kernel thread, runs in the root blkio cgroup, as opposed to dd, which
runs under user.slice.
Now, if the I/O scheduler in use for the underlying block device is
CFQ, then its inter-queue/inter-group idling takes effect (via the
slice_idle and group_idle parameters, both of which default to 8ms).
Therefore, everytime CFQ switches between processing requests from dd
vs jbd2, this 8ms idle time is injected, which slows down the overall
throughput tremendously!
Unfortunately, the pre-conditions that cause this performance drop
correspond to most of the common configurations of Photon OS! Fixing
CFQ itself is challenging (and is still being discussed on the linux
kernel mailing list [1]), so switch the default I/O scheduler to
'deadline' in the meantime.
For more details on this problem, as well as the ongoing discussion
around its fix, refer to [1].
[1]. https://lore.kernel.org/lkml/8d72fcf7-bbb4-2965-1a06-e9fc177a8938@csail.mit.edu/
Change-Id: I257deacbfd15cfe35f99072440da4e09472e2ed3
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/7323
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Srinidhi Rao <srinidhir@vmware.com>
Reviewed-by: Alexey Makhalov <amakhalov@vmware.com>
(cherry picked from commit 7974ca9f70e37dee132758ead578700f7369c1c9)
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/7341
Reviewed-by: Srivatsa S. Bhat <srivatsab@vmware.com>
... | ... |
@@ -1,6 +1,6 @@ |
1 | 1 |
# |
2 | 2 |
# Automatically generated file; DO NOT EDIT. |
3 |
-# Linux/x86 4.19.26 Kernel Configuration |
|
3 |
+# Linux/x86 4.19.40 Kernel Configuration |
|
4 | 4 |
# |
5 | 5 |
|
6 | 6 |
# |
... | ... |
@@ -869,10 +869,10 @@ CONFIG_IOSCHED_NOOP=y |
869 | 869 |
CONFIG_IOSCHED_DEADLINE=y |
870 | 870 |
CONFIG_IOSCHED_CFQ=y |
871 | 871 |
CONFIG_CFQ_GROUP_IOSCHED=y |
872 |
-# CONFIG_DEFAULT_DEADLINE is not set |
|
873 |
-CONFIG_DEFAULT_CFQ=y |
|
872 |
+CONFIG_DEFAULT_DEADLINE=y |
|
873 |
+# CONFIG_DEFAULT_CFQ is not set |
|
874 | 874 |
# CONFIG_DEFAULT_NOOP is not set |
875 |
-CONFIG_DEFAULT_IOSCHED="cfq" |
|
875 |
+CONFIG_DEFAULT_IOSCHED="deadline" |
|
876 | 876 |
CONFIG_MQ_IOSCHED_DEADLINE=y |
877 | 877 |
CONFIG_MQ_IOSCHED_KYBER=y |
878 | 878 |
# CONFIG_IOSCHED_BFQ is not set |
... | ... |
@@ -1,6 +1,6 @@ |
1 | 1 |
# |
2 | 2 |
# Automatically generated file; DO NOT EDIT. |
3 |
-# Linux/x86 4.19.26 Kernel Configuration |
|
3 |
+# Linux/x86 4.19.40 Kernel Configuration |
|
4 | 4 |
# |
5 | 5 |
|
6 | 6 |
# |
... | ... |
@@ -835,10 +835,10 @@ CONFIG_IOSCHED_NOOP=y |
835 | 835 |
CONFIG_IOSCHED_DEADLINE=y |
836 | 836 |
CONFIG_IOSCHED_CFQ=y |
837 | 837 |
CONFIG_CFQ_GROUP_IOSCHED=y |
838 |
-# CONFIG_DEFAULT_DEADLINE is not set |
|
839 |
-CONFIG_DEFAULT_CFQ=y |
|
838 |
+CONFIG_DEFAULT_DEADLINE=y |
|
839 |
+# CONFIG_DEFAULT_CFQ is not set |
|
840 | 840 |
# CONFIG_DEFAULT_NOOP is not set |
841 |
-CONFIG_DEFAULT_IOSCHED="cfq" |
|
841 |
+CONFIG_DEFAULT_IOSCHED="deadline" |
|
842 | 842 |
# CONFIG_MQ_IOSCHED_DEADLINE is not set |
843 | 843 |
# CONFIG_MQ_IOSCHED_KYBER is not set |
844 | 844 |
# CONFIG_IOSCHED_BFQ is not set |
... | ... |
@@ -2,7 +2,7 @@ |
2 | 2 |
Summary: Kernel |
3 | 3 |
Name: linux-secure |
4 | 4 |
Version: 4.19.40 |
5 |
-Release: 2%{?kat_build:.%kat_build}%{?dist} |
|
5 |
+Release: 3%{?kat_build:.%kat_build}%{?dist} |
|
6 | 6 |
License: GPLv2 |
7 | 7 |
URL: http://www.kernel.org/ |
8 | 8 |
Group: System Environment/Kernel |
... | ... |
@@ -239,6 +239,8 @@ ln -sf linux-%{uname_r}.cfg /boot/photon.cfg |
239 | 239 |
/usr/src/linux-headers-%{uname_r} |
240 | 240 |
|
241 | 241 |
%changelog |
242 |
+* Tue May 28 2019 Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> 4.19.40-3 |
|
243 |
+- Change default I/O scheduler to 'deadline' to fix performance issue. |
|
242 | 244 |
* Tue May 14 2019 Keerthana K <keerthanak@vmware.com> 4.19.40-2 |
243 | 245 |
- Fix to parse through /boot folder and update symlink (/boot/photon.cfg) if |
244 | 246 |
- mulitple kernels are installed and current linux kernel is removed. |
... | ... |
@@ -2,7 +2,7 @@ |
2 | 2 |
Summary: Kernel |
3 | 3 |
Name: linux |
4 | 4 |
Version: 4.19.40 |
5 |
-Release: 2%{?kat_build:.%kat_build}%{?dist} |
|
5 |
+Release: 3%{?kat_build:.%kat_build}%{?dist} |
|
6 | 6 |
License: GPLv2 |
7 | 7 |
URL: http://www.kernel.org/ |
8 | 8 |
Group: System Environment/Kernel |
... | ... |
@@ -442,6 +442,8 @@ ln -sf %{name}-%{uname_r}.cfg /boot/photon.cfg |
442 | 442 |
%endif |
443 | 443 |
|
444 | 444 |
%changelog |
445 |
+* Tue May 28 2019 Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> 4.19.40-3 |
|
446 |
+- Change default I/O scheduler to 'deadline' to fix performance issue. |
|
445 | 447 |
* Tue May 14 2019 Keerthana K <keerthanak@vmware.com> 4.19.40-2 |
446 | 448 |
- Fix to parse through /boot folder and update symlink (/boot/photon.cfg) if |
447 | 449 |
- mulitple kernels are installed and current linux kernel is removed. |