On bookworm, AppArmor failed to start inside the container, which can be
seen at startup of the dev-container:
Created symlink /etc/systemd/system/systemd-firstboot.service → /dev/null.
Created symlink /etc/systemd/system/systemd-udevd.service → /dev/null.
Created symlink /etc/systemd/system/multi-user.target.wants/docker-entrypoint.service → /etc/systemd/system/docker-entrypoint.service.
hack/dind-systemd: starting /lib/systemd/systemd --show-status=false --unit=docker-entrypoint.target
systemd 252.17-1~deb12u1 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected virtualization docker.
Detected architecture x86-64.
modprobe@configfs.service: Deactivated successfully.
modprobe@dm_mod.service: Deactivated successfully.
modprobe@drm.service: Deactivated successfully.
modprobe@efi_pstore.service: Deactivated successfully.
modprobe@fuse.service: Deactivated successfully.
modprobe@loop.service: Deactivated successfully.
apparmor.service: Starting requested but asserts failed.
proc-sys-fs-binfmt_misc.automount: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 49 (systemd-binfmt)
+ source /etc/docker-entrypoint-cmd
++ hack/make.sh dynbinary test-integration
When checking "aa-status", an error was printed that the filesystem was
not mounted:
aa-status
apparmor filesystem is not mounted.
apparmor module is loaded.
Checking if "local-fs.target" was loaded, that seemed to be the case;
systemctl status local-fs.target
● local-fs.target - Local File Systems
Loaded: loaded (/lib/systemd/system/local-fs.target; static)
Active: active since Mon 2023-11-27 10:48:38 UTC; 18s ago
Docs: man:systemd.special(7)
However, **on the host**, "/sys/kernel/security" has a mount, which was not
present inside the container:
mount | grep securityfs
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
Interestingly, on `debian:bullseye`, this was not the case either; no
`securityfs` mount was present inside the container, and apparmor actually
failed to start, but succeeded silently:
mount | grep securityfs
systemctl start apparmor
systemctl status apparmor
● apparmor.service - Load AppArmor profiles
Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2023-11-27 11:59:09 UTC; 44s ago
Docs: man:apparmor(7)
https://gitlab.com/apparmor/apparmor/wikis/home/
Process: 43 ExecStart=/lib/apparmor/apparmor.systemd reload (code=exited, status=0/SUCCESS)
Main PID: 43 (code=exited, status=0/SUCCESS)
CPU: 10ms
Nov 27 11:59:09 9519f89cade1 apparmor.systemd[43]: Not starting AppArmor in container
Same, using the `/etc/init.d/apparmor` script:
/etc/init.d/apparmor start
Starting apparmor (via systemctl): apparmor.service.
echo $?
0
And apparmor was not actually active:
aa-status
apparmor module is loaded.
apparmor filesystem is not mounted.
aa-enabled
Maybe - policy interface not available.
After further investigating, I found that the non-systemd dind script
had a mount for AppArmor, which was added in 31638ab2ad2a5380d447780f05f7aa078c9421f5
The systemd variant was missing this mount, which may have gone unnoticed
because `debian:bullseye` was silently ignoring this when starting the
apparmor service.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
| ... | ... |
@@ -1,5 +1,11 @@ |
| 1 | 1 |
#!/bin/bash |
| 2 | 2 |
set -e |
| 3 |
+ |
|
| 4 |
+# Set the container env-var, so that AppArmor is enabled in the daemon and |
|
| 5 |
+# containerd when running docker-in-docker. |
|
| 6 |
+# |
|
| 7 |
+# see: https://github.com/containerd/containerd/blob/787943dc1027a67f3b52631e084db0d4a6be2ccc/pkg/apparmor/apparmor_linux.go#L29-L45 |
|
| 8 |
+# see: https://github.com/moby/moby/commit/de191e86321f7d3136ff42ff75826b8107399497 |
|
| 3 | 9 |
container=docker |
| 4 | 10 |
export container |
| 5 | 11 |
|
| ... | ... |
@@ -18,6 +24,38 @@ fi |
| 18 | 18 |
# running in a container. |
| 19 | 19 |
mount --make-rshared / |
| 20 | 20 |
|
| 21 |
+# Allow AppArmor to work inside the container; |
|
| 22 |
+# |
|
| 23 |
+# aa-status |
|
| 24 |
+# apparmor filesystem is not mounted. |
|
| 25 |
+# apparmor module is loaded. |
|
| 26 |
+# |
|
| 27 |
+# mount -t securityfs none /sys/kernel/security |
|
| 28 |
+# |
|
| 29 |
+# aa-status |
|
| 30 |
+# apparmor module is loaded. |
|
| 31 |
+# 30 profiles are loaded. |
|
| 32 |
+# 30 profiles are in enforce mode. |
|
| 33 |
+# /snap/snapd/18357/usr/lib/snapd/snap-confine |
|
| 34 |
+# ... |
|
| 35 |
+# |
|
| 36 |
+# Note: https://0xn3va.gitbook.io/cheat-sheets/container/escaping/sensitive-mounts#sys-kernel-security |
|
| 37 |
+# |
|
| 38 |
+# ## /sys/kernel/security |
|
| 39 |
+# |
|
| 40 |
+# In /sys/kernel/security mounted the securityfs interface, which allows |
|
| 41 |
+# configuration of Linux Security Modules. This allows configuration of |
|
| 42 |
+# AppArmor policies, and so access to this may allow a container to disable |
|
| 43 |
+# its MAC system. |
|
| 44 |
+# |
|
| 45 |
+# Given that we're running privileged already, this should not be an issue. |
|
| 46 |
+if [ -d /sys/kernel/security ] && ! mountpoint -q /sys/kernel/security; then |
|
| 47 |
+ mount -t securityfs none /sys/kernel/security || {
|
|
| 48 |
+ echo >&2 'Could not mount /sys/kernel/security.' |
|
| 49 |
+ echo >&2 'AppArmor detection and --privileged mode might break.' |
|
| 50 |
+ } |
|
| 51 |
+fi |
|
| 52 |
+ |
|
| 21 | 53 |
env > /etc/docker-entrypoint-env |
| 22 | 54 |
|
| 23 | 55 |
cat > /etc/systemd/system/docker-entrypoint.target << EOF |