Remove the experimental docs for user namespaces and add similar content
to the `docker daemon` command documentation.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
| ... | ... |
@@ -62,6 +62,7 @@ weight = -1 |
| 62 | 62 |
--tlscert="~/.docker/cert.pem" Path to TLS certificate file |
| 63 | 63 |
--tlskey="~/.docker/key.pem" Path to TLS key file |
| 64 | 64 |
--tlsverify Use TLS and verify the remote |
| 65 |
+ --userns-remap="default" Enable user namespace remapping |
|
| 65 | 66 |
--userland-proxy=true Use userland proxy for loopback traffic |
| 66 | 67 |
|
| 67 | 68 |
Options with [] may be specified multiple times. |
| ... | ... |
@@ -628,6 +629,133 @@ For information about how to create an authorization plugin, see [authorization |
| 628 | 628 |
plugin](../../extend/authorization.md) section in the Docker extend section of this documentation. |
| 629 | 629 |
|
| 630 | 630 |
|
| 631 |
+## Daemon user namespace options |
|
| 632 |
+ |
|
| 633 |
+The Linux kernel [user namespace support](http://man7.org/linux/man-pages/man7/user_namespaces.7.html) provides additional security by enabling |
|
| 634 |
+a process, and therefore a container, to have a unique range of user and |
|
| 635 |
+group IDs which are outside the traditional user and group range utilized by |
|
| 636 |
+the host system. Potentially the most important security improvement is that, |
|
| 637 |
+by default, container processes running as the `root` user will have expected |
|
| 638 |
+administrative privilege (with some restrictions) inside the container but will |
|
| 639 |
+effectively be mapped to an unprivileged `uid` on the host. |
|
| 640 |
+ |
|
| 641 |
+When user namespace support is enabled, Docker creates a single daemon-wide mapping |
|
| 642 |
+for all containers running on the same engine instance. The mappings will |
|
| 643 |
+utilize the existing subordinate user and group ID feature available on all modern |
|
| 644 |
+Linux distributions. |
|
| 645 |
+The [`/etc/subuid`](http://man7.org/linux/man-pages/man5/subuid.5.html) and |
|
| 646 |
+[`/etc/subgid`](http://man7.org/linux/man-pages/man5/subgid.5.html) files will be |
|
| 647 |
+read for the user, and optional group, specified to the `--userns-remap` |
|
| 648 |
+parameter. If you do not wish to specify your own user and/or group, you can |
|
| 649 |
+provide `default` as the value to this flag, and a user will be created on your behalf |
|
| 650 |
+and provided subordinate uid and gid ranges. This default user will be named |
|
| 651 |
+`dockremap`, and entries will be created for it in `/etc/passwd` and |
|
| 652 |
+`/etc/group` using your distro's standard user and group creation tools. |
|
| 653 |
+ |
|
| 654 |
+> **Note**: The single mapping per-daemon restriction is in place for now |
|
| 655 |
+> because Docker shares image layers from its local cache across all |
|
| 656 |
+> containers running on the engine instance. Since file ownership must be |
|
| 657 |
+> the same for all containers sharing the same layer content, the decision |
|
| 658 |
+> was made to map the file ownership on `docker pull` to the daemon's user and |
|
| 659 |
+> group mappings so that there is no delay for running containers once the |
|
| 660 |
+> content is downloaded. This design preserves the same performance for `docker |
|
| 661 |
+> pull`, `docker push`, and container startup as users expect with |
|
| 662 |
+> user namespaces disabled. |
|
| 663 |
+ |
|
| 664 |
+### Starting the daemon with user namespaces enabled |
|
| 665 |
+ |
|
| 666 |
+To enable user namespace support, start the daemon with the |
|
| 667 |
+`--userns-remap` flag, which accepts values in the following formats: |
|
| 668 |
+ |
|
| 669 |
+ - uid |
|
| 670 |
+ - uid:gid |
|
| 671 |
+ - username |
|
| 672 |
+ - username:groupname |
|
| 673 |
+ |
|
| 674 |
+If numeric IDs are provided, translation back to valid user or group names |
|
| 675 |
+will occur so that the subordinate uid and gid information can be read, given |
|
| 676 |
+these resources are name-based, not id-based. If the numeric ID information |
|
| 677 |
+provided does not exist as entries in `/etc/passwd` or `/etc/group`, daemon |
|
| 678 |
+startup will fail with an error message. |
|
| 679 |
+ |
|
| 680 |
+*Example: starting with default Docker user management:* |
|
| 681 |
+ |
|
| 682 |
+``` |
|
| 683 |
+ $ docker daemon --userns-remap=default |
|
| 684 |
+``` |
|
| 685 |
+When `default` is provided, Docker will create - or find the existing - user and group |
|
| 686 |
+named `dockremap`. If the user is created, and the Linux distribution has |
|
| 687 |
+appropriate support, the `/etc/subuid` and `/etc/subgid` files will be populated |
|
| 688 |
+with a contiguous 65536 length range of subordinate user and group IDs, starting |
|
| 689 |
+at an offset based on prior entries in those files. For example, Ubuntu will |
|
| 690 |
+create the following range, based on an existing user named `user1` already owning |
|
| 691 |
+the first 65536 range: |
|
| 692 |
+ |
|
| 693 |
+``` |
|
| 694 |
+ $ cat /etc/subuid |
|
| 695 |
+ user1:100000:65536 |
|
| 696 |
+ dockremap:165536:65536 |
|
| 697 |
+``` |
|
| 698 |
+ |
|
| 699 |
+> **Note:** On a fresh Fedora install, we had to `touch` the |
|
| 700 |
+> `/etc/subuid` and `/etc/subgid` files to have ranges assigned when users |
|
| 701 |
+> were created. Once these files existed, range assignment on user creation |
|
| 702 |
+> worked properly. |
|
| 703 |
+ |
|
| 704 |
+If you have a preferred/self-managed user with subordinate ID mappings already |
|
| 705 |
+configured, you can provide that username or uid to the `--userns-remap` flag. |
|
| 706 |
+If you have a group that doesn't match the username, you may provide the `gid` |
|
| 707 |
+or group name as well; otherwise the username will be used as the group name |
|
| 708 |
+when querying the system for the subordinate group ID range. |
|
| 709 |
+ |
|
| 710 |
+### Detailed information on `subuid`/`subgid` ranges |
|
| 711 |
+ |
|
| 712 |
+Given potential advanced use of the subordinate ID ranges by power users, the |
|
| 713 |
+following paragraphs define how the Docker daemon currently uses the range entries |
|
| 714 |
+found within the subordinate range files. |
|
| 715 |
+ |
|
| 716 |
+The simplest case is that only one contiguous range is defined for the |
|
| 717 |
+provided user or group. In this case, Docker will use that entire contiguous |
|
| 718 |
+range for the mapping of host uids and gids to the container process. This |
|
| 719 |
+means that the first ID in the range will be the remapped root user, and the |
|
| 720 |
+IDs above that initial ID will map host ID 1 through the end of the range. |
|
| 721 |
+ |
|
| 722 |
+From the example `/etc/subid` content shown above, the remapped root |
|
| 723 |
+user would be uid 165536. |
|
| 724 |
+ |
|
| 725 |
+If the system administrator has set up multiple ranges for a single user or |
|
| 726 |
+group, the Docker daemon will read all the available ranges and use the |
|
| 727 |
+following algorithm to create the mapping ranges: |
|
| 728 |
+ |
|
| 729 |
+1. The range segments found for the particular user will be sorted by *start ID* ascending. |
|
| 730 |
+2. Map segments will be created from each range in increasing value with a length matching the length of each segment. Therefore the range segment with the lowest numeric starting value will be equal to the remapped root, and continue up through host uid/gid equal to the range segment length. As an example, if the lowest segment starts at ID 1000 and has a length of 100, then a map of 1000 -> 0 (the remapped root) up through 1100 -> 100 will be created from this segment. If the next segment starts at ID 10000, then the next map will start with mapping 10000 -> 101 up to the length of this second segment. This will continue until no more segments are found in the subordinate files for this user. |
|
| 731 |
+3. If more than five range segments exist for a single user, only the first five will be utilized, matching the kernel's limitation of only five entries in `/proc/self/uid_map` and `proc/self/gid_map`. |
|
| 732 |
+ |
|
| 733 |
+### User namespace known restrictions |
|
| 734 |
+ |
|
| 735 |
+The following standard Docker features are currently incompatible when |
|
| 736 |
+running a Docker daemon with user namespaces enabled: |
|
| 737 |
+ |
|
| 738 |
+ - sharing PID or NET namespaces with the host (`--pid=host` or `--net=host`) |
|
| 739 |
+ - sharing a network namespace with an existing container (`--net=container:*other*`) |
|
| 740 |
+ - sharing an IPC namespace with an existing container (`--ipc=container:*other*`) |
|
| 741 |
+ - A `--readonly` container filesystem (this is a Linux kernel restriction against remounting with modified flags of a currently mounted filesystem when inside a user namespace) |
|
| 742 |
+ - external (volume or graph) drivers which are unaware/incapable of using daemon user mappings |
|
| 743 |
+ - Using `--privileged` mode flag on `docker run` |
|
| 744 |
+ |
|
| 745 |
+In general, user namespaces are an advanced feature and will require |
|
| 746 |
+coordination with other capabilities. For example, if volumes are mounted from |
|
| 747 |
+the host, file ownership will have to be pre-arranged if the user or |
|
| 748 |
+administrator wishes the containers to have expected access to the volume |
|
| 749 |
+contents. |
|
| 750 |
+ |
|
| 751 |
+Finally, while the `root` user inside a user namespaced container process has |
|
| 752 |
+many of the expected admin privileges that go along with being the superuser, the |
|
| 753 |
+Linux kernel has restrictions based on internal knowledge that this is a user namespaced |
|
| 754 |
+process. The most notable restriction that we are aware of at this time is the |
|
| 755 |
+inability to use `mknod`. Permission will be denied for device creation even as |
|
| 756 |
+container `root` inside a user namespace. |
|
| 757 |
+ |
|
| 631 | 758 |
## Miscellaneous options |
| 632 | 759 |
|
| 633 | 760 |
IP masquerading uses address translation to allow containers without a public |
| ... | ... |
@@ -72,7 +72,7 @@ to build a Docker binary with the experimental features enabled: |
| 72 | 72 |
## Current experimental features |
| 73 | 73 |
|
| 74 | 74 |
* [External graphdriver plugins](plugins_graphdriver.md) |
| 75 |
- * [User namespaces](userns.md) |
|
| 75 |
+ * The user namespaces feature has graduated from experimental. |
|
| 76 | 76 |
|
| 77 | 77 |
## How to comment on an experimental feature |
| 78 | 78 |
|
| 79 | 79 |
deleted file mode 100644 |
| ... | ... |
@@ -1,119 +0,0 @@ |
| 1 |
-# Experimental: User namespace support |
|
| 2 |
- |
|
| 3 |
-Linux kernel [user namespace support](http://man7.org/linux/man-pages/man7/user_namespaces.7.html) provides additional security by enabling |
|
| 4 |
-a process--and therefore a container--to have a unique range of user and |
|
| 5 |
-group IDs which are outside the traditional user and group range utilized by |
|
| 6 |
-the host system. Potentially the most important security improvement is that, |
|
| 7 |
-by default, container processes running as the `root` user will have expected |
|
| 8 |
-administrative privilege (with some restrictions) inside the container but will |
|
| 9 |
-effectively be mapped to an unprivileged `uid` on the host. |
|
| 10 |
- |
|
| 11 |
-In this experimental phase, the Docker daemon creates a single daemon-wide mapping |
|
| 12 |
-for all containers running on the same engine instance. The mappings will |
|
| 13 |
-utilize the existing subordinate user and group ID feature available on all modern |
|
| 14 |
-Linux distributions. |
|
| 15 |
-The [`/etc/subuid`](http://man7.org/linux/man-pages/man5/subuid.5.html) and |
|
| 16 |
-[`/etc/subgid`](http://man7.org/linux/man-pages/man5/subgid.5.html) files will be |
|
| 17 |
-read for the user, and optional group, specified to the `--userns-remap` |
|
| 18 |
-parameter. If you do not wish to specify your own user and/or group, you can |
|
| 19 |
-provide `default` as the value to this flag, and a user will be created on your behalf |
|
| 20 |
-and provided subordinate uid and gid ranges. This default user will be named |
|
| 21 |
-`dockremap`, and entries will be created for it in `/etc/passwd` and |
|
| 22 |
-`/etc/group` using your distro's standard user and group creation tools. |
|
| 23 |
- |
|
| 24 |
-> **Note**: The single mapping per-daemon restriction exists for this experimental |
|
| 25 |
-> phase because Docker shares image layers from its local cache across all |
|
| 26 |
-> containers running on the engine instance. Since file ownership must be |
|
| 27 |
-> the same for all containers sharing the same layer content, the decision |
|
| 28 |
-> was made to map the file ownership on `docker pull` to the daemon's user and |
|
| 29 |
-> group mappings so that there is no delay for running containers once the |
|
| 30 |
-> content is downloaded--exactly the same performance characteristics as with |
|
| 31 |
-> user namespaces disabled. |
|
| 32 |
- |
|
| 33 |
-## Starting the daemon with user namespaces enabled |
|
| 34 |
-To enable this experimental user namespace support for a Docker daemon instance, |
|
| 35 |
-start the daemon with the aforementioned `--userns-remap` flag, which accepts |
|
| 36 |
-values in the following formats: |
|
| 37 |
- |
|
| 38 |
- - uid |
|
| 39 |
- - uid:gid |
|
| 40 |
- - username |
|
| 41 |
- - username:groupname |
|
| 42 |
- |
|
| 43 |
-If numeric IDs are provided, translation back to valid user or group names |
|
| 44 |
-will occur so that the subordinate uid and gid information can be read, given |
|
| 45 |
-these resources are name-based, not id-based. If the numeric ID information |
|
| 46 |
-provided does not exist as entries in `/etc/passwd` or `/etc/group`, daemon |
|
| 47 |
-startup will fail with an error message. |
|
| 48 |
- |
|
| 49 |
-*An example: starting with default Docker user management:* |
|
| 50 |
- |
|
| 51 |
-``` |
|
| 52 |
- $ docker daemon --userns-remap=default |
|
| 53 |
-``` |
|
| 54 |
-In this case, Docker will create--or find the existing--user and group |
|
| 55 |
-named `dockremap`. If the user is created, and the Linux distribution has |
|
| 56 |
-appropriate support, the `/etc/subuid` and `/etc/subgid` files will be populated |
|
| 57 |
-with a contiguous 65536 length range of subordinate user and group IDs, starting |
|
| 58 |
-at an offset based on prior entries in those files. For example, Ubuntu will |
|
| 59 |
-create the following range, based on an existing user already having the first |
|
| 60 |
-65536 range: |
|
| 61 |
- |
|
| 62 |
-``` |
|
| 63 |
- $ cat /etc/subuid |
|
| 64 |
- user1:100000:65536 |
|
| 65 |
- dockremap:165536:65536 |
|
| 66 |
-``` |
|
| 67 |
- |
|
| 68 |
-> **Note:** On a fresh Fedora install, we found that we had to `touch` the |
|
| 69 |
-> `/etc/subuid` and `/etc/subgid` files to have ranges assigned when users |
|
| 70 |
-> were created. Once these files existed, range assignment on user creation |
|
| 71 |
-> worked properly. |
|
| 72 |
- |
|
| 73 |
-If you have a preferred/self-managed user with subordinate ID mappings already |
|
| 74 |
-configured, you can provide that username or uid to the `--userns-remap` flag. |
|
| 75 |
-If you have a group that doesn't match the username, you may provide the `gid` |
|
| 76 |
-or group name as well; otherwise the username will be used as the group name |
|
| 77 |
-when querying the system for the subordinate group ID range. |
|
| 78 |
- |
|
| 79 |
-## Detailed information on `subuid`/`subgid` ranges |
|
| 80 |
- |
|
| 81 |
-Given there may be advanced use of the subordinate ID ranges by power users, we will |
|
| 82 |
-describe how the Docker daemon uses the range entries within these files under the |
|
| 83 |
-current experimental user namespace support. |
|
| 84 |
- |
|
| 85 |
-The simplest case exists where only one contiguous range is defined for the |
|
| 86 |
-provided user or group. In this case, Docker will use that entire contiguous |
|
| 87 |
-range for the mapping of host uids and gids to the container process. This |
|
| 88 |
-means that the first ID in the range will be the remapped root user, and the |
|
| 89 |
-IDs above that initial ID will map host ID 1 through the end of the range. |
|
| 90 |
- |
|
| 91 |
-From the example `/etc/subid` content shown above, that means the remapped root |
|
| 92 |
-user would be uid 165536. |
|
| 93 |
- |
|
| 94 |
-If the system administrator has set up multiple ranges for a single user or |
|
| 95 |
-group, the Docker daemon will read all the available ranges and use the |
|
| 96 |
-following algorithm to create the mapping ranges: |
|
| 97 |
- |
|
| 98 |
-1. The ranges will be sorted by *start ID* ascending |
|
| 99 |
-2. Maps will be created from each range with where the host ID will increment starting at 0 for the first range, 0+*range1* length for the second, and so on. This means that the lowest range start ID will be the remapped root, and all further ranges will map IDs from 1 through the uid or gid that equals the sum of all range lengths. |
|
| 100 |
-3. Ranges segments above five will be ignored as the kernel ignores any ID maps after five (in `/proc/self/{u,g}id_map`)
|
|
| 101 |
- |
|
| 102 |
-## User namespace known restrictions |
|
| 103 |
- |
|
| 104 |
-The following standard Docker features are currently incompatible when |
|
| 105 |
-running a Docker daemon with experimental user namespaces enabled: |
|
| 106 |
- |
|
| 107 |
- - sharing namespaces with the host (--pid=host, --net=host, etc.) |
|
| 108 |
- - sharing namespaces with other containers (--net=container:*other*) |
|
| 109 |
- - A `--readonly` container filesystem (a Linux kernel restriction on remount with new flags of a currently mounted filesystem when inside a user namespace) |
|
| 110 |
- - external (volume/graph) drivers which are unaware/incapable of using daemon user mappings |
|
| 111 |
- - Using `--privileged` mode containers |
|
| 112 |
- - volume use without pre-arranging proper file ownership in mounted volumes |
|
| 113 |
- |
|
| 114 |
-Additionally, while the `root` user inside a user namespaced container |
|
| 115 |
-process has many of the privileges of the administrative root user, the |
|
| 116 |
-following operations will fail: |
|
| 117 |
- |
|
| 118 |
- - Use of `mknod` - permission is denied for device creation by the container root |
|
| 119 |
- - others will be listed here when fully tested |
| ... | ... |
@@ -53,6 +53,7 @@ docker-daemon - Enable daemon mode |
| 53 | 53 |
[**--tlskey**[=*~/.docker/key.pem*]] |
| 54 | 54 |
[**--tlsverify**] |
| 55 | 55 |
[**--userland-proxy**[=*true*]] |
| 56 |
+[**--userns-remap**[=*default*]] |
|
| 56 | 57 |
|
| 57 | 58 |
# DESCRIPTION |
| 58 | 59 |
**docker** has two distinct functions. It is used for starting the Docker |
| ... | ... |
@@ -223,6 +224,9 @@ unix://[/path/to/socket] to use. |
| 223 | 223 |
**--userland-proxy**=*true*|*false* |
| 224 | 224 |
Rely on a userland proxy implementation for inter-container and outside-to-container loopback communications. Default is true. |
| 225 | 225 |
|
| 226 |
+**--userns-remap**=*default*|*uid:gid*|*user:group*|*user*|*uid* |
|
| 227 |
+ Enable user namespaces for containers on the daemon. Specifying "default" will cause a new user and group to be created to handle UID and GID range remapping for the user namespace mappings used for contained processes. Specifying a user (or uid) and optionally a group (or gid) will cause the daemon to lookup the user and group's subordinate ID ranges for use as the user namespace mappings for contained processes. |
|
| 228 |
+ |
|
| 226 | 229 |
# STORAGE DRIVER OPTIONS |
| 227 | 230 |
|
| 228 | 231 |
Docker uses storage backends (known as "graphdrivers" in the Docker |