Signed-off-by: Charles Smith <charles.smith@docker.com>
| ... | ... |
@@ -5,7 +5,7 @@ aliases = [ |
| 5 | 5 |
] |
| 6 | 6 |
title = "Swarm administration guide" |
| 7 | 7 |
description = "Manager administration guide" |
| 8 |
-keywords = ["docker, container, cluster, swarm, manager, raft"] |
|
| 8 |
+keywords = ["docker, container, swarm, manager, raft"] |
|
| 9 | 9 |
[menu.main] |
| 10 | 10 |
identifier="manager_admin_guide" |
| 11 | 11 |
parent="engine_swarm" |
| ... | ... |
@@ -16,17 +16,20 @@ weight="20" |
| 16 | 16 |
# Administer and maintain a swarm of Docker Engines |
| 17 | 17 |
|
| 18 | 18 |
When you run a swarm of Docker Engines, **manager nodes** are the key components |
| 19 |
-for managing the cluster and storing the cluster state. It is important to |
|
| 19 |
+for managing the swarm and storing the swarm state. It is important to |
|
| 20 | 20 |
understand some key features of manager nodes in order to properly deploy and |
| 21 | 21 |
maintain the swarm. |
| 22 | 22 |
|
| 23 | 23 |
This article covers the following swarm administration tasks: |
| 24 | 24 |
|
| 25 |
-* [Add Manager nodes for fault tolerance](#add-manager-nodes-for-fault-tolerance) |
|
| 26 |
-* [Distributing manager nodes](#distributing-manager-nodes) |
|
| 25 |
+* [Using a static IP for manager node advertise address](#use-a-static-ip-for-manager-node-advertise-address) |
|
| 26 |
+* [Adding manager nodes for fault tolerance](#add-manager-nodes-for-fault-tolerance) |
|
| 27 |
+* [Distributing manager nodes](#distribute-manager-nodes) |
|
| 27 | 28 |
* [Running manager-only nodes](#run-manager-only-nodes) |
| 28 |
-* [Backing up the cluster state](#back-up-the-cluster-state) |
|
| 29 |
+* [Backing up the swarm state](#back-up-the-swarm-state) |
|
| 29 | 30 |
* [Monitoring the swarm health](#monitor-swarm-health) |
| 31 |
+* [Troubleshooting a manager node](#troubleshoot-a-manager-node) |
|
| 32 |
+* [Forcefully removing a node](#force-remove-a-node) |
|
| 30 | 33 |
* [Recovering from disaster](#recover-from-disaster) |
| 31 | 34 |
|
| 32 | 35 |
Refer to [How swarm mode nodes work](how-swarm-mode-works/nodes.md) |
| ... | ... |
@@ -36,21 +39,36 @@ worker nodes. |
| 36 | 36 |
## Operating manager nodes in a swarm |
| 37 | 37 |
|
| 38 | 38 |
Swarm manager nodes use the [Raft Consensus Algorithm](raft.md) to manage the |
| 39 |
-cluster state. You only need to understand some general concepts of Raft in |
|
| 39 |
+swarm state. You only need to understand some general concepts of Raft in |
|
| 40 | 40 |
order to manage a swarm. |
| 41 | 41 |
|
| 42 | 42 |
There is no limit on the number of manager nodes. The decision about how many |
| 43 | 43 |
manager nodes to implement is a trade-off between performance and |
| 44 | 44 |
fault-tolerance. Adding manager nodes to a swarm makes the swarm more |
| 45 | 45 |
fault-tolerant. However, additional manager nodes reduce write performance |
| 46 |
-because more nodes must acknowledge proposals to update the cluster state. |
|
| 46 |
+because more nodes must acknowledge proposals to update the swarm state. |
|
| 47 | 47 |
This means more network round-trip traffic. |
| 48 | 48 |
|
| 49 | 49 |
Raft requires a majority of managers, also called a quorum, to agree on proposed |
| 50 |
-updates to the cluster. A quorum of managers must also agree on node additions |
|
| 50 |
+updates to the swarm. A quorum of managers must also agree on node additions |
|
| 51 | 51 |
and removals. Membership operations are subject to the same constraints as state |
| 52 | 52 |
replication. |
| 53 | 53 |
|
| 54 |
+## Use a static IP for manager node advertise address |
|
| 55 |
+ |
|
| 56 |
+When initiating a swarm, you have to specify the `--advertise-addr` flag to |
|
| 57 |
+advertise your address to other manager nodes in the swarm. For more |
|
| 58 |
+information, see [Run Docker Engine in swarm mode](swarm-mode.md#configure-the-advertise-address). Because manager nodes are |
|
| 59 |
+meant to be a stable component of the infrastructure, you should use a *fixed |
|
| 60 |
+IP address* for the advertise address to prevent the swarm from becoming |
|
| 61 |
+unstable on machine reboot. |
|
| 62 |
+ |
|
| 63 |
+If the whole swarm restarts and every manager node subsequently gets a new IP |
|
| 64 |
+address, there is no way for any node to contact an existing manager. Therefore |
|
| 65 |
+the swarm is hung while nodes to contact one another at their old IP addresses. |
|
| 66 |
+ |
|
| 67 |
+Dynamic IP addresses are OK for worker nodes. |
|
| 68 |
+ |
|
| 54 | 69 |
## Add manager nodes for fault tolerance |
| 55 | 70 |
|
| 56 | 71 |
You should maintain an odd number of managers in the swarm to support manager |
| ... | ... |
@@ -59,7 +77,7 @@ partition, there is a higher chance that a quorum remains available to process |
| 59 | 59 |
requests if the network is partitioned into two sets. Keeping a quorum is not |
| 60 | 60 |
guaranteed if you encounter more than two network partitions. |
| 61 | 61 |
|
| 62 |
-| Cluster Size | Majority | Fault Tolerance | |
|
| 62 |
+| Swarm Size | Majority | Fault Tolerance | |
|
| 63 | 63 |
|:------------:|:----------:|:-----------------:| |
| 64 | 64 |
| 1 | 1 | 0 | |
| 65 | 65 |
| 2 | 2 | 0 | |
| ... | ... |
@@ -73,22 +91,22 @@ guaranteed if you encounter more than two network partitions. |
| 73 | 73 |
|
| 74 | 74 |
For example, in a swarm with *5 nodes*, if you lose *3 nodes*, you don't have a |
| 75 | 75 |
quorum. Therefore you can't add or remove nodes until you recover one of the |
| 76 |
-unavailable manager nodes or recover the cluster with disaster recovery |
|
| 76 |
+unavailable manager nodes or recover the swarm with disaster recovery |
|
| 77 | 77 |
commands. See [Recover from disaster](#recover-from-disaster). |
| 78 | 78 |
|
| 79 | 79 |
While it is possible to scale a swarm down to a single manager node, it is |
| 80 | 80 |
impossible to demote the last manager node. This ensures you maintain access to |
| 81 | 81 |
the swarm and that the swarm can still process requests. Scaling down to a |
| 82 | 82 |
single manager is an unsafe operation and is not recommended. If |
| 83 |
-the last node leaves the cluster unexpetedly during the demote operation, the |
|
| 84 |
-cluster swarm will become unavailable until you reboot the node or restart with |
|
| 83 |
+the last node leaves the swarm unexpetedly during the demote operation, the |
|
| 84 |
+swarm will become unavailable until you reboot the node or restart with |
|
| 85 | 85 |
`--force-new-cluster`. |
| 86 | 86 |
|
| 87 |
-You manage cluster membership with the `docker swarm` and `docker node` |
|
| 87 |
+You manage swarm membership with the `docker swarm` and `docker node` |
|
| 88 | 88 |
subsystems. Refer to [Add nodes to a swarm](join-nodes.md) for more information |
| 89 | 89 |
on how to add worker nodes and promote a worker node to be a manager. |
| 90 | 90 |
|
| 91 |
-## Distributing manager nodes |
|
| 91 |
+## Distribute manager nodes |
|
| 92 | 92 |
|
| 93 | 93 |
In addition to maintaining an odd number of manager nodes, pay attention to |
| 94 | 94 |
datacenter topology when placing managers. For optimal fault-tolerance, distribute |
| ... | ... |
@@ -107,65 +125,54 @@ available to process requests and rebalance workloads. |
| 107 | 107 |
## Run manager-only nodes |
| 108 | 108 |
|
| 109 | 109 |
By default manager nodes also act as a worker nodes. This means the scheduler |
| 110 |
-can assign tasks to a manager node. For small and non-critical clusters |
|
| 110 |
+can assign tasks to a manager node. For small and non-critical swarms |
|
| 111 | 111 |
assigning tasks to managers is relatively low-risk as long as you schedule |
| 112 | 112 |
services using **resource constraints** for *cpu* and *memory*. |
| 113 | 113 |
|
| 114 | 114 |
However, because manager nodes use the Raft consensus algorithm to replicate data |
| 115 | 115 |
in a consistent way, they are sensitive to resource starvation. You should |
| 116 |
-isolate managers in your swarm from processes that might block cluster |
|
| 117 |
-operations like cluster heartbeat or leader elections. |
|
| 116 |
+isolate managers in your swarm from processes that might block swarm |
|
| 117 |
+operations like swarm heartbeat or leader elections. |
|
| 118 | 118 |
|
| 119 | 119 |
To avoid interference with manager node operation, you can drain manager nodes |
| 120 | 120 |
to make them unavailable as worker nodes: |
| 121 | 121 |
|
| 122 | 122 |
```bash |
| 123 |
-docker node update --availability drain <NODE-ID> |
|
| 123 |
+docker node update --availability drain <NODE> |
|
| 124 | 124 |
``` |
| 125 | 125 |
|
| 126 | 126 |
When you drain a node, the scheduler reassigns any tasks running on the node to |
| 127 |
-other available worker nodes in the cluster. It also prevents the scheduler from |
|
| 127 |
+other available worker nodes in the swarm. It also prevents the scheduler from |
|
| 128 | 128 |
assigning tasks to the node. |
| 129 | 129 |
|
| 130 |
-## Back up the cluster state |
|
| 130 |
+## Back up the swarm state |
|
| 131 | 131 |
|
| 132 |
-Docker manager nodes store the cluster state and manager logs in the following |
|
| 132 |
+Docker manager nodes store the swarm state and manager logs in the following |
|
| 133 | 133 |
directory: |
| 134 | 134 |
|
| 135 |
-`/var/lib/docker/swarm/raft` |
|
| 136 |
- |
|
| 137 |
-Back up the raft data directory often so that you can use it in case of disaster |
|
| 138 |
-recovery. |
|
| 139 |
- |
|
| 140 |
-You should never restart a manager node with the data directory from another |
|
| 141 |
-node (for example, by copying the `raft` directory from one node to another). |
|
| 142 |
-The data directory is unique to a node ID and a node can only use a given node |
|
| 143 |
-ID once to join the swarm. (ie. Node ID space should be globally unique) |
|
| 144 |
- |
|
| 145 |
-To cleanly re-join a manager node to a cluster: |
|
| 146 |
- |
|
| 147 |
-1. Run `docker node demote <id-node>` to demote the node to a worker. |
|
| 148 |
-2. Run `docker node rm <id-node>` before adding a node back with a fresh state. |
|
| 149 |
-3. Re-join the node to the cluster using `docker swarm join`. |
|
| 135 |
+```bash |
|
| 136 |
+/var/lib/docker/swarm/raft |
|
| 137 |
+``` |
|
| 150 | 138 |
|
| 151 |
-In case of [disaster recovery](#recover-from-disaster), you can take the raft data |
|
| 152 |
-directory of one of the manager nodes to restore to a new swarm cluster. |
|
| 139 |
+Back up the `raft` data directory often so that you can use it in case of |
|
| 140 |
+[disaster recovery](#recover-from-disaster). Then you can take the `raft` |
|
| 141 |
+directory of one of the manager nodes to restore to a new swarm. |
|
| 153 | 142 |
|
| 154 | 143 |
## Monitor swarm health |
| 155 | 144 |
|
| 156 |
-You can monitor the health of Manager nodes by querying the docker `nodes` API |
|
| 145 |
+You can monitor the health of manager nodes by querying the docker `nodes` API |
|
| 157 | 146 |
in JSON format through the `/nodes` HTTP endpoint. Refer to the [nodes API documentation](../reference/api/docker_remote_api_v1.24.md#36-nodes) |
| 158 | 147 |
for more information. |
| 159 | 148 |
|
| 160 | 149 |
From the command line, run `docker node inspect <id-node>` to query the nodes. |
| 161 |
-For instance, to query the reachability of the node as a Manager: |
|
| 150 |
+For instance, to query the reachability of the node as a manager: |
|
| 162 | 151 |
|
| 163 | 152 |
```bash |
| 164 | 153 |
docker node inspect manager1 --format "{{ .ManagerStatus.Reachability }}"
|
| 165 | 154 |
reachable |
| 166 | 155 |
``` |
| 167 | 156 |
|
| 168 |
-To query the status of the node as a Worker that accept tasks: |
|
| 157 |
+To query the status of the node as a worker that accept tasks: |
|
| 169 | 158 |
|
| 170 | 159 |
```bash |
| 171 | 160 |
docker node inspect manager1 --format "{{ .Status.State }}"
|
| ... | ... |
@@ -181,12 +188,13 @@ manager: |
| 181 | 181 |
|
| 182 | 182 |
- Restart the daemon and see if the manager comes back as reachable. |
| 183 | 183 |
- Reboot the machine. |
| 184 |
-- If neither restarting or rebooting work, you should add another manager node or promote a worker to be a manager node. You also need to cleanly remove the failed node entry from the Manager set with `docker node demote <id-node>` and `docker node rm <id-node>`. |
|
| 184 |
+- If neither restarting or rebooting work, you should add another manager node or promote a worker to be a manager node. You also need to cleanly remove the failed node entry from the manager set with `docker node demote <NODE>` and `docker node rm <id-node>`. |
|
| 185 | 185 |
|
| 186 |
-Alternatively you can also get an overview of the cluster health with `docker node ls`: |
|
| 186 |
+Alternatively you can also get an overview of the swarm health from a manager |
|
| 187 |
+node with `docker node ls`: |
|
| 187 | 188 |
|
| 188 | 189 |
```bash |
| 189 |
-# From a Manager node |
|
| 190 |
+ |
|
| 190 | 191 |
docker node ls |
| 191 | 192 |
ID HOSTNAME MEMBERSHIP STATUS AVAILABILITY MANAGER STATUS |
| 192 | 193 |
1mhtdwhvsgr3c26xxbnzdc3yp node05 Accepted Ready Active |
| ... | ... |
@@ -197,44 +205,61 @@ bb1nrq2cswhtbg4mrsqnlx1ck node03 Accepted Ready Active Reachab |
| 197 | 197 |
di9wxgz8dtuh9d2hn089ecqkf node06 Accepted Ready Active |
| 198 | 198 |
``` |
| 199 | 199 |
|
| 200 |
-## Manager advertise address |
|
| 200 |
+## Troubleshoot a manager node |
|
| 201 | 201 |
|
| 202 |
-When initiating or joining a swarm, you have to specify the `--listen-addr` |
|
| 203 |
-flag to advertise your address to other Manager nodes in the cluster. |
|
| 202 |
+You should never restart a manager node by copying the `raft` directory from another node. The data directory is unique to a node ID. A node can only use a node ID once to join the swarm. The node ID space should be globally unique. |
|
| 204 | 203 |
|
| 205 |
-We recommend that you use a *fixed IP address* for the advertised address, otherwise |
|
| 206 |
-the cluster could become unstable on machine reboot. |
|
| 204 |
+To cleanly re-join a manager node to a cluster: |
|
| 205 |
+ |
|
| 206 |
+1. To demote the node to a worker, run `docker node demote <NODE>`. |
|
| 207 |
+2. To remove the node from the swarm, run `docker node rm <NODE>`. |
|
| 208 |
+3. Re-join the node to the swarm with a fresh state using `docker swarm join`. |
|
| 209 |
+ |
|
| 210 |
+For more information on joining a manager node to a swarm, refer to |
|
| 211 |
+[Join nodes to a swarm](join-nodes.md). |
|
| 212 |
+ |
|
| 213 |
+## Force remove a node |
|
| 214 |
+ |
|
| 215 |
+In most cases, you should shut down a node before removing it from a swarm with the `docker node rm` command. If a node becomes unreachable, unresponsive, or compromised you can forcefully remove the node without shutting it down by passing the `--force` flag. For instance, if `node9` becomes compromised: |
|
| 216 |
+ |
|
| 217 |
+<!-- bash hint breaks block quote --> |
|
| 218 |
+``` |
|
| 219 |
+$ docker node rm node9 |
|
| 220 |
+ |
|
| 221 |
+Error response from daemon: rpc error: code = 9 desc = node node9 is not down and can't be removed |
|
| 222 |
+ |
|
| 223 |
+$ docker node rm --force node9 |
|
| 224 |
+ |
|
| 225 |
+Node node9 removed from swarm |
|
| 226 |
+``` |
|
| 207 | 227 |
|
| 208 |
-Indeed if the whole cluster restarts and every Manager gets a new IP address on |
|
| 209 |
-restart, there is no way for any of those nodes to contact an existing Manager |
|
| 210 |
-and the cluster will stay stuck trying to contact other nodes through their old address. |
|
| 211 |
-While having dynamic IP addresses for Worker nodes is acceptable, Managers are |
|
| 212 |
-meant to be a stable piece in the infrastructure thus it is highly recommended to |
|
| 213 |
-deploy those critical nodes with static IPs. |
|
| 228 |
+Before you forcefully remove a manager node, you must first demote it to the |
|
| 229 |
+worker role. Make sure that you always have an odd number of manager nodes if |
|
| 230 |
+you demote or remove a manager |
|
| 214 | 231 |
|
| 215 | 232 |
## Recover from disaster |
| 216 | 233 |
|
| 217 |
-Swarm is resilient to failures and the cluster can recover from any number |
|
| 234 |
+Swarm is resilient to failures and the swarm can recover from any number |
|
| 218 | 235 |
of temporary node failures (machine reboots or crash with restart). |
| 219 | 236 |
|
| 220 | 237 |
In a swarm of `N` managers, there must be a quorum of manager nodes greater than |
| 221 | 238 |
50% of the total number of managers (or `(N/2)+1`) in order for the swarm to |
| 222 | 239 |
process requests and remain available. This means the swarm can tolerate up to |
| 223 |
-`(N-1)/2` permanent failures beyond which requests involving cluster management |
|
| 240 |
+`(N-1)/2` permanent failures beyond which requests involving swarm management |
|
| 224 | 241 |
cannot be processed. These types of failures include data corruption or hardware |
| 225 | 242 |
failures. |
| 226 | 243 |
|
| 227 | 244 |
Even if you follow the guidelines here, it is possible that you can lose a |
| 228 | 245 |
quorum of manager nodes. If you can't recover the quorum by conventional |
| 229 |
-means such as restarting faulty nodes, you can recover the cluster by running |
|
| 246 |
+means such as restarting faulty nodes, you can recover the swarm by running |
|
| 230 | 247 |
`docker swarm init --force-new-cluster` on a manager node. |
| 231 | 248 |
|
| 232 | 249 |
```bash |
| 233 | 250 |
# From the node to recover |
| 234 |
-docker swarm init --force-new-cluster --listen-addr node01:2377 |
|
| 251 |
+docker swarm init --force-new-cluster --advertise-addr node01:2377 |
|
| 235 | 252 |
``` |
| 236 | 253 |
|
| 237 | 254 |
The `--force-new-cluster` flag puts the Docker Engine into swarm mode as a |
| 238 |
-manager node of a single-node cluster. It discards cluster membership information |
|
| 255 |
+manager node of a single-node swarm. It discards swarm membership information |
|
| 239 | 256 |
that existed before the loss of the quorum but it retains data necessary to the |
| 240 |
-Swarm cluster such as services, tasks and the list of worker nodes. |
|
| 257 |
+Swarm such as services, tasks and the list of worker nodes. |