| 15 | 16 |
new file mode 100644 |
| ... | ... |
@@ -0,0 +1,463 @@ |
| 0 |
+:title: Runtime Metrics |
|
| 1 |
+:description: Measure the behavior of running containers |
|
| 2 |
+:keywords: docker, metrics, CPU, memory, disk, IO, run, runtime |
|
| 3 |
+ |
|
| 4 |
+.. _run_metrics: |
|
| 5 |
+ |
|
| 6 |
+ |
|
| 7 |
+Runtime Metrics |
|
| 8 |
+=============== |
|
| 9 |
+ |
|
| 10 |
+Linux Containers rely on `control groups |
|
| 11 |
+<https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt>`_ which |
|
| 12 |
+not only track groups of processes, but also expose metrics about CPU, |
|
| 13 |
+memory, and block I/O usage. You can access those metrics and obtain |
|
| 14 |
+network usage metrics as well. This is relevant for "pure" LXC |
|
| 15 |
+containers, as well as for Docker containers. |
|
| 16 |
+ |
|
| 17 |
+Control Groups |
|
| 18 |
+-------------- |
|
| 19 |
+ |
|
| 20 |
+Control groups are exposed through a pseudo-filesystem. In recent |
|
| 21 |
+distros, you should find this filesystem under |
|
| 22 |
+``/sys/fs/cgroup``. Under that directory, you will see multiple |
|
| 23 |
+sub-directories, called devices, freezer, blkio, etc.; each |
|
| 24 |
+sub-directory actually corresponds to a different cgroup hierarchy. |
|
| 25 |
+ |
|
| 26 |
+On older systems, the control groups might be mounted on ``/cgroup``, |
|
| 27 |
+without distinct hierarchies. In that case, instead of seeing the |
|
| 28 |
+sub-directories, you will see a bunch of files in that directory, and |
|
| 29 |
+possibly some directories corresponding to existing containers. |
|
| 30 |
+ |
|
| 31 |
+To figure out where your control groups are mounted, you can run: |
|
| 32 |
+ |
|
| 33 |
+:: |
|
| 34 |
+ |
|
| 35 |
+ grep cgroup /proc/mounts |
|
| 36 |
+ |
|
| 37 |
+.. _run_findpid: |
|
| 38 |
+ |
|
| 39 |
+Enumerating Cgroups |
|
| 40 |
+------------------- |
|
| 41 |
+ |
|
| 42 |
+You can look into ``/proc/cgroups`` to see the different control group |
|
| 43 |
+subsystems known to the system, the hierarchy they belong to, and how |
|
| 44 |
+many groups they contain. |
|
| 45 |
+ |
|
| 46 |
+You can also look at ``/proc/<pid>/cgroup`` to see which control |
|
| 47 |
+groups a process belongs to. The control group will be shown as a path |
|
| 48 |
+relative to the root of the hierarchy mountpoint; e.g. ``/`` means |
|
| 49 |
+“this process has not been assigned into a particular group”, while |
|
| 50 |
+``/lxc/pumpkin`` means that the process is likely to be a member of a |
|
| 51 |
+container named ``pumpkin``. |
|
| 52 |
+ |
|
| 53 |
+Finding the Cgroup for a Given Container |
|
| 54 |
+---------------------------------------- |
|
| 55 |
+ |
|
| 56 |
+For each container, one cgroup will be created in each hierarchy. On |
|
| 57 |
+older systems with older versions of the LXC userland tools, the name |
|
| 58 |
+of the cgroup will be the name of the container. With more recent |
|
| 59 |
+versions of the LXC tools, the cgroup will be ``lxc/<container_name>.`` |
|
| 60 |
+ |
|
| 61 |
+For Docker containers using cgroups, the container name will be the |
|
| 62 |
+full ID or long ID of the container. If a container shows up as |
|
| 63 |
+ae836c95b4c3 in ``docker ps``, its long ID might be something like |
|
| 64 |
+``ae836c95b4c3c9e9179e0e91015512da89fdec91612f63cebae57df9a5444c79``. You |
|
| 65 |
+can look it up with ``docker inspect`` or ``docker ps -notrunc``. |
|
| 66 |
+ |
|
| 67 |
+Putting everything together to look at the memory metrics for a Docker |
|
| 68 |
+container, take a look at ``/sys/fs/cgroup/memory/lxc/<longid>/``. |
|
| 69 |
+ |
|
| 70 |
+Metrics from Cgroups: Memory, CPU, Block IO |
|
| 71 |
+------------------------------------------- |
|
| 72 |
+ |
|
| 73 |
+For each subsystem (memory, CPU, and block I/O), you will find one or |
|
| 74 |
+more pseudo-files containing statistics. |
|
| 75 |
+ |
|
| 76 |
+Memory Metrics: ``memory.stat`` |
|
| 77 |
+............................... |
|
| 78 |
+ |
|
| 79 |
+Memory metrics are found in the "memory" cgroup. Note that the memory |
|
| 80 |
+control group adds a little overhead, because it does very |
|
| 81 |
+fine-grained accounting of the memory usage on your host. Therefore, |
|
| 82 |
+many distros chose to not enable it by default. Generally, to enable |
|
| 83 |
+it, all you have to do is to add some kernel command-line parameters: |
|
| 84 |
+``cgroup_enable=memory swapaccount=1``. |
|
| 85 |
+ |
|
| 86 |
+The metrics are in the pseudo-file ``memory.stat``. Here is what it |
|
| 87 |
+will look like: |
|
| 88 |
+ |
|
| 89 |
+:: |
|
| 90 |
+ |
|
| 91 |
+ cache 11492564992 |
|
| 92 |
+ rss 1930993664 |
|
| 93 |
+ mapped_file 306728960 |
|
| 94 |
+ pgpgin 406632648 |
|
| 95 |
+ pgpgout 403355412 |
|
| 96 |
+ swap 0 |
|
| 97 |
+ pgfault 728281223 |
|
| 98 |
+ pgmajfault 1724 |
|
| 99 |
+ inactive_anon 46608384 |
|
| 100 |
+ active_anon 1884520448 |
|
| 101 |
+ inactive_file 7003344896 |
|
| 102 |
+ active_file 4489052160 |
|
| 103 |
+ unevictable 32768 |
|
| 104 |
+ hierarchical_memory_limit 9223372036854775807 |
|
| 105 |
+ hierarchical_memsw_limit 9223372036854775807 |
|
| 106 |
+ total_cache 11492564992 |
|
| 107 |
+ total_rss 1930993664 |
|
| 108 |
+ total_mapped_file 306728960 |
|
| 109 |
+ total_pgpgin 406632648 |
|
| 110 |
+ total_pgpgout 403355412 |
|
| 111 |
+ total_swap 0 |
|
| 112 |
+ total_pgfault 728281223 |
|
| 113 |
+ total_pgmajfault 1724 |
|
| 114 |
+ total_inactive_anon 46608384 |
|
| 115 |
+ total_active_anon 1884520448 |
|
| 116 |
+ total_inactive_file 7003344896 |
|
| 117 |
+ total_active_file 4489052160 |
|
| 118 |
+ total_unevictable 32768 |
|
| 119 |
+ |
|
| 120 |
+The first half (without the ``total_`` prefix) contains statistics |
|
| 121 |
+relevant to the processes within the cgroup, excluding |
|
| 122 |
+sub-cgroups. The second half (with the ``total_`` prefix) includes |
|
| 123 |
+sub-cgroups as well. |
|
| 124 |
+ |
|
| 125 |
+Some metrics are "gauges", i.e. values that can increase or decrease |
|
| 126 |
+(e.g. swap, the amount of swap space used by the members of the |
|
| 127 |
+cgroup). Some others are "counters", i.e. values that can only go up, |
|
| 128 |
+because they represent occurrences of a specific event (e.g. pgfault, |
|
| 129 |
+which indicates the number of page faults which happened since the |
|
| 130 |
+creation of the cgroup; this number can never decrease). |
|
| 131 |
+ |
|
| 132 |
+cache |
|
| 133 |
+ the amount of memory used by the processes of this control group |
|
| 134 |
+ that can be associated precisely with a block on a block |
|
| 135 |
+ device. When you read from and write to files on disk, this amount |
|
| 136 |
+ will increase. This will be the case if you use "conventional" I/O |
|
| 137 |
+ (``open``, ``read``, ``write`` syscalls) as well as mapped files |
|
| 138 |
+ (with ``mmap``). It also accounts for the memory used by ``tmpfs`` |
|
| 139 |
+ mounts, though the reasons are unclear. |
|
| 140 |
+ |
|
| 141 |
+rss |
|
| 142 |
+ the amount of memory that *doesn't* correspond to anything on |
|
| 143 |
+ disk: stacks, heaps, and anonymous memory maps. |
|
| 144 |
+ |
|
| 145 |
+mapped_file |
|
| 146 |
+ indicates the amount of memory mapped by the processes in the |
|
| 147 |
+ control group. It doesn't give you information about *how much* |
|
| 148 |
+ memory is used; it rather tells you *how* it is used. |
|
| 149 |
+ |
|
| 150 |
+pgfault and pgmajfault |
|
| 151 |
+ indicate the number of times that a process of the cgroup triggered |
|
| 152 |
+ a "page fault" and a "major fault", respectively. A page fault |
|
| 153 |
+ happens when a process accesses a part of its virtual memory space |
|
| 154 |
+ which is nonexistent or protected. The former can happen if the |
|
| 155 |
+ process is buggy and tries to access an invalid address (it will |
|
| 156 |
+ then be sent a ``SIGSEGV`` signal, typically killing it with the |
|
| 157 |
+ famous ``Segmentation fault`` message). The latter can happen when |
|
| 158 |
+ the process reads from a memory zone which has been swapped out, or |
|
| 159 |
+ which corresponds to a mapped file: in that case, the kernel will |
|
| 160 |
+ load the page from disk, and let the CPU complete the memory |
|
| 161 |
+ access. It can also happen when the process writes to a |
|
| 162 |
+ copy-on-write memory zone: likewise, the kernel will preempt the |
|
| 163 |
+ process, duplicate the memory page, and resume the write operation |
|
| 164 |
+ on the process' own copy of the page. "Major" faults happen when the |
|
| 165 |
+ kernel actually has to read the data from disk. When it just has to |
|
| 166 |
+ duplicate an existing page, or allocate an empty page, it's a |
|
| 167 |
+ regular (or "minor") fault. |
|
| 168 |
+ |
|
| 169 |
+swap |
|
| 170 |
+ the amount of swap currently used by the processes in this cgroup. |
|
| 171 |
+ |
|
| 172 |
+active_anon and inactive_anon |
|
| 173 |
+ the amount of *anonymous* memory that has been identified has |
|
| 174 |
+ respectively *active* and *inactive* by the kernel. "Anonymous" |
|
| 175 |
+ memory is the memory that is *not* linked to disk pages. In other |
|
| 176 |
+ words, that's the equivalent of the rss counter described above. In |
|
| 177 |
+ fact, the very definition of the rss counter is **active_anon** + |
|
| 178 |
+ **inactive_anon** - **tmpfs** (where tmpfs is the amount of memory |
|
| 179 |
+ used up by ``tmpfs`` filesystems mounted by this control |
|
| 180 |
+ group). Now, what's the difference between "active" and "inactive"? |
|
| 181 |
+ Pages are initially "active"; and at regular intervals, the kernel |
|
| 182 |
+ sweeps over the memory, and tags some pages as "inactive". Whenever |
|
| 183 |
+ they are accessed again, they are immediately retagged |
|
| 184 |
+ "active". When the kernel is almost out of memory, and time comes to |
|
| 185 |
+ swap out to disk, the kernel will swap "inactive" pages. |
|
| 186 |
+ |
|
| 187 |
+active_file and inactive_file |
|
| 188 |
+ cache memory, with *active* and *inactive* similar to the *anon* |
|
| 189 |
+ memory above. The exact formula is cache = **active_file** + |
|
| 190 |
+ **inactive_file** + **tmpfs**. The exact rules used by the kernel to |
|
| 191 |
+ move memory pages between active and inactive sets are different |
|
| 192 |
+ from the ones used for anonymous memory, but the general principle |
|
| 193 |
+ is the same. Note that when the kernel needs to reclaim memory, it |
|
| 194 |
+ is cheaper to reclaim a clean (=non modified) page from this pool, |
|
| 195 |
+ since it can be reclaimed immediately (while anonymous pages and |
|
| 196 |
+ dirty/modified pages have to be written to disk first). |
|
| 197 |
+ |
|
| 198 |
+unevictable |
|
| 199 |
+ the amount of memory that cannot be reclaimed; generally, it will |
|
| 200 |
+ account for memory that has been "locked" with ``mlock``. It is |
|
| 201 |
+ often used by crypto frameworks to make sure that secret keys and |
|
| 202 |
+ other sensitive material never gets swapped out to disk. |
|
| 203 |
+ |
|
| 204 |
+memory and memsw limits |
|
| 205 |
+ These are not really metrics, but a reminder of the limits applied |
|
| 206 |
+ to this cgroup. The first one indicates the maximum amount of |
|
| 207 |
+ physical memory that can be used by the processes of this control |
|
| 208 |
+ group; the second one indicates the maximum amount of RAM+swap. |
|
| 209 |
+ |
|
| 210 |
+Accounting for memory in the page cache is very complex. If two |
|
| 211 |
+processes in different control groups both read the same file |
|
| 212 |
+(ultimately relying on the same blocks on disk), the corresponding |
|
| 213 |
+memory charge will be split between the control groups. It's nice, but |
|
| 214 |
+it also means that when a cgroup is terminated, it could increase the |
|
| 215 |
+memory usage of another cgroup, because they are not splitting the |
|
| 216 |
+cost anymore for those memory pages. |
|
| 217 |
+ |
|
| 218 |
+CPU metrics: ``cpuacct.stat`` |
|
| 219 |
+............................. |
|
| 220 |
+ |
|
| 221 |
+Now that we've covered memory metrics, everything else will look very |
|
| 222 |
+simple in comparison. CPU metrics will be found in the ``cpuacct`` |
|
| 223 |
+controller. |
|
| 224 |
+ |
|
| 225 |
+For each container, you will find a pseudo-file ``cpuacct.stat``, |
|
| 226 |
+containing the CPU usage accumulated by the processes of the |
|
| 227 |
+container, broken down between ``user`` and ``system`` time. If you're |
|
| 228 |
+not familiar with the distinction, ``user`` is the time during which |
|
| 229 |
+the processes were in direct control of the CPU (i.e. executing |
|
| 230 |
+process code), and ``system`` is the time during which the CPU was |
|
| 231 |
+executing system calls on behalf of those processes. |
|
| 232 |
+ |
|
| 233 |
+Those times are expressed in ticks of 1/100th of a second. Actually, |
|
| 234 |
+they are expressed in "user jiffies". There are ``USER_HZ`` |
|
| 235 |
+*"jiffies"* per second, and on x86 systems, ``USER_HZ`` is 100. This |
|
| 236 |
+used to map exactly to the number of scheduler "ticks" per second; but |
|
| 237 |
+with the advent of higher frequency scheduling, as well as `tickless |
|
| 238 |
+kernels <http://lwn.net/Articles/549580/>`_, the number of kernel |
|
| 239 |
+ticks wasn't relevant anymore. It stuck around anyway, mainly for |
|
| 240 |
+legacy and compatibility reasons. |
|
| 241 |
+ |
|
| 242 |
+Block I/O metrics |
|
| 243 |
+................. |
|
| 244 |
+ |
|
| 245 |
+Block I/O is accounted in the ``blkio`` controller. Different metrics |
|
| 246 |
+are scattered across different files. While you can find in-depth |
|
| 247 |
+details in the `blkio-controller |
|
| 248 |
+<https://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt>`_ |
|
| 249 |
+file in the kernel documentation, here is a short list of the most |
|
| 250 |
+relevant ones: |
|
| 251 |
+ |
|
| 252 |
+blkio.sectors |
|
| 253 |
+ contain the number of 512-bytes sectors read and written by the |
|
| 254 |
+ processes member of the cgroup, device by device. Reads and writes |
|
| 255 |
+ are merged in a single counter. |
|
| 256 |
+ |
|
| 257 |
+blkio.io_service_bytes |
|
| 258 |
+ indicates the number of bytes read and written by the cgroup. It has |
|
| 259 |
+ 4 counters per device, because for each device, it differentiates |
|
| 260 |
+ between synchronous vs. asynchronous I/O, and reads vs. writes. |
|
| 261 |
+ |
|
| 262 |
+blkio.io_serviced |
|
| 263 |
+ the number of I/O operations performed, regardless of their size. It |
|
| 264 |
+ also has 4 counters per device. |
|
| 265 |
+ |
|
| 266 |
+blkio.io_queued |
|
| 267 |
+ indicates the number of I/O operations currently queued for this |
|
| 268 |
+ cgroup. In other words, if the cgroup isn't doing any I/O, this will |
|
| 269 |
+ be zero. Note that the opposite is not true. In other words, if |
|
| 270 |
+ there is no I/O queued, it does not mean that the cgroup is idle |
|
| 271 |
+ (I/O-wise). It could be doing purely synchronous reads on an |
|
| 272 |
+ otherwise quiescent device, which is therefore able to handle them |
|
| 273 |
+ immediately, without queuing. Also, while it is helpful to figure |
|
| 274 |
+ out which cgroup is putting stress on the I/O subsystem, keep in |
|
| 275 |
+ mind that is is a relative quantity. Even if a process group does |
|
| 276 |
+ not perform more I/O, its queue size can increase just because the |
|
| 277 |
+ device load increases because of other devices. |
|
| 278 |
+ |
|
| 279 |
+Network Metrics |
|
| 280 |
+--------------- |
|
| 281 |
+ |
|
| 282 |
+Network metrics are not exposed directly by control groups. There is a |
|
| 283 |
+good explanation for that: network interfaces exist within the context |
|
| 284 |
+of *network namespaces*. The kernel could probably accumulate metrics |
|
| 285 |
+about packets and bytes sent and received by a group of processes, but |
|
| 286 |
+those metrics wouldn't be very useful. You want per-interface metrics |
|
| 287 |
+(because traffic happening on the local ``lo`` interface doesn't |
|
| 288 |
+really count). But since processes in a single cgroup can belong to |
|
| 289 |
+multiple network namespaces, those metrics would be harder to |
|
| 290 |
+interpret: multiple network namespaces means multiple ``lo`` |
|
| 291 |
+interfaces, potentially multiple ``eth0`` interfaces, etc.; so this is |
|
| 292 |
+why there is no easy way to gather network metrics with control |
|
| 293 |
+groups. |
|
| 294 |
+ |
|
| 295 |
+Instead we can gather network metrics from other sources: |
|
| 296 |
+ |
|
| 297 |
+IPtables |
|
| 298 |
+........ |
|
| 299 |
+ |
|
| 300 |
+IPtables (or rather, the netfilter framework for which iptables is |
|
| 301 |
+just an interface) can do some serious accounting. |
|
| 302 |
+ |
|
| 303 |
+For instance, you can setup a rule to account for the outbound HTTP |
|
| 304 |
+traffic on a web server: |
|
| 305 |
+ |
|
| 306 |
+:: |
|
| 307 |
+ |
|
| 308 |
+ iptables -I OUTPUT -p tcp --sport 80 |
|
| 309 |
+ |
|
| 310 |
+ |
|
| 311 |
+There is no ``-j`` or ``-g`` flag, so the rule will just count matched |
|
| 312 |
+packets and go to the following rule. |
|
| 313 |
+ |
|
| 314 |
+Later, you can check the values of the counters, with: |
|
| 315 |
+ |
|
| 316 |
+:: |
|
| 317 |
+ |
|
| 318 |
+ iptables -nxvL OUTPUT |
|
| 319 |
+ |
|
| 320 |
+Technically, ``-n`` is not required, but it will prevent iptables from |
|
| 321 |
+doing DNS reverse lookups, which are probably useless in this |
|
| 322 |
+scenario. |
|
| 323 |
+ |
|
| 324 |
+Counters include packets and bytes. If you want to setup metrics for |
|
| 325 |
+container traffic like this, you could execute a ``for`` loop to add |
|
| 326 |
+two ``iptables`` rules per container IP address (one in each |
|
| 327 |
+direction), in the ``FORWARD`` chain. This will only meter traffic |
|
| 328 |
+going through the NAT layer; you will also have to add traffic going |
|
| 329 |
+through the userland proxy. |
|
| 330 |
+ |
|
| 331 |
+Then, you will need to check those counters on a regular basis. If you |
|
| 332 |
+happen to use ``collectd``, there is a nice plugin to automate |
|
| 333 |
+iptables counters collection. |
|
| 334 |
+ |
|
| 335 |
+Interface-level counters |
|
| 336 |
+........................ |
|
| 337 |
+ |
|
| 338 |
+Since each container has a virtual Ethernet interface, you might want |
|
| 339 |
+to check directly the TX and RX counters of this interface. You will |
|
| 340 |
+notice that each container is associated to a virtual Ethernet |
|
| 341 |
+interface in your host, with a name like ``vethKk8Zqi``. Figuring out |
|
| 342 |
+which interface corresponds to which container is, unfortunately, |
|
| 343 |
+difficult. |
|
| 344 |
+ |
|
| 345 |
+But for now, the best way is to check the metrics *from within the |
|
| 346 |
+containers*. To accomplish this, you can run an executable from the |
|
| 347 |
+host environment within the network namespace of a container using |
|
| 348 |
+**ip-netns magic**. |
|
| 349 |
+ |
|
| 350 |
+The ``ip-netns exec`` command will let you execute any program |
|
| 351 |
+(present in the host system) within any network namespace visible to |
|
| 352 |
+the current process. This means that your host will be able to enter |
|
| 353 |
+the network namespace of your containers, but your containers won't be |
|
| 354 |
+able to access the host, nor their sibling containers. Containers will |
|
| 355 |
+be able to “see” and affect their sub-containers, though. |
|
| 356 |
+ |
|
| 357 |
+The exact format of the command is:: |
|
| 358 |
+ |
|
| 359 |
+ ip netns exec <nsname> <command...> |
|
| 360 |
+ |
|
| 361 |
+For example:: |
|
| 362 |
+ |
|
| 363 |
+ ip netns exec mycontainer netstat -i |
|
| 364 |
+ |
|
| 365 |
+``ip netns`` finds the "mycontainer" container by using namespaces |
|
| 366 |
+pseudo-files. Each process belongs to one network namespace, one PID |
|
| 367 |
+namespace, one ``mnt`` namespace, etc., and those namespaces are |
|
| 368 |
+materialized under ``/proc/<pid>/ns/``. For example, the network |
|
| 369 |
+namespace of PID 42 is materialized by the pseudo-file |
|
| 370 |
+``/proc/42/ns/net``. |
|
| 371 |
+ |
|
| 372 |
+When you run ``ip netns exec mycontainer ...``, it expects |
|
| 373 |
+``/var/run/netns/mycontainer`` to be one of those |
|
| 374 |
+pseudo-files. (Symlinks are accepted.) |
|
| 375 |
+ |
|
| 376 |
+In other words, to execute a command within the network namespace of a |
|
| 377 |
+container, we need to: |
|
| 378 |
+ |
|
| 379 |
+* Find out the PID of any process within the container that we want to |
|
| 380 |
+ investigate; |
|
| 381 |
+* Create a symlink from ``/var/run/netns/<somename>`` to |
|
| 382 |
+ ``/proc/<thepid>/ns/net`` |
|
| 383 |
+* Execute ``ip netns exec <somename> ....`` |
|
| 384 |
+ |
|
| 385 |
+Please review :ref:`run_findpid` to learn how to find the cgroup of a |
|
| 386 |
+pprocess running in the container of which you want to measure network |
|
| 387 |
+usage. From there, you can examine the pseudo-file named ``tasks``, |
|
| 388 |
+which containes the PIDs that are in the control group (i.e. in the |
|
| 389 |
+container). Pick any one of them. |
|
| 390 |
+ |
|
| 391 |
+Putting everything together, if the "short ID" of a container is held |
|
| 392 |
+in the environment variable ``$CID``, then you can do this:: |
|
| 393 |
+ |
|
| 394 |
+ TASKS=/sys/fs/cgroup/devices/$CID*/tasks |
|
| 395 |
+ PID=$(head -n 1 $TASKS) |
|
| 396 |
+ mkdir -p /var/run/netns |
|
| 397 |
+ ln -sf /proc/$PID/ns/net /var/run/netns/$CID |
|
| 398 |
+ ip netns exec $CID netstat -i |
|
| 399 |
+ |
|
| 400 |
+ |
|
| 401 |
+Tips for high-performance metric collection |
|
| 402 |
+------------------------------------------- |
|
| 403 |
+ |
|
| 404 |
+Note that running a new process each time you want to update metrics |
|
| 405 |
+is (relatively) expensive. If you want to collect metrics at high |
|
| 406 |
+resolutions, and/or over a large number of containers (think 1000 |
|
| 407 |
+containers on a single host), you do not want to fork a new process |
|
| 408 |
+each time. |
|
| 409 |
+ |
|
| 410 |
+Here is how to collect metrics from a single process. You will have to |
|
| 411 |
+write your metric collector in C (or any language that lets you do |
|
| 412 |
+low-level system calls). You need to use a special system call, |
|
| 413 |
+``setns()``, which lets the current process enter any arbitrary |
|
| 414 |
+namespace. It requires, however, an open file descriptor to the |
|
| 415 |
+namespace pseudo-file (remember: that’s the pseudo-file in |
|
| 416 |
+``/proc/<pid>/ns/net``). |
|
| 417 |
+ |
|
| 418 |
+However, there is a catch: you must not keep this file descriptor |
|
| 419 |
+open. If you do, when the last process of the control group exits, the |
|
| 420 |
+namespace will not be destroyed, and its network resources (like the |
|
| 421 |
+virtual interface of the container) will stay around for ever (or |
|
| 422 |
+until you close that file descriptor). |
|
| 423 |
+ |
|
| 424 |
+The right approach would be to keep track of the first PID of each |
|
| 425 |
+container, and re-open the namespace pseudo-file each time. |
|
| 426 |
+ |
|
| 427 |
+Collecting metrics when a container exits |
|
| 428 |
+----------------------------------------- |
|
| 429 |
+ |
|
| 430 |
+Sometimes, you do not care about real time metric collection, but when |
|
| 431 |
+a container exits, you want to know how much CPU, memory, etc. it has |
|
| 432 |
+used. |
|
| 433 |
+ |
|
| 434 |
+Docker makes this difficult because it relies on ``lxc-start``, which |
|
| 435 |
+carefully cleans up after itself, but it is still possible. It is |
|
| 436 |
+usually easier to collect metrics at regular intervals (e.g. every |
|
| 437 |
+minute, with the collectd LXC plugin) and rely on that instead. |
|
| 438 |
+ |
|
| 439 |
+But, if you'd still like to gather the stats when a container stops, |
|
| 440 |
+here is how: |
|
| 441 |
+ |
|
| 442 |
+For each container, start a collection process, and move it to the |
|
| 443 |
+control groups that you want to monitor by writing its PID to the |
|
| 444 |
+tasks file of the cgroup. The collection process should periodically |
|
| 445 |
+re-read the tasks file to check if it's the last process of the |
|
| 446 |
+control group. (If you also want to collect network statistics as |
|
| 447 |
+explained in the previous section, you should also move the process to |
|
| 448 |
+the appropriate network namespace.) |
|
| 449 |
+ |
|
| 450 |
+When the container exits, ``lxc-start`` will try to delete the control |
|
| 451 |
+groups. It will fail, since the control group is still in use; but |
|
| 452 |
+that’s fine. You process should now detect that it is the only one |
|
| 453 |
+remaining in the group. Now is the right time to collect all the |
|
| 454 |
+metrics you need! |
|
| 455 |
+ |
|
| 456 |
+Finally, your process should move itself back to the root control |
|
| 457 |
+group, and remove the container control group. To remove a control |
|
| 458 |
+group, just ``rmdir`` its directory. It's counter-intuitive to |
|
| 459 |
+``rmdir`` a directory as it still contains files; but remember that |
|
| 460 |
+this is a pseudo-filesystem, so usual rules don't apply. After the |
|
| 461 |
+cleanup is done, the collection process can exit safely. |
|
| 462 |
+ |
| ... | ... |
@@ -1,12 +1,12 @@ |
| 1 |
-:title: Build Images (Dockerfile Reference) |
|
| 1 |
+:title: Dockerfile Reference |
|
| 2 | 2 |
:description: Dockerfiles use a simple DSL which allows you to automate the steps you would normally manually take to create an image. |
| 3 | 3 |
:keywords: builder, docker, Dockerfile, automation, image creation |
| 4 | 4 |
|
| 5 | 5 |
.. _dockerbuilder: |
| 6 | 6 |
|
| 7 |
-=================================== |
|
| 8 |
-Build Images (Dockerfile Reference) |
|
| 9 |
-=================================== |
|
| 7 |
+==================== |
|
| 8 |
+Dockerfile Reference |
|
| 9 |
+==================== |
|
| 10 | 10 |
|
| 11 | 11 |
**Docker can act as a builder** and read instructions from a text |
| 12 | 12 |
``Dockerfile`` to automate the steps you would otherwise take manually |
| ... | ... |
@@ -18,6 +18,45 @@ To list available commands, either run ``docker`` with no parameters or execute |
| 18 | 18 |
|
| 19 | 19 |
... |
| 20 | 20 |
|
| 21 |
+.. _cli_options: |
|
| 22 |
+ |
|
| 23 |
+Types of Options |
|
| 24 |
+---------------- |
|
| 25 |
+ |
|
| 26 |
+Boolean |
|
| 27 |
+~~~~~~~ |
|
| 28 |
+ |
|
| 29 |
+Boolean options look like ``-d=false``. The value you see is the |
|
| 30 |
+default value which gets set if you do **not** use the boolean |
|
| 31 |
+flag. If you do call ``run -d``, that sets the opposite boolean value, |
|
| 32 |
+so in this case, ``true``, and so ``docker run -d`` **will** run in |
|
| 33 |
+"detached" mode, in the background. Other boolean options are similar |
|
| 34 |
+-- specifying them will set the value to the opposite of the default |
|
| 35 |
+value. |
|
| 36 |
+ |
|
| 37 |
+Multi |
|
| 38 |
+~~~~~ |
|
| 39 |
+ |
|
| 40 |
+Options like ``-a=[]`` indicate they can be specified multiple times:: |
|
| 41 |
+ |
|
| 42 |
+ docker run -a stdin -a stdout -a stderr -i -t ubuntu /bin/bash |
|
| 43 |
+ |
|
| 44 |
+Sometimes this can use a more complex value string, as for ``-v``:: |
|
| 45 |
+ |
|
| 46 |
+ docker run -v /host:/container example/mysql |
|
| 47 |
+ |
|
| 48 |
+Strings and Integers |
|
| 49 |
+~~~~~~~~~~~~~~~~~~~~ |
|
| 50 |
+ |
|
| 51 |
+Options like ``-name=""`` expect a string, and they can only be |
|
| 52 |
+specified once. Options like ``-c=0`` expect an integer, and they can |
|
| 53 |
+only be specified once. |
|
| 54 |
+ |
|
| 55 |
+---- |
|
| 56 |
+ |
|
| 57 |
+Commands |
|
| 58 |
+-------- |
|
| 59 |
+ |
|
| 21 | 60 |
.. _cli_daemon: |
| 22 | 61 |
|
| 23 | 62 |
``daemon`` |
| 18 | 19 |
new file mode 100644 |
| ... | ... |
@@ -0,0 +1,419 @@ |
| 0 |
+:title: Docker Run Reference |
|
| 1 |
+:description: Configure containers at runtime |
|
| 2 |
+:keywords: docker, run, configure, runtime |
|
| 3 |
+ |
|
| 4 |
+.. _run_docker: |
|
| 5 |
+ |
|
| 6 |
+==================== |
|
| 7 |
+Docker Run Reference |
|
| 8 |
+==================== |
|
| 9 |
+ |
|
| 10 |
+**Docker runs processes in isolated containers**. When an operator |
|
| 11 |
+executes ``docker run``, she starts a process with its own file |
|
| 12 |
+system, its own networking, and its own isolated process tree. The |
|
| 13 |
+:ref:`image_def` which starts the process may define defaults related |
|
| 14 |
+to the binary to run, the networking to expose, and more, but ``docker |
|
| 15 |
+run`` gives final control to the operator who starts the container |
|
| 16 |
+from the image. That's the main reason :ref:`cli_run` has more options |
|
| 17 |
+than any other ``docker`` command. |
|
| 18 |
+ |
|
| 19 |
+Every one of the :ref:`example_list` shows running containers, and so |
|
| 20 |
+here we try to give more in-depth guidance. |
|
| 21 |
+ |
|
| 22 |
+.. contents:: Table of Contents |
|
| 23 |
+ :depth: 2 |
|
| 24 |
+ |
|
| 25 |
+.. _run_running: |
|
| 26 |
+ |
|
| 27 |
+General Form |
|
| 28 |
+============ |
|
| 29 |
+ |
|
| 30 |
+As you've seen in the :ref:`example_list`, the basic `run` command |
|
| 31 |
+takes this form:: |
|
| 32 |
+ |
|
| 33 |
+ docker run [OPTIONS] IMAGE[:TAG] [COMMAND] [ARG...] |
|
| 34 |
+ |
|
| 35 |
+To learn how to interpret the types of ``[OPTIONS]``, see |
|
| 36 |
+:ref:`cli_options`. |
|
| 37 |
+ |
|
| 38 |
+The list of ``[OPTIONS]`` breaks down into two groups: |
|
| 39 |
+ |
|
| 40 |
+1. Settings exclusive to operators, including: |
|
| 41 |
+ |
|
| 42 |
+ * Detached or Foreground running, |
|
| 43 |
+ * Container Identification, |
|
| 44 |
+ * Network settings, and |
|
| 45 |
+ * Runtime Constraints on CPU and Memory |
|
| 46 |
+ * Privileges and LXC Configuration |
|
| 47 |
+ |
|
| 48 |
+2. Setting shared between operators and developers, where operators |
|
| 49 |
+ can override defaults developers set in images at build time. |
|
| 50 |
+ |
|
| 51 |
+Together, the ``docker run [OPTIONS]`` give complete control over |
|
| 52 |
+runtime behavior to the operator, allowing them to override all |
|
| 53 |
+defaults set by the developer during ``docker build`` and nearly all |
|
| 54 |
+the defaults set by the Docker runtime itself. |
|
| 55 |
+ |
|
| 56 |
+Operator Exclusive Options |
|
| 57 |
+========================== |
|
| 58 |
+ |
|
| 59 |
+Only the operator (the person executing ``docker run``) can set the |
|
| 60 |
+following options. |
|
| 61 |
+ |
|
| 62 |
+.. contents:: |
|
| 63 |
+ :local: |
|
| 64 |
+ |
|
| 65 |
+Detached vs Foreground |
|
| 66 |
+---------------------- |
|
| 67 |
+ |
|
| 68 |
+When starting a Docker container, you must first decide if you want to |
|
| 69 |
+run the container in the background in a "detached" mode or in the |
|
| 70 |
+default foreground mode:: |
|
| 71 |
+ |
|
| 72 |
+ -d=false: Detached mode: Run container in the background, print new container id |
|
| 73 |
+ |
|
| 74 |
+Detached (-d) |
|
| 75 |
+............. |
|
| 76 |
+ |
|
| 77 |
+In detached mode (``-d=true`` or just ``-d``), all I/O should be done |
|
| 78 |
+through network connections or shared volumes because the container is |
|
| 79 |
+no longer listening to the commandline where you executed ``docker |
|
| 80 |
+run``. You can reattach to a detached container with ``docker`` |
|
| 81 |
+:ref:`cli_attach`. If you choose to run a container in the detached |
|
| 82 |
+mode, then you cannot use the ``-rm`` option. |
|
| 83 |
+ |
|
| 84 |
+Foreground |
|
| 85 |
+.......... |
|
| 86 |
+ |
|
| 87 |
+In foreground mode (the default when ``-d`` is not specified), |
|
| 88 |
+``docker run`` can start the process in the container and attach the |
|
| 89 |
+console to the process's standard input, output, and standard |
|
| 90 |
+error. It can even pretend to be a TTY (this is what most commandline |
|
| 91 |
+executables expect) and pass along signals. All of that is |
|
| 92 |
+configurable:: |
|
| 93 |
+ |
|
| 94 |
+ -a=[] : Attach to ``stdin``, ``stdout`` and/or ``stderr`` |
|
| 95 |
+ -t=false : Allocate a pseudo-tty |
|
| 96 |
+ -sig-proxy=true: Proxify all received signal to the process (even in non-tty mode) |
|
| 97 |
+ -i=false : Keep STDIN open even if not attached |
|
| 98 |
+ |
|
| 99 |
+If you do not specify ``-a`` then Docker will `attach everything |
|
| 100 |
+(stdin,stdout,stderr) |
|
| 101 |
+<https://github.com/dotcloud/docker/blob/75a7f4d90cde0295bcfb7213004abce8d4779b75/commands.go#L1797>`_. You |
|
| 102 |
+can specify to which of the three standard streams (``stdin``, ``stdout``, |
|
| 103 |
+``stderr``) you'd like to connect instead, as in:: |
|
| 104 |
+ |
|
| 105 |
+ docker run -a stdin -a stdout -i -t ubuntu /bin/bash |
|
| 106 |
+ |
|
| 107 |
+For interactive processes (like a shell) you will typically want a tty |
|
| 108 |
+as well as persistent standard input (``stdin``), so you'll use ``-i |
|
| 109 |
+-t`` together in most interactive cases. |
|
| 110 |
+ |
|
| 111 |
+Container Identification |
|
| 112 |
+------------------------ |
|
| 113 |
+ |
|
| 114 |
+Name (-name) |
|
| 115 |
+............ |
|
| 116 |
+ |
|
| 117 |
+The operator can identify a container in three ways: |
|
| 118 |
+ |
|
| 119 |
+* UUID long identifier ("f78375b1c487e03c9438c729345e54db9d20cfa2ac1fc3494b6eb60872e74778")
|
|
| 120 |
+* UUID short identifier ("f78375b1c487")
|
|
| 121 |
+* Name ("evil_ptolemy")
|
|
| 122 |
+ |
|
| 123 |
+The UUID identifiers come from the Docker daemon, and if you do not |
|
| 124 |
+assign a name to the container with ``-name`` then the daemon will |
|
| 125 |
+also generate a random string name too. The name can become a handy |
|
| 126 |
+way to add meaning to a container since you can use this name when |
|
| 127 |
+defining :ref:`links <working_with_links_names>` (or any other place |
|
| 128 |
+you need to identify a container). This works for both background and |
|
| 129 |
+foreground Docker containers. |
|
| 130 |
+ |
|
| 131 |
+PID Equivalent |
|
| 132 |
+.............. |
|
| 133 |
+ |
|
| 134 |
+And finally, to help with automation, you can have Docker write the |
|
| 135 |
+container ID out to a file of your choosing. This is similar to how |
|
| 136 |
+some programs might write out their process ID to a file (you've seen |
|
| 137 |
+them as PID files):: |
|
| 138 |
+ |
|
| 139 |
+ -cidfile="": Write the container ID to the file |
|
| 140 |
+ |
|
| 141 |
+Network Settings |
|
| 142 |
+---------------- |
|
| 143 |
+ |
|
| 144 |
+:: |
|
| 145 |
+ -n=true : Enable networking for this container |
|
| 146 |
+ -dns=[] : Set custom dns servers for the container |
|
| 147 |
+ |
|
| 148 |
+By default, all containers have networking enabled and they can make |
|
| 149 |
+any outgoing connections. The operator can completely disable |
|
| 150 |
+networking with ``docker run -n`` which disables all incoming and outgoing |
|
| 151 |
+networking. In cases like this, you would perform I/O through files or |
|
| 152 |
+STDIN/STDOUT only. |
|
| 153 |
+ |
|
| 154 |
+Your container will use the same DNS servers as the host by default, |
|
| 155 |
+but you can override this with ``-dns``. |
|
| 156 |
+ |
|
| 157 |
+Clean Up (-rm) |
|
| 158 |
+-------------- |
|
| 159 |
+ |
|
| 160 |
+By default a container's file system persists even after the container |
|
| 161 |
+exits. This makes debugging a lot easier (since you can inspect the |
|
| 162 |
+final state) and you retain all your data by default. But if you are |
|
| 163 |
+running short-term **foreground** processes, these container file |
|
| 164 |
+systems can really pile up. If instead you'd like Docker to |
|
| 165 |
+**automatically clean up the container and remove the file system when |
|
| 166 |
+the container exits**, you can add the ``-rm`` flag:: |
|
| 167 |
+ |
|
| 168 |
+ -rm=false: Automatically remove the container when it exits (incompatible with -d) |
|
| 169 |
+ |
|
| 170 |
+ |
|
| 171 |
+Runtime Constraints on CPU and Memory |
|
| 172 |
+------------------------------------- |
|
| 173 |
+ |
|
| 174 |
+The operator can also adjust the performance parameters of the container:: |
|
| 175 |
+ |
|
| 176 |
+ -m="": Memory limit (format: <number><optional unit>, where unit = b, k, m or g) |
|
| 177 |
+ -c=0 : CPU shares (relative weight) |
|
| 178 |
+ |
|
| 179 |
+The operator can constrain the memory available to a container easily |
|
| 180 |
+with ``docker run -m``. If the host supports swap memory, then the |
|
| 181 |
+``-m`` memory setting can be larger than physical RAM. |
|
| 182 |
+ |
|
| 183 |
+Similarly the operator can increase the priority of this container |
|
| 184 |
+with the ``-c`` option. By default, all containers run at the same |
|
| 185 |
+priority and get the same proportion of CPU cycles, but you can tell |
|
| 186 |
+the kernel to give more shares of CPU time to one or more containers |
|
| 187 |
+when you start them via Docker. |
|
| 188 |
+ |
|
| 189 |
+Runtime Privilege and LXC Configuration |
|
| 190 |
+--------------------------------------- |
|
| 191 |
+ |
|
| 192 |
+:: |
|
| 193 |
+ |
|
| 194 |
+ -privileged=false: Give extended privileges to this container |
|
| 195 |
+ -lxc-conf=[]: Add custom lxc options -lxc-conf="lxc.cgroup.cpuset.cpus = 0,1" |
|
| 196 |
+ |
|
| 197 |
+By default, Docker containers are "unprivileged" and cannot, for |
|
| 198 |
+example, run a Docker daemon inside a Docker container. This is |
|
| 199 |
+because by default a container is not allowed to access any devices, |
|
| 200 |
+but a "privileged" container is given access to all devices (see |
|
| 201 |
+lxc-template.go_ and documentation on `cgroups devices |
|
| 202 |
+<https://www.kernel.org/doc/Documentation/cgroups/devices.txt>`_). |
|
| 203 |
+ |
|
| 204 |
+When the operator executes ``docker run -privileged``, Docker will |
|
| 205 |
+enable to access to all devices on the host as well as set some |
|
| 206 |
+configuration in AppArmor to allow the container nearly all the same |
|
| 207 |
+access to the host as processes running outside containers on the |
|
| 208 |
+host. Additional information about running with ``-privileged`` is |
|
| 209 |
+available on the `Docker Blog |
|
| 210 |
+<http://blog.docker.io/2013/09/docker-can-now-run-within-docker/>`_. |
|
| 211 |
+ |
|
| 212 |
+An operator can also specify LXC options using one or more |
|
| 213 |
+``-lxc-conf`` parameters. These can be new parameters or override |
|
| 214 |
+existing parameters from the lxc-template.go_. Note that in the |
|
| 215 |
+future, a given host's Docker daemon may not use LXC, so this is an |
|
| 216 |
+implementation-specific configuration meant for operators already |
|
| 217 |
+familiar with using LXC directly. |
|
| 218 |
+ |
|
| 219 |
+.. _lxc-template.go: https://github.com/dotcloud/docker/blob/master/execdriver/lxc/lxc_template.go |
|
| 220 |
+ |
|
| 221 |
+ |
|
| 222 |
+Overriding ``Dockerfile`` Image Defaults |
|
| 223 |
+======================================== |
|
| 224 |
+ |
|
| 225 |
+When a developer builds an image from a :ref:`Dockerfile |
|
| 226 |
+<dockerbuilder>` or when she commits it, the developer can set a |
|
| 227 |
+number of default parameters that take effect when the image starts up |
|
| 228 |
+as a container. |
|
| 229 |
+ |
|
| 230 |
+Four of the ``Dockerfile`` commands cannot be overridden at runtime: |
|
| 231 |
+``FROM, MAINTAINER, RUN``, and ``ADD``. Everything else has a |
|
| 232 |
+corresponding override in ``docker run``. We'll go through what the |
|
| 233 |
+developer might have set in each ``Dockerfile`` instruction and how the |
|
| 234 |
+operator can override that setting. |
|
| 235 |
+ |
|
| 236 |
+.. contents:: |
|
| 237 |
+ :local: |
|
| 238 |
+ |
|
| 239 |
+CMD (Default Command or Options) |
|
| 240 |
+-------------------------------- |
|
| 241 |
+ |
|
| 242 |
+Recall the optional ``COMMAND`` in the Docker commandline:: |
|
| 243 |
+ |
|
| 244 |
+ docker run [OPTIONS] IMAGE[:TAG] [COMMAND] [ARG...] |
|
| 245 |
+ |
|
| 246 |
+This command is optional because the person who created the ``IMAGE`` |
|
| 247 |
+may have already provided a default ``COMMAND`` using the ``Dockerfile`` |
|
| 248 |
+``CMD``. As the operator (the person running a container from the |
|
| 249 |
+image), you can override that ``CMD`` just by specifying a new |
|
| 250 |
+``COMMAND``. |
|
| 251 |
+ |
|
| 252 |
+If the image also specifies an ``ENTRYPOINT`` then the ``CMD`` or |
|
| 253 |
+``COMMAND`` get appended as arguments to the ``ENTRYPOINT``. |
|
| 254 |
+ |
|
| 255 |
+ |
|
| 256 |
+ENTRYPOINT (Default Command to Execute at Runtime |
|
| 257 |
+------------------------------------------------- |
|
| 258 |
+ |
|
| 259 |
+:: |
|
| 260 |
+ |
|
| 261 |
+ -entrypoint="": Overwrite the default entrypoint set by the image |
|
| 262 |
+ |
|
| 263 |
+The ENTRYPOINT of an image is similar to a ``COMMAND`` because it |
|
| 264 |
+specifies what executable to run when the container starts, but it is |
|
| 265 |
+(purposely) more difficult to override. The ``ENTRYPOINT`` gives a |
|
| 266 |
+container its default nature or behavior, so that when you set an |
|
| 267 |
+``ENTRYPOINT`` you can run the container *as if it were that binary*, |
|
| 268 |
+complete with default options, and you can pass in more options via |
|
| 269 |
+the ``COMMAND``. But, sometimes an operator may want to run something else |
|
| 270 |
+inside the container, so you can override the default ``ENTRYPOINT`` at |
|
| 271 |
+runtime by using a string to specify the new ``ENTRYPOINT``. Here is an |
|
| 272 |
+example of how to run a shell in a container that has been set up to |
|
| 273 |
+automatically run something else (like ``/usr/bin/redis-server``):: |
|
| 274 |
+ |
|
| 275 |
+ docker run -i -t -entrypoint /bin/bash example/redis |
|
| 276 |
+ |
|
| 277 |
+or two examples of how to pass more parameters to that ENTRYPOINT:: |
|
| 278 |
+ |
|
| 279 |
+ docker run -i -t -entrypoint /bin/bash example/redis -c ls -l |
|
| 280 |
+ docker run -i -t -entrypoint /usr/bin/redis-cli example/redis --help |
|
| 281 |
+ |
|
| 282 |
+ |
|
| 283 |
+EXPOSE (Incoming Ports) |
|
| 284 |
+----------------------- |
|
| 285 |
+ |
|
| 286 |
+The ``Dockerfile`` doesn't give much control over networking, only |
|
| 287 |
+providing the ``EXPOSE`` instruction to give a hint to the operator |
|
| 288 |
+about what incoming ports might provide services. The following |
|
| 289 |
+options work with or override the ``Dockerfile``'s exposed defaults:: |
|
| 290 |
+ |
|
| 291 |
+ -expose=[]: Expose a port from the container |
|
| 292 |
+ without publishing it to your host |
|
| 293 |
+ -P=false : Publish all exposed ports to the host interfaces |
|
| 294 |
+ -p=[] : Publish a container's port to the host (format: |
|
| 295 |
+ ip:hostPort:containerPort | ip::containerPort | |
|
| 296 |
+ hostPort:containerPort) |
|
| 297 |
+ (use 'docker port' to see the actual mapping) |
|
| 298 |
+ -link="" : Add link to another container (name:alias) |
|
| 299 |
+ |
|
| 300 |
+As mentioned previously, ``EXPOSE`` (and ``-expose``) make a port |
|
| 301 |
+available **in** a container for incoming connections. The port number |
|
| 302 |
+on the inside of the container (where the service listens) does not |
|
| 303 |
+need to be the same number as the port exposed on the outside of the |
|
| 304 |
+container (where clients connect), so inside the container you might |
|
| 305 |
+have an HTTP service listening on port 80 (and so you ``EXPOSE 80`` in |
|
| 306 |
+the ``Dockerfile``), but outside the container the port might be 42800. |
|
| 307 |
+ |
|
| 308 |
+To help a new client container reach the server container's internal |
|
| 309 |
+port operator ``-expose``'d by the operator or ``EXPOSE``'d by the |
|
| 310 |
+developer, the operator has three choices: start the server container |
|
| 311 |
+with ``-P`` or ``-p,`` or start the client container with ``-link``. |
|
| 312 |
+ |
|
| 313 |
+If the operator uses ``-P`` or ``-p`` then Docker will make the |
|
| 314 |
+exposed port accessible on the host and the ports will be available to |
|
| 315 |
+any client that can reach the host. To find the map between the host |
|
| 316 |
+ports and the exposed ports, use ``docker port``) |
|
| 317 |
+ |
|
| 318 |
+If the operator uses ``-link`` when starting the new client container, |
|
| 319 |
+then the client container can access the exposed port via a private |
|
| 320 |
+networking interface. Docker will set some environment variables in |
|
| 321 |
+the client container to help indicate which interface and port to use. |
|
| 322 |
+ |
|
| 323 |
+ENV (Environment Variables) |
|
| 324 |
+--------------------------- |
|
| 325 |
+ |
|
| 326 |
+The operator can **set any environment variable** in the container by |
|
| 327 |
+using one or more ``-e`` flags, even overriding those already defined by the |
|
| 328 |
+developer with a Dockefile ``ENV``:: |
|
| 329 |
+ |
|
| 330 |
+ $ docker run -e "deep=purple" -rm ubuntu /bin/bash -c export |
|
| 331 |
+ declare -x HOME="/" |
|
| 332 |
+ declare -x HOSTNAME="85bc26a0e200" |
|
| 333 |
+ declare -x OLDPWD |
|
| 334 |
+ declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" |
|
| 335 |
+ declare -x PWD="/" |
|
| 336 |
+ declare -x SHLVL="1" |
|
| 337 |
+ declare -x container="lxc" |
|
| 338 |
+ declare -x deep="purple" |
|
| 339 |
+ |
|
| 340 |
+Similarly the operator can set the **hostname** with ``-h``. |
|
| 341 |
+ |
|
| 342 |
+``-link name:alias`` also sets environment variables, using the |
|
| 343 |
+*alias* string to define environment variables within the container |
|
| 344 |
+that give the IP and PORT information for connecting to the service |
|
| 345 |
+container. Let's imagine we have a container running Redis:: |
|
| 346 |
+ |
|
| 347 |
+ # Start the service container, named redis-name |
|
| 348 |
+ $ docker run -d -name redis-name dockerfiles/redis |
|
| 349 |
+ 4241164edf6f5aca5b0e9e4c9eccd899b0b8080c64c0cd26efe02166c73208f3 |
|
| 350 |
+ |
|
| 351 |
+ # The redis-name container exposed port 6379 |
|
| 352 |
+ $ docker ps |
|
| 353 |
+ CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES |
|
| 354 |
+ 4241164edf6f dockerfiles/redis:latest /redis-stable/src/re 5 seconds ago Up 4 seconds 6379/tcp redis-name |
|
| 355 |
+ |
|
| 356 |
+ # Note that there are no public ports exposed since we didn't use -p or -P |
|
| 357 |
+ $ docker port 4241164edf6f 6379 |
|
| 358 |
+ 2014/01/25 00:55:38 Error: No public port '6379' published for 4241164edf6f |
|
| 359 |
+ |
|
| 360 |
+ |
|
| 361 |
+Yet we can get information about the Redis container's exposed ports |
|
| 362 |
+with ``-link``. Choose an alias that will form a valid environment |
|
| 363 |
+variable! |
|
| 364 |
+ |
|
| 365 |
+:: |
|
| 366 |
+ |
|
| 367 |
+ $ docker run -rm -link redis-name:redis_alias -entrypoint /bin/bash dockerfiles/redis -c export |
|
| 368 |
+ declare -x HOME="/" |
|
| 369 |
+ declare -x HOSTNAME="acda7f7b1cdc" |
|
| 370 |
+ declare -x OLDPWD |
|
| 371 |
+ declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" |
|
| 372 |
+ declare -x PWD="/" |
|
| 373 |
+ declare -x REDIS_ALIAS_NAME="/distracted_wright/redis" |
|
| 374 |
+ declare -x REDIS_ALIAS_PORT="tcp://172.17.0.32:6379" |
|
| 375 |
+ declare -x REDIS_ALIAS_PORT_6379_TCP="tcp://172.17.0.32:6379" |
|
| 376 |
+ declare -x REDIS_ALIAS_PORT_6379_TCP_ADDR="172.17.0.32" |
|
| 377 |
+ declare -x REDIS_ALIAS_PORT_6379_TCP_PORT="6379" |
|
| 378 |
+ declare -x REDIS_ALIAS_PORT_6379_TCP_PROTO="tcp" |
|
| 379 |
+ declare -x SHLVL="1" |
|
| 380 |
+ declare -x container="lxc" |
|
| 381 |
+ |
|
| 382 |
+And we can use that information to connect from another container as a client:: |
|
| 383 |
+ |
|
| 384 |
+ $ docker run -i -t -rm -link redis-name:redis_alias -entrypoint /bin/bash dockerfiles/redis -c '/redis-stable/src/redis-cli -h $REDIS_ALIAS_PORT_6379_TCP_ADDR -p $REDIS_ALIAS_PORT_6379_TCP_PORT' |
|
| 385 |
+ 172.17.0.32:6379> |
|
| 386 |
+ |
|
| 387 |
+VOLUME (Shared Filesystems) |
|
| 388 |
+--------------------------- |
|
| 389 |
+ |
|
| 390 |
+:: |
|
| 391 |
+ |
|
| 392 |
+ -v=[]: Create a bind mount with: [host-dir]:[container-dir]:[rw|ro]. |
|
| 393 |
+ If "container-dir" is missing, then docker creates a new volume. |
|
| 394 |
+ -volumes-from="": Mount all volumes from the given container(s) |
|
| 395 |
+ |
|
| 396 |
+The volumes commands are complex enough to have their own |
|
| 397 |
+documentation in section :ref:`volume_def`. A developer can define one |
|
| 398 |
+or more ``VOLUME``\s associated with an image, but only the operator can |
|
| 399 |
+give access from one container to another (or from a container to a |
|
| 400 |
+volume mounted on the host). |
|
| 401 |
+ |
|
| 402 |
+USER |
|
| 403 |
+---- |
|
| 404 |
+ |
|
| 405 |
+The default user within a container is ``root`` (id = 0), but if the |
|
| 406 |
+developer created additional users, those are accessible too. The |
|
| 407 |
+developer can set a default user to run the first process with the |
|
| 408 |
+``Dockerfile USER`` command, but the operator can override it :: |
|
| 409 |
+ |
|
| 410 |
+ -u="": Username or UID |
|
| 411 |
+ |
|
| 412 |
+WORKDIR |
|
| 413 |
+------- |
|
| 414 |
+ |
|
| 415 |
+The default working directory for running binaries within a container is the root directory (``/``), but the developer can set a different default with the ``Dockerfile WORKDIR`` command. The operator can override this with:: |
|
| 416 |
+ |
|
| 417 |
+ -w="": Working directory inside the container |
|
| 418 |
+ |