Browse code

Adding Photon OS Troubleshooting Guide.

Change-Id: I5aac2db7b279e96ed5bbf055a5e740b12efbe40b
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/1491
Reviewed-by: Steve Hoenisch <shoenisch@vmware.com>
Tested-by: Steve Hoenisch <shoenisch@vmware.com>

shoenisch authored on 2016/10/06 06:57:26
Showing 1 changed files
1 1
new file mode 100644
... ...
@@ -0,0 +1,1614 @@
0
+# Photon OS Linux Troublshooting Guide
1
+
2
+-   [Introduction](#introduction)
3
+    -   [Systemd and TDNF](#systemd-and-tdnf)
4
+    -   [The Root Account and the `sudo` and `su`
5
+        Commands](#the-root-account-and-the-sudo-and-su-commands)
6
+    -   [Checking the Version and Build
7
+        Number](#checking-the-version-and-build-number)
8
+    -   [General Best Practices](#general-best-practices)
9
+    -   [Logs on Photon OS](#logs-on-photon-os)
10
+    -   [Troubleshooting Progression](#troubleshooting-progression)
11
+-   [Solutions to Common Problems](#solutions-to-common-problems)
12
+    -   [Resetting a Lost Root
13
+        Password](#resetting-a-lost-root-password)
14
+    -   [Fixing Permissions on Network Config
15
+        Files](#fixing-permissions-on-network-config-files)
16
+    -   [Permitting Root Login with
17
+        SSH](#permitting-root-login-with-ssh)
18
+    -   [Fixing Sendmail If Installed Before an FQDN Was
19
+        Set](#fixing-sendmail-if-installed-before-an-fqdn-was-set)
20
+-   [Common Troubleshooting Tools on Photon
21
+    OS](#common-troubleshooting-tools-on-photon-os)
22
+    -   [Top](#top)
23
+    -   [ps](#ps)
24
+    -   [netstat](#netstat)
25
+    -   [find](#find)
26
+    -   [Locate](#locate)
27
+    -   [df](#df)
28
+    -   [md5sum and sha256sum](#md5sum-and-sha256sum)
29
+    -   [strace](#strace)
30
+    -   [file](#file)
31
+    -   [stat](#stat)
32
+    -   [watch](#watch)
33
+    -   [vmstat and fdisk](#vmstat-and-fdisk)
34
+    -   [lsof](#lsof)
35
+    -   [fuser](#fuser)
36
+    -   [ldd](#ldd)
37
+    -   [gdb](#gdb)
38
+    -   [Other Troubleshooting Tools Installed by
39
+        Default](#other-troubleshooting-tools-installed-by-default)
40
+    -   [Installing More Tools from
41
+        Repositories](#installing-more-tools-from-repositories)
42
+    -   [Linux Troubleshooting Tools Not on Photon
43
+        OS](#linux-troubleshooting-tools-not-on-photon-os)
44
+-   [Systemd](#systemd)
45
+    -   [Viewing Services](#viewing-services)
46
+    -   [Using Systemd Commands Instead of Init.d
47
+        Commands](#using-systemd-commands-instead-of-init.d-commands)
48
+    -   [Analyzing System Logs with
49
+        journalctl](#analyzing-system-logs-with-journalctl)
50
+    -   [Inspecting Services with
51
+        `systemd-analyze`](#inspecting-services-with-systemd-analyze)
52
+-   [Networking](#networking)
53
+    -   [Managing the Network
54
+        Configuration](#managing-the-network-configuration)
55
+    -   [Use `ip` and `ss` Commands Instead of `ifconfig` and
56
+        `netstat`](#use-ip-and-ss-commands-instead-of-ifconfig-and-netstat)
57
+    -   [Inspecting the Status of Network Links with
58
+        `networkctl`](#inspecting-the-status-of-network-links-with-networkctl)
59
+    -   [Turning on Network
60
+        Debugging](#turning-on-network-debugging)
61
+    -   [Installing the Packages for tcpdump and netcat with
62
+        tdnf](#installing-the-packages-for-tcpdump-and-netcat-with-tdnf)
63
+    -   [Checking Firewall Rules](#checking-firewall-rules)
64
+    -   [Netmgr](#netmgr)
65
+-   [File System](#file-system)
66
+    -   [Checking Disk Space](#checking-disk-space)
67
+    -   [Adding a Disk and Partitioning
68
+        It](#adding-a-disk-and-partitioning-it)
69
+    -   [fdisk](#fdisk)
70
+    -   [fsck](#fsck)
71
+    -   [Fixing File System Errors When fsck
72
+        Fails](#fixing-file-system-errors-when-fsck-fails)
73
+-   [Packages](#packages)
74
+-   [Troubleshooting Kernel Problems, Boot Problems, and Login
75
+    Problems](#troubleshooting-kernel-problems-boot-problems-and-login-problems)
76
+    -   [Kernel Overview](#kernel-overview)
77
+    -   [Boot Process Overview](#boot-process-overview)
78
+    -   [Blank Screen on Reboot](#blank-screen-on-reboot)
79
+    -   [Investigating Strange
80
+        Behavior](#investigating-strange-behavior)
81
+    -   [Investigating the Guest Kernel When You Cannot Log
82
+        On](#investigating-the-guest-kernel-when-you-cannot-log-on)
83
+    -   [Kernel Log Replication with
84
+        VProbes](#kernel-log-replication-with-vprobes)
85
+-   [Troubleshooting Performance
86
+    Issues](#troubleshooting-performance-issues)
87
+
88
+
89
+## Introduction 
90
+
91
+This guide describes the fundamentals of troubleshooting problems on Photon OS. An open-source minimalist Linux operating system from VMware, Photon OS is optimized for cloud computing platforms, VMware vSphere deployments, virtual appliances, and applications native to the cloud.
92
+
93
+This guide covers the basics of troubleshooting systemd, packages, network interfaces, services such as SSH and Sendmail, the file system, and the Linux kernel. The guide includes a quick tour of the tools that you can use for troubleshooting and provides examples along the way. The guide also demonstrates how to access the system's log files. 
94
+
95
+For information on how to set up and manage Photon OS, see the [Photon OS Administration Guide](https://github.com/vmware/photon/blob/master/docs/photon-admin-guide.md).
96
+
97
+### Systemd and TDNF
98
+
99
+Two characteristics of Photon OS stand out: It manages services with systemd, and it manages packages with its own open source, yum-compatible package manager called tdnf, for Tiny DNF. 
100
+
101
+By using systemd, Photon OS adopts a contemporary Linux standard to bootstrap the user space and concurrently start services--an architecture that differs from traditional Linux systems such as SUSE Linux Enterprise Server 11.
102
+
103
+[[Image:photon-logo.png|right]]
104
+
105
+A traditional Linux system contains an initialization system called SysVinit. With SLES 11, for instance, SysVinit-style init programs control how the system starts up and shuts down. Init implements system runlevels. A SysVinit runlevel defines a state in which a process or service runs. In contrast to a SysVinit system, systemd defines no such runlevels. Instead, systemd uses a dependency tree of _targets_ to determine which services to start when.
106
+
107
+Because the systemd commands differ from those of an init.d-based Linux system, a section later in this guide illustrates how to troubleshoot by using systemctl commands instead of init.d-style commands. 
108
+
109
+Tdnf keeps the operating system as small as possible while preserving yum's robust package-management capabilities. On Photon OS, tdnf is the default package manager for installing new packages. Since troubleshooting with tdnf differs from using yum, a later section of this guide describes how to solve problems with packages and repositories by using tdnf commands.
110
+
111
+### The Root Account and the `sudo` and `su` Commands
112
+
113
+This guide assumes that you are logged in to Photon OS with the root account and running commands as root. The sudo program comes with the full version of Photon OS. On the minimal version, you must install sudo with tdnf if you want to use it. As an alternative to installing sudo on the minimal version, you can switch users as needed with the `su` command to run commands that require root privileges.
114
+
115
+### Checking the Version and Build Number
116
+
117
+To check the version and build number of Photon OS, concatenate `/etc/photon-release`. Example: 
118
+
119
+	cat /etc/photon-release
120
+	VMware Photon Linux 1.0
121
+	PHOTON_BUILD_NUMBER=a6f0f63
122
+
123
+The build number in the results maps to the commit number on the VMware Photon OS GitHub [commits page](https://github.com/vmware/photon/commits/master).
124
+
125
+### General Best Practices
126
+
127
+When troubleshooting, you should follow some general best practices:
128
+
129
+* **Take a snapshot.** Before you do anything to a virtual machine running Photon OS, take a snapshot of the VM so that you can restore it if need be. 
130
+
131
+* **Make a backup copy.** Before you change a configuration file, make a copy of the original in case you need to restore it later; example: `cp /etc/tdnf/tdnf.conf /etc/tdnf/tdnf.conf.orig`
132
+
133
+* **Collect logs.** Save the log files associated with a Photon OS problem; you or others might need them later. Include not only the log files on the guest but also the `vmware.log` file on the host; `vmware.log` is in the host's directory that contains the VM.
134
+
135
+* **Know what's in your toolbox.** Glance at the man page for a tool before you use it so that you know what your options are. The options can help focus the command's output on the problem you're trying to solve.
136
+
137
+* **Understand the system.** The more you know about the operating system and how it works, the better you can troubleshoot.
138
+
139
+### Logs on Photon OS
140
+
141
+On Photon OS, all the system logs except the installation log and the cloud-init log are written into the systemd journal. The `journalctl` command queries the contents of the systemd journal.
142
+
143
+The installation log files and the cloud-init log files reside in `/var/log`. If Photon OS is running on a virtual machine in a VMware hypervisor, the log file for the VMware tools (vmware-vmsvc.log) also resides in `/var/log`. 
144
+
145
+### Troubleshooting Progression
146
+
147
+If you encounter a problem running an application or appliance on Photon OS and you suspect it involves the operating system, you can troubleshoot by proceeding as follows. 
148
+
149
+First, check the services running on Photon OS:
150
+
151
+	systemctl status
152
+
153
+Second, check your application's log files for clues. (For VMware applications, see [Location of Log Files for VMware Products](https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021806).)
154
+
155
+Third, check the service controller or service monitor for your application or appliance. 
156
+
157
+Fourth, check the network interfaces and other aspects of the network service with `systemd-network` commands.
158
+
159
+Fifth, check the operating system's log files: 
160
+
161
+	journalctl
162
+
163
+Next, run the following commands to view all services according to the order in which they were started:
164
+
165
+	systemd-analyze critical-chain 
166
+
167
+Finally, if the previous steps have not revealed enough information to isolate the problem, turn to the troubleshooting tool that you think is most likely to help with the issue at hand. You could, for example, use `strace` to identify the location of the failure. See the list of troubleshooting tools on Photon OS in a later section. 
168
+
169
+## Solutions to Common Problems
170
+
171
+This section describes solutions to problems that you're likely to encounter.
172
+
173
+### Resetting a Lost Root Password
174
+
175
+Here's how to reset a lost root password. 
176
+
177
+First, restart the Photon OS machine or the virtual machine running Photon OS. When the Photon OS splash screen appears as it restarts, type the letter `e` to go to the GNU GRUB edit menu. Be quick about it: Because Photon OS reboots so quickly, you won't have much time to type `e`. Remember that in vSphere and Workstation, you might have to give the console focus by clicking in its window before it will register input from the keyboard. 
178
+
179
+Second, in the GNU GRUB edit menu, go to the end of the line that starts with `linux`, add a space, and then add the following code exactly as it appears below:
180
+
181
+	rw init=/bin/bash
182
+
183
+After you add this code, the GNU GRUB edit menu should look exactly like this:
184
+
185
+![The modified GNU GRUB edit menu](images/grub-edit-menu-changepw.png) 
186
+
187
+Now type `F10`.
188
+
189
+At the command prompt, type `passwd` and then type (and re-enter) a new root password that conforms to the password complexity rules of Photon OS. Remember the password. 
190
+
191
+Next, type the following command:
192
+
193
+	umount /
194
+
195
+Finally, type the following command. You must include the `-f` option to force a reboot; otherwise, the kernel enters a state of panic.
196
+
197
+	reboot -f
198
+
199
+This sequence of commands should look like this:
200
+
201
+![The series of commands to reset the root password](images/resetpw.png)
202
+
203
+After the Photon OS machine reboots, log in with the new root password. 
204
+
205
+### Fixing Permissions on Network Config Files
206
+
207
+If you, as the root user, create a new network configuration file on Photon OS, the network service might be unable to process it until you set the file's mode bits to `644`.
208
+
209
+If you query the journal with `journalctl -u systemd-networkd`, you might see the following error message along with an indication that the network service did not start: 
210
+
211
+	could not load configuration files. permission denied
212
+
213
+The permissions on the network files are the likely cause of this problem. Without the correct permissions, networkd-systemd cannot parse and apply the settings, and the network configuration that you created will not be loaded. 
214
+
215
+After you create a network configuration file with a `.network` extension, you must run the `chmod` command to set the new file's mode bits to `644`. Example: 
216
+
217
+    chmod 644 10-static-en.network
218
+
219
+For Photon OS to apply the new configuration, you must restart the `systemd-networkd` service by running the following command: 
220
+
221
+	systemctl restart systemd-networkd
222
+
223
+###	Permitting Root Login with SSH
224
+
225
+The full version of Photon OS prevents root login with SSH by default. To permit root login over SSH, open `/etc/ssh/sshd_config` with the vim text editor and set `PermitRootLogin` to `yes`. 
226
+
227
+Vim is the default text editor available in both the full and minimal versions of Photon OS. (Nano is also in the full version.) After you modify the SSH daemon's configuration file, you must restart the sshd daemon for the changes to take effect. Example: 
228
+
229
+	vim /etc/ssh/sshd_config
230
+
231
+	# override default of no subsystems
232
+	Subsystem       sftp    /usr/libexec/sftp-server
233
+
234
+	# Example of overriding settings on a per-user basis
235
+	#Match User anoncvs
236
+	#       X11Forwarding no
237
+	#       AllowTcpForwarding no
238
+	#       PermitTTY no
239
+	#       ForceCommand cvs server
240
+	PermitRootLogin yes
241
+	UsePAM yes
242
+
243
+Save your changes in vim and then restart the sshd daemon: 
244
+
245
+	systemctl restart sshd
246
+
247
+You can then connect to the Photon OS machine with the root account over SSH:
248
+
249
+	steve@ubuntu:~$ ssh root@198.51.100.131
250
+
251
+### Fixing Sendmail If Installed Before an FQDN Was Set
252
+
253
+If Sendmail is behaving improperly or if it hangs during installation, it is likely that an FQDN is not set. Take the following corrective action. 
254
+
255
+First, set an FQDN for your Photon OS machine. 
256
+
257
+Then, run the following commands in the order below: 
258
+
259
+    echo $(hostname -f) > /etc/mail/local-host-names
260
+    
261
+    cat > /etc/mail/aliases << "EOF"
262
+        postmaster: root
263
+        MAILER-DAEMON: root
264
+        EOF
265
+
266
+    /bin/newaliases
267
+
268
+    cd /etc/mail
269
+
270
+    m4 m4/cf.m4 sendmail.mc > sendmail.cf
271
+
272
+    chmod 700 /var/spool/clientmqueue
273
+
274
+    chown smmsp:smmsp /var/spool/clientmqueue
275
+
276
+## Common Troubleshooting Tools on Photon OS
277
+
278
+This section describes tools that can help troubleshoot problems. These tools are installed by default on the full version of Photon OS. On the minimal version of Photon OS, you may have to install a tool before you can use it. 
279
+
280
+There is a manual, or man page, on Photon OS for all the tools covered in this section. The man pages provide more information about each tool's commands, options, and output. To view a tool's man page, on the Photon OS command line, type `man` and then the name of the tool. Example: 
281
+
282
+	man strace
283
+
284
+Some of the examples in this section are marked as abridged with ellipsis (`...`).
285
+
286
+### Top
287
+
288
+Photon OS includes the Top tool to monitor system resources, workloads, and performance. It can unmask problems caused by processes or applications overconsuming CPUs, time, or RAM. 
289
+
290
+To view a textual display of resource consumption, run the `top` command: 
291
+
292
+	top
293
+
294
+In Top, you can kill a runaway or stalled process by typing `k` followed by its process ID (PID). 
295
+
296
+![Top on Photon OS](images/top-in-photon-os.png)
297
+
298
+If the percent of CPU utilization is consistently high with little idle time, there might be a runaway process overconsuming CPUs. Restarting the service might solve the problem. 
299
+
300
+A handy trick while troubleshooting an unknown issue is to run Top in the background by using batch mode to write its output to a file in order to collect data about performance:
301
+
302
+	top d 120 b >> top120second.output
303
+
304
+For a list of options that filter top output and other information, see the man page for Top.
305
+
306
+### ps
307
+
308
+The `ps` tool shows the processes running on the machine. The `ps` tool derives flexibility and power from its options, all of which are covered in the tool's Photon OS man page:
309
+
310
+	man ps
311
+
312
+Here are several popular invocations of `ps` for troubleshooting. 
313
+
314
+Show processes by user: 
315
+
316
+	ps aux
317
+
318
+Show processes and child processes by user: 
319
+
320
+	ps auxf
321
+
322
+Show processes containing the string `ssh`:
323
+
324
+	ps aux | grep ssh
325
+
326
+Show processes and the command and options with which they were started: 
327
+
328
+	ps auxww
329
+
330
+Example abridged output: 
331
+
332
+	ps auxww
333
+	USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
334
+	root          1  0.0  0.9  32724  3300 ?        Ss   07:51   0:32 /lib/systemd/systemd --switched-root --system --deserialize 22
335
+
336
+### netstat
337
+
338
+The `netstat` command can identify bottlenecks causing  performance issues. It lists network connections, listening sockets, port information, and interface statistics for different protocols. Examples: 
339
+
340
+	netstat --statistics
341
+	netstat --listening
342
+
343
+### find
344
+
345
+The `find` command can be a useful starting point to troubleshoot a Photon OS machine that has stopped working. The following command, for example, lists the files in the root directory that have changed in the past day: 
346
+
347
+		find / -mtime -1 
348
+
349
+See the `find` [manual](See https://www.gnu.org/software/findutils/manual/find.html). Take note of the security considerations listed in the `find` manual if you are using `find` to troubleshoot an appliance running on Photon OS. 
350
+
351
+### Locate
352
+
353
+The `locate` command is a fast way to find files and directories when all you have is a keyword. Similar to `find` and part of the same `findutils` package preinstalled on the full version of Photon OS by default, the `locate` command finds file names in the file names database. Before you can use `locate` accurately, you should update its database: 
354
+
355
+	updatedb
356
+
357
+Then you can run `locate` to quickly find a file, such as any file name containing `.network`, which can be helpful to see all the system's `.network` configuration files; abridged example: 
358
+
359
+	locate .network
360
+	/etc/dbus-1/system.d/org.freedesktop.network1.conf
361
+	/etc/systemd/network/10-dhcp-en.network
362
+	/usr/lib/systemd/network/80-container-host0.network
363
+	/usr/lib/systemd/network/80-container-ve.network
364
+	/usr/lib/systemd/system/busnames.target.wants/org.freedesktop.network1.busname
365
+	/usr/lib/systemd/system/dbus-org.freedesktop.network1.service
366
+	/usr/lib/systemd/system/org.freedesktop.network1.busnname
367
+	/usr/share/dbus-1/system-services/org.freedesktop.network1.service
368
+
369
+The `locate` command is also a quick way to see whether a troubleshooting tool is installed on Photon OS. Examples: 
370
+
371
+	locate strace
372
+	/usr/bin/strace
373
+	/usr/bin/strace-graph
374
+	/usr/bin/strace-log-merge
375
+	/usr/share/man/man1/strace.1.gz
376
+	/usr/share/vim/vim74/syntax/strace.vim
377
+
378
+	locate traceroute
379
+
380
+The `strace` tool is there but `traceroute` is not. You can, however, quickly install `traceroute` from the Photon OS repository: 
381
+
382
+	tdnf install traceroute
383
+
384
+
385
+### df
386
+
387
+The `df` command reports the disk space available on the file system. Because running out of disk space can lead an application to fail, a quick check of the available space makes sense as an early troubleshooting step: 
388
+
389
+	df -h
390
+
391
+The `-h` option prints out the available and used space in human-readable sizes. After checking the space, you should also check the number of available inodes. Too few available inodes can lead to difficult-to-diagnose problems:
392
+
393
+	df -i
394
+
395
+### md5sum and sha256sum
396
+
397
+`md5sum` calculates 128-bit MD5 hashes--a message digest, or digital signature, of a file--to uniquely identify a file and verify its integrity after file transfers, downloads, or disk errors when the security of the file is not in question. Photon OS also includes `sha256sum`, which is the preferred method of calculating the authenticity of a file to prevent tampering when security is a concern. Photon OS also includes `shasum`, `sha1sum`, `sha384sum`, and `sha512sum`. See the man pages for  `md3sum`, `sha256sum`, and the other SHA utilities. 
398
+
399
+`md5sum` can help troubleshooting installation issues by verifying that the version of Photon OS being installed matches the version on the Bintray download page. If, for instance, bytes were dropped during the download, the checksums will not match. Try downloading it again. 
400
+
401
+### strace
402
+
403
+The `strace` utility follows system calls and signals as they are executed so that you can see what an application, command, or process is doing. `strace` can trace failed commands, identify where a process obtains its configuration, monitor file activity, and find the location of a crash. 
404
+
405
+By tracing system calls, `strace` can help troubleshoot a broad range of problems, including issues with input-output, memory, interprocess communication, network usage, and application performance. 
406
+
407
+For troubleshooting a problem that gives off few or no clues, the following command displays every system call: 
408
+
409
+	strace ls -al
410
+
411
+With strace commands, you can route the output to a file to make it easier to analyze: 
412
+
413
+	strace -o output.txt ls -al
414
+
415
+`strace` can reveal the files that an application is trying to open with the `-eopen` option. This combination can help troubleshoot an application that is failing because it is missing files or being denied access to a file it needs. If, for example, you see "No such file or directory" in the results of `strace -eopen`, something might be wrong: 
416
+
417
+	strace -eopen sshd
418
+	open("/usr/lib/x86_64/libpam.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
419
+	open("/usr/lib/libpam.so.0", O_RDONLY|O_CLOEXEC) = 3
420
+
421
+In the results above, it's OK that the first file is missing because it is found in the next line. In other cases, the application might be unable to open one of its configuration files or reading the wrong one. If the results say "permission denied" for one of the files, check the permissions of the file with `ls -l` or `stat`.   
422
+
423
+When troubleshooting with `strace`, you can include the process ID in its commands. Here's an example of how to find a process ID: 
424
+
425
+	ps -ef | grep apache
426
+
427
+And you can then use `strace` to examine the file a process is working with: 
428
+
429
+	strace -e trace=file -p 1719
430
+
431
+A similar command can trace network traffic: 
432
+
433
+	strace -p 812 -e trace=network
434
+
435
+If an application is crashing, use `strace` to trace the application and then analyze what happens right before the application crashes.
436
+
437
+You can also trace the child processes that an application spawns with the fork system call, and you can do so with systemctl commands that start a process to identify why an application crashes immediately or fails to start: 
438
+
439
+	strace -f -o output.txt systemctl start httpd
440
+
441
+Here's another example. If journalctl is showing that networkd is failing, you can run strace to help determine why: 
442
+
443
+	strace -o output.txt systemctl restart systemd-networkd
444
+
445
+And then grep inside the results for something, such as _exit_ or _error_: 
446
+
447
+	grep exit output.txt
448
+
449
+Maybe the results indicate systemd-resolved is going wrong, and you can then strace it, too: 
450
+
451
+	strace -f -o output.txt systemctl restart systemd-resolved
452
+
453
+### file
454
+
455
+The `file` command determines the file type, which can help troubleshoot problems when an application mistakes one type of file for another, leading it to misbehave. Example: 
456
+
457
+	file /usr/sbin/sshd
458
+	/usr/sbin/sshd: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, stripped
459
+
460
+### stat
461
+
462
+The `stat` command can help troubleshoot problems with files or the file system by showing the last date it was modified and other information. Example:  
463
+
464
+	stat /dev/sda1
465
+	File: '/dev/sda1'
466
+	Size: 0               Blocks: 0          IO Block: 4096   block special file
467
+	Device: 6h/6d   Inode: 6614        Links: 1     Device type: 8,1
468
+	Access: (0660/brw-rw----)  Uid: (    0/    root)   Gid: (    8/    disk)
469
+	Access: 2016-09-02 12:23:56.135999936 +0000
470
+	Modify: 2016-09-02 12:23:52.879999981 +0000
471
+	Change: 2016-09-02 12:23:52.879999981 +0000
472
+	Birth: -
473
+
474
+On Photon OS, `stat` is handy to show permissions for a file or directory in both their absolute octal notation and their read-write-execute abbreviation; truncated example: 
475
+
476
+	chmod 777 tester.md
477
+	stat tester.md
478
+	  File: 'tester.md'
479
+	  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
480
+	Device: 801h/2049d      Inode: 316385      Links: 1
481
+	Access: (0777/-rwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
482
+
483
+### watch
484
+
485
+The `watch` utility runs a command at regular intervals so you can observe how its output changes over time. `watch` can help dynamically monitor network links, routes, and other information when you are troubleshooting networking or performance issues. Examples: 
486
+
487
+	watch -n0 --differences ss
488
+	watch -n1 --differences ip route
489
+	
490
+Here's another example with a screenshot of the command's output. This command monitors the traffic on your network links. The highlighted numbers are updated every second so you can see the traffic fluctuating: 
491
+
492
+	watch -n1 --differences ip -s link show up
493
+
494
+![The dynamic output of the watch utility](images/watchcmd.png)  
495
+
496
+### vmstat and fdisk
497
+
498
+The `vmstat` tool displays statistics about virtual memory, processes, block input-output, disks, and CPU activity. This tool can help diagnose performance problems, especially system bottlenecks.  
499
+
500
+Its output on a Photon OS virtual machine running in VMware Workstation 12 Pro without a heavy load looks like this: 
501
+
502
+	vmstat
503
+	procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
504
+	 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
505
+	 0  0      0   5980  72084 172488    0    0    27    44  106  294  1  0 98  1  0
506
+
507
+What do all these codes mean? They are explained in the vmstat man page. 
508
+
509
+If `r`, the number of runnable processes, is higher than 10, the machine is under stress; consider intervening to reduce the number of processes or to distribute some of the processes to other machines. In other words, the machine has a bottleneck in executing processes.
510
+
511
+If `cs`, the number of context switches per second, is really high, there may be too many jobs running on the machine. 
512
+
513
+If `in`, the number of interrupts per second, is relatively high, there might be a bottleneck for network or disk IO. 
514
+
515
+You can investigate disk IO further by using vmstat's `-d` option to report disk statistics; abridged example on a machine with little load: 
516
+
517
+	vmstat -d
518
+	disk- ------------reads------------ ------------writes----------- -----IO------
519
+	       total merged sectors      ms  total merged sectors      ms    cur    sec
520
+	ram0       0      0       0       0      0      0       0       0      0      0
521
+	ram1       0      0       0       0      0      0       0       0      0      0
522
+	loop0      0      0       0       0      0      0       0       0      0      0
523
+	loop1      0      0       0       0      0      0       0       0      0      0
524
+	sr0        0      0       0       0      0      0       0       0      0      0
525
+	sda    22744    676  470604   12908  72888  24949  805224  127692      0    130
526
+
527
+The `-D` option summarizes disk statistics:
528
+
529
+	vmstat -D
530
+	           26 disks
531
+	            2 partitions
532
+	        22744 total reads
533
+	          676 merged reads
534
+	       470604 read sectors
535
+	        12908 milli reading
536
+	        73040 writes
537
+	        25001 merged writes
538
+	       806872 written sectors
539
+	       127808 milli writing
540
+	            0 inprogress IO
541
+	          130 milli spent IO
542
+
543
+You can also get statistics about a partition. First, run the `fdisk -l` command to list the machine's devices. Then run `vmstat -p` with the name of a device to view its stats: 
544
+
545
+
546
+	fdisk -l
547
+	Disk /dev/ram0: 4 MiB, 4194304 bytes, 8192 sectors
548
+	Units: sectors of 1 * 512 = 512 bytes
549
+	Sector size (logical/physical): 512 bytes / 4096 bytes
550
+	I/O size (minimum/optimal): 4096 bytes / 4096 bytes
551
+	...
552
+	Device        Start      End  Sectors Size Type
553
+	/dev/sda1      2048 16771071 16769024   8G Linux filesystem
554
+	/dev/sda2  16771072 16777182     6111   3M BIOS boot
555
+
556
+	vmstat -p /dev/sda1
557
+	sda1          reads   read sectors  writes    requested writes
558
+	               22579     473306      78510     866088
559
+
560
+See the vmstat man page for more options. 
561
+
562
+### lsof
563
+
564
+The `lsof` command lists open files. And this tool's definition of an open file is quite broad--directories, libraries, streams, domain sockets, and Internet sockets are all considered files, making `lsof` broadly applicable as a mid-level troubleshooting tool to identify the files a process is using. Because a Linux system like Photon OS uses files to do its work, you can run `lsof` as root to see how the system is using them and to see how an application works. 
565
+
566
+If, for example, you cannot unmount a disk because it is in use, you can run `lsof` to identify the files on the disk that are being used. Here's an example showing what's using the root directory: 
567
+
568
+	lsof /root
569
+	COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
570
+	bash       879 root  cwd    DIR    8,1     4096 262159 /root
571
+	bash      1265 root  cwd    DIR    8,1     4096 262159 /root
572
+	sftp-serv 1326 root  cwd    DIR    8,1     4096 262159 /root
573
+	gdb       1351 root  cwd    DIR    8,1     4096 262159 /root
574
+	bash      1395 root  cwd    DIR    8,1     4096 262159 /root
575
+	lsof      1730 root  cwd    DIR    8,1     4096 262159 /root
576
+
577
+You can do the same with an application or virtual appliance by running `lsof` with the user name or process ID of the app. Here's an example that lists the open files used by the Apache HTTP Server:  
578
+
579
+	lsof -u apache
580
+
581
+Running the command with the `-i` option lists all the open network and Internet files, which can help troubleshoot network problems: 
582
+
583
+	lsof -i
584
+
585
+See the Unix socket addresses of a user like _zookeeper_: 
586
+
587
+	lsof -u zookeeper -U
588
+
589
+And here's an example that shows the processes running on Ports 1 through 80:
590
+
591
+	lsof -i TCP:1-80
592
+	COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
593
+	httpd    403   root    3u  IPv6  10733      0t0  TCP *:http (LISTEN)
594
+	httpd    407 apache    3u  IPv6  10733      0t0  TCP *:http (LISTEN)
595
+	httpd    408 apache    3u  IPv6  10733      0t0  TCP *:http (LISTEN)
596
+	httpd    409 apache    3u  IPv6  10733      0t0  TCP *:http (LISTEN)
597
+	sshd     820   root    3u  IPv4  11336      0t0  TCP *:ssh (LISTEN)
598
+	sshd     820   root    4u  IPv6  11343      0t0  TCP *:ssh (LISTEN)
599
+	sshd    1258   root    3u  IPv4  48040      0t0  TCP 198.51.100.143:ssh->198.51.100.1:49759 (ESTABLISHED)
600
+	sshd    1319   root    3u  IPv4  50866      0t0  TCP 198.51.100.143:ssh->198.51.100.1:51054 (ESTABLISHED)
601
+	sshd    1388   root    3u  IPv4  56438      0t0  TCP 198.51.100.143:ssh->198.51.100.1:60335 (ESTABLISHED)
602
+
603
+You can also inspect the files opened by a process ID. Here's a truncated example that queries the files open by the systemd network service: 
604
+
605
+	lsof -p 1917
606
+	COMMAND    PID            USER   FD      TYPE             DEVICE SIZE/OFF   NODE NAME
607
+	systemd-n 1917 systemd-network  cwd       DIR                8,1     4096      2 /
608
+	systemd-n 1917 systemd-network  txt       REG                8,1   887896 272389 /usr/lib/systemd/systemd-networkd
609
+	systemd-n 1917 systemd-network  mem       REG                8,1   270680 262267 /usr/lib/libnss_files-2.22.so
610
+	systemd-n 1917 systemd-network    0r      CHR                1,3      0t0   5959 /dev/null
611
+	systemd-n 1917 systemd-network    1u     unix 0x0000000000000000      0t0  45734 type=STREAM
612
+	systemd-n 1917 systemd-network    3u  netlink                         0t0   6867 ROUTE
613
+	systemd-n 1917 systemd-network    4u     unix 0x0000000000000000      0t0  45744 type=DGRAM
614
+	systemd-n 1917 systemd-network    9u  netlink                         0t0  45754 KOBJECT_UEVENT
615
+	systemd-n 1917 systemd-network   12u  a_inode               0,11        0   5955 [timerfd]
616
+	systemd-n 1917 systemd-network   13u     IPv4             104292      0t0    UDP 198.51.100.143:bootpc
617
+
618
+### fuser
619
+
620
+The `fuser` command identifies the process IDs of processes using files or sockets. The term _process_ is, in this case, synonymous with _user_. To identify the process ID of a process using a socket, run `fuser` with its namespace option and specify `tcp` or `udp` and the name of the process or port. Examples: 
621
+
622
+	fuser -n tcp ssh
623
+	ssh/tcp:               940  1308
624
+	fuser -n tcp http
625
+	http/tcp:              592   594   595   596
626
+	fuser -n tcp 80
627
+	80/tcp:                592   594   595   596
628
+
629
+
630
+### ldd
631
+
632
+By revealing the shared libraries that a program depends on, `ldd` can help troubleshoot an application that is missing a library or finding the wrong one.
633
+
634
+If, for example, you find output that says "file not found," check the path to the library.  
635
+
636
+	ldd /usr/sbin/sshd
637
+    linux-vdso.so.1 (0x00007ffc0e3e3000)
638
+    libpam.so.0 => (file not found)
639
+    libcrypto.so.1.0.0 => /usr/lib/libcrypto.so.1.0.0 (0x00007f624e570000)
640
+
641
+You can also use the `objdump` command to show dependencies for a program's object files; example:
642
+
643
+	objdump -p /usr/sbin/sshd | grep NEEDED
644
+
645
+### gdb
646
+
647
+The gdb tool is the GNU debugger. It lets you peer inside a program while it executes or when it crashes so that you can catch bugs on the fly. The gdb tool is typically used to debug programs written in C and C++. On Photon OS, gdb can help you determine why an application crashed. See the man page for gdb for instructions on how to run it. For an extensive example on how to use gdb to troubleshoot Photon OS running on a VM when you cannot login to Photon OS, see the section on troubleshooting boot and logon problems. 
648
+
649
+### Other Troubleshooting Tools Installed by Default
650
+
651
+The following troubleshooting tools are included in the full version of Photon OS: 
652
+
653
+* `grep` searches files for patterns. 
654
+* `ping` tests network connectivity. 
655
+* `strings` displays the characters in a file to identify its contents.
656
+* `lsmod` lists loaded modules.
657
+* `ipcs` shows data about the inter-process communication (IPC) resources to which a process has read access--typically, shared memory segments, message queues, and semaphore arrays.
658
+* `nm` lists symbols from object files. 
659
+* `diff` compares files side by side. Useful to compare two configuration files when one version works and the other doesn't. 
660
+
661
+### Installing More Tools from Repositories
662
+
663
+You can install several troubleshooting tools from the Photon OS repositories by using the default package management system, `tdnf`. 
664
+
665
+If a tool you need is not installed, the first thing you should do is search the repositories to see whether it's available. The traceroute tool, for example, is not installed by default. Here's how to search for it in the repositories:  
666
+
667
+	tdnf search traceroute
668
+	traceroute : Traces the route taken by packets over an IPv4/IPv6 network
669
+
670
+The results of the above command show that traceroute exists in the repository. You install it with `tdnf`: 
671
+
672
+	tdnf install traceroute
673
+
674
+Additional tools are not installed by default but are in the repository for instant installation with `tdnf`: 
675
+
676
+* `net-tools`: networking tools.
677
+* `ltrace`: tool for intercepting and recording dynamic library calls. It can identify the function an application was calling when it crashed, making it useful for debugging.
678
+* `nfs-utils`: client tools for the kernel Network File System, or NFS, including showmount; installed by default in the full version of Photon OS but not in the minimal version. 
679
+* `pcstat`: A tool that inspects which pages of a file or files are being cached by the Linux kernel.
680
+* `sysstat` and `sar`: Utilities to monitor system performance and usage activity. Installing sysstat also installs sar.
681
+* `systemtap` and `crash`: The systemtap utility is a programmable instrumentation system for diagnosing problems of performance or function. Installing systemtap also installs crash, which is a kernel crash analysis utility for live systems and dump files.
682
+* `dstat`: versatile tool for viewing and analyzing statistics about system resources.
683
+
684
+The `dstat` tool, for example, can help troubleshoot system performance. The tool shows a live, running list of statistics about system resources: 
685
+
686
+	dstat
687
+	You did not select any stats, using -cdngy by default.
688
+	----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
689
+	usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
690
+	  1   0  98   1   0   0|4036B   42k|   0     0 |   0     0 |  95   276
691
+	  1   0  98   1   0   0|   0    64k|  60B  940B|   0     0 | 142   320
692
+	  1   1  98   0   0   0|   0    52k|  60B  476B|   0     0 | 149   385
693
+
694
+
695
+### Linux Troubleshooting Tools Not on Photon OS
696
+
697
+The following Linux troubleshoot tools are neither installed on Photon OS by default nor available in the Photon OS repositories: 
698
+
699
+* iostat
700
+* telnet (use SSH instead)
701
+* Iprm
702
+* hdparm
703
+* syslog (use journalctl instead)
704
+* ddd
705
+* ksysmoops
706
+* xev
707
+* GUI tools (because Photon OS has no GUI)
708
+
709
+## Systemd
710
+
711
+Photon OS manages services with systemd and its command-line utility for inspecting and controlling the system, `systemctl`, not the deprecated commands of init.d. For example, instead of running the /etc/init.d/ssh script to stop and start the OpenSSH server on a init.d-based Linux system, you control the service by running the following systemctl commands on Photon OS: 
712
+
713
+	systemctl stop sshd
714
+	systemctl start sshd
715
+
716
+For an overview of systemd, see [systemd System and Service Manager](https://www.freedesktop.org/wiki/Software/systemd/) and the [man page for systemd](https://www.freedesktop.org/software/systemd/man/systemd.html). The systemd man pages are listed at [https://www.freedesktop.org/software/systemd/man/](https://www.freedesktop.org/software/systemd/man/).
717
+
718
+### Viewing Services 
719
+
720
+To view a description of all the active, loaded units, execute the systemctl command without any options or arguments: 
721
+
722
+	systemctl
723
+
724
+To see all the loaded, active, and inactive units and their description, run this command: 
725
+
726
+	systemctl --all
727
+
728
+To see all the unit files and their current status but no description, run this command: 
729
+
730
+	systemctl list-unit-files
731
+
732
+The `grep` command filters the services by a search term, a helpful tactic to recall the exact name of a unit file without looking through a long list of names. Example: 
733
+
734
+	systemctl list-unit-files | grep network
735
+	org.freedesktop.network1.busname           static
736
+	dbus-org.freedesktop.network1.service      enabled
737
+	systemd-networkd-wait-online.service       enabled
738
+	systemd-networkd.service                   enabled
739
+	systemd-networkd.socket                    enabled
740
+	network-online.target                      static
741
+	network-pre.target                         static
742
+	network.target  
743
+
744
+### Using Systemd Commands Instead of Init.d Commands
745
+
746
+Basic system administration commands on Photon OS differ from those on operating systems that use SysVinit. Since Photon OS uses systemd instead of SysVinit, you must use systemd commands to manage services. 
747
+
748
+For example, to list all the services that you can manage on Photon OS, you run the following command instead of `ls /etc/rc.d/init.d/`: 
749
+
750
+	systemctl list-unit-files --type=service
751
+
752
+Similarly, to check whether the `sshd` service is enabled, on Photon OS you run the following command instead of `chkconfig sshd`:
753
+
754
+	systemctl is-enabled sshd
755
+
756
+The `chkconfig --list` command that shows which services are enabled for which runlevel on a SysVinit computer becomes substantially different on Photon OS because there are no runlevels, only targets: 
757
+
758
+	ls /etc/systemd/system/*.wants
759
+
760
+You can also display similar information with the following command: 
761
+
762
+	systemctl list-unit-files --type=service
763
+
764
+Here is a list of some of the systemd commands that take the place of SysVinit commands on Photon OS: 
765
+
766
+	USE THIS SYSTEMD COMMAND 	INSTEAD OF THIS SYSVINIT COMMAND
767
+	systemctl start sshd 		service sshd start
768
+	systemctl stop sshd 		service sshd stop
769
+	systemctl restart sshd 		service sshd restart
770
+	systemctl reload sshd 		service sshd reload
771
+	systemctl condrestart sshd 	service sshd condrestart
772
+	systemctl status sshd 		service sshd status
773
+	systemctl enable sshd 		chkconfig sshd on
774
+	systemctl disable sshd 		chkconfig sshd off
775
+	systemctl daemon-reload		chkconfig sshd --add
776
+
777
+### Analyzing System Logs with journalctl
778
+
779
+The journalctl tool queries the contents of the systemd journal. On Photon OS, all the system logs except the installation log and the cloud-init log are written into the systemd journal. 
780
+
781
+If called without parameters, the `journalctl` command shows all the contents of the journal, beginning with the oldest entry. To display the output in reverse order with new entries first, include the `-r` option in the command:
782
+
783
+	journalctl -r
784
+
785
+The `journalctl` command includes many options to filter its output. For help troubleshooting systemd, two journalctl queries are particularly useful: showing the log entries for the last boot and showing the log entries for a systemd service unit. This command displays the messages that systemd generated during the last time the machine started: 
786
+
787
+	journalctl -b
788
+
789
+This command reveals the messages for only the systemd service unit specified by the `-u` option, which in the following example is the auditing service: 
790
+
791
+	journalctl -u auditd
792
+
793
+You can look at the messages for systemd itself or for the network service:
794
+
795
+	journalctl -u systemd
796
+	journalctl -u systemd-networkd
797
+
798
+Example:  
799
+
800
+	root@photon-1a0375a0392e [ ~ ]# journalctl -u systemd-networkd
801
+	-- Logs begin at Tue 2016-08-23 14:35:50 UTC, end at Tue 2016-08-23 23:45:44 UTC. --
802
+	Aug 23 14:35:52 photon-1a0375a0392e systemd[1]: Starting Network Service...
803
+	Aug 23 14:35:52 photon-1a0375a0392e systemd-networkd[458]: Enumeration completed
804
+	Aug 23 14:35:52 photon-1a0375a0392e systemd[1]: Started Network Service.
805
+	Aug 23 14:35:52 photon-1a0375a0392e systemd-networkd[458]: eth0: Gained carrier
806
+	Aug 23 14:35:53 photon-1a0375a0392e systemd-networkd[458]: eth0: DHCPv4 address 198.51.100.1
807
+	Aug 23 14:35:54 photon-1a0375a0392e systemd-networkd[458]: eth0: Gained IPv6LL
808
+	Aug 23 14:35:54 photon-1a0375a0392e systemd-networkd[458]: eth0: Configured
809
+
810
+
811
+For more information, see [journalctl](https://www.freedesktop.org/software/systemd/man/journalctl.html) or the journalctl man page by running this command: `man journalctl`
812
+
813
+### Inspecting Services with `systemd-analyze`
814
+
815
+The `systemd-analyze` command reveals performance statistics for boot times, traces system services, and verifies unit files. It can help troubleshoot slow system boots and incorrect unit files. See the man page for a list of options. Examples:
816
+
817
+	systemd-analyze blame
818
+
819
+	systemd-analyze dump
820
+
821
+## Networking
822
+
823
+### Managing the Network Configuration
824
+
825
+The network service, which is enabled by default, starts when the system boots. You manage the network service by using systemd commands, such as systemd-networkd, systemd-resolvd, and networkctl. You can check its status of the network service by running the following command: 
826
+
827
+	systemctl status systemd-networkd
828
+
829
+Here is a healthy result of the command: 
830
+
831
+	* systemd-networkd.service - Network Service
832
+	   Loaded: loaded (/usr/lib/systemd/system/systemd-networkd.service; enabled; vendor preset: enabled)
833
+	   Active: active (running) since Fri 2016-04-29 15:08:51 UTC; 6 days ago
834
+	     Docs: man:systemd-networkd.service(8)
835
+	 Main PID: 291 (systemd-network)
836
+	   Status: "Processing requests..."
837
+	   CGroup: /system.slice/systemd-networkd.service
838
+	           `-291 /lib/systemd/systemd-networkd
839
+
840
+Because Photon OS relies on systemd to manage services, you should employ the systemd suite of commands, not deprecated init.d commands or other deprecated commands, to manage networking. 
841
+
842
+### Use `ip` and `ss` Commands Instead of `ifconfig` and `netstat`
843
+
844
+Although the `ifconfig` command and the `netstat` command work on Photon OS, VMware recommends that you use the `ip` or `ss` commands. The `ifconfig` and `netstat` commands are deprecated. 
845
+
846
+For example, instead of running `netstat` to display a list of network interfaces, consider running the `ss` command. Similarly, to display information for IP addresses, instead of running `ifconfig -a`, run the `ip addr` command. Examples:
847
+
848
+	USE THIS IPROUTE COMMAND 	INSTEAD OF THIS NET-TOOL COMMAND
849
+	ip addr 					ifconfig -a
850
+	ss 							netstat
851
+	ip route 					route
852
+	ip maddr 					netstat -g
853
+	ip link set eth0 up 		ifconfig eth0 up
854
+	ip -s neigh					arp -v
855
+	ip link set eth0 mtu 9000	ifconfig eth0 mtu 9000
856
+
857
+Using the `ip route` version of a command instead of the net-tools version often provides more complete, accurate information on Photon OS, as the following example demonstrates: 
858
+
859
+	ip neigh
860
+	198.51.100.2 dev eth0 lladdr 00:50:56:e2:02:0f STALE
861
+	198.51.100.254 dev eth0 lladdr 00:50:56:e7:13:d9 STALE
862
+	198.51.100.1 dev eth0 lladdr 00:50:56:c0:00:08 DELAY
863
+
864
+	arp -a
865
+	? (198.51.100.2) at 00:50:56:e2:02:0f [ether] on eth0
866
+	? (198.51.100.254) at 00:50:56:e7:13:d9 [ether] on eth0
867
+	? (198.51.100.1) at 00:50:56:c0:00:08 [ether] on eth0
868
+
869
+**Important:** If you modify an IPv6 configuration or add an IPv6 interface, you must restart `systemd-networkd`. Traditional methods of using `ifconfig` commands will be inadequate to register the changes. Run the following command instead: 
870
+
871
+	systemctl restart systemd-networkd
872
+
873
+
874
+### Inspecting the Status of Network Links with `networkctl`
875
+
876
+The `networkctl` command shows information about network connections that helps you configure networking services and troubleshoot networking problems. You can, for example, progressively add options and arguments to the `networkctl` command to move from general information about network connections to specific information about a network connection. 
877
+
878
+Running `networkctl` without options defaults to the list command:  
879
+
880
+	networkctl
881
+	IDX LINK             TYPE               OPERATIONAL SETUP
882
+	  1 lo               loopback           carrier     unmanaged
883
+	  2 eth0             ether              routable    configured
884
+	  3 docker0          ether              routable    unmanaged
885
+	 11 vethb0aa7a6      ether              degraded    unmanaged
886
+	 4 links listed.
887
+
888
+Running `networkctl` with the status command displays information that looks like this; you can see there are active network links with IP addresses for not only the Ethernet connection but also a Docker container. 
889
+
890
+	root@photon-rc [ ~ ]# networkctl status
891
+	*      State: routable
892
+	     Address: 198.51.100.131 on eth0
893
+	              172.17.0.1 on docker0
894
+	              fe80::20c:29ff:fe55:3ca6 on eth0
895
+	              fe80::42:f0ff:fef7:bd81 on docker0
896
+	              fe80::4c84:caff:fe76:a23f on vethb0aa7a6
897
+	     Gateway: 198.51.100.2 on eth0
898
+	         DNS: 198.51.100.2
899
+
900
+You can then add a network link, such as the Ethernet connection, as the argument of the status command to show specific information about the link: 
901
+
902
+	root@photon-rc [ ~ ]# networkctl status eth0
903
+	* 2: eth0
904
+	       Link File: /usr/lib/systemd/network/99-default.link
905
+	    Network File: /etc/systemd/network/10-dhcp-en.network
906
+	            Type: ether
907
+	           State: routable (configured)
908
+	            Path: pci-0000:02:01.0
909
+	          Driver: e1000
910
+	      HW Address: 00:0c:29:55:3c:a6 (VMware, Inc.)
911
+	             MTU: 1500
912
+	         Address: 198.51.100.131
913
+	                  fe80::20c:29ff:fe55:3ca6
914
+	         Gateway: 198.51.100.2
915
+	             DNS: 198.51.100.2
916
+	        CLIENTID: ffb6220feb00020000ab116724f520a0a77337
917
+
918
+And you can do the same thing with the Docker container: 
919
+
920
+	networkctl status docker0
921
+	* 3: docker0
922
+	       Link File: /usr/lib/systemd/network/99-default.link
923
+	    Network File: n/a
924
+	            Type: ether
925
+	           State: routable (unmanaged)
926
+	          Driver: bridge
927
+	      HW Address: 02:42:f0:f7:bd:81
928
+	             MTU: 1500
929
+	         Address: 172.17.0.1
930
+	                  fe80::42:f0ff:fef7:bd81
931
+
932
+In the example above, it is OK that the state of the Docker container is unmanaged; Docker handles managing the networking for the containers without using systemd-resolved or systemd-networkd. Instead, Docker manages the container's connection by using its bridge drive.
933
+
934
+For more information about `networkctl` commands and options, see https://www.freedesktop.org/software/systemd/man/networkctl.html.
935
+
936
+### Turning on Network Debugging
937
+
938
+You can set `systemd-networkd` to work in debug mode so that you can analyze log files with debugging information to help troubleshoot networking problems. The following procedure turns on network debugging by adding a drop-in file in /etc/systemd to customize the default systemd configuration in /usr/lib/systemd. 
939
+
940
+First, run the following command as root to create a directory with this exact name, including the `.d` extension:
941
+
942
+	mkdir -p /etc/systemd/system/systemd-networkd.service.d/
943
+
944
+Second, run the following command as root to establish a systemd drop-in unit with a debugging configuration for the network service:
945
+
946
+	cat > /etc/systemd/system/systemd-networkd.service.d/10-loglevel-debug.conf << "EOF"
947
+	[Service]
948
+	Environment=SYSTEMD_LOG_LEVEL=debug
949
+	EOF
950
+ 
951
+You must reload the systemctl daemon and restart the systemd-networkd service for the changes to take effect: 
952
+
953
+	systemctl daemon-reload
954
+	systemctl restart systemd-networkd
955
+
956
+Verify that your changes took effect:
957
+
958
+	systemd-delta --type=extended
959
+
960
+View the log files by running this command: 
961
+
962
+	journalctl -u systemd-networkd
963
+
964
+When you are finished debugging the network connections, turn debugging off by deleting the drop-in file: 
965
+
966
+	rm /etc/systemd/system/systemd-networkd.service.d/10-loglevel-debug.conf
967
+
968
+### Installing the Packages for tcpdump and netcat with tdnf
969
+
970
+The minimal version of Photon OS leaves out several useful networking tools to keep the operating system lean. Tcpdump, for example, is absent in the minimal version but available in the repository. The minimal version does, however, include the iproute2 tools by default. 
971
+
972
+Tcpdump captures and analyzes packets on a network interface. On Photon OS, you install tcpdump and its accompanying package libpcap, a C/C++ library for capturing network traffic, by using tdnf, Photon's command-line package manager: 
973
+
974
+	tdnf install tcpdump
975
+
976
+Netcat, a tool for sending data over network connections with TCP or UDP, appears in neither the minimal nor the full version of Photon OS. But since netcat furnishes powerful options for analyzing, troubleshooting, and debugging network connections, you might want to install it. To do so, run the following command: 
977
+
978
+	tdnf install netcat
979
+
980
+### Checking Firewall Rules
981
+
982
+The design of Photon OS emphasizes security. On the minimal and full versions of Photon OS, the default security policy turns on the firewall and drops packets from external interfaces and  applications. As a result, you might need to add rules to iptables to permit forwarding, allow protocols like HTTP, and open ports. In other words, you must configure the firewall for your applications and requirements. 
983
+
984
+The default iptables settings on the full version look like this:
985
+
986
+	iptables --list
987
+	Chain INPUT (policy DROP)
988
+	target     prot opt source               destination
989
+	ACCEPT     all  --  anywhere             anywhere
990
+	ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
991
+	ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:ssh
992
+
993
+	Chain FORWARD (policy DROP)
994
+	target     prot opt source               destination
995
+
996
+	Chain OUTPUT (policy DROP)
997
+	target     prot opt source               destination
998
+	ACCEPT     all  --  anywhere             anywhere
999
+
1000
+
1001
+To find out how to adjust the settings, see the man page for iptables. 
1002
+
1003
+Although the default iptables policy accepts SSH connections, the `sshd` configuration file on the full version of Photon OS  is set to reject SSH connections. See [Permitting Root Login with SSH](#permitting-root-login-with-ssh).
1004
+
1005
+If you are unable to ping a Photon OS machine, one of the first things you should do is check the firewall rules. Do they allow connectivity for the port and protocol in question? You can supplement the `iptables` commands by using `lsof` to, for instance, see the processes listening on ports: 
1006
+
1007
+	lsof -i -P -n
1008
+
1009
+### Netmgr
1010
+
1011
+If you are running a VMware appliance on Photon OS and the VAMI module has problems or if there are networking issues, you can use the Photon OS `netmgr` utility to inspect the networking settings. Make sure, in particular, that the IP addresses for the DNS server and other infrastructure are correct. Use `tcpdump` to analyze the issues. 
1012
+
1013
+If you get an error code from netmgr, it is a standard Unix error code--enter it into a search engine to obtain more information.
1014
+
1015
+## File System
1016
+
1017
+This section covers troubleshooting the file system.
1018
+
1019
+### Checking Disk Space
1020
+
1021
+One of the first simple steps to take when you're troubleshooting is to check how much disk space is available by running the `df` command: 
1022
+
1023
+	df -h
1024
+
1025
+### Adding a Disk and Partitioning It
1026
+
1027
+If the `df` command shows that the file system is indeed nearing capacity, you can add a new disk on the fly and partition it to increase capacity. 
1028
+
1029
+First, add a new disk. You can, for example, add a new disk to a virtual machine by using the VMware vSphere Client. After adding a new disk, check for the new disk by using `fdisk`; see the section on `fdisk` below. In the following example, the new disk is named `/dev/sdb`:
1030
+
1031
+	fdisk -l
1032
+	Device        Start      End  Sectors Size Type
1033
+	/dev/sda1      2048 16771071 16769024   8G Linux filesystem
1034
+	/dev/sda2  16771072 16777182     6111   3M BIOS boot
1035
+	
1036
+	Disk /dev/sdb: 1 GiB, 1073741824 bytes, 2097152 sectors
1037
+	Units: sectors of 1 * 512 = 512 bytes
1038
+	Sector size (logical/physical): 512 bytes / 512 bytes
1039
+	I/O size (minimum/optimal): 512 bytes / 512 bytes
1040
+
1041
+After you confirm that Photon OS registers the new disk, you can partition it with the `parted` wizard. The command to partition the disk on Photon OS is as follows: 
1042
+
1043
+	parted /dev/sdb
1044
+
1045
+And then you use the parted wizard to create it (see the man page for `parted` for more information):
1046
+
1047
+	mklabel gpt
1048
+	mkpart ext3 1 1024
1049
+
1050
+Then you must create a file system on the partition:
1051
+
1052
+	mkfs -t ext3 /dev/sdb1
1053
+
1054
+Make a directory where you will mount the new file system: 
1055
+
1056
+	mkdir /newdata
1057
+
1058
+Finally, open `/etc/fstab` and add the new file system with the options that you want: 
1059
+
1060
+	#system mnt-pt  type    options dump    fsck
1061
+	/dev/sda1       /       ext4    defaults,barrier,noatime,noacl,data=ord$
1062
+	/dev/cdrom      /mnt/cdrom      iso9660 ro,noauto       0       0
1063
+	/dev/sdb1       /newdata        ext3    defaults        0		0
1064
+
1065
+Mount it for now: 
1066
+
1067
+	mount /newdata
1068
+
1069
+Check your work: 
1070
+
1071
+	df -h
1072
+	Filesystem      Size  Used Avail Use% Mounted on
1073
+	/dev/root       7.8G  4.4G  3.1G  59% /
1074
+	devtmpfs        172M     0  172M   0% /dev
1075
+	tmpfs           173M     0  173M   0% /dev/shm
1076
+	tmpfs           173M  664K  172M   1% /run
1077
+	tmpfs           173M     0  173M   0% /sys/fs/cgroup
1078
+	tmpfs           173M   36K  173M   1% /tmp
1079
+	tmpfs            35M     0   35M   0% /run/user/0
1080
+	/dev/sdb1       945M  1.3M  895M   1% /newdata
1081
+
1082
+### fdisk
1083
+
1084
+The `fdisk` command manipulates the disk partition table. You can, for example, use `fdisk` to list the disk partitions so that you can identify the root Linux file system. Here is an truncated example showing `/dev/sda1` to be the root Linux partition: 
1085
+
1086
+	fdisk -l
1087
+	Disk /dev/ram0: 4 MiB, 4194304 bytes, 8192 sectors
1088
+	Units: sectors of 1 * 512 = 512 bytes
1089
+	Sector size (logical/physical): 512 bytes / 4096 bytes
1090
+	I/O size (minimum/optimal): 4096 bytes / 4096 bytes
1091
+	...
1092
+	Disk /dev/sda: 8 GiB, 8589934592 bytes, 16777216 sectors
1093
+	Units: sectors of 1 * 512 = 512 bytes
1094
+	Sector size (logical/physical): 512 bytes / 512 bytes
1095
+	I/O size (minimum/optimal): 512 bytes / 512 bytes
1096
+	Disklabel type: gpt
1097
+	Disk identifier: 3CFA568B-2C89-4290-8B52-548732A3972D
1098
+
1099
+	Device        Start      End  Sectors Size Type
1100
+	/dev/sda1      2048 16771071 16769024   8G Linux filesystem
1101
+	/dev/sda2  16771072 16777182     6111   3M BIOS boot
1102
+
1103
+Remember the `fdisk -l` command--it will be used later in a section that demonstrates how to reset a lost root password. 
1104
+
1105
+### fsck
1106
+
1107
+The Photon OS file system includes btrfs and ext4. The default root file system is ext4, which you can see by looking at the file system configuration file, `/etc/fstab`: 
1108
+
1109
+	cat /etc/fstab
1110
+	#system mnt-pt  type    options dump    fsck
1111
+	/dev/sda1       /       ext4    defaults,barrier,noatime,noacl,data=ordered     1       1
1112
+	/dev/cdrom      /mnt/cdrom      iso9660 ro,noauto       0       0
1113
+
1114
+The `1` in the fifth column, under `fsck`, indicates that fsck checks the file system when the system boots.
1115
+
1116
+You can manually check the file system by using the file system consistency check tool, `fsck`, after you unmount the file system. You can also perform a read-only check without unmounting it:
1117
+
1118
+	fsck -nf /dev/sda1
1119
+	fsck from util-linux 2.27.1
1120
+	e2fsck 1.42.13 (17-May-2015)
1121
+	Warning!  /dev/sda1 is mounted.
1122
+	Warning: skipping journal recovery because doing a read-only filesystem check.
1123
+	Pass 1: Checking inodes, blocks, and sizes
1124
+	Pass 2: Checking directory structure
1125
+	Pass 3: Checking directory connectivity
1126
+	Pass 4: Checking reference counts
1127
+	Pass 5: Checking group summary information
1128
+	Free blocks count wrong (1439651, counted=1423942).
1129
+	Fix? no
1130
+	Free inodes count wrong (428404, counted=428397).
1131
+	Fix? no
1132
+	/dev/sda1: 95884/524288 files (0.3% non-contiguous), 656477/2096128 blocks
1133
+
1134
+The inodes count is probably off because the file system is mounted and in use. To fix problems, you must first unmount the file system and then run fsck again: 
1135
+
1136
+	umount /dev/sda1
1137
+	umount: /: target is busy
1138
+	        (In some cases useful info about processes that
1139
+	         use the device is found by lsof(8) or fuser(1).)
1140
+
1141
+So check it with `lsof`:
1142
+
1143
+	lsof | grep ^jbd2/sd
1144
+	jbd2/sda1   99                root  cwd       DIR                8,1     4096          2 /
1145
+	jbd2/sda1   99                root  rtd       DIR                8,1     4096          2 /
1146
+	jbd2/sda1   99                root  txt   unknown                                        /proc/99/exe
1147
+
1148
+The file system is indeed in use. What troubleshooting tool would you use next to further explore the applications or processes that are using the file system?  
1149
+
1150
+### Fixing File System Errors When fsck Fails
1151
+
1152
+A potential issue is that when `fsck` runs during startup, it finds a problem that prevents the system from fully booting until you fix the issue by running fsck manually. This kind of a problem can occur when Photon OS is the operating system for a VM running an appliance. 
1153
+
1154
+If fsck fails when the computer boots and an error message says to run fsck manually, you can troubleshoot by restarting the VM, altering the GRUB edit menu to enter emergency mode before Photon OS fully boots, and running fsck.
1155
+
1156
+1. Take a snapshot of the virtual machine. 
1157
+
1158
+1. Restart the virtual machine running Photon OS. 
1159
+
1160
+1. When the Photon OS splash screen appears as it restarts, type the letter `e` to go to the GNU GRUB edit menu. Be quick about it: Because Photon OS reboots so quickly, you won't have much time to type `e`. Remember that in VMware vSphere or VMware Workstation Pro, you might have to give the console focus by clicking in its window before it will register input from the keyboard. 
1161
+
1162
+1. In the GNU GRUB edit menu, go to the end of the line that starts with `linux`, add a space, and then add the following code exactly as it appears below:
1163
+
1164
+	`systemd.unit=emergency.target`
1165
+
1166
+1. Type `F10`.
1167
+
1168
+1. In the bash shell, run one of the following commands to fix the file system errors, depending on whether `sda1` or `sda2` represents the root file system: 
1169
+
1170
+	`e2fsck -y /dev/sda1`
1171
+
1172
+	or
1173
+
1174
+	`e2fsck -y /dev/sda2`
1175
+
1176
+1. Restart the virtual machine.
1177
+
1178
+
1179
+## Packages
1180
+
1181
+On Photon OS, tdnf is the default package manager. The standard syntax for `tdnf` commands is the same as that for DNF and Yum: 
1182
+
1183
+	tdnf [options] <command> [<arguments>...]
1184
+
1185
+The main configuration files reside in `/etc/tdnf/tdnf.conf`. The repositories appear in `/etc/yum.repos.d/` with `.repo` file extensions. For more information, see the [Photon OS Administration Guide](https://github.com/vmware/photon/blob/master/docs/photon-admin-guide.md).
1186
+
1187
+The cache files for data and metadata reside in `/var/cache/tdnf`. The local cache is populated with data from the repository: 
1188
+
1189
+	ls -l /var/cache/tdnf/photon
1190
+	total 8
1191
+	drwxr-xr-x 2 root root 4096 May 18 22:52 repodata
1192
+	d-wxr----t 3 root root 4096 May  3 22:51 rpms
1193
+
1194
+You can clear the cache to help troubleshoot a problem, but keep in mind that doing so might slow the performance of tdnf until the cache becomes repopulated with data. Cleaning the cache can remove stale information. Here is how to clear the cache: 
1195
+
1196
+	tdnf clean all
1197
+	Cleaning repos: photon photon-extras photon-updates lightwave
1198
+	Cleaning up everything
1199
+
1200
+Some tdnf commands can help you troubleshoot problems with packages:
1201
+
1202
+`makecache`: This command updates the cached binary metadata for all known repositories. You can run it after you clean the cache to make sure you are working with the latest repository data as you troubleshoot. Example:
1203
+
1204
+	tdnf makecache
1205
+	Refreshing metadata for: 'VMware Lightwave 1.0(x86_64)'
1206
+	Refreshing metadata for: 'VMware Photon Linux 1.0(x86_64)Updates'
1207
+	Refreshing metadata for: 'VMware Photon Extras 1.0(x86_64)'
1208
+	Refreshing metadata for: 'VMware Photon Linux 1.0(x86_64)'
1209
+	Metadata cache created.
1210
+
1211
+`tdnf check-local`: This command resolves dependencies by using the local RPMs to help check RPMs for quality assurance before publishing them. To check RPMs with this command, you must create a local directory and place your RPMs in it. The command, which includes no options, takes the path to the local directory containing the RPMs as its argument. The command does not, however, recursively parse directories; it checks the RPMs only in the directory that you specify. For example, after creating a directory named `/tmp/myrpms` and placing your RPMs in it, you can run the following command to check them:  
1212
+
1213
+	tdnf check-local /tmp/myrpms
1214
+	Checking all packages from: /tmp/myrpms
1215
+	Found 10 packages
1216
+	Check completed without issues
1217
+
1218
+`tdnf provides`: This command finds the packages that provide the package that you supply as an argument. If you are used to a package name for another system, you can use `tdnf provides` to find the corresponding name of the package on Photon OS. Example: 
1219
+
1220
+	tdnf provides docker
1221
+	docker-1.11.0-1.ph1.x86_64 : Docker
1222
+	Repo     : photon
1223
+	docker-1.11.0-1.ph1.x86_64 : Docker
1224
+	Repo     : @System
1225
+
1226
+For a file, you must provide the full path. Here's an example: 
1227
+
1228
+	tdnf provides /usr/include/stdio.h
1229
+	glibc-devel-2.22-8.ph1.x86_64 : Header files for glibc
1230
+	Repo     : photon
1231
+	glibc-devel-2.22-8.ph1.x86_64 : Header files for glibc
1232
+	Repo     : @System
1233
+
1234
+Here's an example that shows you how to find the package that provides a pluggable authentication module, which you might need to find if the system is mishandling passwords. 
1235
+
1236
+	tdnf provides /etc/pam.d/system-account
1237
+	shadow-4.2.1-7.ph1.x86_64 : Programs for handling passwords in a secure way
1238
+	Repo     : photon
1239
+	shadow-4.2.1-8.ph1.x86_64 : Programs for handling passwords in a secure way
1240
+	Repo     : photon-updates
1241
+
1242
+Additional commands appear in the [Photon OS Administration Guide](https://github.com/vmware/photon/blob/master/docs/photon-admin-guide.md).
1243
+
1244
+If you find a package that is installed but is not working, try re-installing it; example: 
1245
+
1246
+	tdnf reinstall shadow
1247
+	Reinstalling:
1248
+	shadow 	x86_64 	4.2.1-7.ph1   3.85 M
1249
+
1250
+## Kernel Problems, Boot Problems, and Login Problems
1251
+
1252
+### Kernel Overview
1253
+
1254
+Photon OS 1.0 uses Linux kernel version 4.4. Troubleshooting kernel problems starts with `dmesg`. The `dmesg` command prints messages from the kernel ring buffer. The following command, for example, presents kernel messages in a human-readable format: 
1255
+
1256
+	dmesg --human --kernel
1257
+
1258
+To examine kernel messages as you perform actions, such as reproducing a problem, in another terminal, you can run the command with the `--follow` option, which waits for new messages and prints them as they occur: 
1259
+
1260
+	dmesg --human --kernel --follow
1261
+
1262
+The kernel buffer is limited in memory size. As a result, the kernel cyclically overwrites the end of the information in the buffer from which dmesg pulls information. The systemd journal, however, saves the information from the buffer to a log file so that you can access older information. To view it, run the following command: 
1263
+
1264
+	journalctl -k
1265
+
1266
+If need be, you can check the modules that are loaded on your Photon OS machine by running the `lsmod` command; truncated example:  
1267
+
1268
+	lsmod
1269
+	Module                  Size  Used by
1270
+	vmw_vsock_vmci_transport    28672  1
1271
+	vsock                  36864  2 vmw_vsock_vmci_transport
1272
+	coretemp               16384  0
1273
+	hwmon                  16384  1 coretemp
1274
+	crc32c_intel           24576  0
1275
+	hid_generic            16384  0
1276
+	usbhid                 28672  0
1277
+	hid                   106496  2 hid_generic,usbhid
1278
+	xt_conntrack           16384  1
1279
+	iptable_nat            16384  0
1280
+	nf_conntrack_ipv4      16384  2
1281
+	nf_defrag_ipv4         16384  1 nf_conntrack_ipv4
1282
+	nf_nat_ipv4            16384  1 iptable_nat
1283
+	nf_nat                 24576  1 nf_nat_ipv4
1284
+	iptable_filter         16384  1
1285
+	ip_tables              24576  2 iptable_filter,iptable_nat
1286
+
1287
+
1288
+### Boot Process Overview
1289
+
1290
+When a Photon OS machine boots, the BIOS initializes the hardware and uses a boot loader to start the kernel. After the kernel starts, systemd takes over and boots the rest of the operating system. 
1291
+
1292
+More specifically, the BIOS checks the memory and initializes the keyboard, the screen, and other peripherals. When the BIOS finds the first hard disk, the boot loader--GNU GRUB 2.02--takes over. From the hard disk, GNU GRUB loads the master boot record (MBR) and initializes the root partition of the random-access memory by using initrd. The device manager, udev, provides initrd with the drivers it needs to access the device containing the root file system. Here's what the GNU GRUB edit menu looks like in Photon OS with its default commands to load the boot record and initialize the RAM disk: 
1293
+
1294
+![The GNU GRUB edit menu in the full and minimal versions of Photon OS](images/grub-edit-menu-orig.png)  
1295
+
1296
+At this point, the Linux kernel in Photon OS, which is kernel version 4.4.8, takes control. Systemd kicks in, initializes services in parallel, mounts the rest of the file system, and checks the file system for errors. 
1297
+
1298
+### Blank Screen on Reboot
1299
+
1300
+If the Photon OS kernel enters a state of panic during a reboot and all you see is a blank screen, note the name of the virtual machine running Photon OS and then power off the VM. 
1301
+
1302
+In the host, open the `vmware.log` file for the VM. When a kernel panics, the guest VM prints the entire kernel log in  `vmware.log` in the host's directory containing the VM. This log file contains the output of the `dmesg` command from the guest, and you can analyze it to help identify the cause of the boot problem. 
1303
+
1304
+Here's an example. After searching for `Guest:` in the following abridged `vmware.log`, this line appears, identifying the root cause of the reboot problem: 
1305
+
1306
+	2016-08-30T16:02:43.220-07:00| vcpu-0| I125: Guest: 
1307
+	<0>[1.125804] Kernel panic - not syncing: 
1308
+	VFS: Unable to mount root fs on unknown-block(0,0)
1309
+
1310
+Further inspection finds the following lines: 
1311
+
1312
+	2016-08-30T16:02:43.217-07:00| vcpu-0| I125: Guest: 
1313
+	<4>[    1.125782] VFS: Cannot open root device "sdc1" or unknown-block(0,0): error -6
1314
+	2016-08-30T16:02:43.217-07:00| vcpu-0| I125: Guest: 
1315
+	<4>[    1.125783] Please append a correct "root=" boot option; 
1316
+	here are the available partitions: 
1317
+	2016-08-30T16:02:43.217-07:00| vcpu-0| I125: Guest: 
1318
+	<4>[    1.125785] 0100            4096 ram0  (driver?)
1319
+	...
1320
+	0800         8388608 sda  driver: sd
1321
+	2016-08-30T16:02:43.220-07:00| vcpu-0| I125: Guest: 
1322
+	<4>[    1.125802]   0801         8384512 sda1 611e2d9a-a3da-4ac7-9eb9-8d09cb151a93
1323
+	2016-08-30T16:02:43.220-07:00| vcpu-0| I125: Guest: 
1324
+	<4>[    1.125803]   0802            3055 sda2 8159e59c-b382-40b9-9070-3c5586f3c7d6
1325
+
1326
+In this unlikely case, the GRUB configuration points to a root device named `sdc1` instead of the correct root device, `sda1`. You can fix the problem by restoring the GRUB GNU edit screen and the GRUB configuration file (`/boot/grub/grub.cfg`) to their original configurations. 
1327
+
1328
+### Investigating Strange Behavior
1329
+
1330
+If you rebooted to address strange behavior before the reboot of if you encountered strange behavior during the reboot but have reached the shell, you should analyze what happened since the previous boot. Start broad by running the following command to check the logs: 
1331
+
1332
+	journalctl
1333
+
1334
+Next, run the following command to look at what happened since the penultimate reboot: 
1335
+
1336
+	journalctl --boot=-1
1337
+
1338
+Then look at the log from the reboot: 
1339
+
1340
+	journalctl -b
1341
+
1342
+If need be, examine the logs for the kernel: 
1343
+
1344
+	journalctl -k
1345
+
1346
+Check which kernel is in use:
1347
+
1348
+	uname -r
1349
+
1350
+The kernel version of Photon OS in the full version is 4.4.8. The kernel version of in the OVA version is 4.4.8-esx. With the ESX version of the kernel, some services might not start. Run this command to check the overall status of services: 
1351
+
1352
+	systemctl status 
1353
+
1354
+If a service is in red, check it: 
1355
+
1356
+	systemctl status service-name
1357
+
1358
+Start it if need be: 
1359
+
1360
+	systemctl start service-name
1361
+
1362
+If looking at the journal and checking the status of services gets you nowhere, run the following `systemd-analyze` commands to examine the boot time and the speed with which services start.
1363
+
1364
+	systemd-analyze time
1365
+	systemd-analyze blame
1366
+	systemd-analyze critical-chain
1367
+ 
1368
+Keep in mind that the output of these commands might be misleading because one service might just be waiting for another service to finish initializing.
1369
+
1370
+### Investigating the Guest Kernel When You Cannot Log On
1371
+
1372
+If a VM running Photon OS and an application or virtual appliance is behaving so oddly that, for example, you cannot log on to the machine, you can still troubleshoot by extracting the kernel logs from the guest's memory and analyzing them with `gdb`. 
1373
+
1374
+This advanced troubleshooting method works when you are running Photon OS as the operating system for an application or appliance on VMware Workstation, Fusion, or ESXi. This approach assumes that the virtual machine running Photon OS is functioning normally. 
1375
+
1376
+This troubleshooting method has the following requirements: 
1377
+
1378
+* Root access to a Linux machine other than the one you are troubleshooting. It can be another Photon OS machine, Ubuntu, or another Linux variant. 
1379
+* The `vmss2core` utility from VMware. It is installed by default in VMware Workstation and some other VMware products. If your system doesn't already contain it, you can download it for free from https://labs.vmware.com/flings/vmss2core.
1380
+* A local copy of the Photon OS ISO of the exact same version and release number as the Photon OS machine that you are troubleshooting. 
1381
+
1382
+The process to use this troubleshooting method varies by environment. The examples in this section assume that the troublesome Photon OS virtual machine is running in VMware Workstation 12 Pro on a Microsoft Windows 8 Enterprise host. The examples also use an additional, fully functional Photon OS virtual machine running in Workstation.
1383
+
1384
+You can, however, use other hosts, hypervisors, and operating systems--but you will have to adapt the example process below to them. Directory paths, file names, and other aspects might be different on other systems. 
1385
+
1386
+**Overview**   
1387
+
1388
+The process to apply this troubleshooting method goes like this: On a local computer, you open a file on the Photon OS ISO that contains Linux debugging information. Then you suspend the troublesome Photon OS VM and extract the kernel memory logs from the VMware hypervisor running Photon OS. 
1389
+
1390
+Next, you use the vmss2core tool to convert the memory logs into core dump files. The vmss2core utility converts VMware checkpoint state files into formats that third-party debugging tools understand. It can handle both suspend (.vmss) and snapshot (.vmsn) checkpoint state files (hereafter referred to as a _vmss file_) as well as monolithic and non-monolithic (separate .vmem file) encapsulation of checkpoint state data. See [Debugging Virtual Machines with the Checkpoint to Core Tool](http://www.vmware.com/pdf/snapshot2core_technote.pdf).
1391
+
1392
+Finally, you prepare to run the gdb tool by using the debug info file from the ISO to create a `.gdbinit` file, which you can then analyze with the gdb shell on your local Linux machine.
1393
+
1394
+All three components must be in the same directory on a Linux machine.  
1395
+
1396
+**Process**
1397
+
1398
+First, obtain a local copy of the Photon OS ISO of the exact same version and release number as the Photon OS machine that you are troubleshooting and mount the ISO on a Linux machine (or open it on a Windows machine):
1399
+
1400
+	mount /mnt/cdrom
1401
+
1402
+Second, locate the following file. (If you opened the Photon OS ISO on a Windows computer, copy the following file to the root folder of a Linux machine.)
1403
+
1404
+	/RPMS/x86_64/linux-debuginfo-4.4.8-6.ph1.x86_64.rpm
1405
+
1406
+Third, on a Linux machine, run the following `rpm2cpio` command to convert the RPM file to a cpio file and to extract the contents of the RPM to the current directory:
1407
+
1408
+	rpm2cpio /mnt/cdrom/RPMS/x86_64/linux-debuginfo-4.4.8-6.ph1.x86_64.rpm | cpio -idmv
1409
+
1410
+From the extracted files, copy the following file to your current directory: 
1411
+
1412
+	cp usr/lib/debug/lib/modules/4.4.8/vmlinux-4.4.8.debug .
1413
+
1414
+Run the following command to download the dmesg functions that will help extract the kernel log from the coredump: <!--	wget https://www.kernel.org/doc/Documentation/kdump/gdbmacros.txt
1415
+-->
1416
+
1417
+	wget https://github.com/vmware/photon/blob/master/tools/scripts/gdbmacros-for-linux.txt
1418
+
1419
+Move the file as follows: 
1420
+
1421
+	mv gdbmacros-for-linux.txt .gdbinit
1422
+
1423
+Next, switch to your host machine so you can get the kernel memory files from the VM. Suspend the troublesome VM and locate the `.vmss` and `.vmem` files in the virtual machine's directory on the host. Example: 
1424
+
1425
+	C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit (7)>dir
1426
+	 Volume in drive C is Windows
1427
+	 Directory of C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit
1428
+	 (7)
1429
+	09/20/2016  12:22 PM    <DIR>          .
1430
+	09/20/2016  12:22 PM    <DIR>          ..
1431
+	09/19/2016  03:39 PM       402,653,184 VMware Photon 64-bit (7)-f6b070cd.vmem
1432
+	09/20/2016  12:11 PM         5,586,907 VMware Photon 64-bit (7)-f6b070cd.vmss
1433
+	09/20/2016  12:11 PM     1,561,001,984 VMware Photon 64-bit (7)-s001.vmdk
1434
+	...
1435
+	09/20/2016  12:11 PM           300,430 vmware.log
1436
+	...
1437
+
1438
+Now that you have located the `.vmss` and `.vmem` files, convert them to one or more core dump files by using the vmss2core tool that comes with Workstation. Here is an example of how to run the command. Be careful with your pathing, escaping, file names, and so forth--all of which might be different from this example on your Windows machine. 
1439
+
1440
+	C:\Users\shoenisch\Documents\Virtual Machines\VMware Photon 64-bit (7)>C:\"Program Files (x86)\VMware\VMware Workstation"\vmss2core.exe "VMware Photon 64-bit (7)-f6b070cd.vmss" "VMware Photon 64-bit (7)-f6b070cd.vmem"
1441
+
1442
+The result of this command is one or more files with a `.core` extension plus a digit. Truncated example: 
1443
+
1444
+	C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit (7)>dir
1445
+	 Directory of C:\Users\tester\Documents\Virtual Machines\VMware Photon 64-bit(7)
1446
+	09/20/2016  12:22 PM       729,706,496 vmss.core0
1447
+
1448
+Copy the `.core` file or files to the your current directory on the Linux machine where you so that you can analyze it with gdb. 
1449
+
1450
+Run the following `gdb` command to enter the gdb shell attached to the memory core dump file. You might have to change the name of the `vmss.core` file in the example to match your `.core` file:
1451
+
1452
+	gdb vmlinux-4.4.8.debug vmss.core0
1453
+
1454
+	GNU gdb (GDB) 7.8.2
1455
+	Copyright (C) 2014 Free Software Foundation, Inc.
1456
+	License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
1457
+	This is free software: you are free to change and redistribute it. 
1458
+	There is NO WARRANTY, to the extent permitted by law. ...
1459
+	Type "show configuration" for configuration details.
1460
+	For bug reporting instructions, please see:
1461
+	<http://www.gnu.org/software/gdb/bugs/>.
1462
+	Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>.
1463
+	For help, type "help".
1464
+	Type "apropos word" to search for commands related to "word"...
1465
+	Reading symbols from vmlinux-4.4.8.debug...done.
1466
+	warning: core file may not match specified executable file.
1467
+	[New LWP 12345]
1468
+	Core was generated by `GuestVM'.
1469
+	Program terminated with signal SIGSEGV, Segmentation fault.
1470
+	#0  0xffffffff813df39a in insb (count=0, addr=0xffffc90000144000, port=<optimized out>)
1471
+	    at arch/x86/include/asm/io.h:316
1472
+	316     arch/x86/include/asm/io.h: No such file or directory.
1473
+	(gdb)
1474
+
1475
+In the results above, the _(gdb)_ of the last line is the prompt of the gdb shell. You can now analyze the core dump by using commands like `bt` (to perform a backtrace) and `dmesg` (to view the Photon OS kernel log and see Photon OS kernel error messages). 
1476
+
1477
+### Kernel Log Replication with VProbes
1478
+
1479
+Replicating the Photon OS kernel logs on the VMware ESXi host is an advanced but powerful method of troubleshooting a kernel problem. This method is applicable when the virtual machine running Photon OS is hanging or inaccessible because, for instance, the hard disk has failed.
1480
+
1481
+There is a prerequisite, however: You must have preemptively enabled the VMware VProbes facility on the VM before a problem rendered it inaccessible. You must also create a VProbes script on the ESXi host, but you can do that after the fact. 
1482
+
1483
+Although the foresight to implement these prerequisites might limit the application of this troubleshooting method for production systems, the method can be particularly useful in analyzing kernel issues when testing an application or appliance that is running on Photon OS.   
1484
+
1485
+There are two similar ways in which you can replicate the Photon OS kernel logs on ESXi by using VProbes. The first modifies the VProbes script so that it works only for the VM that you set; it uses a hard-coded address. The second uses an abstraction instead of a hard-coded address so that the same VProbes script can be used for any VM on an ESXi host that you have enabled for VProbe and copied its kernel symbol table (kallsyms) to ESXi.
1486
+
1487
+For more information on VMware VProbes, see [VProbes: Deep Observability Into the ESXi Hypervisor](https://labs.vmware.com/vmtj/vprobes-deep-observability-into-the-esxi-hypervisor) and the [VProbes Programming Reference](http://www.vmware.com/pdf/ws7_f3_vprobes_reference.pdf).
1488
+
1489
+**Using VProbes Script with a Hard-Coded Address**
1490
+
1491
+Here's how to set a VProbe for an individual VM: 
1492
+
1493
+First, power off the VM so that you can turn on the VProbe facility. Edit the `.vmx` configuration file for the VM. The file resides in the directory that contains the VM in the ESXi data store. Add the following line of code to the `.vmx` file and then power the VM on: 
1494
+
1495
+	vprobe.enable = "TRUE"
1496
+
1497
+When you edit the `.vmx` file to add the above line of code, you must first turn off the VM--otherwise, your changes will not persist. 
1498
+
1499
+Second, obtain the kernel log_store function address by connecting to the VM with SSH and running the following commands as root. (Photon OS uses the `kptr_restrict` setting to place restrictions on the kernel addresses exposed through `/proc` and other interfaces. This setting hides exposed kernel pointers to prevent attackers from exploiting kernel write vulnerabilities. When you are done using VProbes, you should return `kptr_restrict` to the original setting of `2` by rebooting.)
1500
+
1501
+	echo 0 > /proc/sys/kernel/kptr_restrict
1502
+	grep log_store /proc/kallsyms
1503
+
1504
+The output of the `grep` command will look similar to the following string. The first set of characters (without the `t`) is the log_store function address:
1505
+
1506
+	ffffffff810bb680 t log_store
1507
+
1508
+Third, connect to the ESXi host with SSH so that you can create a VProbes script. Here's the template for the script; `log_store` in the first line is a placeholder for the VM's log_store function address: 
1509
+
1510
+	GUEST:ENTER:log_store {
1511
+	   string dst;
1512
+	   getgueststr(dst, getguest(RSP+16) & 0xff, getguest(RSP+8));
1513
+	   printf("%s\n", dst);
1514
+	}
1515
+
1516
+On the ESXi host, create a new file, add the template to it, and then change `log_store` to the function address that was the output from the grep command on the VM. 
1517
+
1518
+You must add a `0x` prefix to the function address. In this example, the modified template looks like this: 
1519
+
1520
+	GUEST:ENTER:0xffffffff810bb680 {
1521
+	   string dst;
1522
+	   getgueststr(dst, getguest(RSP+16) & 0xff, getguest(RSP+8));
1523
+	   printf("%s\n", dst);
1524
+	}
1525
+
1526
+Save your VProbes script as `console.emt` in the `/tmp` directory. (The file extension for VProbe scripts is `.emt`.) 
1527
+
1528
+While still connected to the ESXi host with SSH, run the following command to obtain the ID of the virtual machine that you want to troubleshoot: 
1529
+
1530
+	vim-cmd vmsvc/getallvms
1531
+
1532
+This command lists all the VMs running on the ESXi host. Find the VM you want to troubleshoot in the list and make a note of its ID. 
1533
+
1534
+Finally, run the following command to print all the kernel messages from Photon OS in your SSH console; replace `<VM ID>` with the ID of your VM:  
1535
+
1536
+	vprobe -m <VM ID> /tmp/console.emt
1537
+
1538
+When you're done, type `Ctrl-C` to stop the loop. 
1539
+
1540
+**A Reusable VProbe Script Using the kallsyms File**
1541
+
1542
+Here's how to create one VProbe script and use for all the VMs on your ESXi host. 
1543
+
1544
+First, power off the VM and turn on the VProbe facility on each VM that you want to be able to analyze. Add `vprobe.enable = "TRUE"` to the VM's `.vmx` configuration file. See the instructions above. 
1545
+
1546
+Second, power on the VM, connect to it with SSH, and run the following command as root: 
1547
+
1548
+	echo 0 > /proc/sys/kernel/kptr_restrict
1549
+
1550
+Third, connect to the ESXi host with SSH to create the following VProbes script and save it as `/tmp/console.emt`:
1551
+
1552
+	GUEST:ENTER:log_store {
1553
+	   string dst;
1554
+	   getgueststr(dst, getguest(RSP+16) & 0xff, getguest(RSP+8));
1555
+	   printf("%s\n", dst);
1556
+	}
1557
+
1558
+Fourth, from the ESXi host, run the following command to copy the VM's `kallysms` file to the `tmp` directory on the ESXi host: 
1559
+
1560
+	scp root@<vm ip address>:/proc/kallsyms /tmp
1561
+
1562
+While still connected to the ESXi host with SSH, run the following command to obtain the ID of the virtual machine that you want to troubleshoot: 
1563
+
1564
+	vim-cmd vmsvc/getallvms
1565
+
1566
+This command lists all the VMs running on the ESXi host. Find the VM you want to troubleshoot in the list and make a note of its ID. 
1567
+
1568
+Finally, run the following command to print all the kernel messages from Photon OS in your SSH console; replace `<VM ID>` with the ID of your VM. When you're done, type `Ctrl-C` to stop the loop.  
1569
+
1570
+	vprobe -m <VM ID> -k /tmp/kallysyms /tmp/console.emt
1571
+
1572
+You can use a directory other than `tmp` if you want.
1573
+
1574
+<!--
1575
+## Deep Kernel Analysis with the Crash Utility
1576
+
1577
+-->
1578
+
1579
+## Performance Issues
1580
+
1581
+Performance issues can be difficult to troubleshoot because so many variables play a role in overall system performance. Interpreting performance data often depends on the context and the situation. To better identify and isolate variables and to gain insight into performance data, you can use the troubleshooting tools on Photon OS to diagnose the system.  
1582
+
1583
+If you have no indication what the cause of a performance degradation might be, start by getting a broad picture of the system's state. Then look for clues in the data that might point to a cause. The systemd journal is a useful place to start. 
1584
+
1585
+The `top` tool can unmask problems caused by processes or applications overconsuming CPUs, time, or RAM. If the percent of CPU utilization is consistently high with little idle time, for example, there might be a runaway process. Restart it. 
1586
+
1587
+The `netstat --statistics` command can identify bottlenecks causing performance issues. It lists interface statistics for different protocols. 
1588
+
1589
+If `top` and `netstat` reveal no clues, run the `strace ls -al` to view every system call.
1590
+
1591
+The following `watch` command can help dynamically monitor a command to help troubleshoot performance issues:
1592
+
1593
+	watch -n0 --differences <command>
1594
+
1595
+You can, for example, combine `watch` with the `vmstat` command to dig deeper into statistics about virtual memory, processes, block input-output, disks, and CPU activity. Are there any bottlenecks? 
1596
+
1597
+Another option is to use the `dstat` utility. It shows a live, running list of statistics about system resources. 
1598
+
1599
+In addition, `systemd-analyze`, which reveals performance statistics for boot times, can help troubleshoot slow system boots and incorrect unit files.
1600
+
1601
+The additional tools that you select depend on the clues that your initial investigation reveals. The following tools can also help troubleshoot performance: `sysstat`, `sar`, `systemtap`, and `crash`. 
1602
+
1603
+
1604
+
1605
+
1606
+
1607
+
1608
+
1609
+
1610
+
1611
+
1612
+
1613
+