Browse code

Update sig docs to have more info about sig writing

Andrew authored on 2018/11/06 07:53:51
Showing 1 changed files
... ...
@@ -4,7 +4,6 @@ Table of Contents
4 4
 
5 5
 - [Creating signatures for ClamAV](#creating-signatures-for-clamav)
6 6
 - [Introduction](#introduction)
7
-- [Debug information from libclamav](#debug-information-from-libclamav)
8 7
 - [Signature formats](#signature-formats)
9 8
     - [Hash-based signatures](#hash-based-signatures)
10 9
         - [MD5 hash-based signatures](#md5-hash-based-signatures)
... ...
@@ -32,190 +31,22 @@ Table of Contents
32 32
     - [Signature names](#signature-names)
33 33
     - [Using YARA rules in ClamAV](#using-yara-rules-in-clamav)
34 34
     - [Passwords for archive files \[experimental\]](#passwords-for-archive-files-experimental)
35
-- [Special files](#special-files)
36
-    - [HTML](#html)
37
-    - [Text files](#text-files)
38
-    - [Compressed Portable Executable files](#compressed-portable-executable-files)
35
+- [Signature writing tips and tricks](#signature-writing-tips-and-tricks)
36
+    - [Testing rules with clamscan](#testing-rules-with-clamscan)
37
+    - [Debug information from libclamav](#debug-information-from-libclamav)
38
+    - [Writing signatures for special files](#writing-signatures-for-special-files)
39
+        - [HTML](#html)
40
+        - [Text files](#text-files)
41
+        - [Compressed Portable Executable files](#compressed-portable-executable-files)
42
+    - [Using sigtool](#using-sigtool)
43
+    - [Inspecting signatures inside a CVD file](#inspecting-signatures-inside-a-CVD-file)
44
+    - [External tools](#external-tools)
39 45
 
40 46
 # Introduction
41 47
 
42
-CVD (ClamAV Virus Database) is a digitally signed container that includes signature databases in various text formats. The header of the container is a 512 bytes long string with colon separated fields:
48
+In order to detect malware and other file-based threats, ClamAV relies on signatures to differentiate clean and malicious/unwanted files.  ClamAV signatures are primarily text-based and conform to one of the ClamAV-specific signature formats associated with a given method of detection.  These formats are explained in the [Signature formats](#signature-formats) section below.  In addition, ClamAV 0.99 and above support signatures written in the YARA format.  More information on this can be found in the [Using YARA rules in ClamAV](#using-yara-rules-in-clamav) section.
43 49
 
44
-```
45
-ClamAV-VDB:build time:version:number of signatures:functionality level required:MD5 checksum:digital signature:builder name:build time (sec)
46
-```
47
-
48
-`sigtool --info` displays detailed information about a given CVD file:
49
-
50
-```bash
51
-zolw@localhost:/usr/local/share/clamav$ sigtool -i main.cvd
52
-File: main.cvd
53
-Build time: 09 Dec 2007 15:50 +0000
54
-Version: 45
55
-Signatures: 169676
56
-Functionality level: 21
57
-Builder: sven
58
-MD5: b35429d8d5d60368eea9630062f7c75a
59
-Digital signature: dxsusO/HWP3/GAA7VuZpxYwVsE9b+tCk+tPN6OyjVF/U8
60
-JVh4vYmW8mZ62ZHYMlM903TMZFg5hZIxcjQB3SX0TapdF1SFNzoWjsyH53eXvMDY
61
-eaPVNe2ccXLfEegoda4xU2TezbGfbSEGoU1qolyQYLX674sNA2Ni6l6/CEKYYh
62
-Verification OK.
63
-```
64
-
65
-The ClamAV project distributes a number of CVD files, including *main.cvd* and *daily.cvd*.
66
-
67
-# Debug information from libclamav
68
-
69
-In order to create efficient signatures for ClamAV it’s important to understand how the engine handles input files. The best way to see how it works is having a look at the debug information from libclamav. You can do it by calling `clamscan` with the `--debug` and `--leave-temps` flags. The first switch makes clamscan display all the interesting information from libclamav and the second one avoids deleting temporary files so they can be analyzed further.
70
-
71
-The now important part of the info is:
72
-
73
-```bash
74
-$ clamscan --debug attachment.exe
75
-[...]
76
-LibClamAV debug: Recognized MS-EXE/DLL file
77
-LibClamAV debug: Matched signature for file type PE
78
-LibClamAV debug: File type: Executable
79
-```
80
-
81
-The engine recognized a windows executable.
82
-
83
-```bash
84
-LibClamAV debug: Machine type: 80386
85
-LibClamAV debug: NumberOfSections: 3
86
-LibClamAV debug: TimeDateStamp: Fri Jan 10 04:57:55 2003
87
-LibClamAV debug: SizeOfOptionalHeader: e0
88
-LibClamAV debug: File format: PE
89
-LibClamAV debug: MajorLinkerVersion: 6
90
-LibClamAV debug: MinorLinkerVersion: 0
91
-LibClamAV debug: SizeOfCode: 0x9000
92
-LibClamAV debug: SizeOfInitializedData: 0x1000
93
-LibClamAV debug: SizeOfUninitializedData: 0x1e000
94
-LibClamAV debug: AddressOfEntryPoint: 0x27070
95
-LibClamAV debug: BaseOfCode: 0x1f000
96
-LibClamAV debug: SectionAlignment: 0x1000
97
-LibClamAV debug: FileAlignment: 0x200
98
-LibClamAV debug: MajorSubsystemVersion: 4
99
-LibClamAV debug: MinorSubsystemVersion: 0
100
-LibClamAV debug: SizeOfImage: 0x29000
101
-LibClamAV debug: SizeOfHeaders: 0x400
102
-LibClamAV debug: NumberOfRvaAndSizes: 16
103
-LibClamAV debug: Subsystem: Win32 GUI
104
-LibClamAV debug: ------------------------------------
105
-LibClamAV debug: Section 0
106
-LibClamAV debug: Section name: UPX0
107
-LibClamAV debug: Section data (from headers - in memory)
108
-LibClamAV debug: VirtualSize: 0x1e000 0x1e000
109
-LibClamAV debug: VirtualAddress: 0x1000 0x1000
110
-LibClamAV debug: SizeOfRawData: 0x0 0x0
111
-LibClamAV debug: PointerToRawData: 0x400 0x400
112
-LibClamAV debug: Section's memory is executable
113
-LibClamAV debug: Section's memory is writeable
114
-LibClamAV debug: ------------------------------------
115
-LibClamAV debug: Section 1
116
-LibClamAV debug: Section name: UPX1
117
-LibClamAV debug: Section data (from headers - in memory)
118
-LibClamAV debug: VirtualSize: 0x9000 0x9000
119
-LibClamAV debug: VirtualAddress: 0x1f000 0x1f000
120
-LibClamAV debug: SizeOfRawData: 0x8200 0x8200
121
-LibClamAV debug: PointerToRawData: 0x400 0x400
122
-LibClamAV debug: Section's memory is executable
123
-LibClamAV debug: Section's memory is writeable
124
-LibClamAV debug: ------------------------------------
125
-LibClamAV debug: Section 2
126
-LibClamAV debug: Section name: UPX2
127
-LibClamAV debug: Section data (from headers - in memory)
128
-LibClamAV debug: VirtualSize: 0x1000 0x1000
129
-LibClamAV debug: VirtualAddress: 0x28000 0x28000
130
-LibClamAV debug: SizeOfRawData: 0x200 0x1ff
131
-LibClamAV debug: PointerToRawData: 0x8600 0x8600
132
-LibClamAV debug: Section's memory is writeable
133
-LibClamAV debug: ------------------------------------
134
-LibClamAV debug: EntryPoint offset: 0x8470 (33904)
135
-```
136
-
137
-The section structure displayed above suggests the executable is packed
138
-with UPX.
139
-
140
-```bash
141
-LibClamAV debug: ------------------------------------
142
-LibClamAV debug: EntryPoint offset: 0x8470 (33904)
143
-LibClamAV debug: UPX/FSG/MEW: empty section found - assuming
144
-                 compression
145
-LibClamAV debug: UPX: bad magic - scanning for imports
146
-LibClamAV debug: UPX: PE structure rebuilt from compressed file
147
-LibClamAV debug: UPX: Successfully decompressed with NRV2B
148
-LibClamAV debug: UPX/FSG: Decompressed data saved in
149
-                 /tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede
150
-LibClamAV debug: ***** Scanning decompressed file *****
151
-LibClamAV debug: Recognized MS-EXE/DLL file
152
-LibClamAV debug: Matched signature for file type PE
153
-```
154
-
155
-Indeed, libclamav recognizes the UPX data and saves the decompressed
156
-(and rebuilt) executable into
157
-`/tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede`. Then it continues by
158
-scanning this new file:
159
-
160
-```bash
161
-LibClamAV debug: File type: Executable
162
-LibClamAV debug: Machine type: 80386
163
-LibClamAV debug: NumberOfSections: 3
164
-LibClamAV debug: TimeDateStamp: Thu Jan 27 11:43:15 2011
165
-LibClamAV debug: SizeOfOptionalHeader: e0
166
-LibClamAV debug: File format: PE
167
-LibClamAV debug: MajorLinkerVersion: 6
168
-LibClamAV debug: MinorLinkerVersion: 0
169
-LibClamAV debug: SizeOfCode: 0xc000
170
-LibClamAV debug: SizeOfInitializedData: 0x19000
171
-LibClamAV debug: SizeOfUninitializedData: 0x0
172
-LibClamAV debug: AddressOfEntryPoint: 0x7b9f
173
-LibClamAV debug: BaseOfCode: 0x1000
174
-LibClamAV debug: SectionAlignment: 0x1000
175
-LibClamAV debug: FileAlignment: 0x1000
176
-LibClamAV debug: MajorSubsystemVersion: 4
177
-LibClamAV debug: MinorSubsystemVersion: 0
178
-LibClamAV debug: SizeOfImage: 0x26000
179
-LibClamAV debug: SizeOfHeaders: 0x1000
180
-LibClamAV debug: NumberOfRvaAndSizes: 16
181
-LibClamAV debug: Subsystem: Win32 GUI
182
-LibClamAV debug: ------------------------------------
183
-LibClamAV debug: Section 0
184
-LibClamAV debug: Section name: .text
185
-LibClamAV debug: Section data (from headers - in memory)
186
-LibClamAV debug: VirtualSize: 0xc000 0xc000
187
-LibClamAV debug: VirtualAddress: 0x1000 0x1000
188
-LibClamAV debug: SizeOfRawData: 0xc000 0xc000
189
-LibClamAV debug: PointerToRawData: 0x1000 0x1000
190
-LibClamAV debug: Section contains executable code
191
-LibClamAV debug: Section's memory is executable
192
-LibClamAV debug: ------------------------------------
193
-LibClamAV debug: Section 1
194
-LibClamAV debug: Section name: .rdata
195
-LibClamAV debug: Section data (from headers - in memory)
196
-LibClamAV debug: VirtualSize: 0x2000 0x2000
197
-LibClamAV debug: VirtualAddress: 0xd000 0xd000
198
-LibClamAV debug: SizeOfRawData: 0x2000 0x2000
199
-LibClamAV debug: PointerToRawData: 0xd000 0xd000
200
-LibClamAV debug: ------------------------------------
201
-LibClamAV debug: Section 2
202
-LibClamAV debug: Section name: .data
203
-LibClamAV debug: Section data (from headers - in memory)
204
-LibClamAV debug: VirtualSize: 0x17000 0x17000
205
-LibClamAV debug: VirtualAddress: 0xf000 0xf000
206
-LibClamAV debug: SizeOfRawData: 0x17000 0x17000
207
-LibClamAV debug: PointerToRawData: 0xf000 0xf000
208
-LibClamAV debug: Section's memory is writeable
209
-LibClamAV debug: ------------------------------------
210
-LibClamAV debug: EntryPoint offset: 0x7b9f (31647)
211
-LibClamAV debug: Bytecode executing hook id 257 (0 hooks)
212
-attachment.exe: OK
213
-[...]
214
-```
215
-
216
-No additional files get created by libclamav. By writing a signature for the decompressed file you have more chances that the engine will detect the target data when it gets compressed with another packer.
217
-
218
-This method should be applied to all files for which you want to create signatures. By analyzing the debug information you can quickly see how the engine recognizes and preprocesses the data and what additional files get created. Signatures created for bottom-level temporary files are usually more generic and should help detecting the same malware in different forms.
50
+The ClamAV project distributes a collection of signatures in the form of CVD (ClamAV Virus Database) files.  The CVD file format provides a digitally-signed container that encapsulates the signatures and ensures that they can't be modified by a malicious third-party.  This signature set is actively maintained by [Cisco Talos](https://www.talosintelligence.com/) and can be downloaded using the `freshclam` application that ships with ClamAV.  For more details on this, see the [CVD file](#inspecting-signatures-inside-a-CVD-file) section.
219 51
 
220 52
 # Signature formats
221 53
 
... ...
@@ -1006,25 +837,215 @@ where:
1006 1006
 
1007 1007
 The signatures for password attempts are stored inside `.pwdb` files.
1008 1008
 
1009
-# Special files
1009
+# Signature writing tips and tricks
1010
+## Testing rules with clamscan
1011
+
1012
+To test a new signature, first create a text file with the extension corresponding to the signature type (Ex: '.lsb' for logical signatures).  Then, add the signature as it's own line within the file. This file can be passed to `clamscan` via the `-d` option, which tells ClamAV to load signatures from the file specified.  If the signature is not formatted correctly, ClamAV will display an error - run `clamscan` with `--debug --verbose` to see additional information about the error message.  Some common causes of errors include:
1013
+ - The signature file has the incorrect extension type for the signatures contained within
1014
+ - The file has one or more blank lines
1015
+ - For logical signatures, a semicolon exists at the end of the file
1016
+
1017
+If the rule is formatted correctly, clamscan will load the signature(s) in and scan any files specified via the command line invocation (or all files in the current directory if none are specified).  A successful detection will look like the following:
1018
+
1019
+```bash
1020
+clamscan -d test.ldb text.exe
1021
+test.exe: Win.Malware.Agent.UNOFFICIAL FOUND
1022
+
1023
+----------- SCAN SUMMARY -----------
1024
+Known viruses: 1
1025
+Engine version: 0.100.0
1026
+Scanned directories: 0
1027
+Scanned files: 1
1028
+Infected files: 1
1029
+Data scanned: 17.45 MB
1030
+Data read: 17.45 MB (ratio 1.00:1)
1031
+Time: 0.400 sec (0 m 0 s)
1032
+```
1033
+
1034
+If the rule did not match as intended:
1035
+ - The file may have exceeded one or more of the default scanning limits built-in to ClamAV.  Try running clamscan with the following options to see if raising the limits addresses the issue: `--max-filesize=2000M --max-scansize=2000M --max-files=2000000 --max-recursion=2000000 --max-embeddedpe=2000M --max-htmlnormalize=2000000 --max-htmlnotags=2000000 --max-scriptnormalize=2000000 --max-ziptypercg=2000000 --max-partitions=2000000 --max-iconspe=2000000 --max-rechwp3=2000000 --pcre-match-limit=2000000 --pcre-recmatch-limit=2000000 --pcre-max-filesize=2000M`.
1036
+ - If matching on HTML or text files, ClamAV might be performing normalization that causes the content of the scanned file to change.  See the [HTML](#html) and [Text file](#text-file) sections for more details.
1037
+ - libclamav may have been unable to unpack or otherwise process the file.  See [Debug information from libclamav](#debug-information-from-libclamav) for more details.
1038
+
1039
+NOTE: If you run `clamscan` with a `-d` flag, ClamAV will not load in the signatures downloaded via `freshclam`.  This means that:
1040
+ - some of ClamAV's unpacking support might be disabled, since some unpackers are implemented as bytecode signatures
1041
+ - PE whitelisting based on Authenticode signatures won't work, since this functionality relies on .crb rules
1042
+If any of this functionality is needed, load in the CVD files manually with additional `-d` flags.
1043
+
1044
+### Debug information from libclamav
1045
+
1046
+In order to create efficient signatures for ClamAV it’s important to understand how the engine handles input files. The best way to see how it works is having a look at the debug information from libclamav. You can do it by calling `clamscan` with the `--debug` and `--leave-temps` flags. The first switch makes clamscan display all the interesting information from libclamav and the second one avoids deleting temporary files so they can be analyzed further.
1047
+
1048
+The now important part of the info is:
1049
+
1050
+```bash
1051
+$ clamscan --debug attachment.exe
1052
+[...]
1053
+LibClamAV debug: Recognized MS-EXE/DLL file
1054
+LibClamAV debug: Matched signature for file type PE
1055
+LibClamAV debug: File type: Executable
1056
+```
1010 1057
 
1011
-## HTML
1058
+The engine recognized a windows executable.
1012 1059
 
1013
-ClamAV contains a special HTML normalisation code which helps to detect HTML exploits. Running `sigtool --html-normalise` on a HTML file should generate the following files:
1060
+```bash
1061
+LibClamAV debug: Machine type: 80386
1062
+LibClamAV debug: NumberOfSections: 3
1063
+LibClamAV debug: TimeDateStamp: Fri Jan 10 04:57:55 2003
1064
+LibClamAV debug: SizeOfOptionalHeader: e0
1065
+LibClamAV debug: File format: PE
1066
+LibClamAV debug: MajorLinkerVersion: 6
1067
+LibClamAV debug: MinorLinkerVersion: 0
1068
+LibClamAV debug: SizeOfCode: 0x9000
1069
+LibClamAV debug: SizeOfInitializedData: 0x1000
1070
+LibClamAV debug: SizeOfUninitializedData: 0x1e000
1071
+LibClamAV debug: AddressOfEntryPoint: 0x27070
1072
+LibClamAV debug: BaseOfCode: 0x1f000
1073
+LibClamAV debug: SectionAlignment: 0x1000
1074
+LibClamAV debug: FileAlignment: 0x200
1075
+LibClamAV debug: MajorSubsystemVersion: 4
1076
+LibClamAV debug: MinorSubsystemVersion: 0
1077
+LibClamAV debug: SizeOfImage: 0x29000
1078
+LibClamAV debug: SizeOfHeaders: 0x400
1079
+LibClamAV debug: NumberOfRvaAndSizes: 16
1080
+LibClamAV debug: Subsystem: Win32 GUI
1081
+LibClamAV debug: ------------------------------------
1082
+LibClamAV debug: Section 0
1083
+LibClamAV debug: Section name: UPX0
1084
+LibClamAV debug: Section data (from headers - in memory)
1085
+LibClamAV debug: VirtualSize: 0x1e000 0x1e000
1086
+LibClamAV debug: VirtualAddress: 0x1000 0x1000
1087
+LibClamAV debug: SizeOfRawData: 0x0 0x0
1088
+LibClamAV debug: PointerToRawData: 0x400 0x400
1089
+LibClamAV debug: Section's memory is executable
1090
+LibClamAV debug: Section's memory is writeable
1091
+LibClamAV debug: ------------------------------------
1092
+LibClamAV debug: Section 1
1093
+LibClamAV debug: Section name: UPX1
1094
+LibClamAV debug: Section data (from headers - in memory)
1095
+LibClamAV debug: VirtualSize: 0x9000 0x9000
1096
+LibClamAV debug: VirtualAddress: 0x1f000 0x1f000
1097
+LibClamAV debug: SizeOfRawData: 0x8200 0x8200
1098
+LibClamAV debug: PointerToRawData: 0x400 0x400
1099
+LibClamAV debug: Section's memory is executable
1100
+LibClamAV debug: Section's memory is writeable
1101
+LibClamAV debug: ------------------------------------
1102
+LibClamAV debug: Section 2
1103
+LibClamAV debug: Section name: UPX2
1104
+LibClamAV debug: Section data (from headers - in memory)
1105
+LibClamAV debug: VirtualSize: 0x1000 0x1000
1106
+LibClamAV debug: VirtualAddress: 0x28000 0x28000
1107
+LibClamAV debug: SizeOfRawData: 0x200 0x1ff
1108
+LibClamAV debug: PointerToRawData: 0x8600 0x8600
1109
+LibClamAV debug: Section's memory is writeable
1110
+LibClamAV debug: ------------------------------------
1111
+LibClamAV debug: EntryPoint offset: 0x8470 (33904)
1112
+```
1113
+
1114
+The section structure displayed above suggests the executable is packed
1115
+with UPX.
1116
+
1117
+```bash
1118
+LibClamAV debug: ------------------------------------
1119
+LibClamAV debug: EntryPoint offset: 0x8470 (33904)
1120
+LibClamAV debug: UPX/FSG/MEW: empty section found - assuming
1121
+                 compression
1122
+LibClamAV debug: UPX: bad magic - scanning for imports
1123
+LibClamAV debug: UPX: PE structure rebuilt from compressed file
1124
+LibClamAV debug: UPX: Successfully decompressed with NRV2B
1125
+LibClamAV debug: UPX/FSG: Decompressed data saved in
1126
+                 /tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede
1127
+LibClamAV debug: ***** Scanning decompressed file *****
1128
+LibClamAV debug: Recognized MS-EXE/DLL file
1129
+LibClamAV debug: Matched signature for file type PE
1130
+```
1131
+
1132
+Indeed, libclamav recognizes the UPX data and saves the decompressed
1133
+(and rebuilt) executable into
1134
+`/tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede`. Then it continues by
1135
+scanning this new file:
1136
+
1137
+```bash
1138
+LibClamAV debug: File type: Executable
1139
+LibClamAV debug: Machine type: 80386
1140
+LibClamAV debug: NumberOfSections: 3
1141
+LibClamAV debug: TimeDateStamp: Thu Jan 27 11:43:15 2011
1142
+LibClamAV debug: SizeOfOptionalHeader: e0
1143
+LibClamAV debug: File format: PE
1144
+LibClamAV debug: MajorLinkerVersion: 6
1145
+LibClamAV debug: MinorLinkerVersion: 0
1146
+LibClamAV debug: SizeOfCode: 0xc000
1147
+LibClamAV debug: SizeOfInitializedData: 0x19000
1148
+LibClamAV debug: SizeOfUninitializedData: 0x0
1149
+LibClamAV debug: AddressOfEntryPoint: 0x7b9f
1150
+LibClamAV debug: BaseOfCode: 0x1000
1151
+LibClamAV debug: SectionAlignment: 0x1000
1152
+LibClamAV debug: FileAlignment: 0x1000
1153
+LibClamAV debug: MajorSubsystemVersion: 4
1154
+LibClamAV debug: MinorSubsystemVersion: 0
1155
+LibClamAV debug: SizeOfImage: 0x26000
1156
+LibClamAV debug: SizeOfHeaders: 0x1000
1157
+LibClamAV debug: NumberOfRvaAndSizes: 16
1158
+LibClamAV debug: Subsystem: Win32 GUI
1159
+LibClamAV debug: ------------------------------------
1160
+LibClamAV debug: Section 0
1161
+LibClamAV debug: Section name: .text
1162
+LibClamAV debug: Section data (from headers - in memory)
1163
+LibClamAV debug: VirtualSize: 0xc000 0xc000
1164
+LibClamAV debug: VirtualAddress: 0x1000 0x1000
1165
+LibClamAV debug: SizeOfRawData: 0xc000 0xc000
1166
+LibClamAV debug: PointerToRawData: 0x1000 0x1000
1167
+LibClamAV debug: Section contains executable code
1168
+LibClamAV debug: Section's memory is executable
1169
+LibClamAV debug: ------------------------------------
1170
+LibClamAV debug: Section 1
1171
+LibClamAV debug: Section name: .rdata
1172
+LibClamAV debug: Section data (from headers - in memory)
1173
+LibClamAV debug: VirtualSize: 0x2000 0x2000
1174
+LibClamAV debug: VirtualAddress: 0xd000 0xd000
1175
+LibClamAV debug: SizeOfRawData: 0x2000 0x2000
1176
+LibClamAV debug: PointerToRawData: 0xd000 0xd000
1177
+LibClamAV debug: ------------------------------------
1178
+LibClamAV debug: Section 2
1179
+LibClamAV debug: Section name: .data
1180
+LibClamAV debug: Section data (from headers - in memory)
1181
+LibClamAV debug: VirtualSize: 0x17000 0x17000
1182
+LibClamAV debug: VirtualAddress: 0xf000 0xf000
1183
+LibClamAV debug: SizeOfRawData: 0x17000 0x17000
1184
+LibClamAV debug: PointerToRawData: 0xf000 0xf000
1185
+LibClamAV debug: Section's memory is writeable
1186
+LibClamAV debug: ------------------------------------
1187
+LibClamAV debug: EntryPoint offset: 0x7b9f (31647)
1188
+LibClamAV debug: Bytecode executing hook id 257 (0 hooks)
1189
+attachment.exe: OK
1190
+[...]
1191
+```
1192
+
1193
+No additional files get created by libclamav. By writing a signature for the decompressed file you have more chances that the engine will detect the target data when it gets compressed with another packer.
1194
+
1195
+This method should be applied to all files for which you want to create signatures. By analyzing the debug information you can quickly see how the engine recognizes and preprocesses the data and what additional files get created. Signatures created for bottom-level temporary files are usually more generic and should help detecting the same malware in different forms.
1196
+
1197
+## Writing signatures for special files
1198
+
1199
+### HTML
1200
+
1201
+ClamAV contains HTML normalization code which makes it easier to write signatures for HTML data that might differ based on white space, capitalization, and other insignificant differences. Running `sigtool --html-normalise` on a HTML file can be used to see what a file's contents will look like after normalization.  This command should generate the following files:
1014 1202
 
1015 1203
 - nocomment.html - the file is normalized, lower-case, with all comments and superfluous white space removed
1016 1204
 
1017 1205
 - notags.html - as above but with all HTML tags removed
1018 1206
 
1019
-The code automatically decodes JScript.encode parts and char ref’s (e.g. `f`). You need to create a signature against one of the created files. To eliminate potential false positive alerts the target type should be set to 3.
1207
+- javascript - any script contents are normalized and the results appended to this file
1020 1208
 
1021
-## Text files
1209
+The code automatically decodes JScript.encode parts and char ref’s (e.g. `f`). To create a successful signature for the input file type, the rule must match on the contents of one of the created files.  Signatures matching on normalized HTML should have a target type of 3.
1022 1210
 
1023
-Similarly to HTML all ASCII text files get normalized (converted to lower-case, all superfluous white space and control characters removed, etc.) before scanning. Use `clamscan --leave-temps` to obtain a normalized file then create a signature with the target type 7.
1211
+### Text files
1024 1212
 
1025
-## Compressed Portable Executable files
1213
+Similarly to HTML all ASCII text files get normalized (converted to lower-case, all superfluous white space and control characters removed, etc.) before scanning. Running `sigtool --ascii-normalise` on a text file will result in a normalized version being written to the file named 'normalised\_text'.  Rules matching on normalized ASCII text should have a target type of 7.
1026 1214
 
1027
-If the file is compressed with UPX, FSG, Petite or other PE packer supported by libclamav, run `clamscan` with `--debug --leave-temps`. Example output for a FSG compressed file:
1215
+### Compressed Portable Executable files
1216
+
1217
+If the file is compressed with UPX, FSG, Petite or another PE packer supported by libclamav, ClamAV will attempt to automatically unpack the executable and evaluate signatures against the unpacked executable.  To inspect the executable that results from ClamAV's unpacking process, run `clamscan` with `--debug --leave-temps`. Example output for a FSG compressed file:
1028 1218
 
1029 1219
 ```bash
1030 1220
 LibClamAV debug: UPX/FSG/MEW: empty section found - assuming compression
... ...
@@ -1034,4 +1055,72 @@ LibClamAV debug: FSG: Unpacked and rebuilt executable saved in
1034 1034
 
1035 1035
 ```
1036 1036
 
1037
-Next create a type 1 signature for `/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c`
1037
+In the example above, `/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c` is the unpacked executable, and a signature can be written based off of this file.
1038
+
1039
+## Using sigtool
1040
+sigtool pulls in libclamav and provides shortcuts to doing tasks that clamscan does behind the scenes.  These can be really useful when writing a signature or trying to get information about a signature that might be causing FPs or performance problems.
1041
+
1042
+The following sigtool flags can be especially useful for signature writing:
1043
+
1044
+- `--md5` / `--sha1` / `--sha256`: Generate the MD5/SHA1/SHA256 hash and calculate the file size, outputting both as a properly-formatted .hdb/.hsb signature
1045
+
1046
+- `--mdb`: Generate section hashes of the specified file.  This is useful when generating .mdb signatures.
1047
+
1048
+- `--decode`: Given a ClamAV signature from STDIN, show a more user-friendly representation of it.  An example usage of this flag is `cat test.lsb | sigtool --decode`.
1049
+
1050
+- `--hex-dump`: Given a sequence of bytes from STDIN, print the hex equivalent. An example usage of this flag is `echo -n "Match on this" | sigtool --hex-dump`.
1051
+
1052
+- `--html-normalise`: Normalize the specified HTML file in the way that clamscan will before looking for rule matches.  Writing signatures off of these files makes it easier to write rules for target type HTML (you'll know what white space, capitalization, etc. to expect). See the [HTML](#html) section for more details.
1053
+
1054
+- `--ascii-normalise`: Normalize the specified ASCII text file in the way that clamscan will before looking for rule matches. Writing signatures off of this normalized file data makes it easier to write rules for target type Txt (you'll know what white space, capitalization, etc. to expect). See the [Text files](#text-files) sectino for more details.
1055
+
1056
+- `--print-certs`: Print the Authenticode signatures of any PE files specified.
1057
+  This is useful when writing signature-based .crb rule files.
1058
+
1059
+- `--vba`: Extract VBA/Word6 macro code
1060
+
1061
+- `--test-sigs`: Given a signature and a sample, determine whether the signature matches and, if so, display the offset into the file where the match occurred.  This can be useful for investigating false positive matches in clean files.
1062
+
1063
+## Inspecting signatures inside a CVD file
1064
+
1065
+CVD (ClamAV Virus Database) is a digitally signed container that includes signature databases in various text formats. The header of the container is a 512 bytes long string with colon separated fields:
1066
+
1067
+```
1068
+ClamAV-VDB:build time:version:number of signatures:functionality level required:MD5 checksum:digital signature:builder name:build time (sec)
1069
+```
1070
+
1071
+`sigtool --info` displays detailed information about a given CVD file:
1072
+
1073
+```bash
1074
+zolw@localhost:/usr/local/share/clamav$ sigtool -i main.cvd
1075
+File: main.cvd
1076
+Build time: 09 Dec 2007 15:50 +0000
1077
+Version: 45
1078
+Signatures: 169676
1079
+Functionality level: 21
1080
+Builder: sven
1081
+MD5: b35429d8d5d60368eea9630062f7c75a
1082
+Digital signature: dxsusO/HWP3/GAA7VuZpxYwVsE9b+tCk+tPN6OyjVF/U8
1083
+JVh4vYmW8mZ62ZHYMlM903TMZFg5hZIxcjQB3SX0TapdF1SFNzoWjsyH53eXvMDY
1084
+eaPVNe2ccXLfEegoda4xU2TezbGfbSEGoU1qolyQYLX674sNA2Ni6l6/CEKYYh
1085
+Verification OK.
1086
+```
1087
+
1088
+The ClamAV project distributes a number of CVD files, including *main.cvd* and *daily.cvd*.
1089
+
1090
+To view the signature associated with a given detection name, the CVD files can be unpacked and the underlying text files searched for a rule definition using a tool like `grep`.  To do this, use sigtool's `--unpack` flag as follows:
1091
+
1092
+```bash
1093
+$ mkdir /tmp/clamav-sigs
1094
+$ cd /tmp/clamav-sigs/
1095
+$ sigtool --unpack /var/lib/clamav/main.cvd
1096
+$ ls
1097
+COPYING   main.fp   main.hsb   main.mdb  main.ndb
1098
+main.crb  main.hdb  main.info  main.msb  main.sfp
1099
+```
1100
+
1101
+## External tools
1102
+
1103
+Below are tools that can be helpful when writing ClamAV signatures:
1104
+
1105
+ - [CASC](https://github.com/Cisco-Talos/CASC) - CASC is a plugin for IDA Pro that allows the user to highlight sections of code and create a signature based on the underlying instructions (with options to ignore bytes associated with registers, addresses, and offsets).  It also contains SigAlyzer, a tool to take an existing signature and locate the regions within the binary that match the subsignatures.