Browse code

update documentation

git-svn: trunk@3730

Tomasz Kojm authored on 2008/03/21 06:15:14
Showing 3 changed files
... ...
@@ -1,3 +1,7 @@
1
+Thu Mar 20 21:27:22 CET 2008 (tk)
2
+---------------------------------
3
+  * doc/signatures.[pdf,tex]: update documentation
4
+
1 5
 Thu Mar 20 21:06:30 CET 2008 (acab)
2 6
 -----------------------------------
3 7
  * libclamav/blob.[ch]: Fix for "bad file descriptor" under win32, properly
4 8
Binary files a/docs/signatures.pdf and b/docs/signatures.pdf differ
... ...
@@ -15,34 +15,38 @@
15 15
 
16 16
     \noindent
17 17
     \section{Introduction}
18
-    CVD (ClamAV Virus Database) is a digitally signed tarball file that
19
-    contains one or more databases. The header is a 512 bytes long string
20
-    with colon separated fields:
18
+    CVD (ClamAV Virus Database) is a digitally signed container that
19
+    includes signature databases in various text formats. The header
20
+    of the container is a 512 bytes long string with colon separated fields:
21 21
     \begin{verbatim}
22 22
 ClamAV-VDB:build time:version:number of signatures:functionality
23
-level required:MD5 checksum:digital signature:builder name:build time (sec)
23
+level required:MD5 checksum:digital signature:builder name:build
24
+time (sec)
24 25
     \end{verbatim}
25
-    \verb+sigtool --info+ displays detailed information about a CVD file:
26
+    \verb+sigtool --info+ displays detailed information about a given CVD file:
26 27
     \begin{verbatim}
27 28
 zolw@localhost:/usr/local/share/clamav$ sigtool -i main.cvd
28
-Build time: 09 Jun 2006 22-19 +0200
29
-Version: 39
30
-# of signatures: 58116
31
-Functionality level: 8
32
-Builder: tkojm
33
-MD5: a9a400e70dcbfe2c9e11d78416e1c0cc
34
-Digital signature: 0s12V8OxLWO95fNNv+kTxj7CEWBW/1TKOGC7G4RelhogruBYw8dJeIX2+yhxex/XsLohxoEuXxC2CaFXiiTbrbvpK2USIxkpn53n6LYVV6jKgkP5sa08MdJE7cl29H1slfCrdaevBUZ1Z/UefkRnV6p3iQVpDPsBwqFRbrem33b
29
+File: main.cvd
30
+Build time: 09 Dec 2007 15:50 +0000
31
+Version: 45
32
+Signatures: 169676
33
+Functionality level: 21
34
+Builder: sven
35
+MD5: b35429d8d5d60368eea9630062f7c75a
36
+Digital signature: dxsusO/HWP3/GAA7VuZpxYwVsE9b+tCk+tPN6OyjVF/U8
37
+JVh4vYmW8mZ62ZHYMlM903TMZFg5hZIxcjQB3SX0TapdF1SFNzoWjsyH53eXvMDY
38
+eaPVNe2ccXLfEegoda4xU2TezbGfbSEGoU1qolyQYLX674sNA2Ni6l6/CEKYYh
35 39
 Verification OK.
36 40
     \end{verbatim}
37
-    There are two CVD databases in ClamAV: \emph{main.cvd} and \emph{daily.cvd}
38
-    for daily updates.
41
+    The ClamAV project distributes two CVD files: \emph{main.cvd} and
42
+    \emph{daily.cvd}.
39 43
 
40
-    \section{Signature format}
44
+    \section{Signature formats}
41 45
 
42 46
     \subsection{MD5}
43
-    There's an easy way to create signatures for static malware using MD5
44
-    checksums. To create a signature for \verb+test.exe+ use the \verb+--md5+
45
-    option of sigtool:
47
+    The easiest way to create signatures for ClamAV is to use MD5 checksums,
48
+    however this method can be only used against static malware. To create
49
+    a signature for \verb+test.exe+ use the \verb+--md5+ option of sigtool:
46 50
     \begin{verbatim}
47 51
 zolw@localhost:/tmp/test$ sigtool --md5 test.exe > test.hdb
48 52
 zolw@localhost:/tmp/test$ cat test.hdb 
... ...
@@ -56,33 +60,36 @@ test.exe: test.exe FOUND
56 56
 ----------- SCAN SUMMARY -----------
57 57
 Known viruses: 1
58 58
 Scanned directories: 0
59
-Engine version: 0.88.2
59
+Engine version: 0.92.1
60 60
 Scanned files: 1
61 61
 Infected files: 1
62 62
 Data scanned: 0.02 MB
63 63
 Time: 0.024 sec (0 m 0 s)
64 64
     \end{verbatim}
65
-    You can edit it to change the name (by default sigtool uses the file name).
66
-    Remember that all MD5 signatures must be placed inside \verb+*.hdb+ files
67
-    and you can include any number of signatures inside a single file. To get
68
-    them automatically loaded every time clamscan/clamd starts just copy them
69
-    to the local virus database directory.
65
+    You can change the name (by default sigtool uses the name of the file)
66
+    and place it inside a \verb+*.hdb+ file. A single database file can
67
+    include any number of signatures. To get them automatically loaded
68
+    each time clamscan/clamd starts just copy the database file(s) into
69
+    the local virus database directory (eg. /usr/local/share/clamav).
70 70
 
71 71
     \subsection{MD5, PE section based}
72
-    You can create an MD5 signature for a specific section in a PE file.
73
-    Such signatures are stored in .mdb files in the following format:
72
+    You can create a MD5 signature for a specific section in a PE file.
73
+    Such signatures shall be stored inside \verb+.mdb+ files in the
74
+    following format:
74 75
     \begin{verbatim}
75 76
 PESectionSize:MD5:MalwareName
76 77
     \end{verbatim}
78
+    The easiest way to generate MD5 based section signatures is to extract
79
+    target PE sections into separate files and then run sigtool with the
80
+    option \verb+--mdb+
77 81
 
78 82
     \subsection{Hexadecimal signatures}
79
-    ClamAV keeps viral fragments in hexadecimal format. If you don't know how
80
-    to get a proper signature please try the MD5 method or submit your sample
81
-    at \url{http://www.clamav.net/sendvirus}
83
+    ClamAV stores all signatures in a hexadecimal format. By a hex-signature
84
+    here we mean a fragment of a malware's body converted into a hexadecimal
85
+    string which can be additionally extended with various wildcards.
82 86
 
83 87
     \subsubsection{Hexadecimal format}
84
-    You can use \verb+sigtool --hex-dump+ to convert arbitrary data into
85
-    hexadecimal format:
88
+    You can use \verb+sigtool --hex-dump+ to convert any data into a hex-string:
86 89
     \begin{verbatim}
87 90
 zolw@localhost:/tmp/test$ sigtool --hex-dump
88 91
 How do I look in hex?
... ...
@@ -95,12 +102,13 @@ How do I look in hex?
95 95
 	\item \verb+??+\\
96 96
 	Match any byte.
97 97
 	\item \verb+a?+\\
98
-	Match high nibble (high four bits). \textbf{IMPORTANT NOTE:} Nibble
99
-	matching is only available in libclamav with the functionality level
100
-	17 therefore please only use it with .ndb signatures, each followed
101
-	by ":17" (MinEngineFunctionalityLevel, see \ref{ndb}).
98
+	Match a high nibble (the four high bits). \textbf{IMPORTANT NOTE:}
99
+	The nibble matching is only available in libclamav with the
100
+	functionality level 17 and higher therefore please only use it with
101
+	.ndb signatures followed by ":17" (MinEngineFunctionalityLevel,
102
+	see \ref{ndb}).
102 103
 	\item \verb+?a+\\
103
-	Match low nibble (low four bits).
104
+	Match a low nibble (the four low bits).
104 105
 	\item \verb+*+\\
105 106
 	Match any number of bytes.
106 107
 	\item \verb+{n}+\\
... ...
@@ -109,47 +117,56 @@ How do I look in hex?
109 109
 	Match n or less bytes.
110 110
 	\item \verb+{n-}+\\
111 111
 	Match n or more bytes.
112
-	\item \verb+(a|b)+\\
113
-	Match a or b (you can use more alternate characters).
112
+	\item \verb+(aa|bb|cc|..)+\\
113
+	Match aa or bb or cc..
114
+	\item \verb+HEXSIG[x-y]aa+ or \verb+aa[x-y]HEXSIG+\\
115
+	Match aa anchored to a hex-signature, see
116
+	\url{https://wwws.clamav.net/bugzilla/show_bug.cgi?id=776} for
117
+	a discussion and examples.
114 118
     \end{itemize}
119
+    The range signatures \verb+*+ and \verb+{}+ virtually separate
120
+    a hex-signature into two parts, eg. \verb+aabbcc*bbaacc+ is treated
121
+    as two sub-signatures \verb+aabbcc+ and \verb+bbaacc+ with any number
122
+    of bytes between them. It's a requirement that each sub-signature
123
+    includes a block of two static characters somewhere in its body.
115 124
 
116 125
     \subsubsection{Basic signature format}
117
-    The simplest signatures are of the format:
126
+    The simplest (and now deprecated) signature format is:
118 127
     \begin{verbatim}
119 128
 MalwareName=HexSignature
120 129
     \end{verbatim}
121
-    ClamAV will analyse a whole content of a file trying to match it. All
122
-    signatures of this type must be placed in \verb+*.db+ files.
130
+    ClamAV will scan the entire file looking for HexSignature. All
131
+    signatures of this type must be placed inside \verb+*.db+ files.
123 132
 
124 133
     \subsubsection{Extended signature format}\label{ndb}
125
-    Extended signature format allows on including additional information about
126
-    target file type, virus offset and required engine version.
127
-    The format is:
134
+    The extended signature format allows for specification of additional
135
+    information such as a target file type, virus offset or engine version,
136
+    making the detection more reliable. The format is:
128 137
     \begin{verbatim}
129 138
 MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]]
130 139
     \end{verbatim}
131
-    where \verb+TargetType+ is one of the following decimal numbers describing
132
-    the target file type:
140
+    where \verb+TargetType+ is one of the following numbers specifying
141
+    the type of the target file:
133 142
     \begin{itemize}
134 143
 	\item 0 = any file
135 144
 	\item 1 = Portable Executable
136
-	\item 2 = OLE2 component (e.g. VBA script)
145
+	\item 2 = OLE2 component (e.g. a VBA script)
137 146
 	\item 3 = HTML (normalised)
138 147
 	\item 4 = Mail file
139
-	\item 5 = Graphics (to help catching exploits in JPEG files)
148
+	\item 5 = Graphics
140 149
 	\item 6 = ELF
150
+	\item 7 = ASCII text file (normalised)
141 151
     \end{itemize}
142 152
     And	\verb+Offset+ is an asterisk or a decimal number \verb+n+ possibly
143
-    combined with a special string:
153
+    combined with a special modifier:
144 154
     \begin{itemize}
145 155
 	\item \verb+*+ = any
146 156
 	\item \verb+n+ = absolute offset
147 157
 	\item \verb+EOF-n+ = end of file minus \verb+n+ bytes
148 158
     \end{itemize}
149
-    Signatures for Portable Executables files (target = 1) also support:
159
+    Signatures for PE and ELF files additionally support:
150 160
     \begin{itemize}
151
-	\item \verb#EP+n# = entry point plus n bytes (\verb#EP+0# if you
152
-	want to anchor to \verb+EP+)
161
+	\item \verb#EP+n# = entry point plus n bytes (\verb#EP+0# for \verb+EP+)
153 162
 	\item \verb#EP-n# = entry point minus n bytes
154 163
 	\item \verb#Sx+n# = start of section \verb+x+'s (counted from 0)
155 164
 	data plus \verb+n+ bytes
... ...
@@ -166,15 +183,17 @@ MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]]
166 166
     0.91 will silently ignore the \verb+MaxShift+ extension and only use
167 167
     \verb+Offset+.\\
168 168
 
169
+    \noindent
169 170
     All signatures in the extended format must be placed inside \verb+*.ndb+ files.
170 171
 
171 172
     \subsection{Signatures based on archive metadata}
172
-    In order to detect some malware which spreads inside of Zip or RAR archives
173
-    (especially encrypted ones) you can try to create a signature describing
174
-    a malicious archived file. The general format is:
173
+    Signatures based on metadata inside archive files can provide an effective
174
+    protection against malware that spreads via encrypted zip or rar
175
+    archives. The format of a metadata signature is:
175 176
 \begin{verbatim}
176 177
 virname:encrypted:filename:normal size:csize:crc32:cmethod:fileno:max depth
177 178
 \end{verbatim}
179
+    where the corresponding fields are:
178 180
     \begin{itemize}
179 181
 	\item Virus name
180 182
 	\item Encryption flag (1 -- encrypted, 0 -- not encrypted)
... ...
@@ -186,15 +205,22 @@ virname:encrypted:filename:normal size:csize:crc32:cmethod:fileno:max depth
186 186
 	\item File position in archive (* to ignore)
187 187
 	\item Maximum number of nested archives (* to ignore)
188 188
     \end{itemize}
189
-    The database should have the extension \verb+.zmd+ or \verb+.rmd+ for
190
-    Zip or RAR archive respectively.
189
+    The database file should have the extension of \verb+.zmd+ or
190
+    \verb+.rmd+ for zip or rar metadata respectively.
191 191
 
192
-    \subsection{Whitelist database}
192
+    \subsection{Whitelist databases}
193 193
     To whitelist a specific file use the MD5 signature format and place
194
-    it in the database with the extension \verb+.fp+.
194
+    it inside a database file with the extension of \verb+.fp+.\\
195
+
196
+    \noindent
197
+    To whitelist a specific signature inside main.cvd add the following
198
+    entry into daily.ign or a local file local.ign:
199
+\begin{verbatim}
200
+db_name:line_number:signature_name
201
+\end{verbatim}
195 202
 
196 203
     \subsection{Signature names}
197
-    ClamAV uses the following prefixes for particular malware:
204
+    ClamAV uses the following prefixes for signature names:
198 205
     \begin{itemize}
199 206
 	\item \emph{Worm} for Internet worms
200 207
 	\item \emph{Trojan} for backdoor programs
... ...
@@ -210,7 +236,7 @@ virname:encrypted:filename:normal size:csize:crc32:cmethod:fileno:max depth
210 210
 	\item \emph{BAT} for BAT malware
211 211
 	\item \emph{W97M}, \emph{W2000M} for Word macro viruses
212 212
 	\item \emph{X97M}, \emph{X2000M} for Excel macro viruses
213
-	\item \emph{O97M}, \emph{O2000M} for general Office macro viruses
213
+	\item \emph{O97M}, \emph{O2000M} for generic Office macro viruses
214 214
 	\item \emph{DoS} for Denial of Service attack software
215 215
 	\item \emph{DOS} for old DOS malware
216 216
 	\item \emph{Exploit} for popular exploits
... ...
@@ -230,30 +256,35 @@ virname:encrypted:filename:normal size:csize:crc32:cmethod:fileno:max depth
230 230
     \section{Special files}
231 231
 
232 232
     \subsection{HTML}
233
-    ClamAV contains a special HTML normalisation code required to detect
233
+    ClamAV contains a special HTML normalisation code which helps to detect
234 234
     HTML exploits. Running \verb+sigtool --html-normalise+ on a HTML file
235
-    should create the following files:
235
+    should generate the following files:
236 236
     \begin{itemize}
237
-	\item comment.html - the whole file normalised
238
-	\item nocomment.html - the file normalised, with all comments removed
239
-	\item script.html - the parts of the file in \verb+<script>+ tags
240
-	      (lowercased)
237
+	\item nocomment.html - the file is normalised, lower-case, with all
238
+	comments and superflous white space removed
239
+	\item notags.html - as above but with all HTML tags removed
241 240
     \end{itemize}
242 241
     The code automatically decodes JScript.encode parts and char ref's (e.g.
243 242
     \verb+&#102;+). You need to create a signature against one of the created
244
-    files. To eliminate potential false positive alerts you should use
245
-    extended signature format with target type of 3.
243
+    files. To eliminate potential false positive alerts the target type should
244
+    be set to 3.
245
+
246
+    \subsection{Text files}
247
+    Similarly to HTML all ASCII text files get normalised (converted
248
+    to lower-case, all superflous white space and control characters removed,
249
+    etc.) before scanning. Use \verb+clamscan --leave-temps+ to obtain
250
+    a normalised file then create a signature with the target type 7.
246 251
 
247 252
     \subsection{Compressed Portable Executable files}
248
-    If the file is compressed with UPX, FSG, Petite or other executable packer
249
-    (supported by libclamav) run \verb+clamscan+ with
250
-    \verb+--debug --leave-temps+. Example output on FSG compressed file:
253
+    If the file is compressed with UPX, FSG, Petite or other PE packer
254
+    supported by libclamav, run \verb+clamscan+ with
255
+    \verb+--debug --leave-temps+. Example output for a FSG compressed file:
251 256
     \begin{verbatim}
252
-LibClamAV debug: UPX/FSG: empty section found - assuming compression
253
-LibClamAV debug: FSG: found old EP @1554
254
-LibClamAV debug: FSG: Successfully decompressed
255
-LibClamAV debug: UPX/FSG: Decompressed data saved in /tmp/clamav-4eba73ff4050a26
257
+LibClamAV debug: UPX/FSG/MEW: empty section found - assuming compression
258
+LibClamAV debug: FSG: found old EP @119e0
259
+LibClamAV debug: FSG: Unpacked and rebuilt executable saved in
260
+/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c
256 261
     \end{verbatim}
257
-    and then create a signature for \verb+/tmp/clamav-4eba73ff4050a26+
262
+    Next create a type 1 signature for \verb+/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c+
258 263
 
259 264
 \end{document}