Browse code

docs/signatures.pdf: describe logical signatures; other minor improvements (bb#1582)

git-svn: trunk@5066

Tomasz Kojm authored on 2009/05/06 22:47:25
Showing 3 changed files
... ...
@@ -1,3 +1,8 @@
1
+Wed May  6 15:43:27 CEST 2009 (tk)
2
+----------------------------------
3
+ * docs/signatures.pdf: describe logical signatures;
4
+			other minor improvements (bb#1582)
5
+
1 6
 Wed May  6 14:30:51 EEST 2009 (edwin)
2 7
 -------------------------------------
3 8
  * configure, configure.in: add -fno-strict-aliasing, so that
4 9
Binary files a/docs/signatures.pdf and b/docs/signatures.pdf differ
... ...
@@ -102,7 +102,7 @@ How do I look in hex?
102 102
 	\item \verb+??+\\
103 103
 	Match any byte.
104 104
 	\item \verb+a?+\\
105
-	Match a high nibble (the four high bits). \textbf{IMPORTANT NOTE:}
105
+	Match a high nibble (the four high bits).\\ \textbf{IMPORTANT NOTE:}
106 106
 	The nibble matching is only available in libclamav with the
107 107
 	functionality level 17 and higher therefore please only use it with
108 108
 	.ndb signatures followed by ":17" (MinEngineFunctionalityLevel,
... ...
@@ -112,11 +112,13 @@ How do I look in hex?
112 112
 	\item \verb+*+\\
113 113
 	Match any number of bytes.
114 114
 	\item \verb+{n}+\\
115
-	Match n bytes.
115
+	Match $n$ bytes.
116 116
 	\item \verb+{-n}+\\
117
-	Match n or less bytes.
117
+	Match $n$ or less bytes.
118 118
 	\item \verb+{n-}+\\
119
-	Match n or more bytes.
119
+	Match $n$ or more bytes.
120
+	\item \verb+{n-m}+\\
121
+	Match between $n$ and $m$ bytes ($m > n$).
120 122
 	\item \verb+(aa|bb|cc|..)+\\
121 123
 	Match aa or bb or cc..
122 124
 	\item \verb+HEXSIG[x-y]aa+ or \verb+aa[x-y]HEXSIG+\\
... ...
@@ -149,13 +151,21 @@ MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]]
149 149
     the type of the target file:
150 150
     \begin{itemize}
151 151
 	\item 0 = any file
152
-	\item 1 = Portable Executable
153
-	\item 2 = OLE2 component (e.g. a VBA script)
154
-	\item 3 = HTML (normalised)
152
+	\item 1 = Portable Executable, both 32- and 64-bit.
153
+	\item 2 = file inside OLE2 container (e.g. image, embedded executable,
154
+	VBA script). The OLE2 format is primarily used by MS Office and MSI
155
+	installation files.
156
+	\item 3 = HTML (normalized: whitespace transformed to spaces, tags/tag
157
+	attributes normalized, all lowercase), Javascript is normalized too:
158
+	all strings are normalized (hex encoding is decoded), numbers are
159
+	parsed and normalized, local variables/function names are normalized
160
+	to 'n001' format, argument to eval() is parsed as JS again,
161
+	unescape() is handled, some simple JS packers are handled,
162
+	output is whitespace normalized.
155 163
 	\item 4 = Mail file
156 164
 	\item 5 = Graphics
157 165
 	\item 6 = ELF
158
-	\item 7 = ASCII text file (normalised)
166
+	\item 7 = ASCII text file (normalized)
159 167
     \end{itemize}
160 168
     And	\verb+Offset+ is an asterisk or a decimal number \verb+n+ possibly
161 169
     combined with a special modifier:
... ...
@@ -186,6 +196,72 @@ MalwareName:TargetType:Offset:HexSignature[:MinEngineFunctionalityLevel:[Max]]
186 186
     \noindent
187 187
     All signatures in the extended format must be placed inside \verb+*.ndb+ files.
188 188
 
189
+    \subsubsection{Logical signatures}\label{ndb}
190
+    Logical signatures allow combining of multiple signatures in extended
191
+    format using logical operators. They can provide both more detailed and
192
+    flexible pattern matching. The logical sigs are stored inside \verb+*.ldb+
193
+    files in the following format:
194
+    \begin{verbatim}
195
+SignatureName;TargetDescriptionBlock;LogicalExpression;Subsig0;
196
+Subsig1;Subsig2;...
197
+    \end{verbatim}
198
+    where:
199
+    \begin{itemize}
200
+	\item \verb+TargetDescriptionBlock+ provides information about the
201
+	engine and target file with comma separated \verb+Arg:Val+ pairs,
202
+	currently (as of 0.95.1) only \verb+Target:X+ and \verb+Engine:X-Y+
203
+	are supported.
204
+	\item \verb+LogicalExpression+ specifies the logical expression
205
+	describing the relationship between \verb+Subsig0...SubsigN+.\\
206
+	\textbf{Basis clause:} 0,1,...,N decimal indexes are SUB-EXPRESSIONS
207
+	representing \verb+Subsig0, Subsig1,...,SubsigN+ respectively.\\
208
+	\textbf{Inductive clause:} if \verb+A+ and \verb+B+ are
209
+	SUB-EXPRESSIONS and \verb+X, Y+ are decimal numbers then
210
+	\verb+(A&B)+, \verb+(A|B)+, \verb+A=X+, \verb+A=X,Y+, \verb+A>X+,
211
+	\verb+A>X,Y+, \verb+A<X+ and \verb+A<X,Y+ are SUB-EXPRESSIONS
212
+	\item \verb+SubsigN+ is n-th subsignature in extended format possibly
213
+	preceded with an offset. There can be specified up to 64 subsigs.
214
+    \end{itemize}
215
+    Modifiers for subexpressions:
216
+    \begin{itemize}
217
+	\item \verb+A=X+: If the SUB-EXPRESSION A refers to a single signature
218
+	then this signature must get matched exactly X times; if it refers to
219
+	a (logical) block of signatures then this block must generate exactly
220
+	X matches (with any of its sigs).
221
+	\item \verb+A=0+ specifies negation (signature or block of signatures
222
+	cannot be matched)
223
+	\item \verb+A=X,Y+: If the SUB-EXPRESSION A refers to a single signature
224
+	then this signature must be matched exactly X times; if it refers to
225
+	a (logical) block of signatures then this block must generate X matches
226
+	and at least Y different signatures must get matched.
227
+	\item \verb+A>X+: If the SUB-EXPRESSION A refers to a single signature
228
+	then this signature must get matched more than X times; if it refers to
229
+	a (logical) block of signatures then this block must generate more
230
+	than X matches (with any of its sigs).
231
+	\item \verb+A>X,Y+: If the SUB-EXPRESSION A refers to a single signature
232
+	then this signature must get matched more than X times; if it refers to
233
+	a (logical) block of signatures then this block must generate more than
234
+	X matches and at least Y different signatures must be matched.
235
+	\item \verb+A<X+ and \verb+A<X,Y+ as above with the change of "more"
236
+	to "less".
237
+    \end{itemize}
238
+    Examples:
239
+    \begin{verbatim}
240
+Sig1;Target:0;(0&1&2&3)&(4|1);6b6f74656b;616c61;7a6f6c77;7374656
241
+6616e;deadbeef
242
+
243
+Sig2;Target:0;((0|1|2)>5,2)&(3|1);6b6f74656b;616c61;7a6f6c77;737
244
+46566616e  
245
+
246
+Sig3;Target:0;((0|1|2|3)=2)&(4|1);6b6f74656b;616c61;7a6f6c77;737
247
+46566616e;deadbeef
248
+
249
+Sig4;Target:1,Engine:18-20;((0|1)&(2|3))&4;EP+123:33c06834f04100
250
+f2aef7d14951684cf04100e8110a00;S2+78:22??232c2d252229{-15}6e6573
251
+(63|64)61706528;S+50:68efa311c3b9963cb1ee8e586d32aeb9043e;f9c58d
252
+cf43987e4f519d629b103375;SL+550:6300680065005c0046006900
253
+    \end{verbatim}
254
+
189 255
     \subsection{Signatures based on archive metadata}
190 256
     Signatures based on metadata inside archive files can provide an effective
191 257
     protection against malware that spreads via encrypted zip or rar
... ...
@@ -260,7 +336,7 @@ db_name:line_number:signature_name
260 260
     HTML exploits. Running \verb+sigtool --html-normalise+ on a HTML file
261 261
     should generate the following files:
262 262
     \begin{itemize}
263
-	\item nocomment.html - the file is normalised, lower-case, with all
263
+	\item nocomment.html - the file is normalized, lower-case, with all
264 264
 	comments and superflous white space removed
265 265
 	\item notags.html - as above but with all HTML tags removed
266 266
     \end{itemize}
... ...
@@ -270,10 +346,10 @@ db_name:line_number:signature_name
270 270
     be set to 3.
271 271
 
272 272
     \subsection{Text files}
273
-    Similarly to HTML all ASCII text files get normalised (converted
273
+    Similarly to HTML all ASCII text files get normalized (converted
274 274
     to lower-case, all superflous white space and control characters removed,
275 275
     etc.) before scanning. Use \verb+clamscan --leave-temps+ to obtain
276
-    a normalised file then create a signature with the target type 7.
276
+    a normalized file then create a signature with the target type 7.
277 277
 
278 278
     \subsection{Compressed Portable Executable files}
279 279
     If the file is compressed with UPX, FSG, Petite or other PE packer