1 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,36 +0,0 @@ |
1 |
-# Base package |
|
2 |
- |
|
3 |
-## Supported platforms |
|
4 |
- |
|
5 |
-Clam AntiVirus is highly cross-platform. The development team cannot test every OS, so have chosen to test ClamAV using the two most recent Long Term Support (LTS) versions of each of the most popular desktop operating systems. Our selection includes: |
|
6 |
- |
|
7 |
-- GNU/Linux |
|
8 |
- - Ubuntu |
|
9 |
- - 14.04 |
|
10 |
- - 16.04 |
|
11 |
- - Debian |
|
12 |
- - 7 |
|
13 |
- - 8 |
|
14 |
- - CentOS |
|
15 |
- - 6 |
|
16 |
- - 7 |
|
17 |
-- UNIX |
|
18 |
- - Solaris |
|
19 |
- - 10 |
|
20 |
- - 11 |
|
21 |
- - FreeBSD |
|
22 |
- - 10 |
|
23 |
- - 11 |
|
24 |
- - macOS |
|
25 |
- - 10.12 (Sierra) |
|
26 |
- - 10.13 (High Sierra) |
|
27 |
-- Windows |
|
28 |
- - 7 |
|
29 |
- - 10 |
|
30 |
- |
|
31 |
-## Binary packages |
|
32 |
- |
|
33 |
-As an alternative to building and installing from source, most Linux package managers provide pre-compiled ClamAV packages. |
|
34 |
- |
|
35 |
-For more information about installing ClamAV via a Package Manager, please visit: |
|
36 |
-<https://www.clamav.net/download.html#otherversions> |
37 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,26 +0,0 @@ |
1 |
-# Clam AntiVirus 0.100.0 *User Manual* |
|
2 |
- |
|
3 |
-![image](images/demon.png) |
|
4 |
- |
|
5 |
- |
|
6 |
-Table Of Contents |
|
7 |
- |
|
8 |
-1. [Introduction](Introduction.md) |
|
9 |
-2. [Base Package](BasePackage.md) |
|
10 |
-3. [Installation](Installation.md) |
|
11 |
-4. [Configuration](Configuration.md) |
|
12 |
-5. [Usage](Usage.md) |
|
13 |
-6. [libclamav](libclamav.md) |
|
14 |
-7. [Signatures](Signatures.md) |
|
15 |
-8. [PhishSigs](PhishSigs.md) |
|
16 |
- |
|
17 |
- |
|
18 |
-ClamAV User Manual © 2018 Cisco Systems, Inc. |
|
19 |
- |
|
20 |
-This document is distributed under the terms of the GNU General Public License v2. |
|
21 |
- |
|
22 |
-Clam AntiVirus is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. |
|
23 |
- |
|
24 |
-ClamAV and Clam AntiVirus are trademarks of Cisco Systems, Inc. |
25 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,105 +0,0 @@ |
1 |
-# Configuration |
|
2 |
- |
|
3 |
-Before proceeding with the steps below, you should run the ’clamconf’ command, which gives important information about your ClamAV configuration. See section [5.8](#sec:clamconf) for more details. |
|
4 |
- |
|
5 |
-## clamd |
|
6 |
- |
|
7 |
-Before you start using the daemon you have to edit the configuration file (in other case `clamd` won’t run): |
|
8 |
- |
|
9 |
-```bash |
|
10 |
- $ clamd |
|
11 |
- ERROR: Please edit the example config file /etc/clamd.conf. |
|
12 |
-``` |
|
13 |
- |
|
14 |
-This shows the location of the default configuration file. The format and options of this file are fully described in the *clamd.conf(5)* manual. The config file is well commented and configuration should be straightforward. |
|
15 |
- |
|
16 |
-### On-access scanning |
|
17 |
- |
|
18 |
-One of the interesting features of `clamd` is on-access scanning based on fanotify, included in Linux since kernel 2.6.36. **This is not required to run clamd**. At the moment the fanotify header is only available for Linux. |
|
19 |
- |
|
20 |
-Configure on-access scanning in `clamd.conf` and read the [on-access](Usage.md#On-access-Scanning) section for on-access scanning usage. |
|
21 |
- |
|
22 |
-## clamav-milter |
|
23 |
- |
|
24 |
-ClamAV (v0.95) includes a new, redesigned clamav-milter. The most notable difference is that the internal mode has been dropped and now a working clamd companion is required. The second important difference is that now the milter has got its own configuration and log files. |
|
25 |
- |
|
26 |
-To compile ClamAV with the clamav-milter just run `./configure --enable-milter` and make as usual. In order to use the `–enable-milter` option with `configure`, your system MUST have the milter library installed. If you use the `–enable-milter` option without the library being installed, you will most likely see output like this during ’configure’: |
|
27 |
- |
|
28 |
-```bash |
|
29 |
- checking for libiconv_open in -liconv... no |
|
30 |
- checking for iconv... yes |
|
31 |
- checking whether in_port_t is defined... yes |
|
32 |
- checking for in_addr_t definition... yes |
|
33 |
- checking for mi_stop in -lmilter... no |
|
34 |
- checking for library containing strlcpy... no |
|
35 |
- checking for mi_stop in -lmilter... no |
|
36 |
- configure: error: Cannot find libmilter |
|
37 |
-``` |
|
38 |
- |
|
39 |
-At which point the ’configure’ script will stop processing. |
|
40 |
- |
|
41 |
-Please consult your MTA’s manual on how to connect ClamAV with the milter. |
|
42 |
- |
|
43 |
-## Testing |
|
44 |
- |
|
45 |
-Try to scan recursively the source directory: |
|
46 |
- |
|
47 |
-```bash |
|
48 |
- $ clamscan -r -l scan.txt clamav-x.yz |
|
49 |
-``` |
|
50 |
- |
|
51 |
-It should find some test files in the clamav-x.yz/test directory. The scan result will be saved in the `scan.txt` log file \[7\]. To test `clamd`, start it and use `clamdscan` (or instead connect directly to its socket and run the SCAN command): |
|
52 |
- |
|
53 |
-```bash |
|
54 |
- $ clamdscan -l scan.txt clamav-x.yz |
|
55 |
-``` |
|
56 |
- |
|
57 |
-Please note that the scanned files must be accessible by the user running `clamd` or you will get an error. |
|
58 |
- |
|
59 |
-## Setting up auto-updating |
|
60 |
- |
|
61 |
-`freshclam` is the automatic database update tool for Clam AntiVirus. It can work in two modes: |
|
62 |
- |
|
63 |
-- interactive - on demand from command line |
|
64 |
-- daemon - silently in the background |
|
65 |
- |
|
66 |
-`freshclam` is advanced tool: it supports scripted updates (instead of transferring the whole CVD file at each update it only transfers the differences between the latest and the current database via a special script), database version checks through DNS, proxy servers (with authentication), digital signatures and various error scenarios. **Quick test: run freshclam (as superuser) with no parameters and check the output.** If everything is OK you may create the log file in /var/log (owned by *clamav* or another user `freshclam` will be running as): |
|
67 |
- |
|
68 |
-```bash |
|
69 |
- # touch /var/log/freshclam.log |
|
70 |
- # chmod 600 /var/log/freshclam.log |
|
71 |
- # chown clamav /var/log/freshclam.log |
|
72 |
-``` |
|
73 |
- |
|
74 |
-Now you *should* edit the configuration file `freshclam.conf` and point the *UpdateLogFile* directive to the log file. Finally, to run `freshclam` in the daemon mode, execute: |
|
75 |
- |
|
76 |
-```bash |
|
77 |
- # freshclam -d |
|
78 |
-``` |
|
79 |
- |
|
80 |
-The other way is to use the *cron* daemon. You have to add the following line to the crontab of **root** or **clamav** user: |
|
81 |
- |
|
82 |
-```cron |
|
83 |
-N * * * * /usr/local/bin/freshclam --quiet |
|
84 |
-``` |
|
85 |
- |
|
86 |
-to check for a new database every hour. **N should be a number between 3 and 57 of your choice. Please don’t choose any multiple of 10, because there are already too many clients using those time slots.** Proxy settings are only configurable via the configuration file and `freshclam` will require strict permission settings for the config file when `HTTPProxyPassword` is turned on. |
|
87 |
- |
|
88 |
-```bash |
|
89 |
- HTTPProxyServer myproxyserver.com |
|
90 |
- HTTPProxyPort 1234 |
|
91 |
- HTTPProxyUsername myusername |
|
92 |
- HTTPProxyPassword mypass |
|
93 |
-``` |
|
94 |
- |
|
95 |
-### Closest mirrors |
|
96 |
- |
|
97 |
-The `DatabaseMirror` directive in the config file specifies the database server `freshclam` will attempt (up to `MaxAttempts` times) to download the database from. The default database mirror is [database.clamav.net](database.clamav.net) but multiple directives are allowed. In order to download the database from the closest mirror you should configure `freshclam` to use [db.xx.clamav.net](db.xx.clamav.net) where xx represents your country code. For example, if your server is in "Ascension Island" you should have the following lines included in `freshclam.conf`: |
|
98 |
- |
|
99 |
-```bash |
|
100 |
- DNSDatabaseInfo current.cvd.clamav.net |
|
101 |
- DatabaseMirror db.ac.clamav.net |
|
102 |
- DatabaseMirror database.clamav.net |
|
103 |
-``` |
|
104 |
- |
|
105 |
-The second entry acts as a fallback in case the connection to the first mirror fails for some reason. The full list of two-letters country codes is available at <http://www.iana.org/cctld/cctld-whois.htm> |
106 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,190 +0,0 @@ |
1 |
-# Installation |
|
2 |
- |
|
3 |
-## Requirements |
|
4 |
- |
|
5 |
-The following components are required to compile ClamAV under UNIX:= |
|
6 |
- |
|
7 |
-- zlib and zlib-devel packages |
|
8 |
-- openssl version 0.9.8 or higher and libssl-devel packages |
|
9 |
-- gcc compiler suite (tested with 2.9x, 3.x and 4.x series) **If you are compiling with higher optimization levels than the default one ( for gcc), be aware that there have been reports of misoptimizations. The build system of ClamAV only checks for bugs affecting the default settings, it is your responsibility to check that your compiler version doesn’t have any bugs.** |
|
10 |
-- GNU make (gmake) |
|
11 |
- |
|
12 |
-The following packages are optional but **highly recommended**: |
|
13 |
- |
|
14 |
-- bzip2 and bzip2-devel library |
|
15 |
-- libxml2 and libxml2-dev library |
|
16 |
-- `check` unit testing framework \[3\]. |
|
17 |
- |
|
18 |
-The following packages are optional, but **required for bytecode JIT support**: |
|
19 |
- |
|
20 |
-- GCC C and C++ compilers (minimum 4.1.3, recommended 4.3.4 or newer) the package for these compilers are usually called: gcc, g++, or gcc-c++. \[5\] |
|
21 |
-- OSX Xcode versions prior to 5.0 use a g++ compiler frontend (llvm-gcc) that is not compatible with ClamAV JIT. It is recommended to either compile ClamAV JIT with clang++ or to compile ClamAV without JIT. |
|
22 |
-- A supported CPU for the JIT, either of: X86, X86-64, PowerPC, PowerPC64 |
|
23 |
- |
|
24 |
-The following packages are optional, but needed for the JIT unit tests: |
|
25 |
- |
|
26 |
-- GNU Make (version 3.79, recommended 3.81) |
|
27 |
-- Python (version 2.5.4 or newer), for running the JIT unit tests |
|
28 |
- |
|
29 |
-The following packages are optional, but required for clamsubmit: |
|
30 |
- |
|
31 |
-- libcurl-devel library |
|
32 |
-- libjson-c-dev library |
|
33 |
- |
|
34 |
-## Installing on shell account |
|
35 |
- |
|
36 |
-To install ClamAV locally on an unprivileged shell account you need not create any additional users or groups. Assuming your home directory is `/home/gary` you should build it as follows: |
|
37 |
- |
|
38 |
-```bash |
|
39 |
- $ ./configure --prefix=/home/gary/clamav --disable-clamav |
|
40 |
- $ make; make install |
|
41 |
-``` |
|
42 |
- |
|
43 |
-To test your installation execute: |
|
44 |
- |
|
45 |
-```bash |
|
46 |
- $ ~/clamav/bin/freshclam |
|
47 |
- $ ~/clamav/bin/clamscan ~ |
|
48 |
-``` |
|
49 |
- |
|
50 |
-The `--disable-clamav` switch disables the check for existence of the *clamav* user and group but `clamscan` would still require an unprivileged account to work in a superuser mode. |
|
51 |
- |
|
52 |
-## Adding new system user and group |
|
53 |
- |
|
54 |
-If you are installing ClamAV for the first time, you have to add a new user and group to your system: |
|
55 |
- |
|
56 |
-```bash |
|
57 |
- # groupadd clamav |
|
58 |
- # useradd -g clamav -s /bin/false -c "Clam AntiVirus" clamav |
|
59 |
-``` |
|
60 |
- |
|
61 |
-Consult a system manual if your OS has not *groupadd* and *useradd* utilities. **Don’t forget to lock access to the account\!** |
|
62 |
- |
|
63 |
-## Compilation of base package |
|
64 |
- |
|
65 |
-Once you have created the clamav user and group, please extract the archive: |
|
66 |
- |
|
67 |
-```bash |
|
68 |
- $ zcat clamav-x.yz.tar.gz | tar xvf - |
|
69 |
- $ cd clamav-x.yz |
|
70 |
-``` |
|
71 |
- |
|
72 |
-Assuming you want to install the configuration files in /etc, configure and build the software as follows: |
|
73 |
- |
|
74 |
-```bash |
|
75 |
- $ ./configure --sysconfdir=/etc |
|
76 |
- $ make |
|
77 |
- $ su -c "make install" |
|
78 |
-``` |
|
79 |
- |
|
80 |
-In the last step the software is installed into the /usr/local directory and the config files into /etc. **WARNING: Never enable the SUID or SGID bits for Clam AntiVirus binaries.** |
|
81 |
- |
|
82 |
-## Compilation with clamav-milter enabled |
|
83 |
- |
|
84 |
-libmilter and its development files are required. To enable clamav-milter, configure ClamAV with |
|
85 |
- |
|
86 |
-```bash |
|
87 |
- $ ./configure --enable-milter |
|
88 |
-``` |
|
89 |
- |
|
90 |
-See section /refsec:clamavmilter for more details on clamav-milter. |
|
91 |
- |
|
92 |
-## Using the system LLVM |
|
93 |
- |
|
94 |
-Some problems have been reported when compiling ClamAV’s built-in LLVM with recent C++ compiler releases. These problems may be avoided by installing and using an external LLVM system library. To configure ClamAV to use LLVM that is installed as a system library instead of the built-in LLVM JIT, use following: |
|
95 |
- |
|
96 |
-```bash |
|
97 |
- $ ./configure --with-system-llvm=/myllvm/bin/llvm-config |
|
98 |
- $ make |
|
99 |
- $ sudo make install |
|
100 |
-``` |
|
101 |
- |
|
102 |
-The argument to `--with-system-llvm` is optional, indicating the path name of the LLVM configuration utility (llvm-config). With no argument to `--with-system-llvm`, `./configure` will search for LLVM in /usr/local/ and then /usr. |
|
103 |
- |
|
104 |
-Recommended versions of LLVM are 3.2, 3.3, 3.4, 3.5, and 3.6. Some installations have reported problems using earlier LLVM versions. Versions of LLVM beyond 3.6 are not currently supported in ClamAV. |
|
105 |
- |
|
106 |
-## Running unit tests |
|
107 |
- |
|
108 |
-ClamAV includes unit tests that allow you to test that the compiled binaries work correctly on your platform. |
|
109 |
- |
|
110 |
-The first step is to use your OS’s package manager to install the `check` package. If your OS doesn’t have that package, you can download it from <http://check.sourceforge.net/>, build it and install it. |
|
111 |
- |
|
112 |
-To help clamav’s configure script locate `check`, it is recommended that you install `pkg-config`, preferably using your OS’s package manager, or from <http://pkg-config.freedesktop.org>. |
|
113 |
- |
|
114 |
-The recommended way to run unit-tests is the following, which ensures you will get an error if unit tests cannot be built: \[6\] |
|
115 |
- |
|
116 |
-```bash |
|
117 |
- $ ./configure --enable-check |
|
118 |
- $ make |
|
119 |
- $ make check |
|
120 |
-``` |
|
121 |
- |
|
122 |
-When `make check` is finished, you should get a message similar to this: |
|
123 |
- |
|
124 |
-```bash |
|
125 |
-================== |
|
126 |
-All 8 tests passed |
|
127 |
-================== |
|
128 |
-``` |
|
129 |
- |
|
130 |
-If a unit test fails, you get a message similar to the following. Note that in older versions of make check may report failures due to the absence of optional packages. Please make sure you have the latest versions of the components noted in section /refsec:components. See the next section on how to report a bug when a unit test fails. |
|
131 |
- |
|
132 |
-```bash |
|
133 |
-======================================== |
|
134 |
-1 of 8 tests failed |
|
135 |
-Please report to https://bugzilla.clamav.net/ |
|
136 |
-======================================== |
|
137 |
-``` |
|
138 |
- |
|
139 |
-If unit tests are disabled (and you didn’t use –enable-check), you will get this message: |
|
140 |
- |
|
141 |
-```bash |
|
142 |
-*** Unit tests disabled in this build |
|
143 |
-*** Use ./configure --enable-check to enable them |
|
144 |
- |
|
145 |
-SKIP: check_clamav |
|
146 |
-PASS: check_clamd.sh |
|
147 |
-PASS: check_freshclam.sh |
|
148 |
-PASS: check_sigtool.sh |
|
149 |
-PASS: check_clamscan.sh |
|
150 |
-====================== |
|
151 |
-All 4 tests passed |
|
152 |
-(1 tests were not run) |
|
153 |
-====================== |
|
154 |
-``` |
|
155 |
- |
|
156 |
-Running `./configure --enable-check` should tell you why. |
|
157 |
- |
|
158 |
-## Reporting a unit test failure bug |
|
159 |
- |
|
160 |
-If `make check` says that some tests failed we encourage you to report a bug on our bugzilla: <https://bugzilla.clamav.net>. The information we need is: |
|
161 |
- |
|
162 |
-- The exact output from `make check` |
|
163 |
-- Output of `uname -mrsp` |
|
164 |
-- your `config.log` |
|
165 |
-- The following files from the `unit_tests/` directory: |
|
166 |
- - `test.log` |
|
167 |
- - `clamscan.log` |
|
168 |
- - `clamdscan.log` |
|
169 |
- |
|
170 |
-- `/tmp/clamd-test.log` if it exists |
|
171 |
-- where and how you installed the check package |
|
172 |
-- Output of `pkg-config check --cflags --libs` |
|
173 |
-- Optionally if `valgrind` is available on your platform, the output of the following: |
|
174 |
- ```bash |
|
175 |
- $ make check |
|
176 |
- $ CK_FORK=no ./libtool --mode=execute valgrind unit_tests/check_clamav |
|
177 |
- ``` |
|
178 |
- |
|
179 |
-## Obtain Latest ClamAV anti-virus signature databases |
|
180 |
- |
|
181 |
-Before you can run ClamAV in daemon mode (clamd), ’clamdscan’, or ’clamscan’ which is ClamAV’s command line virus scanner, you must have ClamAV Virus Database (.cvd) file(s) installed in the appropriate location on your system. The default location for these database files are /usr/local/share/clamav (in Linux/Unix). |
|
182 |
- |
|
183 |
-Here is a listing of currently available ClamAV Virus Database Files: |
|
184 |
- |
|
185 |
-- bytecode.cvd (signatures to detect bytecode in files) |
|
186 |
-- main.cvd (main ClamAV virus database file) |
|
187 |
-- daily.cvd (daily update file for ClamAV virus databases) |
|
188 |
-- safebrowsing.cvd (virus signatures for safe browsing) |
|
189 |
- |
|
190 |
-These files can be downloaded via HTTP from the main ClamAV website or via the ’freshclam’ utility on a periodic basis. Using ’freshclam’ is the preferred method of keeping the ClamAV virus database files up to date without manual intervention (see the [freshclam configuration](Configuration.md#Setting-up-auto\-updating) section for information on how to configure ’freshclam’ for automatic updating and the main [freshclam](Usage.md#freshclam) section for additional details on freshclam). |
191 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,85 +0,0 @@ |
1 |
-# Introduction |
|
2 |
- |
|
3 |
-Clam AntiVirus is an open source (GPL) anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail gateways. It provides a number of utilities including a flexible and scalable multi-threaded daemon, a command line scanner and advanced tool for automatic database updates. The core of the package is an anti-virus engine available in a form of shared library. |
|
4 |
- |
|
5 |
-## Features |
|
6 |
- |
|
7 |
-- Licensed under the GNU General Public License, Version 2 |
|
8 |
-- POSIX compliant, portable |
|
9 |
-- Fast scanning |
|
10 |
-- Supports on-access scanning (Linux only) |
|
11 |
-- Detects over 1 million viruses, worms and trojans, including Microsoft Office macro viruses, mobile malware, and other threats |
|
12 |
-- Built-in bytecode interpreter allows the ClamAV signature writers to create and distribute very complex detection routines and remotely enhance the scanner’s functionality |
|
13 |
-- Scans within archives and compressed files (also protects against archive bombs), built-in support includes: |
|
14 |
- - Zip (including SFX) |
|
15 |
- - RAR (including SFX) |
|
16 |
- - 7Zip |
|
17 |
- - ARJ (including SFX) |
|
18 |
- - Tar |
|
19 |
- - CPIO |
|
20 |
- - Gzip |
|
21 |
- - Bzip2 |
|
22 |
- - DMG |
|
23 |
- - IMG |
|
24 |
- - ISO 9660 |
|
25 |
- - PKG |
|
26 |
- - HFS+ partition |
|
27 |
- - HFSX partition |
|
28 |
- - APM disk image |
|
29 |
- - GPT disk image |
|
30 |
- - MBR disk image |
|
31 |
- - XAR |
|
32 |
- - XZ |
|
33 |
- - MS OLE2 |
|
34 |
- - MS Cabinet Files (including SFX) |
|
35 |
- - MS CHM (Compiled HTML) |
|
36 |
- - MS SZDD compression format |
|
37 |
- - BinHex |
|
38 |
- - SIS (SymbianOS packages) |
|
39 |
- - AutoIt |
|
40 |
- - InstallShield |
|
41 |
-- Supports Portable Executable (32/64-bit) files compressed or obfuscated with: |
|
42 |
- - AsPack |
|
43 |
- - UPX |
|
44 |
- - FSG |
|
45 |
- - Petite |
|
46 |
- - PeSpin |
|
47 |
- - NsPack |
|
48 |
- - wwpack32 |
|
49 |
- - MEW |
|
50 |
- - Upack |
|
51 |
- - Y0da Cryptor |
|
52 |
-- Supports ELF and Mach-O files (both 32- and 64-bit) |
|
53 |
-- Supports almost all mail file formats |
|
54 |
-- Support for other special files/formats includes: |
|
55 |
- - HTML |
|
56 |
- - RTF |
|
57 |
|
|
58 |
- - Files encrypted with CryptFF and ScrEnc |
|
59 |
- - uuencode |
|
60 |
- - TNEF (winmail.dat) |
|
61 |
-- Advanced database updater with support for scripted updates, digital signatures and DNS based database version queries |
|
62 |
- |
|
63 |
-## Mailing lists and IRC channel |
|
64 |
- |
|
65 |
-If you have a trouble installing or using ClamAV try asking on our mailing lists. There are four lists available: |
|
66 |
- |
|
67 |
-- **clamav-announce\*lists.clamav.net** - info about new versions, moderated\[1\]. |
|
68 |
-- **clamav-users\*lists.clamav.net** - user questions |
|
69 |
-- **clamav-devel\*lists.clamav.net** - technical discussions |
|
70 |
-- **clamav-virusdb\*lists.clamav.net** - database update announcements, moderated |
|
71 |
- |
|
72 |
-You can subscribe and search the mailing list archives at: <https://www.clamav.net/contact.html#ml> |
|
73 |
- |
|
74 |
-Alternatively you can try asking on the `#clamav` IRC channel - launch your favourite irc client and type: |
|
75 |
- |
|
76 |
-```bash |
|
77 |
- /server irc.freenode.net |
|
78 |
- /join #clamav |
|
79 |
-``` |
|
80 |
- |
|
81 |
-## Virus submitting |
|
82 |
- |
|
83 |
-If you have got a virus which is not detected by your ClamAV with the latest databases, please submit the sample at our website: |
|
84 |
- |
|
85 |
-<https://www.clamav.net/reports/malware> |
|
86 | 1 |
\ No newline at end of file |
87 | 2 |
deleted file mode 100644 |
... | ... |
@@ -1,681 +0,0 @@ |
1 |
-# PhishSigs |
|
2 |
- |
|
3 |
-- [PhishSigs](#phishsigs) |
|
4 |
-- [Database file format](#database-file-format) |
|
5 |
- - [PDB format](#pdb-format) |
|
6 |
- - [GDB format](#gdb-format) |
|
7 |
- - [WDB format](#wdb-format) |
|
8 |
- - [Hints](#hints) |
|
9 |
- - [Examples of PDB signatures](#examples-of-pdb-signatures) |
|
10 |
- - [Examples of WDB signatures](#examples-of-wdb-signatures) |
|
11 |
- - [Example for how the URL extractor works](#example-for-how-the-url-extractor-works) |
|
12 |
- - [How matching works](#how-matching-works) |
|
13 |
- - [RealURL, displayedURL concatenation](#realurl-displayedurl-concatenation) |
|
14 |
- - [What happens when a match is found](#what-happens-when-a-match-is-found) |
|
15 |
- - [Extraction of realURL, displayedURL from HTML tags](#extraction-of-realurl-displayedurl-from-html-tags) |
|
16 |
- - [Example](#example) |
|
17 |
- - [Simple patterns](#simple-patterns) |
|
18 |
- - [Regular expressions](#regular-expressions) |
|
19 |
- - [Flags](#flags) |
|
20 |
-- [Introduction to regular expressions](#introduction-to-regular-expressions) |
|
21 |
- - [Special characters](#special-characters) |
|
22 |
- - [Character classes](#character-classes) |
|
23 |
- - [Escaping](#escaping) |
|
24 |
- - [Alternation](#alternation) |
|
25 |
- - [Optional matching, and repetition](#optional-matching-and-repetition) |
|
26 |
- - [Groups](#groups) |
|
27 |
-- [How to create database files](#how-to-create-database-files) |
|
28 |
- - [How to create and maintain the whitelist (daily.wdb)](#how-to-create-and-maintain-the-whitelist-dailywdb) |
|
29 |
- - [How to create and maintain the domainlist (daily.pdb)](#how-to-create-and-maintain-the-domainlist-dailypdb) |
|
30 |
- - [Dealing with false positives, and undetected phishing mails](#dealing-with-false-positives-and-undetected-phishing-mails) |
|
31 |
- - [False positives](#false-positives) |
|
32 |
- - [Undetected phish mails](#undetected-phish-mails) |
|
33 |
- |
|
34 |
-# Database file format |
|
35 |
- |
|
36 |
-## PDB format |
|
37 |
- |
|
38 |
-This file contains urls/hosts that are target of phishing attempts. It |
|
39 |
-contains lines in the following format: |
|
40 |
- |
|
41 |
-``` |
|
42 |
- R[Filter]:RealURL:DisplayedURL[:FuncLevelSpec] |
|
43 |
- H[Filter]:DisplayedHostname[:FuncLevelSpec] |
|
44 |
-``` |
|
45 |
- |
|
46 |
-- `R` |
|
47 |
- |
|
48 |
- regular expression, for the concatenated URL |
|
49 |
- |
|
50 |
-- `H` |
|
51 |
- |
|
52 |
- matches the `DisplayedHostname` as a simple pattern (literally, no regular expression) |
|
53 |
- |
|
54 |
- - the pattern can match either the full hostname |
|
55 |
- |
|
56 |
- - or a subdomain of the specified hostname |
|
57 |
- |
|
58 |
- - to avoid false matches in case of subdomain matches, the engine checks that there is a dot(`.`) or a space(` `) before the matched portion |
|
59 |
- |
|
60 |
-- `Filter` |
|
61 |
- |
|
62 |
- is ignored for R and H for compatibility reasons |
|
63 |
- |
|
64 |
-- `RealURL` |
|
65 |
- |
|
66 |
- is the URL the user is sent to, example: *href* attribute of an html anchor (*\<a\> tag*) |
|
67 |
- |
|
68 |
-- `DisplayedURL` |
|
69 |
- |
|
70 |
- is the URL description displayed to the user, where its *claimed* they are sent, example: contents of an html anchor (*\<a\> tag*) |
|
71 |
- |
|
72 |
-- `DisplayedHostname` |
|
73 |
- |
|
74 |
- is the hostname portion of the DisplayedURL |
|
75 |
- |
|
76 |
-- `FuncLevelSpec` |
|
77 |
- |
|
78 |
- an (optional) functionality level, 2 formats are possible: |
|
79 |
- |
|
80 |
- - `minlevel` all engines having functionality level \>= `minlevel` will load this line |
|
81 |
- |
|
82 |
- - `minlevel-maxlevel` engines with functionality level \(>=\) `minlevel`, and \(<\) `maxlevel` will load this line |
|
83 |
- |
|
84 |
-## GDB format |
|
85 |
- |
|
86 |
-This file contains URL hashes in the following format: |
|
87 |
- |
|
88 |
- S:P:HostPrefix[:FuncLevelSpec] |
|
89 |
- S:F:Sha256hash[:FuncLevelSpec] |
|
90 |
- S1:P:HostPrefix[:FuncLevelSpec] |
|
91 |
- S1:F:Sha256hash[:FuncLevelSpec] |
|
92 |
- S2:P:HostPrefix[:FuncLevelSpec] |
|
93 |
- S2:F:Sha256hash[:FuncLevelSpec] |
|
94 |
- S:W:Sha256hash[:FuncLevelSpec] |
|
95 |
- |
|
96 |
-- `S:` |
|
97 |
- |
|
98 |
- These are hashes for Google Safe Browsing - malware sites, and should not be used for other purposes. |
|
99 |
- |
|
100 |
-- `S2:` |
|
101 |
- |
|
102 |
- These are hashes for Google Safe Browsing - phishing sites, and should not be used for other purposes. |
|
103 |
- |
|
104 |
-- `S1:` |
|
105 |
- |
|
106 |
- Hashes for blacklisting phishing sites. Virus name: Phishing.URL.Blacklisted |
|
107 |
- |
|
108 |
-- `S:W:` |
|
109 |
- |
|
110 |
- Locally whitelisted hashes. |
|
111 |
- |
|
112 |
-- `HostPrefix` |
|
113 |
- |
|
114 |
- 4-byte prefix of the sha256 hash of the last 2 or 3 components of the hostname. If prefix doesn’t match, no further lookups are performed. |
|
115 |
- |
|
116 |
-- `Sha256hash` |
|
117 |
- |
|
118 |
- sha256 hash of the canonicalized URL, or a sha256 hash of its prefix/suffix according to the Google Safe Browsing “Performing Lookups” rules. There should be a corresponding `:P:HostkeyPrefix` entry for the hash to be taken into consideration. |
|
119 |
- |
|
120 |
-To see which hash/URL matched, look at the `clamscan --debug` output, and look for the following strings: `Looking up hash`, `prefix matched`, and `Hash matched`. Local whitelisting of .gdb entries can be done by creating a local.gdb file, and adding a line `S:W:<HASH>`. |
|
121 |
- |
|
122 |
-## WDB format |
|
123 |
- |
|
124 |
-This file contains whitelisted url pairs It contains lines in the following format: |
|
125 |
- |
|
126 |
-``` |
|
127 |
- X:RealURL:DisplayedURL[:FuncLevelSpec] |
|
128 |
- M:RealHostname:DisplayedHostname[:FuncLevelSpec] |
|
129 |
-``` |
|
130 |
- |
|
131 |
-- `X` |
|
132 |
- |
|
133 |
- regular expression, for the *entire URL*, not just the hostname |
|
134 |
- |
|
135 |
- - The regular expression is by default anchored to start-of-line and end-of-line, as if you have used `^RegularExpression$` |
|
136 |
- |
|
137 |
- - A trailing `/` is automatically added both to the regex, and the input string to avoid false matches |
|
138 |
- |
|
139 |
- - The regular expression matches the *concatenation* of the RealURL, a colon(`:`), and the DisplayedURL as a single string. It doesn’t separately match RealURL and DisplayedURL\! |
|
140 |
- |
|
141 |
-- `M` |
|
142 |
- |
|
143 |
- matches hostname, or subdomain of it, see notes for H above |
|
144 |
- |
|
145 |
-## Hints |
|
146 |
- |
|
147 |
-- empty lines are ignored |
|
148 |
- |
|
149 |
-- the colons are mandatory |
|
150 |
- |
|
151 |
-- Don’t leave extra spaces on the end of a line\! |
|
152 |
- |
|
153 |
-- if any of the lines don’t conform to this format, clamav will abort with a Malformed Database Error |
|
154 |
- |
|
155 |
-- see section [Extraction-of-realURL](#Extraction-of-realURL,-displayedURL-from-HTML-tags) for more details on realURL/displayedURL |
|
156 |
- |
|
157 |
-## Examples of PDB signatures |
|
158 |
- |
|
159 |
-To check for phishing mails that target amazon.com, or subdomains of |
|
160 |
-amazon.com: |
|
161 |
- |
|
162 |
-``` |
|
163 |
- H:amazon.com |
|
164 |
-``` |
|
165 |
- |
|
166 |
-To do the same, but for amazon.co.uk: |
|
167 |
- |
|
168 |
-``` |
|
169 |
- H:amazon.co.uk |
|
170 |
-``` |
|
171 |
- |
|
172 |
-To limit the signatures to certain engine versions: |
|
173 |
- |
|
174 |
-``` |
|
175 |
- H:amazon.co.uk:20-30 |
|
176 |
- H:amazon.co.uk:20- |
|
177 |
- H:amazon.co.uk:0-20 |
|
178 |
-``` |
|
179 |
- |
|
180 |
-First line: engine versions 20, 21, ..., 29 can load it |
|
181 |
- |
|
182 |
-Second line: engine versions \>= 20 can load it |
|
183 |
- |
|
184 |
-Third line: engine versions \< 20 can load it |
|
185 |
- |
|
186 |
-In a real situation, you’d probably use the second form. A situation like that would be if you are using a feature of the signatures not available in earlier versions, or if earlier versions have bugs with your signature. Its neither case here, the above examples are for illustrative purposes only. |
|
187 |
- |
|
188 |
-## Examples of WDB signatures |
|
189 |
- |
|
190 |
-To allow amazon’s country specific domains and amazon.com, to mix domain names in DisplayedURL, and RealURL: |
|
191 |
- |
|
192 |
- X:.+\.amazon\.(at|ca|co\.uk|co\.jp|de|fr)([/?].*)?:.+\.amazon\.com([/?].*)?:17- |
|
193 |
- |
|
194 |
-Explanation of this signature: |
|
195 |
- |
|
196 |
-- `X:` |
|
197 |
- |
|
198 |
- this is a regular expression |
|
199 |
- |
|
200 |
-- `:17-` |
|
201 |
- |
|
202 |
- load signature only for engines with functionality level \>= 17 (recommended for type X) |
|
203 |
- |
|
204 |
-The regular expression is the following (X:, :17- stripped, and a / appended) |
|
205 |
- |
|
206 |
-``` |
|
207 |
- .+\.amazon\.(at|ca|co\.uk|co\.jp|de|fr)([/?].*)?:.+\.amazon\.com([/?].*)?/ |
|
208 |
-``` |
|
209 |
- |
|
210 |
-Explanation of this regular expression (note that it is a single regular expression, and not 2 regular expressions splitted at the :). |
|
211 |
- |
|
212 |
-- `.+` |
|
213 |
- |
|
214 |
- any subdomain of |
|
215 |
- |
|
216 |
-- `\.amazon\.` |
|
217 |
- |
|
218 |
- domain we are whitelisting (RealURL part) |
|
219 |
- |
|
220 |
-- `(at|ca|co\.uk|co\.jp|de|fr)` |
|
221 |
- |
|
222 |
- country-domains: at, ca, co.uk, co.jp, de, fr |
|
223 |
- |
|
224 |
-- `([/?].*)?` |
|
225 |
- |
|
226 |
- recomended way to end real url part of whitelist, this protects against embedded URLs (evilurl.example.com/amazon.co.uk/) |
|
227 |
- |
|
228 |
-- `:` |
|
229 |
- |
|
230 |
- RealURL and DisplayedURL are concatenated via a :, so match a literal : here |
|
231 |
- |
|
232 |
-- `.+` |
|
233 |
- |
|
234 |
- any subdomain of |
|
235 |
- |
|
236 |
-- `\.amazon\.com` |
|
237 |
- |
|
238 |
- whitelisted DisplayedURL |
|
239 |
- |
|
240 |
-- `([/?].*)?` |
|
241 |
- |
|
242 |
- recommended way to end displayed url part, to protect against embedded URLs |
|
243 |
- |
|
244 |
-- `/` |
|
245 |
- |
|
246 |
- automatically added to further protect against embedded URLs |
|
247 |
- |
|
248 |
-When you whitelist an entry make sure you check that both domains are owned by the same entity. What this whitelist entry allows is: Links claiming to point to amazon.com (DisplayedURL), but really go to country-specific domain of amazon (RealURL). |
|
249 |
- |
|
250 |
-## Example for how the URL extractor works |
|
251 |
- |
|
252 |
-Consider the following HTML file: |
|
253 |
- |
|
254 |
-```html |
|
255 |
- <html> |
|
256 |
- <a href="http://1.realurl.example.com/"> |
|
257 |
- 1.displayedurl.example.com |
|
258 |
- </a> |
|
259 |
- <a href="http://2.realurl.example.com"> |
|
260 |
- 2 d<b>i<p>splayedurl.e</b>xa<i>mple.com |
|
261 |
- </a> |
|
262 |
- <a href="http://3.realurl.example.com"> |
|
263 |
- 3.nested.example.com |
|
264 |
- <a href="http://4.realurl.example.com"> |
|
265 |
- 4.displayedurl.example.com |
|
266 |
- </a> |
|
267 |
- </a> |
|
268 |
- <form action="http://5.realurl.example.com"> |
|
269 |
- sometext |
|
270 |
- <img src="http://5.displayedurl.example.com/img0.gif"/> |
|
271 |
- <a href="http://5.form.nested.displayedurl.example.com"> |
|
272 |
- 5.form.nested.link-displayedurl.example.com |
|
273 |
- </a> |
|
274 |
- </form> |
|
275 |
- <a href="http://6.realurl.example.com"> |
|
276 |
- 6.displ |
|
277 |
- <img src="6.displayedurl.example.com/img1.gif"/> |
|
278 |
- ayedurl.example.com |
|
279 |
- </a> |
|
280 |
- <a href="http://7.realurl.example.com"> |
|
281 |
- <iframe src="http://7.displayedurl.example.com"> |
|
282 |
- </a> |
|
283 |
-``` |
|
284 |
- |
|
285 |
-The phishing engine extract the following |
|
286 |
-RealURL/DisplayedURL pairs from it: |
|
287 |
- |
|
288 |
-``` |
|
289 |
- http://1.realurl.example.com/ |
|
290 |
- 1.displayedurl.example.com |
|
291 |
- |
|
292 |
- http://2.realurl.example.com |
|
293 |
- 2displayedurl.example.com |
|
294 |
- |
|
295 |
- http://3.realurl.example.com |
|
296 |
- 3.nested.example.com |
|
297 |
- |
|
298 |
- http://4.realurl.example.com |
|
299 |
- 4.displayedurl.example.com |
|
300 |
- |
|
301 |
- http://5.realurl.example.com |
|
302 |
- http://5.displayedurl.example.com/img0.gif |
|
303 |
- |
|
304 |
- http://5.realurl.example.com |
|
305 |
- http://5.form.nested.displayedurl.example.com |
|
306 |
- |
|
307 |
- http://5.form.nested.displayedurl.example.com |
|
308 |
- 5.form.nested.link-displayedurl.example.com |
|
309 |
- |
|
310 |
- http://6.realurl.example.com |
|
311 |
- 6.displayedurl.example.com |
|
312 |
- |
|
313 |
- http://6.realurl.example.com |
|
314 |
- 6.displayedurl.example.com/img1.gif |
|
315 |
-``` |
|
316 |
- |
|
317 |
-## How matching works |
|
318 |
- |
|
319 |
-### RealURL, displayedURL concatenation |
|
320 |
- |
|
321 |
-The phishing detection module processes pairs of RealURL/DisplayedURL. Matching against daily.wdb is done as follows: the realURL is concatenated with a `:`, and with the DisplayedURL, then that *line* is matched against the lines in daily.wdb/daily.pdb |
|
322 |
- |
|
323 |
-So if you have this line in daily.wdb: |
|
324 |
- |
|
325 |
- M:www.google.ro:www.google.com |
|
326 |
- |
|
327 |
-and this href: `<a href='http://www.google.ro'>www.google.com</a>` then it will be whitelisted, but: `<a href='http://images.google.com'>www.google.com</a>` will not. |
|
328 |
- |
|
329 |
-### What happens when a match is found |
|
330 |
- |
|
331 |
-In the case of the whitelist, a match means that the RealURL/DisplayedURL combination is considered clean, and no further checks are performed on it. |
|
332 |
- |
|
333 |
-In the case of the domainlist, a match means that the RealURL/displayedURL is going to be checked for phishing attempts. |
|
334 |
- |
|
335 |
-Furthermore you can restrict what checks are to be performed by specifying the 3-digit hexnumber. |
|
336 |
- |
|
337 |
-### Extraction of realURL, displayedURL from HTML tags |
|
338 |
- |
|
339 |
-The html parser extracts pairs of realURL/displayedURL based on the following rules. |
|
340 |
- |
|
341 |
-In version 0.93: After URLs have been extracted, they are normalized, and cut after the hostname. `http://test.example.com/path/somecgi?queryparameters` becomes `http://test.example.com/` |
|
342 |
- |
|
343 |
-- `a` |
|
344 |
- |
|
345 |
- (anchor) the *href* is the realURL, its *contents* is the displayedURL |
|
346 |
- |
|
347 |
- - contents |
|
348 |
- is the tag-stripped contents of the \<a\> tags, so for example \<b\> tags are stripped (but not their contents) |
|
349 |
- |
|
350 |
- nesting another \<a\> tag withing an \<a\> tag (besides being invalid html) is treated as a \</a\>\<a.. |
|
351 |
- |
|
352 |
-- `form` |
|
353 |
- |
|
354 |
- the *action* attribute is the realURL, and a nested \<a\> tag is the displayedURL |
|
355 |
- |
|
356 |
-- `img/area` |
|
357 |
- |
|
358 |
- if nested within an *\<a\>* tag, the realURL is the *href* of the a tag, and the *src/dynsrc/area* is the displayedURL of the img |
|
359 |
- |
|
360 |
- if nested withing a *form* tag, then the action attribute of the *form* tag is the realURL |
|
361 |
- |
|
362 |
-- `iframe` |
|
363 |
- |
|
364 |
- if nested withing an *\<a\>* tag the *src* attribute is the displayedURL, and the *href* of its parent *a* tag is the realURL |
|
365 |
- |
|
366 |
- if nested withing a *form* tag, then the action attribute of the *form* tag is the realURL |
|
367 |
- |
|
368 |
-### Example |
|
369 |
- |
|
370 |
-Consider this html file: |
|
371 |
- |
|
372 |
-```html |
|
373 |
-<a href=”evilurl”\>www.paypal.com\</a\>* |
|
374 |
- |
|
375 |
-<a href=”evilurl2” title=”www.ebay.com”\>click here to sign |
|
376 |
-in\</a\>* |
|
377 |
- |
|
378 |
-<form action=”evilurl_form”\>* |
|
379 |
- |
|
380 |
-*Please sign in to \<a href=”cgi.ebay.com”\>Ebay\</a\using this |
|
381 |
-form* |
|
382 |
- |
|
383 |
-<input type=’text’ name=’username’\>Username\</input\>* |
|
384 |
- |
|
385 |
-*....* |
|
386 |
- |
|
387 |
-</form\>* |
|
388 |
- |
|
389 |
-<a href=”evilurl”\>\<img src=”images.paypal.com/secure.jpg”\>\</a\>* |
|
390 |
-``` |
|
391 |
- |
|
392 |
-The resulting realURL/displayedURL pairs will be (note that one tag can generate multiple pairs): |
|
393 |
- |
|
394 |
-- evilurl / www.paypal.com |
|
395 |
- |
|
396 |
-- evilurl2 / click here to sign in |
|
397 |
- |
|
398 |
-- evilurl2 / www.ebay.com |
|
399 |
- |
|
400 |
-- evilurl_form / cgi.ebay.com |
|
401 |
- |
|
402 |
-- cgi.ebay.com / Ebay |
|
403 |
- |
|
404 |
-- evilurl / image.paypal.com/secure.jpg |
|
405 |
- |
|
406 |
-## Simple patterns |
|
407 |
- |
|
408 |
-Simple patterns are matched literally, i.e. if you say: |
|
409 |
- |
|
410 |
-``` |
|
411 |
-www.google.com |
|
412 |
-``` |
|
413 |
- |
|
414 |
-it is going to match *www.google.com*, and only that. The *. (dot)* character has no special meaning (see the section on regexes [\[sec:Regular-expressions\]](#sec:Regular-expressions) for how the *.(dot)* character behaves there) |
|
415 |
- |
|
416 |
-## Regular expressions |
|
417 |
- |
|
418 |
-POSIX regular expressions are supported, and you can consider that internally it is wrapped by *^*, and *$.* In other words, this means that the regular expression has to match the entire concatenated (see section [RealURL,-displayedURL-concatenation](#RealURL,-displayedURL-concatenation) for details on concatenation) url. |
|
419 |
- |
|
420 |
-It is recomended that you read section [Introduction-to-regular](#Introduction-to-regular) to learn how to write regular expressions, and then come back and read this for hints. |
|
421 |
- |
|
422 |
-Be advised that clamav contains an internal, very basic regex matcher to reduce the load on the regex matching core. Thus it is recomended that you avoid using regex syntax not supported by it at the very beginning of regexes (at least the first few characters). |
|
423 |
- |
|
424 |
-Currently the clamav regex matcher supports: |
|
425 |
- |
|
426 |
-- `.` (dot) character |
|
427 |
- |
|
428 |
-- `\(\backslash\)` (escaping special characters) |
|
429 |
- |
|
430 |
-- `|` (pipe) alternatives |
|
431 |
- |
|
432 |
-- `\[\]` (character classes) |
|
433 |
- |
|
434 |
-- `()` (parenthesis for grouping, but no group extraction is performed) |
|
435 |
- |
|
436 |
-- other non-special characters |
|
437 |
- |
|
438 |
-Thus the following are not supported: |
|
439 |
- |
|
440 |
-- `\+` repetition |
|
441 |
- |
|
442 |
-- `\*` repetition |
|
443 |
- |
|
444 |
-- `{}` repetition |
|
445 |
- |
|
446 |
-- backreferences |
|
447 |
- |
|
448 |
-- lookaround |
|
449 |
- |
|
450 |
-- other “advanced” features not listed in the supported list ;) |
|
451 |
- |
|
452 |
-This however shouldn’t discourage you from using the “not directly supported features “, because if the internal engine encounters unsupported syntax, it passes it on to the POSIX regex core (beginning from the first unsupported token, everything before that is still processed by the internal matcher). An example might make this more clear: |
|
453 |
- |
|
454 |
-*www\(\backslash\).google\(\backslash\).(com|ro|it) (\[a-zA-Z\])+\(\backslash\).google\(\backslash\).(com|ro|it)* |
|
455 |
- |
|
456 |
-Everything till *(\[a-zA-Z\])+* is processed internally, that parenthesis (and everything beyond) is processed by the posix core. |
|
457 |
- |
|
458 |
-Examples of url pairs that match: |
|
459 |
- |
|
460 |
-- *www.google.ro images.google.ro* |
|
461 |
- |
|
462 |
-- www.google.com images.google.ro |
|
463 |
- |
|
464 |
-Example of url pairs that don’t match: |
|
465 |
- |
|
466 |
-- www.google.ro images1.google.ro |
|
467 |
- |
|
468 |
-- images.google.com image.google.com |
|
469 |
- |
|
470 |
-## Flags |
|
471 |
- |
|
472 |
-Flags are a binary OR of the following numbers: |
|
473 |
- |
|
474 |
-- HOST_SUFFICIENT |
|
475 |
- |
|
476 |
- 1 |
|
477 |
- |
|
478 |
-- DOMAIN_SUFFICIENT |
|
479 |
- |
|
480 |
- 2 |
|
481 |
- |
|
482 |
-- DO_REVERSE_LOOKUP |
|
483 |
- |
|
484 |
- 4 |
|
485 |
- |
|
486 |
-- CHECK_REDIR |
|
487 |
- |
|
488 |
- 8 |
|
489 |
- |
|
490 |
-- CHECK_SSL |
|
491 |
- |
|
492 |
- 16 |
|
493 |
- |
|
494 |
-- CHECK_CLOAKING |
|
495 |
- |
|
496 |
- 32 |
|
497 |
- |
|
498 |
-- CLEANUP_URL |
|
499 |
- |
|
500 |
- 64 |
|
501 |
- |
|
502 |
-- CHECK_DOMAIN_REVERSE |
|
503 |
- |
|
504 |
- 128 |
|
505 |
- |
|
506 |
-- CHECK_IMG_URL |
|
507 |
- |
|
508 |
- 256 |
|
509 |
- |
|
510 |
-- DOMAINLIST_REQUIRED |
|
511 |
- |
|
512 |
- 512 |
|
513 |
- |
|
514 |
-The names of the constants are self-explanatory. |
|
515 |
- |
|
516 |
-These constants are defined in libclamav/phishcheck.h, you can check there for the latest flags. |
|
517 |
- |
|
518 |
-There is a default set of flags that are enabled, these are currently: |
|
519 |
- |
|
520 |
- ( CLEANUP_URL | CHECK_SSL | CHECK_CLOAKING | CHECK_IMG_URL ) |
|
521 |
- |
|
522 |
-ssl checking is performed only for a tags currently. |
|
523 |
- |
|
524 |
-You must decide for each line in the domainlist if you want to filter any flags (that is you don’t want certain checks to be done), and then calculate the binary OR of those constants, and then convert it into a 3-digit hexnumber. For example you devide that domain_sufficient shouldn’t be used for ebay.com, and you don’t want to check images either, so you come up with this flag number: \(2|256\Rightarrow\)258\((decimal)\Rightarrow102(hexadecimal)\) |
|
525 |
- |
|
526 |
-So you add this line to daily.wdb: |
|
527 |
- |
|
528 |
-- R102 www.ebay.com .+ |
|
529 |
- |
|
530 |
-# Introduction to regular expressions |
|
531 |
- |
|
532 |
-Recomended reading: |
|
533 |
- |
|
534 |
-- http://www.regular-expressions.info/quickstart.html |
|
535 |
- |
|
536 |
-- http://www.regular-expressions.info/tutorial.html |
|
537 |
- |
|
538 |
-- regex(7) man-page: http://www.tin.org/bin/man.cgi?section=7\&topic=regex |
|
539 |
- |
|
540 |
-## Special characters |
|
541 |
- |
|
542 |
-- \[ |
|
543 |
- |
|
544 |
- the opening square bracket - it marks the beginning of a character class, see section[Character-classes](#Character-classes) |
|
545 |
- |
|
546 |
-- \(\backslash\) |
|
547 |
- |
|
548 |
- the backslash - escapes special characters, see section [Escaping](#Escaping) |
|
549 |
- |
|
550 |
-- ^ |
|
551 |
- |
|
552 |
- the caret - matches the beginning of a line (not needed in clamav regexes, this is implied) |
|
553 |
- |
|
554 |
-- $ |
|
555 |
- |
|
556 |
- the dollar sign - matches the end of a line (not needed in clamav regexes, this is implied) |
|
557 |
- |
|
558 |
-- ̇ |
|
559 |
- |
|
560 |
- the period or dot - matches *any* character |
|
561 |
- |
|
562 |
-- | |
|
563 |
- |
|
564 |
- the vertical bar or pipe symbol - matches either of the token on its left and right side, see section [Alternation](#sub:Alternation) |
|
565 |
- |
|
566 |
-- ? |
|
567 |
- |
|
568 |
- the question mark - matches optionally the left-side token, see section[Optional-matching,-and](Optional-matching,-and) |
|
569 |
- |
|
570 |
-- \* |
|
571 |
- |
|
572 |
- the asterisk or star - matches 0 or more occurences of the left-side token, see section [Optional-matching,-and](Optional-matching,-and) |
|
573 |
- |
|
574 |
-- + |
|
575 |
- |
|
576 |
- the plus sign - matches 1 or more occurences of the left-side token, see section [Optional-matching,-and](Optional-matching,-and) |
|
577 |
- |
|
578 |
-- ( |
|
579 |
- |
|
580 |
- the opening round bracket - marks beginning of a group, see section [Groups](Groups) |
|
581 |
- |
|
582 |
-- ) |
|
583 |
- |
|
584 |
- the closing round bracket - marks end of a group, see section[Groups](Groups) |
|
585 |
- |
|
586 |
-## Character classes |
|
587 |
- |
|
588 |
-## Escaping |
|
589 |
- |
|
590 |
-Escaping has two purposes: |
|
591 |
- |
|
592 |
-- it allows you to actually match the special characters themselves, for example to match the literal *+*, you would write *\(\backslash\)+* |
|
593 |
- |
|
594 |
-- it also allows you to match non-printable characters, such as the tab (*\(\backslash\)t*), newline (*\(\backslash\)n*), .. |
|
595 |
- |
|
596 |
-However since non-printable characters are not valid inside an url, you won’t have a reason to use them. |
|
597 |
- |
|
598 |
-## Alternation |
|
599 |
- |
|
600 |
-## Optional matching, and repetition |
|
601 |
- |
|
602 |
-## Groups |
|
603 |
- |
|
604 |
-Groups are usually used together with repetition, or alternation. For example: *(com|it)+* means: match 1 or more repetitions of *com* or *it,* that is it matches: com, it, comcom, comcomcom, comit, itit, ititcom,... you get the idea. |
|
605 |
- |
|
606 |
-Groups can also be used to extract substring, but this is not supported by the clam engine, and not needed either in this case. |
|
607 |
- |
|
608 |
-# How to create database files |
|
609 |
- |
|
610 |
-## How to create and maintain the whitelist (daily.wdb) |
|
611 |
- |
|
612 |
-If the phishing code claims that a certain mail is phishing, but its not, you have 2 choices: |
|
613 |
- |
|
614 |
-- examine your rules daily.pdb, and fix them if necessary (see: section[How-to-create](How-to-create)) |
|
615 |
- |
|
616 |
-- add it to the whitelist (discussed here) |
|
617 |
- |
|
618 |
-Lets assume you are having problems because of links like this in a mail: |
|
619 |
- |
|
620 |
-```html |
|
621 |
- <a href=''http://69.0.241.57/bCentral/L.asp?L=XXXXXXXX''> |
|
622 |
- http://www.bcentral.it/ |
|
623 |
- </a> |
|
624 |
-``` |
|
625 |
- |
|
626 |
-After investigating those sites further, you decide they are no threat, and create a line like this in daily.wdb: |
|
627 |
- |
|
628 |
-``` |
|
629 |
-R http://www\(\backslash\).bcentral\(\backslash\).it/.+ |
|
630 |
-http://69\(\backslash\).0\(\backslash\).241\(\backslash\).57/bCentral/L\(\backslash\).asp?L=.+ |
|
631 |
-``` |
|
632 |
- |
|
633 |
-Note: urls like the above can be used to track unique mail recipients, and thus know if somebody actually reads mails (so they can send more spam). However since this site required no authentication information, it is safe from a phishing point of view. |
|
634 |
- |
|
635 |
-## How to create and maintain the domainlist (daily.pdb) |
|
636 |
- |
|
637 |
-When not using –phish-scan-alldomains (production environments for example), you need to decide which urls you are going to check. |
|
638 |
- |
|
639 |
-Although at a first glance it might seem a good idea to check everything, it would produce false positives. Particularly newsletters, ads, etc. are likely to use URLs that look like phishing attempts. |
|
640 |
- |
|
641 |
-Lets assume that you’ve recently seen many phishing attempts claiming they come from Paypal. Thus you need to add paypal to daily.pdb: |
|
642 |
- |
|
643 |
-``` |
|
644 |
-R .+ .+\(\backslash\).paypal\(\backslash\).com |
|
645 |
-``` |
|
646 |
- |
|
647 |
-The above line will block (detect as phishing) mails that contain urls that claim to lead to paypal, but they don’t in fact. |
|
648 |
- |
|
649 |
-Be carefull not to create regexes that match a too broad range of urls though. |
|
650 |
- |
|
651 |
-## Dealing with false positives, and undetected phishing mails |
|
652 |
- |
|
653 |
-### False positives |
|
654 |
- |
|
655 |
-Whenever you see a false positive (mail that is detected as phishing, but its not), you need to examine *why* clamav decided that its phishing. You can do this easily by building clamav with debugging (./configure –enable-experimental –enable-debug), and then running a tool: |
|
656 |
- |
|
657 |
-```bash |
|
658 |
-$contrib/phishing/why.py phishing.eml |
|
659 |
-``` |
|
660 |
- |
|
661 |
-This will show the url that triggers the phish verdict, and a reason why that url is considered phishing attempt. |
|
662 |
- |
|
663 |
-Once you know the reason, you might need to modify daily.pdb (if one of yours rules inthere are too broad), or you need to add the url to daily.wdb. If you think the algorithm is incorrect, please file a bug report on bugzilla.clamav.net, including the output of *why.py*. |
|
664 |
- |
|
665 |
-### Undetected phish mails |
|
666 |
- |
|
667 |
-Using why.py doesn’t help here unfortunately (it will say: clean), so all you can do is: |
|
668 |
- |
|
669 |
-```bash |
|
670 |
-$clamscan/clamscan –phish-scan-alldomains undetected.eml |
|
671 |
-``` |
|
672 |
- |
|
673 |
-And see if the mail is detected, if yes, then you need to add an appropriate line to daily.pdb (see section [How-to-create](How-to-create)). |
|
674 |
- |
|
675 |
-If the mail is not detected, then try using: |
|
676 |
- |
|
677 |
-```bash |
|
678 |
-$clamscan/clamscan –debug undetected.eml|less |
|
679 |
-``` |
|
680 |
- |
|
681 |
-Then see what urls are being checked, see if any of them is in a whitelist, see if all urls are detected, etc. |
682 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,1013 +0,0 @@ |
1 |
-# Creating signatures for ClamAV |
|
2 |
- |
|
3 |
-- [Creating signatures for ClamAV](#creating-signatures-for-clamav) |
|
4 |
-- [Introduction](#introduction) |
|
5 |
-- [Debug information from libclamav](#debug-information-from-libclamav) |
|
6 |
-- [Signature formats](#signature-formats) |
|
7 |
- - [Hash-based signatures](#hash-based-signatures) |
|
8 |
- - [MD5 hash-based signatures](#md5-hash-based-signatures) |
|
9 |
- - [SHA1 and SHA256 hash-based signatures](#sha1-and-sha256-hash-based-signatures) |
|
10 |
- - [PE section based hash signatures](#pe-section-based-hash-signatures) |
|
11 |
- - [Hash signatures with unknown size](#hash-signatures-with-unknown-size) |
|
12 |
- - [Body-based signatures](#body-based-signatures) |
|
13 |
- - [Hexadecimal format](#hexadecimal-format) |
|
14 |
- - [Wildcards](#wildcards) |
|
15 |
- - [Character classes](#character-classes) |
|
16 |
- - [Alternate strings](#alternate-strings) |
|
17 |
- - [Basic signature format](#basic-signature-format) |
|
18 |
- - [Extended signature format](#extended-signature-format) |
|
19 |
- - [Logical signatures](#logical-signatures) |
|
20 |
- - [Subsignature Modifiers](#subsignature-modifiers) |
|
21 |
- - [Special Subsignature Types](#special-subsignature-types) |
|
22 |
- - [Macro subsignatures (clamav-0.96) : <span class="nodecor">`${min-max}MACROID$`</span>](#macro-subsignatures-clamav-096-span-classnodecormin-maxmacroidspan) |
|
23 |
- - [PCRE subsignatures (clamav-0.99) : <span class="nodecor">`Trigger/PCRE/[Flags]`</span>](#pcre-subsignatures-clamav-099-span-classnodecortriggerpcreflagsspan) |
|
24 |
- - [Icon signatures for PE files](#icon-signatures-for-pe-files) |
|
25 |
- - [Signatures for Version Information metadata in PE files](#signatures-for-version-information-metadata-in-pe-files) |
|
26 |
- - [Trusted and Revoked Certificates](#trusted-and-revoked-certificates) |
|
27 |
- - [Signatures based on container metadata](#signatures-based-on-container-metadata) |
|
28 |
- - [Signatures based on ZIP/RAR metadata (obsolete)](#signatures-based-on-ziprar-metadata-obsolete) |
|
29 |
- - [Whitelist databases](#whitelist-databases) |
|
30 |
- - [Signature names](#signature-names) |
|
31 |
- - [Using YARA rules in ClamAV](#using-yara-rules-in-clamav) |
|
32 |
- - [Passwords for archive files \[experimental\]](#passwords-for-archive-files-experimental) |
|
33 |
-- [Special files](#special-files) |
|
34 |
- - [HTML](#html) |
|
35 |
- - [Text files](#text-files) |
|
36 |
- - [Compressed Portable Executable files](#compressed-portable-executable-files) |
|
37 |
- |
|
38 |
-# Introduction |
|
39 |
- |
|
40 |
-CVD (ClamAV Virus Database) is a digitally signed container that includes signature databases in various text formats. The header of the container is a 512 bytes long string with colon separated fields: |
|
41 |
- |
|
42 |
-``` |
|
43 |
-ClamAV-VDB:build time:version:number of signatures:functionality level required:MD5 checksum:digital signature:builder name:build time (sec) |
|
44 |
-``` |
|
45 |
- |
|
46 |
-`sigtool --info` displays detailed information about a given CVD file: |
|
47 |
- |
|
48 |
-```bash |
|
49 |
-zolw@localhost:/usr/local/share/clamav$ sigtool -i main.cvd |
|
50 |
-File: main.cvd |
|
51 |
-Build time: 09 Dec 2007 15:50 +0000 |
|
52 |
-Version: 45 |
|
53 |
-Signatures: 169676 |
|
54 |
-Functionality level: 21 |
|
55 |
-Builder: sven |
|
56 |
-MD5: b35429d8d5d60368eea9630062f7c75a |
|
57 |
-Digital signature: dxsusO/HWP3/GAA7VuZpxYwVsE9b+tCk+tPN6OyjVF/U8 |
|
58 |
-JVh4vYmW8mZ62ZHYMlM903TMZFg5hZIxcjQB3SX0TapdF1SFNzoWjsyH53eXvMDY |
|
59 |
-eaPVNe2ccXLfEegoda4xU2TezbGfbSEGoU1qolyQYLX674sNA2Ni6l6/CEKYYh |
|
60 |
-Verification OK. |
|
61 |
-``` |
|
62 |
- |
|
63 |
-The ClamAV project distributes a number of CVD files, including *main.cvd* and *daily.cvd*. |
|
64 |
- |
|
65 |
-# Debug information from libclamav |
|
66 |
- |
|
67 |
-In order to create efficient signatures for ClamAV it’s important to understand how the engine handles input files. The best way to see how it works is having a look at the debug information from libclamav. You can do it by calling `clamscan` with the `--debug` and `--leave-temps` flags. The first switch makes clamscan display all the interesting information from libclamav and the second one avoids deleting temporary files so they can be analyzed further. |
|
68 |
- |
|
69 |
-The now important part of the info is: |
|
70 |
- |
|
71 |
-```bash |
|
72 |
-$ clamscan --debug attachment.exe |
|
73 |
-[...] |
|
74 |
-LibClamAV debug: Recognized MS-EXE/DLL file |
|
75 |
-LibClamAV debug: Matched signature for file type PE |
|
76 |
-LibClamAV debug: File type: Executable |
|
77 |
-``` |
|
78 |
- |
|
79 |
-The engine recognized a windows executable. |
|
80 |
- |
|
81 |
-```bash |
|
82 |
-LibClamAV debug: Machine type: 80386 |
|
83 |
-LibClamAV debug: NumberOfSections: 3 |
|
84 |
-LibClamAV debug: TimeDateStamp: Fri Jan 10 04:57:55 2003 |
|
85 |
-LibClamAV debug: SizeOfOptionalHeader: e0 |
|
86 |
-LibClamAV debug: File format: PE |
|
87 |
-LibClamAV debug: MajorLinkerVersion: 6 |
|
88 |
-LibClamAV debug: MinorLinkerVersion: 0 |
|
89 |
-LibClamAV debug: SizeOfCode: 0x9000 |
|
90 |
-LibClamAV debug: SizeOfInitializedData: 0x1000 |
|
91 |
-LibClamAV debug: SizeOfUninitializedData: 0x1e000 |
|
92 |
-LibClamAV debug: AddressOfEntryPoint: 0x27070 |
|
93 |
-LibClamAV debug: BaseOfCode: 0x1f000 |
|
94 |
-LibClamAV debug: SectionAlignment: 0x1000 |
|
95 |
-LibClamAV debug: FileAlignment: 0x200 |
|
96 |
-LibClamAV debug: MajorSubsystemVersion: 4 |
|
97 |
-LibClamAV debug: MinorSubsystemVersion: 0 |
|
98 |
-LibClamAV debug: SizeOfImage: 0x29000 |
|
99 |
-LibClamAV debug: SizeOfHeaders: 0x400 |
|
100 |
-LibClamAV debug: NumberOfRvaAndSizes: 16 |
|
101 |
-LibClamAV debug: Subsystem: Win32 GUI |
|
102 |
-LibClamAV debug: ------------------------------------ |
|
103 |
-LibClamAV debug: Section 0 |
|
104 |
-LibClamAV debug: Section name: UPX0 |
|
105 |
-LibClamAV debug: Section data (from headers - in memory) |
|
106 |
-LibClamAV debug: VirtualSize: 0x1e000 0x1e000 |
|
107 |
-LibClamAV debug: VirtualAddress: 0x1000 0x1000 |
|
108 |
-LibClamAV debug: SizeOfRawData: 0x0 0x0 |
|
109 |
-LibClamAV debug: PointerToRawData: 0x400 0x400 |
|
110 |
-LibClamAV debug: Section's memory is executable |
|
111 |
-LibClamAV debug: Section's memory is writeable |
|
112 |
-LibClamAV debug: ------------------------------------ |
|
113 |
-LibClamAV debug: Section 1 |
|
114 |
-LibClamAV debug: Section name: UPX1 |
|
115 |
-LibClamAV debug: Section data (from headers - in memory) |
|
116 |
-LibClamAV debug: VirtualSize: 0x9000 0x9000 |
|
117 |
-LibClamAV debug: VirtualAddress: 0x1f000 0x1f000 |
|
118 |
-LibClamAV debug: SizeOfRawData: 0x8200 0x8200 |
|
119 |
-LibClamAV debug: PointerToRawData: 0x400 0x400 |
|
120 |
-LibClamAV debug: Section's memory is executable |
|
121 |
-LibClamAV debug: Section's memory is writeable |
|
122 |
-LibClamAV debug: ------------------------------------ |
|
123 |
-LibClamAV debug: Section 2 |
|
124 |
-LibClamAV debug: Section name: UPX2 |
|
125 |
-LibClamAV debug: Section data (from headers - in memory) |
|
126 |
-LibClamAV debug: VirtualSize: 0x1000 0x1000 |
|
127 |
-LibClamAV debug: VirtualAddress: 0x28000 0x28000 |
|
128 |
-LibClamAV debug: SizeOfRawData: 0x200 0x1ff |
|
129 |
-LibClamAV debug: PointerToRawData: 0x8600 0x8600 |
|
130 |
-LibClamAV debug: Section's memory is writeable |
|
131 |
-LibClamAV debug: ------------------------------------ |
|
132 |
-LibClamAV debug: EntryPoint offset: 0x8470 (33904) |
|
133 |
-``` |
|
134 |
- |
|
135 |
-The section structure displayed above suggests the executable is packed |
|
136 |
-with UPX. |
|
137 |
- |
|
138 |
-```bash |
|
139 |
-LibClamAV debug: ------------------------------------ |
|
140 |
-LibClamAV debug: EntryPoint offset: 0x8470 (33904) |
|
141 |
-LibClamAV debug: UPX/FSG/MEW: empty section found - assuming |
|
142 |
- compression |
|
143 |
-LibClamAV debug: UPX: bad magic - scanning for imports |
|
144 |
-LibClamAV debug: UPX: PE structure rebuilt from compressed file |
|
145 |
-LibClamAV debug: UPX: Successfully decompressed with NRV2B |
|
146 |
-LibClamAV debug: UPX/FSG: Decompressed data saved in |
|
147 |
- /tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede |
|
148 |
-LibClamAV debug: ***** Scanning decompressed file ***** |
|
149 |
-LibClamAV debug: Recognized MS-EXE/DLL file |
|
150 |
-LibClamAV debug: Matched signature for file type PE |
|
151 |
-``` |
|
152 |
- |
|
153 |
-Indeed, libclamav recognizes the UPX data and saves the decompressed |
|
154 |
-(and rebuilt) executable into |
|
155 |
-`/tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede`. Then it continues by |
|
156 |
-scanning this new file: |
|
157 |
- |
|
158 |
-```bash |
|
159 |
-LibClamAV debug: File type: Executable |
|
160 |
-LibClamAV debug: Machine type: 80386 |
|
161 |
-LibClamAV debug: NumberOfSections: 3 |
|
162 |
-LibClamAV debug: TimeDateStamp: Thu Jan 27 11:43:15 2011 |
|
163 |
-LibClamAV debug: SizeOfOptionalHeader: e0 |
|
164 |
-LibClamAV debug: File format: PE |
|
165 |
-LibClamAV debug: MajorLinkerVersion: 6 |
|
166 |
-LibClamAV debug: MinorLinkerVersion: 0 |
|
167 |
-LibClamAV debug: SizeOfCode: 0xc000 |
|
168 |
-LibClamAV debug: SizeOfInitializedData: 0x19000 |
|
169 |
-LibClamAV debug: SizeOfUninitializedData: 0x0 |
|
170 |
-LibClamAV debug: AddressOfEntryPoint: 0x7b9f |
|
171 |
-LibClamAV debug: BaseOfCode: 0x1000 |
|
172 |
-LibClamAV debug: SectionAlignment: 0x1000 |
|
173 |
-LibClamAV debug: FileAlignment: 0x1000 |
|
174 |
-LibClamAV debug: MajorSubsystemVersion: 4 |
|
175 |
-LibClamAV debug: MinorSubsystemVersion: 0 |
|
176 |
-LibClamAV debug: SizeOfImage: 0x26000 |
|
177 |
-LibClamAV debug: SizeOfHeaders: 0x1000 |
|
178 |
-LibClamAV debug: NumberOfRvaAndSizes: 16 |
|
179 |
-LibClamAV debug: Subsystem: Win32 GUI |
|
180 |
-LibClamAV debug: ------------------------------------ |
|
181 |
-LibClamAV debug: Section 0 |
|
182 |
-LibClamAV debug: Section name: .text |
|
183 |
-LibClamAV debug: Section data (from headers - in memory) |
|
184 |
-LibClamAV debug: VirtualSize: 0xc000 0xc000 |
|
185 |
-LibClamAV debug: VirtualAddress: 0x1000 0x1000 |
|
186 |
-LibClamAV debug: SizeOfRawData: 0xc000 0xc000 |
|
187 |
-LibClamAV debug: PointerToRawData: 0x1000 0x1000 |
|
188 |
-LibClamAV debug: Section contains executable code |
|
189 |
-LibClamAV debug: Section's memory is executable |
|
190 |
-LibClamAV debug: ------------------------------------ |
|
191 |
-LibClamAV debug: Section 1 |
|
192 |
-LibClamAV debug: Section name: .rdata |
|
193 |
-LibClamAV debug: Section data (from headers - in memory) |
|
194 |
-LibClamAV debug: VirtualSize: 0x2000 0x2000 |
|
195 |
-LibClamAV debug: VirtualAddress: 0xd000 0xd000 |
|
196 |
-LibClamAV debug: SizeOfRawData: 0x2000 0x2000 |
|
197 |
-LibClamAV debug: PointerToRawData: 0xd000 0xd000 |
|
198 |
-LibClamAV debug: ------------------------------------ |
|
199 |
-LibClamAV debug: Section 2 |
|
200 |
-LibClamAV debug: Section name: .data |
|
201 |
-LibClamAV debug: Section data (from headers - in memory) |
|
202 |
-LibClamAV debug: VirtualSize: 0x17000 0x17000 |
|
203 |
-LibClamAV debug: VirtualAddress: 0xf000 0xf000 |
|
204 |
-LibClamAV debug: SizeOfRawData: 0x17000 0x17000 |
|
205 |
-LibClamAV debug: PointerToRawData: 0xf000 0xf000 |
|
206 |
-LibClamAV debug: Section's memory is writeable |
|
207 |
-LibClamAV debug: ------------------------------------ |
|
208 |
-LibClamAV debug: EntryPoint offset: 0x7b9f (31647) |
|
209 |
-LibClamAV debug: Bytecode executing hook id 257 (0 hooks) |
|
210 |
-attachment.exe: OK |
|
211 |
-[...] |
|
212 |
-``` |
|
213 |
- |
|
214 |
-No additional files get created by libclamav. By writing a signature for the decompressed file you have more chances that the engine will detect the target data when it gets compressed with another packer. |
|
215 |
- |
|
216 |
-This method should be applied to all files for which you want to create signatures. By analyzing the debug information you can quickly see how the engine recognizes and preprocesses the data and what additional files get created. Signatures created for bottom-level temporary files are usually more generic and should help detecting the same malware in different forms. |
|
217 |
- |
|
218 |
-# Signature formats |
|
219 |
- |
|
220 |
-## Hash-based signatures |
|
221 |
- |
|
222 |
-The easiest way to create signatures for ClamAV is to use filehash checksums, however this method can be only used against static malware. |
|
223 |
- |
|
224 |
-### MD5 hash-based signatures |
|
225 |
- |
|
226 |
-To create a MD5 signature for `test.exe` use the `--md5` option of |
|
227 |
-sigtool: |
|
228 |
- |
|
229 |
-```bash |
|
230 |
-zolw@localhost:/tmp/test$ sigtool --md5 test.exe > test.hdb |
|
231 |
-zolw@localhost:/tmp/test$ cat test.hdb |
|
232 |
-48c4533230e1ae1c118c741c0db19dfb:17387:test.exe |
|
233 |
-``` |
|
234 |
- |
|
235 |
-That’s it\! The signature is ready for use: |
|
236 |
- |
|
237 |
-```bash |
|
238 |
-zolw@localhost:/tmp/test$ clamscan -d test.hdb test.exe |
|
239 |
-test.exe: test.exe FOUND |
|
240 |
- |
|
241 |
-Known viruses: 1 |
|
242 |
-Scanned directories: 0 |
|
243 |
-Engine version: 0.92.1 |
|
244 |
-Scanned files: 1 |
|
245 |
-Infected files: 1 |
|
246 |
-Data scanned: 0.02 MB |
|
247 |
-Time: 0.024 sec (0 m 0 s) |
|
248 |
-``` |
|
249 |
- |
|
250 |
-You can change the name (by default sigtool uses the name of the file) and place it inside a `*.hdb` file. A single database file can include any number of signatures. To get them automatically loaded each time clamscan/clamd starts just copy the database file(s) into the local virus database directory (eg. /usr/local/share/clamav). |
|
251 |
- |
|
252 |
-*The hash-based signatures shall not be used for text files, HTML and any other data that gets internally preprocessed before pattern matching. If you really want to use a hash signature in such a case, run clamscan with –debug and –leave-temps flags as described above and create a signature for a preprocessed file left in /tmp. Please keep in mind that a hash signature will stop matching as soon as a single byte changes in the target file.* |
|
253 |
- |
|
254 |
-### SHA1 and SHA256 hash-based signatures |
|
255 |
- |
|
256 |
-ClamAV 0.98 has also added support for SHA1 and SHA256 file checksums. The format is the same as for MD5 file checksum. It can differentiate between them based on the length of the hash string in the signature. For best backwards compatibility, these should be placed inside a `*.hsb` file. The format is: |
|
257 |
- |
|
258 |
-``` |
|
259 |
-HashString:FileSize:MalwareName |
|
260 |
-``` |
|
261 |
- |
|
262 |
-### PE section based hash signatures |
|
263 |
- |
|
264 |
-You can create a hash signature for a specific section in a PE file. Such signatures shall be stored inside `.mdb` files in the following format: |
|
265 |
- |
|
266 |
-``` |
|
267 |
-PESectionSize:PESectionHash:MalwareName |
|
268 |
-``` |
|
269 |
- |
|
270 |
-The easiest way to generate MD5 based section signatures is to extract target PE sections into separate files and then run sigtool with the option `--mdb` |
|
271 |
- |
|
272 |
-ClamAV 0.98 has also added support for SHA1 and SHA256 section based signatures. The format is the same as for MD5 PE section based signatures. It can differentiate between them based on the length of the hash string in the signature. For best backwards compatibility, these should be placed inside a `*.msb` file. |
|
273 |
- |
|
274 |
-### Hash signatures with unknown size |
|
275 |
- |
|
276 |
-ClamAV 0.98 has also added support for hash signatures where the size is not known but the hash is. It is much more performance-efficient to use signatures with specific sizes, so be cautious when using this feature. For these cases, the ’\*’ character can be used in the size field. To ensure proper backwards compatibility with older versions of ClamAV, these signatures must have a minimum functional level of 73 or higher. Signatures that use the wildcard size without this level set will be rejected as malformed. |
|
277 |
- |
|
278 |
-``` |
|
279 |
-Sample .hsb signature matching any size |
|
280 |
-HashString:*:MalwareName:73 |
|
281 |
- |
|
282 |
-Sample .msb signature matching any size |
|
283 |
-*:PESectionHash:MalwareName:73 |
|
284 |
-``` |
|
285 |
- |
|
286 |
-## Body-based signatures |
|
287 |
- |
|
288 |
-ClamAV stores all body-based signatures in a hexadecimal format. In this section by a hex-signature we mean a fragment of malware’s body converted into a hexadecimal string which can be additionally extended using various wildcards. |
|
289 |
- |
|
290 |
-### Hexadecimal format |
|
291 |
- |
|
292 |
-You can use `sigtool --hex-dump` to convert any data into a hex-string: |
|
293 |
- |
|
294 |
-```bash |
|
295 |
-zolw@localhost:/tmp/test$ sigtool --hex-dump |
|
296 |
-How do I look in hex? |
|
297 |
-486f7720646f2049206c6f6f6b20696e206865783f0a |
|
298 |
-``` |
|
299 |
- |
|
300 |
-### Wildcards |
|
301 |
- |
|
302 |
-ClamAV supports the following wildcards for hex-signatures: |
|
303 |
- |
|
304 |
-- `??` |
|
305 |
- |
|
306 |
- Match any byte. |
|
307 |
- |
|
308 |
-- `a?` |
|
309 |
- |
|
310 |
- Match a high nibble (the four high bits). |
|
311 |
- **IMPORTANT NOTE:** The nibble matching is only available in |
|
312 |
- libclamav with the functionality level 17 and higher therefore |
|
313 |
- please only use it with .ndb signatures followed by ":17" |
|
314 |
- (MinEngineFunctionalityLevel, see [3.2.7](#ndb)). |
|
315 |
- |
|
316 |
-- `?a` |
|
317 |
- |
|
318 |
- Match a low nibble (the four low bits). |
|
319 |
- |
|
320 |
-- `*` |
|
321 |
- |
|
322 |
- Match any number of bytes. |
|
323 |
- |
|
324 |
-- `{n}` |
|
325 |
- |
|
326 |
- Match \(n\) bytes. |
|
327 |
- |
|
328 |
-- `{-n}` |
|
329 |
- |
|
330 |
- Match \(n\) or less bytes. |
|
331 |
- |
|
332 |
-- `{n-}` |
|
333 |
- |
|
334 |
- Match \(n\) or more bytes. |
|
335 |
- |
|
336 |
-- `{n-m}` |
|
337 |
- |
|
338 |
- Match between \(n\) and \(m\) bytes (\(m > n\)). |
|
339 |
- |
|
340 |
-- `HEXSIG[x-y]aa` or `aa[x-y]HEXSIG` |
|
341 |
- |
|
342 |
- Match aa anchored to a hex-signature, see |
|
343 |
- <https://bugzilla.clamav.net/show_bug.cgi?id=776> for discussion and |
|
344 |
- examples. |
|
345 |
- |
|
346 |
-The range signatures `*` and `{}` virtually separate a hex-signature into two parts, eg. `aabbcc*bbaacc` is treated as two sub-signatures `aabbcc` and `bbaacc` with any number of bytes between them. It’s a requirement that each sub-signature includes a block of two static characters somewhere in its body. Note that there is one exception to this restriction; that is when the range wildcard is of the form `{n}` with `n<128`. In this case, ClamAV uses an optimization and translates `{n}` to the string consisting of `n ??` character wildcards. Character wildcards do not divide hex signatures into two parts and so the two static character requirement does not apply. |
|
347 |
- |
|
348 |
-### Character classes |
|
349 |
- |
|
350 |
-ClamAV supports the following character classes for hex-signatures: |
|
351 |
- |
|
352 |
-- `(B)` |
|
353 |
- |
|
354 |
- Match word boundary (including file boundaries). |
|
355 |
- |
|
356 |
-- `(L)` |
|
357 |
- |
|
358 |
- Match CR, CRLF or file boundaries. |
|
359 |
- |
|
360 |
-- `(W)` |
|
361 |
- |
|
362 |
- Match a non-alphanumeric character. |
|
363 |
- |
|
364 |
-### Alternate strings |
|
365 |
- |
|
366 |
-- Single-byte alternates (clamav-0.96) `(aa|bb|cc|...)` or `!(aa|bb|cc|...)` Match a member from a set of bytes \[aa, bb, cc, ...\]. |
|
367 |
- - Negation operation can be applied to match any non-member, assumed to be one-byte in length. |
|
368 |
- - Signature modifiers and wildcards cannot be applied. |
|
369 |
- |
|
370 |
-- Multi-byte fixed length alternates `(aaaa|bbbb|cccc|...)` or `!(aaaa|bbbb|cccc|...)` Match a member from a set of multi-byte alternates \[aaaa, bbbb, cccc, ...\] of n-length. |
|
371 |
- - All set members must be the same length. |
|
372 |
- - Negation operation can be applied to match any non-member, assumed to be n-bytes in length (clamav-0.98.2). |
|
373 |
- - Signature modifiers and wildcards cannot be applied. |
|
374 |
- |
|
375 |
-- Generic alternates (clamav-0.99) `(alt1|alt2|alt3|...)` Match a member from a set of alternates \[alt1, alt2, alt3, ...\] that can be of variable lengths. |
|
376 |
- - Negation operation cannot be applied. |
|
377 |
- - Signature modifiers and nibble wildcards \[`??, a?, ?a`\] can be applied. |
|
378 |
- - Ranged wildcards \[`{n-m}`\] are limited to a fixed range of less than 128 bytes \[`{1} -> {127}`\]. |
|
379 |
- |
|
380 |
-Note that using signature modifiers and wildcards classifies the alternate type to be a generic alternate. Thus single-byte alternates and multi-byte fixed length alternates can use signature modifiers and wildcards but will be classified as generic alternate. This means that negation cannot be applied in this situation and there is a slight performance impact. |
|
381 |
- |
|
382 |
-### Basic signature format |
|
383 |
- |
|
384 |
-The simplest (and now deprecated) signature format is: |
|
385 |
- |
|
386 |
-``` |
|
387 |
-MalwareName=HexSignature |
|
388 |
-``` |
|
389 |
- |
|
390 |
-ClamAV will scan the entire file looking for HexSignature. All signatures of this type must be placed inside `*.db` files. |
|
391 |
- |
|
392 |
-### Extended signature format |
|
393 |
- |
|
394 |
-The extended signature format allows for specification of additional information such as a target file type, virus offset or engine version, making the detection more reliable. The format is: |
|
395 |
- |
|
396 |
-``` |
|
397 |
-MalwareName:TargetType:Offset:HexSignature[:MinFL:[MaxFL]] |
|
398 |
-``` |
|
399 |
- |
|
400 |
-where `TargetType` is one of the following numbers specifying the type of the target file: |
|
401 |
- |
|
402 |
-- 0 = any file |
|
403 |
- |
|
404 |
-- 1 = Portable Executable, both 32- and 64-bit. |
|
405 |
- |
|
406 |
-- 2 = OLE2 containers, including their specific macros. The OLE2 format is primarily used by MS Office and MSI installation files. |
|
407 |
- |
|
408 |
-- 3 = HTML (normalized: whitespace transformed to spaces, tags/tag attributes normalized, all lowercase), Javascript is normalized too: all strings are normalized (hex encoding is decoded), numbers are parsed and normalized, local variables/function names are normalized to ’n001’ format, argument to eval() is parsed as JS again, unescape() is handled, some simple JS packers are handled, output is whitespace normalized. |
|
409 |
- |
|
410 |
-- 4 = Mail file |
|
411 |
- |
|
412 |
-- 5 = Graphics |
|
413 |
- |
|
414 |
-- 6 = ELF |
|
415 |
- |
|
416 |
-- 7 = ASCII text file (normalized) |
|
417 |
- |
|
418 |
-- 8 = Unused |
|
419 |
- |
|
420 |
-- 9 = Mach-O files |
|
421 |
- |
|
422 |
-- 10 = PDF files |
|
423 |
- |
|
424 |
-- 11 = Flash files |
|
425 |
- |
|
426 |
-- 12 = Java class files |
|
427 |
- |
|
428 |
-And `Offset` is an asterisk or a decimal number `n` possibly combined with a special modifier: |
|
429 |
- |
|
430 |
-- `*` = any |
|
431 |
- |
|
432 |
-- `n` = absolute offset |
|
433 |
- |
|
434 |
-- `EOF-n` = end of file minus `n` bytes |
|
435 |
- |
|
436 |
-Signatures for PE, ELF and Mach-O files additionally support: |
|
437 |
- |
|
438 |
-- `EP+n` = entry point plus n bytes (`EP+0` for `EP`) |
|
439 |
- |
|
440 |
-- `EP-n` = entry point minus n bytes |
|
441 |
- |
|
442 |
-- `Sx+n` = start of section `x`’s (counted from 0) data plus `n` bytes |
|
443 |
- |
|
444 |
-- `SEx` = entire section `x` (offset must lie within section boundaries) |
|
445 |
- |
|
446 |
-- `SL+n` = start of last section plus `n` bytes |
|
447 |
- |
|
448 |
-All the above offsets except `*` can be turned into **floating offsets** and represented as `Offset,MaxShift` where `MaxShift` is an unsigned integer. A floating offset will match every offset between `Offset` and `Offset+MaxShift`, eg. `10,5` will match all offsets from 10 to 15 and `EP+n,y` will match all offsets from `EP+n` to `EP+n+y`. Versions of ClamAV older than 0.91 will silently ignore the `MaxShift` extension and only use `Offset`. Optional `MinFL` and `MaxFL` parameters can restrict the signature to specific engine releases. All signatures in the extended format must be placed inside `*.ndb` files. |
|
449 |
- |
|
450 |
-### Logical signatures |
|
451 |
- |
|
452 |
-Logical signatures allow combining of multiple signatures in extended format using logical operators. They can provide both more detailed and flexible pattern matching. The logical sigs are stored inside `*.ldb` files in the following format: |
|
453 |
- |
|
454 |
-``` |
|
455 |
-SignatureName;TargetDescriptionBlock;LogicalExpression;Subsig0; |
|
456 |
-Subsig1;Subsig2;... |
|
457 |
-``` |
|
458 |
- |
|
459 |
-where: |
|
460 |
- |
|
461 |
-- `TargetDescriptionBlock` provides information about the engine and target file with comma separated `Arg:Val` pairs. For args where `Val` is a range, the minimum and maximum values should be expressed as `min-max`. |
|
462 |
- |
|
463 |
-- `LogicalExpression` specifies the logical expression describing the relationship between `Subsig0...SubsigN`. **Basis clause:** 0,1,...,N decimal indexes are SUB-EXPRESSIONS representing `Subsig0, Subsig1,...,SubsigN` respectively. **Inductive clause:** if `A` and `B` are SUB-EXPRESSIONS and `X, Y` are decimal numbers then `(A&B)`, `(A|B)`, `A=X`, `A=X,Y`, `A>X`, `A>X,Y`, `A<X` and `A<X,Y` are SUB-EXPRESSIONS |
|
464 |
- |
|
465 |
-- `SubsigN` is n-th subsignature in extended format possibly preceded with an offset. There can be specified up to 64 subsigs. |
|
466 |
- |
|
467 |
-Keywords used in `TargetDescriptionBlock`: |
|
468 |
- |
|
469 |
-- `Target:X`: Target file type |
|
470 |
- |
|
471 |
-- `Engine:X-Y`: Required engine functionality (range; 0.96). Note that if the `Engine` keyword is used, it must be the first one in the `TargetDescriptionBlock` for backwards compatibility |
|
472 |
- |
|
473 |
-- `FileSize:X-Y`: Required file size (range in bytes; 0.96) |
|
474 |
- |
|
475 |
-- `EntryPoint`: Entry point offset (range in bytes; 0.96) |
|
476 |
- |
|
477 |
-- `NumberOfSections`: Required number of sections in executable (range; 0.96) |
|
478 |
- |
|
479 |
-- `Container:CL_TYPE_*`: File type of the container which stores the scanned file. Specifying `CL_TYPE_ANY` matches on root objects only. |
|
480 |
- |
|
481 |
-- `Intermediates:CL_TYPE_*>CL_TYPE_*`: File types of intermediate containers which stores the scanned file. Specify 1-16 file types separated by ’`>`’ in top-down order (’`>`’ separator not needed for single file type), last type should be the immediate container for the malicious content. `CL_TYPE_ANY` can be used as a wildcard file type. (expr; 0.100.0) |
|
482 |
- |
|
483 |
-- `IconGroup1`: Icon group name 1 from .idb signature Required engine functionality (range; 0.96) |
|
484 |
- |
|
485 |
-- `IconGroup2`: Icon group name 2 from .idb signature Required engine functionality (range; 0.96) |
|
486 |
- |
|
487 |
-Modifiers for subexpressions: |
|
488 |
- |
|
489 |
-- `A=X`: If the SUB-EXPRESSION A refers to a single signature then this signature must get matched exactly X times; if it refers to a (logical) block of signatures then this block must generate exactly X matches (with any of its sigs). |
|
490 |
- |
|
491 |
-- `A=0` specifies negation (signature or block of signatures cannot be matched) |
|
492 |
- |
|
493 |
-- `A=X,Y`: If the SUB-EXPRESSION A refers to a single signature then this signature must be matched exactly X times; if it refers to a (logical) block of signatures then this block must generate X matches and at least Y different signatures must get matched. |
|
494 |
- |
|
495 |
-- `A>X`: If the SUB-EXPRESSION A refers to a single signature then this signature must get matched more than X times; if it refers to a (logical) block of signatures then this block must generate more than X matches (with any of its sigs). |
|
496 |
- |
|
497 |
-- `A>X,Y`: If the SUB-EXPRESSION A refers to a single signature then this signature must get matched more than X times; if it refers to a (logical) block of signatures then this block must generate more than X matches and at least Y different signatures must be matched. |
|
498 |
- |
|
499 |
-- `A<X` and `A<X,Y` as above with the change of "more" to "less". |
|
500 |
- |
|
501 |
-Examples: |
|
502 |
- |
|
503 |
-``` |
|
504 |
-Sig1;Target:0;(0&1&2&3)&(4|1);6b6f74656b;616c61;7a6f6c77;7374656 |
|
505 |
-6616e;deadbeef |
|
506 |
- |
|
507 |
-Sig2;Target:0;((0|1|2)>5,2)&(3|1);6b6f74656b;616c61;7a6f6c77;737 |
|
508 |
-46566616e |
|
509 |
- |
|
510 |
-Sig3;Target:0;((0|1|2|3)=2)&(4|1);6b6f74656b;616c61;7a6f6c77;737 |
|
511 |
-46566616e;deadbeef |
|
512 |
- |
|
513 |
-Sig4;Engine:51-255,Target:1;((0|1)&(2|3))&4;EP+123:33c06834f04100 |
|
514 |
-f2aef7d14951684cf04100e8110a00;S2+78:22??232c2d252229{-15}6e6573 |
|
515 |
-(63|64)61706528;S3+50:68efa311c3b9963cb1ee8e586d32aeb9043e;f9c58 |
|
516 |
-dcf43987e4f519d629b103375;SL+550:6300680065005c0046006900 |
|
517 |
-``` |
|
518 |
- |
|
519 |
-### Subsignature Modifiers |
|
520 |
- |
|
521 |
-ClamAV (clamav-0.99) supports a number of additional subsignature |
|
522 |
-modifiers for logical signatures. This is done by specifying ’::’ |
|
523 |
-followed by a number of characters representing the desired options. |
|
524 |
-Signatures using subsignature modifiers require `Engine:81-255` for |
|
525 |
-backwards-compatibility. |
|
526 |
- |
|
527 |
-- Case-Insensitive \[`i`\] |
|
528 |
- |
|
529 |
- Specifying the `i` modifier causes ClamAV to match all alphabetic hex bytes as case-insensitive. All patterns in ClamAV are case-sensitive by default. |
|
530 |
- |
|
531 |
-- Wide \[`w`\] |
|
532 |
- |
|
533 |
- Specifying the `w` causes ClamAV to match all hex bytes encoded with two bytes per character. Note this simply interweaves each character with NULL characters and does not truly support UTF-16 characters. Wildcards for ’wide’ subsignatures are not treated as wide (i.e. there can be an odd number of intermittent characters). This can be combined with `a` to search for patterns in both wide and ascii. |
|
534 |
- |
|
535 |
-- Fullword \[`f`\] |
|
536 |
- |
|
537 |
- Match subsignature as a fullword (delimited by non-alphanumeric characters). |
|
538 |
- |
|
539 |
-- Ascii \[`a`\] |
|
540 |
- |
|
541 |
- Match subsignature as ascii characters. This can be combined with `w` to search for patterns in both ascii and wide. |
|
542 |
- |
|
543 |
-Examples: |
|
544 |
- |
|
545 |
-``` |
|
546 |
-clamav-nocase-A;Engine:81-255,Target:0;0&1;41414141::i;424242424242::i |
|
547 |
- -matches 'AAAA'(nocase) and 'BBBBBB'(nocase) |
|
548 |
- |
|
549 |
-clamav-fullword-A;Engine:81-255,Target:0;0&1;414141;68656c6c6f::f |
|
550 |
- -matches 'AAA' and 'hello'(fullword) |
|
551 |
-clamav-fullword-B;Engine:81-255,Target:0;0&1;414141;68656c6c6f::fi |
|
552 |
- -matches 'AAA' and 'hello'(fullword nocase) |
|
553 |
- |
|
554 |
-clamav-wide-B2;Engine:81-255,Target:0;0&1;414141;68656c6c6f::wa |
|
555 |
- -matches 'AAA' and 'hello'(wide ascii) |
|
556 |
-clamav-wide-C0;Engine:81-255,Target:0;0&1;414141;68656c6c6f::iwfa |
|
557 |
- -matches 'AAA' and 'hello'(nocase wide fullword ascii) |
|
558 |
-``` |
|
559 |
- |
|
560 |
-## Special Subsignature Types |
|
561 |
- |
|
562 |
-### Macro subsignatures (clamav-0.96) : <span class="nodecor">`${min-max}MACROID$`</span> |
|
563 |
- |
|
564 |
-Macro subsignatures are used to combine a number of existing extended |
|
565 |
-signatures (`.ndb`) into a on-the-fly generated alternate string logical |
|
566 |
-signature (`.ldb`). Signatures using macro subsignatures require |
|
567 |
-`Engine:51-255` for backwards-compatibility. |
|
568 |
- |
|
569 |
-Example: |
|
570 |
- |
|
571 |
-``` |
|
572 |
- test.ldb: |
|
573 |
- TestMacro;Engine:51-255,Target:0;0&1;616161;${6-7}12$ |
|
574 |
- |
|
575 |
- test.ndb: |
|
576 |
- D1:0:$12:626262 |
|
577 |
- D2:0:$12:636363 |
|
578 |
- D3:0:$30:626264 |
|
579 |
-``` |
|
580 |
- |
|
581 |
-The example logical signature `TestMacro` is functionally equivalent |
|
582 |
-to: |
|
583 |
- |
|
584 |
-``` |
|
585 |
-`TestMacro;Engine:51-255,Target:0;0;616161{3-4}(626262|636363)` |
|
586 |
-``` |
|
587 |
- |
|
588 |
-- `MACROID` points to a group of signatures; there can be at most 32 macro groups. |
|
589 |
- |
|
590 |
- - In the example, `MACROID` is `12` and both `D1` and `D2` are members of macro group `12`. `D3` is a member of separate macro group `30`. |
|
591 |
- |
|
592 |
-- `{min-max}` specifies the offset range at which one of the group signatures should match; the offset range is relative to the starting offset of the preceding subsignature. This means a macro subsignature cannot be the first subsignature. |
|
593 |
- |
|
594 |
- - In the example, `{min-max}` is `{6-7}` and it is relative to the start of a `616161` match. |
|
595 |
- |
|
596 |
-- For more information and examples please see <https://wwws.clamav.net/bugzilla/show_bug.cgi?id=164>. |
|
597 |
- |
|
598 |
-### PCRE subsignatures (clamav-0.99) : <span class="nodecor">`Trigger/PCRE/[Flags]`</span> |
|
599 |
- |
|
600 |
-PCRE subsignatures are used within a logical signature (`.ldb`) to specify regex matches that execute once triggered by a conditional based on preceding subsignatures. Signatures using PCRE subsignatures require `Engine:81-255` for backwards-compatibility. |
|
601 |
- |
|
602 |
-- `Trigger` is a required field that is a valid `LogicalExpression` and may refer to any subsignatures that precede this subsignature. Triggers cannot be self-referential and cannot refer to subsequent subsignatures. |
|
603 |
- |
|
604 |
-- `PCRE` is the expression representing the regex to execute. `PCRE` must be delimited by ’/’ and usage of ’/’ within the expression need to be escaped. For backward compatibility, ’;’ within the expression must be expressed as ’`\x3B`’. `PCRE` cannot be empty and (?UTF\*) control sequence is not allowed. If debug is specified, named capture groups are displayed in a post-execution report. |
|
605 |
- |
|
606 |
-- `Flags` are a series of characters which affect the compilation and execution of `PCRE` within the PCRE compiler and the ClamAV engine. This field is optional. |
|
607 |
- |
|
608 |
- - `g [CLAMAV_GLOBAL]` specifies to search for ALL matches of PCRE (default is to search for first match). NOTE: INCREASES the time needed to run the PCRE. |
|
609 |
- |
|
610 |
- - `r [CLAMAV_ROLLING]` specifies to use the given offset as the starting location to search for a match as opposed to the only location; applies to subsigs without maxshifts. By default, in order to facilatate normal ClamAV offset behavior, PCREs are auto-anchored (only attempt match on first offset); using the rolling option disables the auto-anchoring. |
|
611 |
- |
|
612 |
- - `e [CLAMAV_ENCOMPASS]` specifies to CONFINE matching between the specified offset and maxshift; applies only when maxshift is specified. Note: DECREASES time needed to run the PCRE. |
|
613 |
- |
|
614 |
- - `i [PCRE_CASELESS]` |
|
615 |
- |
|
616 |
- - `s [PCRE_DOTALL]` |
|
617 |
- |
|
618 |
- - `m [PCRE_MULTILINE]` |
|
619 |
- |
|
620 |
- - `x [PCRE_EXTENDED]` |
|
621 |
- |
|
622 |
- - `A [PCRE_ANCHORED]` |
|
623 |
- |
|
624 |
- - `E [PCRE_DOLLAR_ENODNLY]` |
|
625 |
- |
|
626 |
- - `U [PCRE_UNGREEDY]` |
|
627 |
- |
|
628 |
-Examples: |
|
629 |
- |
|
630 |
-``` |
|
631 |
-Find.All.ClamAV;Engine:81-255,Target:0;1;6265676c6164697427736e6f7462797465636f6465;0/clamav/g |
|
632 |
- |
|
633 |
-Find.ClamAV.OnlyAt.299;Engine:81-255,Target:0;2;7374756c747a67657473;7063726572656765786c6f6c;299:0&1/clamav/ |
|
634 |
- |
|
635 |
-Find.ClamAV.StartAt.300;Engine:81-255,Target:0;3;616c61696e;62756731393238;636c6f736564;300:0&1&2/clamav/r |
|
636 |
- |
|
637 |
-Find.All.Encompassed.ClamAV;Engine:81-255,Target:0;3;7768796172656e2774;796f757573696e67;79617261;200,300:0&1&2/clamav/ge |
|
638 |
- |
|
639 |
-Named.CapGroup.Pcre;Engine:81-255,Target:0;3;636f75727479617264;616c62756d;74657272696572;50:0&1&2/variable=(?<nilshell>.{16})end/gr |
|
640 |
- |
|
641 |
-Firefox.TreeRange.UseAfterFree;Engine:81-255,Target:0,Engine:81-255;0&1&2;2e766965772e73656c656374696f6e;2e696e76616c696461746553656c656374696f6e;0&1/\x2Eview\x2Eselection.*?\x2Etree\s*\x3D\s*null.*?\x2Einvalidate/smi |
|
642 |
- |
|
643 |
-Firefox.IDB.UseAfterFree;Engine:81-255,Target:0;0&1;4944424b657952616e6765;0/^\x2e(only|lowerBound|upperBound|bound)\x28.*?\x29.*?\x2e(lower|upper|lowerOpen|upperOpen)/smi |
|
644 |
- |
|
645 |
-Firefox.boundElements;Engine:81-255,Target:0;0&1&2;6576656e742e6 |
|
646 |
-26f756e64456c656d656e7473;77696e646f772e636c6f7365;0&1/on(load|click)\s*=\s*\x22?window\.close\s*\x28/si |
|
647 |
-``` |
|
648 |
- |
|
649 |
-## Icon signatures for PE files |
|
650 |
- |
|
651 |
-ClamAV 0.96 includes an approximate/fuzzy icon matcher to help detecting malicious executables disguising themselves as innocent looking image files, office documents and the like. |
|
652 |
- |
|
653 |
-Icon matching is only triggered via .ldb signatures using the special attribute tokens `IconGroup1` or `IconGroup2`. These identify two (optional) groups of icons defined in a .idb database file. The format of the .idb file is: |
|
654 |
- |
|
655 |
-``` |
|
656 |
-ICONNAME:GROUP1:GROUP2:ICON_HASH |
|
657 |
-``` |
|
658 |
- |
|
659 |
-where: |
|
660 |
- |
|
661 |
-- `ICON_NAME` is a unique string identifier for a specific icon, |
|
662 |
- |
|
663 |
-- `GROUP1` is a string identifier for the first group of icons (`IconGroup1`) |
|
664 |
- |
|
665 |
-- `GROUP2` is a string identifier for the second group of icons (`IconGroup2`), |
|
666 |
- |
|
667 |
-- `ICON_HASH` is a fuzzy hash of the icon image |
|
668 |
- |
|
669 |
-The `ICON_HASH` field can be obtained from the debug output of libclamav. For example: |
|
670 |
- |
|
671 |
-```bash |
|
672 |
-LibClamAV debug: ICO SIGNATURE: |
|
673 |
-ICON_NAME:GROUP1:GROUP2:18e2e0304ce60a0cc3a09053a30000414100057e000afe0000e 80006e510078b0a08910d11ad04105e0811510f084e01040c080a1d0b0021000a39002a41 |
|
674 |
-``` |
|
675 |
- |
|
676 |
-## Signatures for Version Information metadata in PE files |
|
677 |
- |
|
678 |
-Starting with ClamAV 0.96 it is possible to easily match certain information built into PE files (executables and dynamic link libraries). Whenever you lookup the properties of a PE executable file in windows, you are presented with a bunch of details about the file itself. |
|
679 |
- |
|
680 |
-These info are stored in a special area of the file resources which goes under the name of `VS_VERSION_INFORMATION` (or versioninfo for short). It is divided into 2 parts. The first part (which is rather uninteresting) is really a bunch of numbers and flags indicating the product and file version. It was originally intended for use with installers which, after parsing it, should be able to determine whether a certain executable or library are to be upgraded/overwritten or are already up to date. Suffice to say, this approach never really worked and is generally never used. |
|
681 |
- |
|
682 |
-The second block is much more interesting: it is a simple list of key/value strings, intended for user information and completely ignored by the OS. For example, if you look at ping.exe you can see the company being *"Microsoft Corporation"*, the description *"TCP/IP Ping command"*, the internal name *"ping.exe"* and so on... Depending on the OS version, some keys may be given peculiar visibility in the file properties dialog, however they are internally all the same. |
|
683 |
- |
|
684 |
-To match a versioninfo key/value pair, the special file offset anchor `VI` was introduced. This is similar to the other anchors (like `EP` and `SL`) except that, instead of matching the hex pattern against a single offset, it checks it against each and every key/value pair in the file. The `VI` token doesn’t need nor accept a `+/-` offset like e.g. `EP+1`. As for the hex signature itself, it’s just the utf16 dump of the key and value. Only the `??` and `(aa|bb)` wildcards are allowed in the signature. Usually, you don’t need to bother figuring it out: each key/value pair together with the corresponding VI-based signature is printed by `clamscan` when the `--debug` option is given. |
|
685 |
- |
|
686 |
-For example `clamscan --debug freecell.exe` produces: |
|
687 |
- |
|
688 |
-```bash |
|
689 |
-[...] |
|
690 |
-Recognized MS-EXE/DLL file |
|
691 |
-in cli_peheader |
|
692 |
-versioninfo_cb: type: 10, name: 1, lang: 410, rva: 9608 |
|
693 |
-cli_peheader: parsing version info @ rva 9608 (1/1) |
|
694 |
-VersionInfo (d2de): 'CompanyName'='Microsoft Corporation' - |
|
695 |
-VI:43006f006d00700061006e0079004e0061006d006500000000004d006900 |
|
696 |
-630072006f0073006f0066007400200043006f00720070006f0072006100740 |
|
697 |
-069006f006e000000 |
|
698 |
-VersionInfo (d32a): 'FileDescription'='Entertainment Pack |
|
699 |
-FreeCell Game' - VI:460069006c006500440065007300630072006900700 |
|
700 |
-0740069006f006e000000000045006e007400650072007400610069006e006d |
|
701 |
-0065006e00740020005000610063006b0020004600720065006500430065006 |
|
702 |
-c006c002000470061006d0065000000 |
|
703 |
-VersionInfo (d396): 'FileVersion'='5.1.2600.0 (xpclient.010817 |
|
704 |
--1148)' - VI:460069006c006500560065007200730069006f006e00000000 |
|
705 |
-0035002e0031002e0032003600300030002e003000200028007800700063006 |
|
706 |
-c00690065006e0074002e003000310030003800310037002d00310031003400 |
|
707 |
-380029000000 |
|
708 |
-VersionInfo (d3fa): 'InternalName'='freecell' - VI:49006e007400 |
|
709 |
-650072006e0061006c004e0061006d006500000066007200650065006300650 |
|
710 |
-06c006c000000 |
|
711 |
-VersionInfo (d4ba): 'OriginalFilename'='freecell' - VI:4f007200 |
|
712 |
-6900670069006e0061006c00460069006c0065006e0061006d0065000000660 |
|
713 |
-0720065006500630065006c006c000000 |
|
714 |
-VersionInfo (d4f6): 'ProductName'='Sistema operativo Microsoft |
|
715 |
-Windows' - VI:500072006f0064007500630074004e0061006d00650000000 |
|
716 |
-000530069007300740065006d00610020006f00700065007200610074006900 |
|
717 |
-76006f0020004d006900630072006f0073006f0066007400ae0020005700690 |
|
718 |
-06e0064006f0077007300ae000000 |
|
719 |
-VersionInfo (d562): 'ProductVersion'='5.1.2600.0' - VI:50007200 |
|
720 |
-6f006400750063007400560065007200730069006f006e00000035002e00310 |
|
721 |
-02e0032003600300030002e0030000000 |
|
722 |
-[...] |
|
723 |
-``` |
|
724 |
- |
|
725 |
-Although VI-based signatures are intended for use in logical signatures you can test them using ordinary `.ndb` files. For example: |
|
726 |
- |
|
727 |
-``` |
|
728 |
-my_test_vi_sig:1:VI:paste_your_hex_sig_here |
|
729 |
-``` |
|
730 |
- |
|
731 |
-Final note. If you want to decode a VI-based signature into a human readable form you can use: |
|
732 |
- |
|
733 |
-```bash |
|
734 |
-echo hex_string | xxd -r -p | strings -el |
|
735 |
-``` |
|
736 |
- |
|
737 |
-For example: |
|
738 |
- |
|
739 |
-```bash |
|
740 |
-$ echo 460069006c0065004400650073006300720069007000740069006f006e |
|
741 |
-000000000045006e007400650072007400610069006e006d0065006e007400200 |
|
742 |
-05000610063006b0020004600720065006500430065006c006c00200047006100 |
|
743 |
-6d0065000000 | xxd -r -p | strings -el |
|
744 |
-FileDescription |
|
745 |
-Entertainment Pack FreeCell Game |
|
746 |
-``` |
|
747 |
- |
|
748 |
-## Trusted and Revoked Certificates |
|
749 |
- |
|
750 |
-Clamav 0.98 checks signed PE files for certificates and verifies each certificate in the chain against a database of trusted and revoked certificates. The signature format is |
|
751 |
- |
|
752 |
-``` |
|
753 |
- Name;Trusted;Subject;Serial;Pubkey;Exponent;CodeSign;TimeSign;CertSign; |
|
754 |
- NotBefore;Comment[;minFL[;maxFL]] |
|
755 |
-``` |
|
756 |
- |
|
757 |
-where the corresponding fields are: |
|
758 |
- |
|
759 |
-- `Name:` name of the entry |
|
760 |
- |
|
761 |
-- `Trusted:` bit field, specifying whether the cert is trusted. 1 for trusted. 0 for revoked |
|
762 |
- |
|
763 |
-- `Subject:` sha1 of the Subject field in hex |
|
764 |
- |
|
765 |
-- `Serial:` the serial number as clamscan –debug –verbose reports |
|
766 |
- |
|
767 |
-- `Pubkey:` the public key in hex |
|
768 |
- |
|
769 |
-- `Exponent:` the exponent in hex. Currently ignored and hardcoded to 010001 (in hex) |
|
770 |
- |
|
771 |
-- `CodeSign:` bit field, specifying whether this cert can sign code. 1 for true, 0 for false |
|
772 |
- |
|
773 |
-- `TimeSign:` bit field. 1 for true, 0 for false |
|
774 |
- |
|
775 |
-- `CertSign:` bit field, specifying whether this cert can sign other certs. 1 for true, 0 for false |
|
776 |
- |
|
777 |
-- `NotBefore:` integer, cert should not be added before this variable. Defaults to 0 if left empty |
|
778 |
- |
|
779 |
-- `Comment:` comments for this entry |
|
780 |
- |
|
781 |
-The signatures for certs are stored inside `.crb` files. |
|
782 |
- |
|
783 |
-## Signatures based on container metadata |
|
784 |
- |
|
785 |
-ClamAV 0.96 allows creating generic signatures matching files stored inside different container types which meet specific conditions. The signature format is |
|
786 |
- |
|
787 |
-``` |
|
788 |
- VirusName:ContainerType:ContainerSize:FileNameREGEX: |
|
789 |
- FileSizeInContainer:FileSizeReal:IsEncrypted:FilePos: |
|
790 |
- Res1:Res2[:MinFL[:MaxFL]] |
|
791 |
-``` |
|
792 |
- |
|
793 |
-where the corresponding fields are: |
|
794 |
- |
|
795 |
-- `VirusName:` Virus name to be displayed when signature matches |
|
796 |
- |
|
797 |
-- `ContainerType:` one of |
|
798 |
- - `CL_TYPE_ZIP`, |
|
799 |
- - `CL_TYPE_RAR`, |
|
800 |
- - `CL_TYPE_ARJ`, |
|
801 |
- - `CL_TYPE_MSCAB`, |
|
802 |
- - `CL_TYPE_7Z`, |
|
803 |
- - `CL_TYPE_MAIL`, |
|
804 |
- - `CL_TYPE_(POSIX|OLD)_TAR`, |
|
805 |
- - `CL_TYPE_CPIO_(OLD|ODC|NEWC|CRC)` or |
|
806 |
- - `*` to match any of the container types listed here |
|
807 |
- |
|
808 |
-- `ContainerSize:` size of the container file itself (eg. size of the zip archive) specified in bytes as absolute value or range `x-y` |
|
809 |
- |
|
810 |
-- `FileNameREGEX:` regular expression describing name of the target file |
|
811 |
- |
|
812 |
-- `FileSizeInContainer:` usually compressed size; for MAIL, TAR and CPIO == `FileSizeReal`; specified in bytes as absolute value or range |
|
813 |
- |
|
814 |
-- `FileSizeReal:` usually uncompressed size; for MAIL, TAR and CPIO == `FileSizeInContainer`; absolute value or range |
|
815 |
- |
|
816 |
-- `IsEncrypted`: 1 if the target file is encrypted, 0 if it’s not and `*` to ignore |
|
817 |
- |
|
818 |
-- `FilePos`: file position in container (counting from 1); absolute value or range |
|
819 |
- |
|
820 |
-- `Res1`: when `ContainerType` is `CL_TYPE_ZIP` or `CL_TYPE_RAR` this field is treated as a CRC sum of the target file specified in hexadecimal format; for other container types it’s ignored |
|
821 |
- |
|
822 |
-- `Res2`: not used as of ClamAV 0.96 |
|
823 |
- |
|
824 |
-The signatures for container files are stored inside `.cdb` files. |
|
825 |
- |
|
826 |
-## Signatures based on ZIP/RAR metadata (obsolete) |
|
827 |
- |
|
828 |
-The (now obsolete) archive metadata signatures can be only applied to |
|
829 |
-ZIP and RAR files and have the following format: |
|
830 |
- |
|
831 |
-``` |
|
832 |
- virname:encrypted:filename:normal size:csize:crc32:cmethod: |
|
833 |
- fileno:max depth |
|
834 |
-``` |
|
835 |
- |
|
836 |
-where the corresponding fields are: |
|
837 |
- |
|
838 |
-- Virus name |
|
839 |
- |
|
840 |
-- Encryption flag (1 – encrypted, 0 – not encrypted) |
|
841 |
- |
|
842 |
-- File name (this is a regular expression - \* to ignore) |
|
843 |
- |
|
844 |
-- Normal (uncompressed) size (\* to ignore) |
|
845 |
- |
|
846 |
-- Compressed size (\* to ignore) |
|
847 |
- |
|
848 |
-- CRC32 (\* to ignore) |
|
849 |
- |
|
850 |
-- Compression method (\* to ignore) |
|
851 |
- |
|
852 |
-- File position in archive (\* to ignore) |
|
853 |
- |
|
854 |
-- Maximum number of nested archives (\* to ignore) |
|
855 |
- |
|
856 |
-The database file should have the extension of `.zmd` or `.rmd` for zip or rar metadata respectively. |
|
857 |
- |
|
858 |
-## Whitelist databases |
|
859 |
- |
|
860 |
-To whitelist a specific file use the MD5 signature format and place it inside a database file with the extension of `.fp`. To whitelist a specific file with the SHA1 or SHA256 file hash signature format, place the signature inside a database file with the extension of `.sfp`. To whitelist a specific signature from the database you just add its name into a local file called local.ign2 stored inside the database directory. You can additionally follow the signature name with the MD5 of the entire database entry for this signature, eg: |
|
861 |
- |
|
862 |
-``` |
|
863 |
- Eicar-Test-Signature:bc356bae4c42f19a3de16e333ba3569c |
|
864 |
-``` |
|
865 |
- |
|
866 |
-In such a case, the signature will no longer be whitelisted when its entry in the database gets modified (eg. the signature gets updated to avoid false alerts). |
|
867 |
- |
|
868 |
-## Signature names |
|
869 |
- |
|
870 |
-ClamAV uses the following prefixes for signature names: |
|
871 |
- |
|
872 |
-- *Worm* for Internet worms |
|
873 |
- |
|
874 |
-- *Trojan* for backdoor programs |
|
875 |
- |
|
876 |
-- *Adware* for adware |
|
877 |
- |
|
878 |
-- *Flooder* for flooders |
|
879 |
- |
|
880 |
-- *HTML* for HTML files |
|
881 |
- |
|
882 |
-- *Email* for email messages |
|
883 |
- |
|
884 |
-- *IRC* for IRC trojans |
|
885 |
- |
|
886 |
-- *JS* for Java Script malware |
|
887 |
- |
|
888 |
-- *PHP* for PHP malware |
|
889 |
- |
|
890 |
-- *ASP* for ASP malware |
|
891 |
- |
|
892 |
-- *VBS* for VBS malware |
|
893 |
- |
|
894 |
-- *BAT* for BAT malware |
|
895 |
- |
|
896 |
-- *W97M*, *W2000M* for Word macro viruses |
|
897 |
- |
|
898 |
-- *X97M*, *X2000M* for Excel macro viruses |
|
899 |
- |
|
900 |
-- *O97M*, *O2000M* for generic Office macro viruses |
|
901 |
- |
|
902 |
-- *DoS* for Denial of Service attack software |
|
903 |
- |
|
904 |
-- *DOS* for old DOS malware |
|
905 |
- |
|
906 |
-- *Exploit* for popular exploits |
|
907 |
- |
|
908 |
-- *VirTool* for virus construction kits |
|
909 |
- |
|
910 |
-- *Dialer* for dialers |
|
911 |
- |
|
912 |
-- *Joke* for hoaxes |
|
913 |
- |
|
914 |
-Important rules of the naming convention: |
|
915 |
- |
|
916 |
-- always use a -zippwd suffix in the malware name for signatures of type zmd, |
|
917 |
- |
|
918 |
-- always use a -rarpwd suffix in the malware name for signatures of type rmd, |
|
919 |
- |
|
920 |
-- only use alphanumeric characters, dash (-), dot (.), underscores (_) in malware names, never use space, apostrophe or quote mark. |
|
921 |
- |
|
922 |
-## Using YARA rules in ClamAV |
|
923 |
- |
|
924 |
-ClamAV version 0.99 and above can process YARA rules. ClamAV virus database file names ending with “.yar” or “.yara” are parsed as yara rule files. The link to the YARA rule grammar documentation may be found at http://plusvic.github.io/yara/. There are currently a few limitations on using YARA rules within ClamAV: |
|
925 |
- |
|
926 |
-- YARA modules are not yet supported by ClamAV. This includes the “import” keyword and any YARA module-specific keywords. |
|
927 |
- |
|
928 |
-- Global rules(“global” keyword) are not supported by ClamAV. |
|
929 |
- |
|
930 |
-- External variables(“contains” and “matches” keywords) are not supported. |
|
931 |
- |
|
932 |
-- YARA rules pre-compiled with the *yarac* command are not supported. |
|
933 |
- |
|
934 |
-- As in the ClamAV logical and extended signature formats, YARA strings and segments of strings separated by wild cards must represent at least two octets of data. |
|
935 |
- |
|
936 |
-- There is a maximum of 64 strings per YARA rule. |
|
937 |
- |
|
938 |
-- YARA rules in ClamAV must contain at least one literal, hexadecimal, or regular expression string. |
|
939 |
- |
|
940 |
-In addition, there are a few more ClamAV processing modes that may affect the outcome of YARA rules. |
|
941 |
- |
|
942 |
-- *File decomposition and decompression* - Since ClamAV uses file decomposition and decompression to find viruses within de-archived and uncompressed inner files, YARA rules executed by ClamAV will match against these files as well. |
|
943 |
- |
|
944 |
-- *Normalization* - By default, ClamAV normalizes HTML, JavaScript, and ASCII text files. YARA rules in ClamAV will match against the normalized result. The effects of normalization of these file types may be captured using `clamscan --leave-temps --tempdir=mytempdir`. YARA rules may then be written using the normalized file(s) found in `mytempdir`. Alternatively, starting with ClamAV 0.100.0, `clamscan --normalize=no` will prevent normalization and only scan the raw file. To obtain similar behavior prior to 0.99.2, use `clamscan --scan-html=no`. The corresponding parameters for clamd.conf are `Normalize` and `ScanHTML`. |
|
945 |
- |
|
946 |
-- *YARA conditions driven by string matches* - All YARA conditions are driven by string matches in ClamAV. This saves from executing every YARA rule on every file. Any YARA condition may be augmented with a string match clause which is always true, such as: |
|
947 |
- |
|
948 |
-```yara |
|
949 |
- rule CheckFileSize |
|
950 |
- { |
|
951 |
- strings: |
|
952 |
- $abc = "abc" |
|
953 |
- condition: |
|
954 |
- ($abc or not $abc) and filesize < 200KB |
|
955 |
- } |
|
956 |
-``` |
|
957 |
- |
|
958 |
-This will ensure that the YARA condition always performs the desired action (checking the file size in this example), |
|
959 |
- |
|
960 |
-## Passwords for archive files \[experimental\] |
|
961 |
- |
|
962 |
-ClamAV 0.99 allows for users to specify password attempts for certain password-compatible archives. Passwords will be attempted in order of appearance in the password signature file which use the extension of `.pwdb`. If no passwords apply or none are provided, ClamAV will default to the original behavior of parsing the file. Currently, as of ClamAV 0.99 \[flevel 81\], only `.zip` archives using the traditional PKWARE encryption are supported. The signature format is |
|
963 |
- |
|
964 |
-``` |
|
965 |
- SignatureName;TargetDescriptionBlock;PWStorageType;Password |
|
966 |
-``` |
|
967 |
- |
|
968 |
-where: |
|
969 |
- |
|
970 |
-- `SignatureName`: name to be displayed during debug when a password is successful |
|
971 |
- |
|
972 |
-- `TargetDescriptionBlock`: provides information about the engine and target file with comma separated Arg:Val pairs |
|
973 |
- - `Engine:X-Y`: Required engine functionality |
|
974 |
- - `Container:CL_TYPE_*`: File type of applicable containers |
|
975 |
- |
|
976 |
-- `PWStorageType`: determines how the password field is parsed |
|
977 |
- - 0 = cleartext |
|
978 |
- - 1 = hex |
|
979 |
- |
|
980 |
-- `Password`: value used in password attempt |
|
981 |
- |
|
982 |
-The signatures for password attempts are stored inside `.pwdb` files. |
|
983 |
- |
|
984 |
-# Special files |
|
985 |
- |
|
986 |
-## HTML |
|
987 |
- |
|
988 |
-ClamAV contains a special HTML normalisation code which helps to detect HTML exploits. Running `sigtool --html-normalise` on a HTML file should generate the following files: |
|
989 |
- |
|
990 |
-- nocomment.html - the file is normalized, lower-case, with all comments and superfluous white space removed |
|
991 |
- |
|
992 |
-- notags.html - as above but with all HTML tags removed |
|
993 |
- |
|
994 |
-The code automatically decodes JScript.encode parts and char ref’s (e.g. `f`). You need to create a signature against one of the created files. To eliminate potential false positive alerts the target type should be set to 3. |
|
995 |
- |
|
996 |
-## Text files |
|
997 |
- |
|
998 |
-Similarly to HTML all ASCII text files get normalized (converted to lower-case, all superfluous white space and control characters removed, etc.) before scanning. Use `clamscan --leave-temps` to obtain a normalized file then create a signature with the target type 7. |
|
999 |
- |
|
1000 |
-## Compressed Portable Executable files |
|
1001 |
- |
|
1002 |
-If the file is compressed with UPX, FSG, Petite or other PE packer supported by libclamav, run `clamscan` with `--debug --leave-temps`. Example output for a FSG compressed file: |
|
1003 |
- |
|
1004 |
-```bash |
|
1005 |
-LibClamAV debug: UPX/FSG/MEW: empty section found - assuming compression |
|
1006 |
-LibClamAV debug: FSG: found old EP @119e0 |
|
1007 |
-LibClamAV debug: FSG: Unpacked and rebuilt executable saved in |
|
1008 |
-/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c |
|
1009 |
- |
|
1010 |
-``` |
|
1011 |
- |
|
1012 |
-Next create a type 1 signature for `/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c` |
1013 | 1 |
deleted file mode 100644 |
... | ... |
@@ -1,242 +0,0 @@ |
1 |
-# Usage |
|
2 |
- |
|
3 |
-## Clam daemon |
|
4 |
- |
|
5 |
-`clamd` is a multi-threaded daemon that uses *libclamav* to scan files for viruses. It may work in one or both modes listening on: |
|
6 |
- |
|
7 |
-- Unix (local) socket |
|
8 |
-- TCP socket |
|
9 |
- |
|
10 |
-The daemon is fully configurable via the `clamd.conf` file \[8\]. `clamd` recognizes the following commands: |
|
11 |
- |
|
12 |
-- **PING** |
|
13 |
- Check the daemon’s state (should reply with "PONG"). |
|
14 |
-- **VERSION** |
|
15 |
- Print program and database versions. |
|
16 |
-- **RELOAD** |
|
17 |
- Reload the databases. |
|
18 |
-- **SHUTDOWN** |
|
19 |
- Perform a clean exit. |
|
20 |
-- **SCAN file/directory** |
|
21 |
- Scan file or directory (recursively) with archive support enabled (a full path is required). |
|
22 |
-- **RAWSCAN file/directory** |
|
23 |
- Scan file or directory (recursively) with archive and special file support disabled (a full path is required). |
|
24 |
-- **CONTSCAN file/directory** |
|
25 |
- Scan file or directory (recursively) with archive support enabled and don’t stop the scanning when a virus is found. |
|
26 |
-- **MULTISCAN file/directory** |
|
27 |
- Scan file in a standard way or scan directory (recursively) using multiple threads (to make the scanning faster on SMP machines). |
|
28 |
-- **ALLMATCHSCAN file/directory** |
|
29 |
- ALLMATCHSCAN works just like SCAN except that it sets a mode where, after finding a virus within a file, continues scanning for additional viruses. |
|
30 |
-- **INSTREAM** |
|
31 |
- *It is mandatory to prefix this command with **n** or **z**.* Scan a stream of data. The stream is sent to clamd in chunks, after INSTREAM, on the same socket on which the command was sent. This avoids the overhead of establishing new TCP connections and problems with NAT. The format of the chunk is: `<length><data>` where `<length>` is the size of the following data in bytes expressed as a 4 byte unsigned integer in network byte order and `<data>` is the actual chunk. Streaming is terminated by sending a zero-length chunk. Note: do not exceed StreamMaxLength as defined in clamd.conf, otherwise clamd will reply with *INSTREAM size limit exceeded* and close the connection. |
|
32 |
-- **FILDES** |
|
33 |
- *It is mandatory to newline terminate this command, or prefix with **n** or **z**. This command only works on UNIX domain sockets.* Scan a file descriptor. After issuing a FILDES command a subsequent rfc2292/bsd4.4 style packet (with at least one dummy character) is sent to clamd carrying the file descriptor to be scanned inside the ancillary data. Alternatively the file descriptor may be sent in the same packet, including the extra character. |
|
34 |
-- **STATS** |
|
35 |
- *It is mandatory to newline terminate this command, or prefix with **n** or **z**, it is recommended to only use the **z** prefix.* On this command clamd provides statistics about the scan queue, contents of scan queue, and memory usage. The exact reply format is subject to changes in future releases. |
|
36 |
-- **IDSESSION, END** |
|
37 |
- *It is mandatory to prefix this command with **n** or **z**, also all commands inside **IDSESSION** must be prefixed.* Start/end a clamd session. Within a session multiple SCAN, INSTREAM, FILDES, VERSION, STATS commands can be sent on the same socket without opening new connections. Replies from clamd will be in the form `<id>: <response>` where `<id>` is the request number (in ASCII, starting from 1) and `<response>` is the usual clamd reply. The reply lines have the same delimiter as the corresponding command had. Clamd will process the commands asynchronously, and reply as soon as it has finished processing. Clamd requires clients to read all the replies it sent, before sending more commands to prevent send() deadlocks. The recommended way to implement a client that uses IDSESSION is with non-blocking sockets, and a select()/poll() loop: whenever send would block, sleep in select/poll until either you can write more data, or read more replies. *Note that using non-blocking sockets without the select/poll loop and alternating recv()/send() doesn’t comply with clamd’s requirements.* If clamd detects that a client has deadlocked, it will close the connection. Note that clamd may close an IDSESSION connection too if the client doesn’t follow the protocol’s requirements. |
|
38 |
-- **STREAM** (deprecated, use **INSTREAM** instead) |
|
39 |
- Scan stream: clamd will return a new port number you should connect to and send data to scan. |
|
40 |
- |
|
41 |
-It’s recommended to prefix clamd commands with the letter **z** (eg. zSCAN) to indicate that the command will be delimited by a NULL character and that clamd should continue reading command data until a NULL character is read. The null delimiter assures that the complete command and its entire argument will be processed as a single command. Alternatively commands may be prefixed with the letter **n** (e.g. nSCAN) to use a newline character as the delimiter. Clamd replies will honour the requested terminator in turn. If clamd doesn’t recognize the command, or the command doesn’t follow the requirements specified below, it will reply with an error message, and close the connection. Clamd can handle the following signals: |
|
42 |
- |
|
43 |
-- **SIGTERM** - perform a clean exit |
|
44 |
-- **SIGHUP** - reopen the log file |
|
45 |
-- **SIGUSR2** - reload the database |
|
46 |
- |
|
47 |
-Clamd should not be started in the background using the shell operator `&` or external tools. Instead, you should run and wait for clamd to load the database and daemonize itself. After that, clamd is instantly ready to accept connections and perform file scanning. |
|
48 |
- |
|
49 |
-## Clam**d**scan |
|
50 |
- |
|
51 |
-`clamdscan` is a simple `clamd` client. In many cases you can use it as a `clamscan` replacement however you must remember that: |
|
52 |
- |
|
53 |
-- it only depends on `clamd` |
|
54 |
-- although it accepts the same command line options as `clamscan` most of them are ignored because they must be enabled directly in `clamd`, i.e. `clamd.conf` |
|
55 |
-- in TCP mode scanned files must be accessible for `clamd`, if you enabled LocalSocket in clamd.conf then clamdscan will try to workaround this limitation by using FILDES |
|
56 |
- |
|
57 |
-## On-access Scanning |
|
58 |
- |
|
59 |
-There is a special thread in `clamd` that performs on-access scanning under Linux and shares internal virus database with the daemon. By default, this thread will only notify you when potential threats are discovered. If you turn on prevention via `clamd.conf` then **you must follow some important rules when using it:** |
|
60 |
- |
|
61 |
-- Always stop the daemon cleanly - using the SHUTDOWN command or the SIGTERM signal. In other case you can lose access to protected files until the system is restarted. |
|
62 |
-- Never protect the directory your mail-scanner software uses for attachment unpacking. Access to all infected files will be automatically blocked and the scanner (including `clamd`\!) will not be able to detect any viruses. In the result **all infected mails may be delivered.** |
|
63 |
-- Watch your entire filesystem only using the `clamd.conf` OnAccessMountPath option. While this will disable on-access prevention, it will avoid potential system lockups caused by fanotify’s blocking functionality. |
|
64 |
-- Using the On-Access Scanner to watch a virtual filesystem will result in undefined behaviour. |
|
65 |
- |
|
66 |
-The default configuration utilizes inotify to recursively keep track of directories. If you need to protect more than 8192 directories it will be necessary to change inotify’s `max_user_watches` value. |
|
67 |
- |
|
68 |
-This can be done temporarily with: |
|
69 |
- |
|
70 |
-```bash |
|
71 |
- $ sysctl fs.inotify.max_user_watches=<n> |
|
72 |
-``` |
|
73 |
- |
|
74 |
-Where `<n>` is the new maximum desired. |
|
75 |
- |
|
76 |
-To watch your entire filesystem add the following lines to `clamd.conf`: |
|
77 |
- |
|
78 |
-```ini |
|
79 |
- ScanOnAccess yes |
|
80 |
- OnAccessMountPath / |
|
81 |
-``` |
|
82 |
- |
|
83 |
-Similarly, to protect your home directory add the following lines to |
|
84 |
-`clamd.conf`: |
|
85 |
- |
|
86 |
-```ini |
|
87 |
- ScanOnAccess yes |
|
88 |
- OnAccessIncludePath /home |
|
89 |
- OnAccessExcludePath /home/user/temp/dir/of/your/mail/scanning/software |
|
90 |
- OnAccessPrevention yes |
|
91 |
-``` |
|
92 |
- |
|
93 |
-For more configuration options, type ’man clamd.conf’ or reference the example clamd.conf. |
|
94 |
- |
|
95 |
-## Clamdtop |
|
96 |
- |
|
97 |
-`clamdtop` is a tool to monitor one or multiple instances of clamd. It has a (color) ncurses interface, that shows the jobs in clamd’s queue, memory usage, and information about the loaded signature database. You can specify on the command-line to which clamd(s) it should connect to. By default it will attempt to connect to the local clamd as defined in clamd.conf. |
|
98 |
- |
|
99 |
-For more detailed help, type ’man clamdtop’ or ’clamdtop –help’. |
|
100 |
- |
|
101 |
-## Clamscan |
|
102 |
- |
|
103 |
-`clamscan` is ClamAV’s command line virus scanner. It can be used to scan files and/or directories for viruses. In order for clamscan to work proper, the ClamAV virus database files must be installed on the system you are using clamscan on. |
|
104 |
- |
|
105 |
-The general usage of clamscan is: clamscan \[options\] |
|
106 |
-\[file/directory/-\] |
|
107 |
- |
|
108 |
-For more detailed help, type ’man clamscan’ or ’clamscan –help’. |
|
109 |
- |
|
110 |
-## ClamBC |
|
111 |
- |
|
112 |
-`clambc` is Clam Anti-Virus’ bytecode testing tool. It can be used to test files which contain bytecode. For more detailed help, type ’man clambc’ or ’clambc –help’. |
|
113 |
- |
|
114 |
-## Freshclam |
|
115 |
- |
|
116 |
-`freshclam` is ClamAV’s virus database update tool and reads it’s configuration from the file ’freshclam.conf’ (this may be overridden by command line options). Freshclam’s default behavior is to attempt to update databases that are paired with downloaded cdiffs. Potentially corrupted databases are not updated and are automatically fully replaced after several failed attempts unless otherwise specified. |
|
117 |
- |
|
118 |
-Here is a sample usage including cdiffs: |
|
119 |
- |
|
120 |
-```bash |
|
121 |
-$ freshclam |
|
122 |
- |
|
123 |
-ClamAV update process started at Mon Oct 7 08:15:10 2013 |
|
124 |
-main.cld is up to date (version: 55, sigs: 2424225, f-level: 60, builder: neo) |
|
125 |
-Downloading daily-17945.cdiff [100%] |
|
126 |
-Downloading daily-17946.cdiff [100%] |
|
127 |
-Downloading daily-17947.cdiff [100%] |
|
128 |
-daily.cld updated (version: 17947, sigs: 406951, f-level: 63, builder: neo) |
|
129 |
-Downloading bytecode-227.cdiff [100%] |
|
130 |
-Downloading bytecode-228.cdiff [100%] |
|
131 |
-bytecode.cld updated (version: 228, sigs: 43, f-level: 63, builder: neo) |
|
132 |
-Database updated (2831219 signatures) from database.clamav.net (IP: 64.6.100.177) |
|
133 |
-``` |
|
134 |
- |
|
135 |
-For more detailed help, type ’man clamscan’ or ’clamscan –help’. |
|
136 |
- |
|
137 |
-## Clamconf |
|
138 |
- |
|
139 |
-`clamconf` is the Clam Anti-Virus configuration utility. It is used for displaying values of configurations options in ClamAV, which will show the contents of clamd.conf (or tell you if it is not properly configured), the contents of freshclam.conf, and display information about software settings, database, platform, and build information. Here is a sample clamconf output: |
|
140 |
- |
|
141 |
-```bash |
|
142 |
-$ clamconf |
|
143 |
- |
|
144 |
-Checking configuration files in /etc/clamav |
|
145 |
- |
|
146 |
-Config file: clamd.conf |
|
147 |
-ERROR: Please edit the example config file /etc/clamav/clamd.conf |
|
148 |
- |
|
149 |
-Config file: freshclam.conf |
|
150 |
-ERROR: Please edit the example config file /etc/clamav/freshclam.conf |
|
151 |
- |
|
152 |
-clamav-milter.conf not found |
|
153 |
- |
|
154 |
-Software settings |
|
155 |
-Version: 0.98.2 |
|
156 |
-Optional features supported: MEMPOOL IPv6 AUTOIT_EA06 BZIP2 RAR JIT |
|
157 |
- |
|
158 |
-Database information |
|
159 |
-Database directory: /xclam/gcc/release/share/clamav |
|
160 |
-WARNING: freshclam.conf and clamd.conf point to different database directories |
|
161 |
-print_dbs: Can't open directory /xclam/gcc/release/share/clamav |
|
162 |
- |
|
163 |
-Platform information |
|
164 |
-uname: Linux 3.5.0-44-generic #67~precise1-Ubuntu SMP Wed Nov 13 16:20:03 UTC 2013 i686 |
|
165 |
-OS: linux-gnu, ARCH: i386, CPU: i686 |
|
166 |
-Full OS version: Ubuntu 12.04.3 LTS |
|
167 |
-zlib version: 1.2.3.4 (1.2.3.4), compile flags: 55 |
|
168 |
-Triple: i386-pc-linux-gnu |
|
169 |
-CPU: i686, Little-endian |
|
170 |
-platform id: 0x0a114d4d0404060401040604 |
|
171 |
- |
|
172 |
-Build information |
|
173 |
-GNU C: 4.6.4 (4.6.4) |
|
174 |
-GNU C++: 4.6.4 (4.6.4) |
|
175 |
-CPPFLAGS: |
|
176 |
-CFLAGS: -g -O0 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE |
|
177 |
-CXXFLAGS: |
|
178 |
-LDFLAGS: |
|
179 |
-Configure: '--prefix=/xclam/gcc/release/' '--disable-clamav' '--enable-debug' 'CFLAGS=-g -O0' |
|
180 |
-sizeof(void*) = 4 |
|
181 |
-Engine flevel: 77, dconf: 77 |
|
182 |
- |
|
183 |
-``` |
|
184 |
- |
|
185 |
-For more detailed help, type ’man clamconf’ or ’clamconf –help’. |
|
186 |
- |
|
187 |
-## Output format |
|
188 |
- |
|
189 |
-### clamscan |
|
190 |
- |
|
191 |
-`clamscan` writes all regular program messages to **stdout** and errors/warnings to **stderr**. You can use the option `--stdout` to redirect all program messages to **stdout**. Warnings and error messages from `libclamav` are always printed to **stderr**. A typical output from `clamscan` looks like this: |
|
192 |
- |
|
193 |
-```bash |
|
194 |
- /tmp/test/removal-tool.exe: Worm.Sober FOUND |
|
195 |
- /tmp/test/md5.o: OK |
|
196 |
- /tmp/test/blob.c: OK |
|
197 |
- /tmp/test/message.c: OK |
|
198 |
- /tmp/test/error.hta: VBS.Inor.D FOUND |
|
199 |
-``` |
|
200 |
- |
|
201 |
-When a virus is found its name is printed between the `filename:` and `FOUND` strings. In case of archives the scanner depends on libclamav and only prints the first virus found within an archive: |
|
202 |
- |
|
203 |
-```bash |
|
204 |
- $ clamscan malware.zip |
|
205 |
- malware.zip: Worm.Mydoom.U FOUND |
|
206 |
-``` |
|
207 |
- |
|
208 |
-When using the –allmatch(-z) flag, clamscan may print multiple virus `FOUND` lines for archives and files. |
|
209 |
- |
|
210 |
-### clamd |
|
211 |
- |
|
212 |
-The output format of `clamd` is very similar to `clamscan`. |
|
213 |
- |
|
214 |
-```bash |
|
215 |
- $ telnet localhost 3310 |
|
216 |
- Trying 127.0.0.1... |
|
217 |
- Connected to localhost. |
|
218 |
- Escape character is '^]'. |
|
219 |
- SCAN /home/zolw/test |
|
220 |
- /home/zolw/test/clam.exe: ClamAV-Test-File FOUND |
|
221 |
- Connection closed by foreign host. |
|
222 |
-``` |
|
223 |
- |
|
224 |
-In the **SCAN** mode it closes the connection when the first virus is found. |
|
225 |
- |
|
226 |
-```bash |
|
227 |
- SCAN /home/zolw/test/clam.zip |
|
228 |
- /home/zolw/test/clam.zip: ClamAV-Test-File FOUND |
|
229 |
-``` |
|
230 |
- |
|
231 |
-**CONTSCAN** and **MULTISCAN** don’t stop scanning in case a virus is found. Error messages are printed in the following format: |
|
232 |
- |
|
233 |
-```bash |
|
234 |
- SCAN /no/such/file |
|
235 |
- /no/such/file: Can't stat() the file. ERROR |
|
236 |
-``` |
237 | 1 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,25 @@ |
0 |
+# Clam AntiVirus 0.100.0 *User Manual* |
|
1 |
+ |
|
2 |
+![image](images/demon.png) |
|
3 |
+ |
|
4 |
+----- |
|
5 |
+ |
|
6 |
+Table Of Contents |
|
7 |
+ |
|
8 |
+1. [Introduction to ClamAV](UserManual/Introduction.md) |
|
9 |
+2. [Installing ClamAV](UserManual/Installation.md) |
|
10 |
+3. [Configuring ClamAV](UserManual/Configuration.md) |
|
11 |
+4. [Using ClamAV](UserManual/Usage.md) |
|
12 |
+5. [Build \[lib\]ClamAV Into Your Programs](UserManual/libclamav.md) |
|
13 |
+6. [Writing ClamAV Signatures](UserManual/Signatures.md) |
|
14 |
+7. [Writing ClamAV Phishing Signatures](UserManual/PhishSigs.md) |
|
15 |
+ |
|
16 |
+----- |
|
17 |
+ |
|
18 |
+ClamAV User Manual © 2018 Cisco Systems, Inc. |
|
19 |
+ |
|
20 |
+This document is distributed under the terms of the GNU General Public License v2. |
|
21 |
+ |
|
22 |
+Clam AntiVirus is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. |
|
23 |
+ |
|
24 |
+ClamAV and Clam AntiVirus are trademarks of Cisco Systems, Inc. |
0 | 2 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,105 @@ |
0 |
+# Configuration |
|
1 |
+ |
|
2 |
+Before proceeding with the steps below, you should run the ’clamconf’ command, which gives important information about your ClamAV configuration. See section [5.8](#sec:clamconf) for more details. |
|
3 |
+ |
|
4 |
+## clamd |
|
5 |
+ |
|
6 |
+Before you start using the daemon you have to edit the configuration file (in other case `clamd` won’t run): |
|
7 |
+ |
|
8 |
+```bash |
|
9 |
+ $ clamd |
|
10 |
+ ERROR: Please edit the example config file /etc/clamd.conf. |
|
11 |
+``` |
|
12 |
+ |
|
13 |
+This shows the location of the default configuration file. The format and options of this file are fully described in the *clamd.conf(5)* manual. The config file is well commented and configuration should be straightforward. |
|
14 |
+ |
|
15 |
+### On-access scanning |
|
16 |
+ |
|
17 |
+One of the interesting features of `clamd` is on-access scanning based on fanotify, included in Linux since kernel 2.6.36. **This is not required to run clamd**. At the moment the fanotify header is only available for Linux. |
|
18 |
+ |
|
19 |
+Configure on-access scanning in `clamd.conf` and read the [on-access](Usage.md#On-access-Scanning) section for on-access scanning usage. |
|
20 |
+ |
|
21 |
+## clamav-milter |
|
22 |
+ |
|
23 |
+ClamAV (v0.95) includes a new, redesigned clamav-milter. The most notable difference is that the internal mode has been dropped and now a working clamd companion is required. The second important difference is that now the milter has got its own configuration and log files. |
|
24 |
+ |
|
25 |
+To compile ClamAV with the clamav-milter just run `./configure --enable-milter` and make as usual. In order to use the `–enable-milter` option with `configure`, your system MUST have the milter library installed. If you use the `–enable-milter` option without the library being installed, you will most likely see output like this during ’configure’: |
|
26 |
+ |
|
27 |
+```bash |
|
28 |
+ checking for libiconv_open in -liconv... no |
|
29 |
+ checking for iconv... yes |
|
30 |
+ checking whether in_port_t is defined... yes |
|
31 |
+ checking for in_addr_t definition... yes |
|
32 |
+ checking for mi_stop in -lmilter... no |
|
33 |
+ checking for library containing strlcpy... no |
|
34 |
+ checking for mi_stop in -lmilter... no |
|
35 |
+ configure: error: Cannot find libmilter |
|
36 |
+``` |
|
37 |
+ |
|
38 |
+At which point the ’configure’ script will stop processing. |
|
39 |
+ |
|
40 |
+Please consult your MTA’s manual on how to connect ClamAV with the milter. |
|
41 |
+ |
|
42 |
+## Testing |
|
43 |
+ |
|
44 |
+Try to scan recursively the source directory: |
|
45 |
+ |
|
46 |
+```bash |
|
47 |
+ $ clamscan -r -l scan.txt clamav-x.yz |
|
48 |
+``` |
|
49 |
+ |
|
50 |
+It should find some test files in the clamav-x.yz/test directory. The scan result will be saved in the `scan.txt` log file \[7\]. To test `clamd`, start it and use `clamdscan` (or instead connect directly to its socket and run the SCAN command): |
|
51 |
+ |
|
52 |
+```bash |
|
53 |
+ $ clamdscan -l scan.txt clamav-x.yz |
|
54 |
+``` |
|
55 |
+ |
|
56 |
+Please note that the scanned files must be accessible by the user running `clamd` or you will get an error. |
|
57 |
+ |
|
58 |
+## Setting up auto-updating |
|
59 |
+ |
|
60 |
+`freshclam` is the automatic database update tool for Clam AntiVirus. It can work in two modes: |
|
61 |
+ |
|
62 |
+- interactive - on demand from command line |
|
63 |
+- daemon - silently in the background |
|
64 |
+ |
|
65 |
+`freshclam` is advanced tool: it supports scripted updates (instead of transferring the whole CVD file at each update it only transfers the differences between the latest and the current database via a special script), database version checks through DNS, proxy servers (with authentication), digital signatures and various error scenarios. **Quick test: run freshclam (as superuser) with no parameters and check the output.** If everything is OK you may create the log file in /var/log (owned by *clamav* or another user `freshclam` will be running as): |
|
66 |
+ |
|
67 |
+```bash |
|
68 |
+ # touch /var/log/freshclam.log |
|
69 |
+ # chmod 600 /var/log/freshclam.log |
|
70 |
+ # chown clamav /var/log/freshclam.log |
|
71 |
+``` |
|
72 |
+ |
|
73 |
+Now you *should* edit the configuration file `freshclam.conf` and point the *UpdateLogFile* directive to the log file. Finally, to run `freshclam` in the daemon mode, execute: |
|
74 |
+ |
|
75 |
+```bash |
|
76 |
+ # freshclam -d |
|
77 |
+``` |
|
78 |
+ |
|
79 |
+The other way is to use the *cron* daemon. You have to add the following line to the crontab of **root** or **clamav** user: |
|
80 |
+ |
|
81 |
+```cron |
|
82 |
+N * * * * /usr/local/bin/freshclam --quiet |
|
83 |
+``` |
|
84 |
+ |
|
85 |
+to check for a new database every hour. **N should be a number between 3 and 57 of your choice. Please don’t choose any multiple of 10, because there are already too many clients using those time slots.** Proxy settings are only configurable via the configuration file and `freshclam` will require strict permission settings for the config file when `HTTPProxyPassword` is turned on. |
|
86 |
+ |
|
87 |
+```bash |
|
88 |
+ HTTPProxyServer myproxyserver.com |
|
89 |
+ HTTPProxyPort 1234 |
|
90 |
+ HTTPProxyUsername myusername |
|
91 |
+ HTTPProxyPassword mypass |
|
92 |
+``` |
|
93 |
+ |
|
94 |
+### Closest mirrors |
|
95 |
+ |
|
96 |
+The `DatabaseMirror` directive in the config file specifies the database server `freshclam` will attempt (up to `MaxAttempts` times) to download the database from. The default database mirror is [database.clamav.net](database.clamav.net) but multiple directives are allowed. In order to download the database from the closest mirror you should configure `freshclam` to use [db.xx.clamav.net](db.xx.clamav.net) where xx represents your country code. For example, if your server is in "Ascension Island" you should have the following lines included in `freshclam.conf`: |
|
97 |
+ |
|
98 |
+```bash |
|
99 |
+ DNSDatabaseInfo current.cvd.clamav.net |
|
100 |
+ DatabaseMirror db.ac.clamav.net |
|
101 |
+ DatabaseMirror database.clamav.net |
|
102 |
+``` |
|
103 |
+ |
|
104 |
+The second entry acts as a fallback in case the connection to the first mirror fails for some reason. The full list of two-letters country codes is available at <http://www.iana.org/cctld/cctld-whois.htm> |
0 | 105 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,197 @@ |
0 |
+# Installation from Source |
|
1 |
+ |
|
2 |
+## Requirements |
|
3 |
+ |
|
4 |
+The following components are required to compile ClamAV under UNIX:= |
|
5 |
+ |
|
6 |
+- zlib and zlib-devel packages |
|
7 |
+- openssl version 0.9.8 or higher and libssl-devel packages |
|
8 |
+- gcc compiler suite (tested with 2.9x, 3.x and 4.x series) **If you are compiling with higher optimization levels than the default one ( for gcc), be aware that there have been reports of misoptimizations. The build system of ClamAV only checks for bugs affecting the default settings, it is your responsibility to check that your compiler version doesn’t have any bugs.** |
|
9 |
+- GNU make (gmake) |
|
10 |
+ |
|
11 |
+The following packages are optional but **highly recommended**: |
|
12 |
+ |
|
13 |
+- bzip2 and bzip2-devel library |
|
14 |
+- libxml2 and libxml2-dev library |
|
15 |
+- `check` unit testing framework \[3\]. |
|
16 |
+ |
|
17 |
+The following packages are optional, but **required for bytecode JIT support**: |
|
18 |
+ |
|
19 |
+- GCC C and C++ compilers (minimum 4.1.3, recommended 4.3.4 or newer) the package for these compilers are usually called: gcc, g++, or gcc-c++. \[5\] |
|
20 |
+- OSX Xcode versions prior to 5.0 use a g++ compiler frontend (llvm-gcc) that is not compatible with ClamAV JIT. It is recommended to either compile ClamAV JIT with clang++ or to compile ClamAV without JIT. |
|
21 |
+- A supported CPU for the JIT, either of: X86, X86-64, PowerPC, PowerPC64 |
|
22 |
+ |
|
23 |
+The following packages are optional, but needed for the JIT unit tests: |
|
24 |
+ |
|
25 |
+- GNU Make (version 3.79, recommended 3.81) |
|
26 |
+- Python (version 2.5.4 or newer), for running the JIT unit tests |
|
27 |
+ |
|
28 |
+The following packages are optional, but required for clamsubmit: |
|
29 |
+ |
|
30 |
+- libcurl-devel library |
|
31 |
+- libjson-c-dev library |
|
32 |
+ |
|
33 |
+## Installing on shell account |
|
34 |
+ |
|
35 |
+To install ClamAV locally on an unprivileged shell account you need not create any additional users or groups. Assuming your home directory is `/home/gary` you should build it as follows: |
|
36 |
+ |
|
37 |
+```bash |
|
38 |
+ $ ./configure --prefix=/home/gary/clamav --disable-clamav |
|
39 |
+ $ make; make install |
|
40 |
+``` |
|
41 |
+ |
|
42 |
+To test your installation execute: |
|
43 |
+ |
|
44 |
+```bash |
|
45 |
+ $ ~/clamav/bin/freshclam |
|
46 |
+ $ ~/clamav/bin/clamscan ~ |
|
47 |
+``` |
|
48 |
+ |
|
49 |
+The `--disable-clamav` switch disables the check for existence of the *clamav* user and group but `clamscan` would still require an unprivileged account to work in a superuser mode. |
|
50 |
+ |
|
51 |
+## Adding new system user and group |
|
52 |
+ |
|
53 |
+If you are installing ClamAV for the first time, you have to add a new user and group to your system: |
|
54 |
+ |
|
55 |
+```bash |
|
56 |
+ # groupadd clamav |
|
57 |
+ # useradd -g clamav -s /bin/false -c "Clam AntiVirus" clamav |
|
58 |
+``` |
|
59 |
+ |
|
60 |
+Consult a system manual if your OS has not *groupadd* and *useradd* utilities. **Don’t forget to lock access to the account\!** |
|
61 |
+ |
|
62 |
+## Compilation of base package |
|
63 |
+ |
|
64 |
+Once you have created the clamav user and group, please extract the archive: |
|
65 |
+ |
|
66 |
+```bash |
|
67 |
+ $ zcat clamav-x.yz.tar.gz | tar xvf - |
|
68 |
+ $ cd clamav-x.yz |
|
69 |
+``` |
|
70 |
+ |
|
71 |
+Assuming you want to install the configuration files in /etc, configure and build the software as follows: |
|
72 |
+ |
|
73 |
+```bash |
|
74 |
+ $ ./configure --sysconfdir=/etc |
|
75 |
+ $ make |
|
76 |
+ $ su -c "make install" |
|
77 |
+``` |
|
78 |
+ |
|
79 |
+In the last step the software is installed into the /usr/local directory and the config files into /etc. **WARNING: Never enable the SUID or SGID bits for Clam AntiVirus binaries.** |
|
80 |
+ |
|
81 |
+## Compilation with clamav-milter enabled |
|
82 |
+ |
|
83 |
+libmilter and its development files are required. To enable clamav-milter, configure ClamAV with |
|
84 |
+ |
|
85 |
+```bash |
|
86 |
+ $ ./configure --enable-milter |
|
87 |
+``` |
|
88 |
+ |
|
89 |
+See section /refsec:clamavmilter for more details on clamav-milter. |
|
90 |
+ |
|
91 |
+## Using the system LLVM |
|
92 |
+ |
|
93 |
+Some problems have been reported when compiling ClamAV’s built-in LLVM with recent C++ compiler releases. These problems may be avoided by installing and using an external LLVM system library. To configure ClamAV to use LLVM that is installed as a system library instead of the built-in LLVM JIT, use following: |
|
94 |
+ |
|
95 |
+```bash |
|
96 |
+ $ ./configure --with-system-llvm=/myllvm/bin/llvm-config |
|
97 |
+ $ make |
|
98 |
+ $ sudo make install |
|
99 |
+``` |
|
100 |
+ |
|
101 |
+The argument to `--with-system-llvm` is optional, indicating the path name of the LLVM configuration utility (llvm-config). With no argument to `--with-system-llvm`, `./configure` will search for LLVM in /usr/local/ and then /usr. |
|
102 |
+ |
|
103 |
+Recommended versions of LLVM are 3.2, 3.3, 3.4, 3.5, and 3.6. Some installations have reported problems using earlier LLVM versions. Versions of LLVM beyond 3.6 are not currently supported in ClamAV. |
|
104 |
+ |
|
105 |
+## Running unit tests |
|
106 |
+ |
|
107 |
+ClamAV includes unit tests that allow you to test that the compiled binaries work correctly on your platform. |
|
108 |
+ |
|
109 |
+The first step is to use your OS’s package manager to install the `check` package. If your OS doesn’t have that package, you can download it from <http://check.sourceforge.net/>, build it and install it. |
|
110 |
+ |
|
111 |
+To help clamav’s configure script locate `check`, it is recommended that you install `pkg-config`, preferably using your OS’s package manager, or from <http://pkg-config.freedesktop.org>. |
|
112 |
+ |
|
113 |
+The recommended way to run unit-tests is the following, which ensures you will get an error if unit tests cannot be built: \[6\] |
|
114 |
+ |
|
115 |
+```bash |
|
116 |
+ $ ./configure --enable-check |
|
117 |
+ $ make |
|
118 |
+ $ make check |
|
119 |
+``` |
|
120 |
+ |
|
121 |
+When `make check` is finished, you should get a message similar to this: |
|
122 |
+ |
|
123 |
+```bash |
|
124 |
+================== |
|
125 |
+All 8 tests passed |
|
126 |
+================== |
|
127 |
+``` |
|
128 |
+ |
|
129 |
+If a unit test fails, you get a message similar to the following. Note that in older versions of make check may report failures due to the absence of optional packages. Please make sure you have the latest versions of the components noted in section /refsec:components. See the next section on how to report a bug when a unit test fails. |
|
130 |
+ |
|
131 |
+```bash |
|
132 |
+======================================== |
|
133 |
+1 of 8 tests failed |
|
134 |
+Please report to https://bugzilla.clamav.net/ |
|
135 |
+======================================== |
|
136 |
+``` |
|
137 |
+ |
|
138 |
+If unit tests are disabled (and you didn’t use –enable-check), you will get this message: |
|
139 |
+ |
|
140 |
+```bash |
|
141 |
+*** Unit tests disabled in this build |
|
142 |
+*** Use ./configure --enable-check to enable them |
|
143 |
+ |
|
144 |
+SKIP: check_clamav |
|
145 |
+PASS: check_clamd.sh |
|
146 |
+PASS: check_freshclam.sh |
|
147 |
+PASS: check_sigtool.sh |
|
148 |
+PASS: check_clamscan.sh |
|
149 |
+====================== |
|
150 |
+All 4 tests passed |
|
151 |
+(1 tests were not run) |
|
152 |
+====================== |
|
153 |
+``` |
|
154 |
+ |
|
155 |
+Running `./configure --enable-check` should tell you why. |
|
156 |
+ |
|
157 |
+## Reporting a unit test failure bug |
|
158 |
+ |
|
159 |
+If `make check` says that some tests failed we encourage you to report a bug on our bugzilla: <https://bugzilla.clamav.net>. The information we need is: |
|
160 |
+ |
|
161 |
+- The exact output from `make check` |
|
162 |
+- Output of `uname -mrsp` |
|
163 |
+- your `config.log` |
|
164 |
+- The following files from the `unit_tests/` directory: |
|
165 |
+ - `test.log` |
|
166 |
+ - `clamscan.log` |
|
167 |
+ - `clamdscan.log` |
|
168 |
+ |
|
169 |
+- `/tmp/clamd-test.log` if it exists |
|
170 |
+- where and how you installed the check package |
|
171 |
+- Output of `pkg-config check --cflags --libs` |
|
172 |
+- Optionally if `valgrind` is available on your platform, the output of the following: |
|
173 |
+ ```bash |
|
174 |
+ $ make check |
|
175 |
+ $ CK_FORK=no ./libtool --mode=execute valgrind unit_tests/check_clamav |
|
176 |
+ ``` |
|
177 |
+ |
|
178 |
+## Obtain Latest ClamAV anti-virus signature databases |
|
179 |
+ |
|
180 |
+Before you can run ClamAV in daemon mode (clamd), ’clamdscan’, or ’clamscan’ which is ClamAV’s command line virus scanner, you must have ClamAV Virus Database (.cvd) file(s) installed in the appropriate location on your system. The default location for these database files are /usr/local/share/clamav (in Linux/Unix). |
|
181 |
+ |
|
182 |
+Here is a listing of currently available ClamAV Virus Database Files: |
|
183 |
+ |
|
184 |
+- bytecode.cvd (signatures to detect bytecode in files) |
|
185 |
+- main.cvd (main ClamAV virus database file) |
|
186 |
+- daily.cvd (daily update file for ClamAV virus databases) |
|
187 |
+- safebrowsing.cvd (virus signatures for safe browsing) |
|
188 |
+ |
|
189 |
+These files can be downloaded via HTTP from the main ClamAV website or via the ’freshclam’ utility on a periodic basis. Using ’freshclam’ is the preferred method of keeping the ClamAV virus database files up to date without manual intervention (see the [freshclam configuration](Configuration.md#Setting-up-auto\-updating) section for information on how to configure ’freshclam’ for automatic updating and the main [freshclam](Usage.md#freshclam) section for additional details on freshclam). |
|
190 |
+ |
|
191 |
+## Binary packages |
|
192 |
+ |
|
193 |
+As an alternative to building and installing from source, most Linux package managers provide pre-compiled ClamAV packages. |
|
194 |
+ |
|
195 |
+For more information about installing ClamAV via a Package Manager, please visit: |
|
196 |
+<https://www.clamav.net/download.html#otherversions> |
0 | 197 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,122 @@ |
0 |
+# Introduction |
|
1 |
+ |
|
2 |
+Clam AntiVirus is an open source (GPLv2) anti-virus toolkit, designed especially for e-mail scanning on mail gateways. It provides a number of utilities including a flexible and scalable multi-threaded daemon, a command line scanner and advanced tool for automatic database updates. The core of the package is an anti-virus engine available in a form of shared library. |
|
3 |
+ |
|
4 |
+## Features |
|
5 |
+ |
|
6 |
+### Capabilities |
|
7 |
+ |
|
8 |
+- ClamAV is designed to scan files quickly. |
|
9 |
+- Real time protection (Linux only). Our scanning daemon supports on-access scanning on modern versions of Linux, including the ability to block file access until a file has been scanned. |
|
10 |
+- ClamAV detects over 1 million viruses, worms and trojans, including Microsoft Office macro viruses, mobile malware, and other threats. |
|
11 |
+- The built-in bytecode interpreter allows the ClamAV signature writers to create and distribute very complex detection routines and remotely enhance the scanner’s functionality. |
|
12 |
+- Signed signature databases ensure that ClamAV will only execute trusted signature definitions. |
|
13 |
+- ClamAV scans within archives and compressed files but also protects against archive bombs. Built-in archive extraction capabilities include: |
|
14 |
+ - Zip (including SFX) |
|
15 |
+ - RAR (including SFX) |
|
16 |
+ - 7Zip |
|
17 |
+ - ARJ (including SFX) |
|
18 |
+ - Tar |
|
19 |
+ - CPIO |
|
20 |
+ - Gzip |
|
21 |
+ - Bzip2 |
|
22 |
+ - DMG |
|
23 |
+ - IMG |
|
24 |
+ - ISO 9660 |
|
25 |
+ - PKG |
|
26 |
+ - HFS+ partition |
|
27 |
+ - HFSX partition |
|
28 |
+ - APM disk image |
|
29 |
+ - GPT disk image |
|
30 |
+ - MBR disk image |
|
31 |
+ - XAR |
|
32 |
+ - XZ |
|
33 |
+ - MS OLE2 |
|
34 |
+ - MS Cabinet Files (including SFX) |
|
35 |
+ - MS CHM (Compiled HTML) |
|
36 |
+ - MS SZDD compression format |
|
37 |
+ - BinHex |
|
38 |
+ - SIS (SymbianOS packages) |
|
39 |
+ - AutoIt |
|
40 |
+ - InstallShield |
|
41 |
+- Supports Windows executable file parsing, also known as Portable Executables (PE) both 32/64-bit, including PE files that are compressed or obfuscated with: |
|
42 |
+ - AsPack |
|
43 |
+ - UPX |
|
44 |
+ - FSG |
|
45 |
+ - Petite |
|
46 |
+ - PeSpin |
|
47 |
+ - NsPack |
|
48 |
+ - wwpack32 |
|
49 |
+ - MEW |
|
50 |
+ - Upack |
|
51 |
+ - Y0da Cryptor |
|
52 |
+- Supports ELF and Mach-O files (both 32- and 64-bit) |
|
53 |
+- Supports almost all mail file formats |
|
54 |
+- Support for other special files/formats includes: |
|
55 |
+ - HTML |
|
56 |
+ - RTF |
|
57 |
|
|
58 |
+ - Files encrypted with CryptFF and ScrEnc |
|
59 |
+ - uuencode |
|
60 |
+ - TNEF (winmail.dat) |
|
61 |
+- Advanced database updater with support for scripted updates, digital signatures and DNS based database version queries |
|
62 |
+ |
|
63 |
+### License |
|
64 |
+ |
|
65 |
+ClamAV is licensed under the GNU General Public License, Version 2 |
|
66 |
+ |
|
67 |
+### Supported platforms |
|
68 |
+ |
|
69 |
+Clam AntiVirus is highly cross-platform. The development team cannot test every OS, so we have chosen to test ClamAV using the two most recent Long Term Support (LTS) versions of each of the most popular desktop operating systems. Our regularly tested operating systems include: |
|
70 |
+ |
|
71 |
+- GNU/Linux |
|
72 |
+ - Ubuntu |
|
73 |
+ - 14.04 |
|
74 |
+ - 16.04 |
|
75 |
+ - Debian |
|
76 |
+ - 7 |
|
77 |
+ - 8 |
|
78 |
+ - CentOS |
|
79 |
+ - 6 |
|
80 |
+ - 7 |
|
81 |
+- UNIX |
|
82 |
+ - Solaris |
|
83 |
+ - 10 |
|
84 |
+ - 11 |
|
85 |
+ - FreeBSD |
|
86 |
+ - 10 |
|
87 |
+ - 11 |
|
88 |
+ - macOS |
|
89 |
+ - 10.12 (Sierra) |
|
90 |
+ - 10.13 (High Sierra) |
|
91 |
+- Windows |
|
92 |
+ - 7 |
|
93 |
+ - 10 |
|
94 |
+ |
|
95 |
+## Mailing lists and IRC channel |
|
96 |
+ |
|
97 |
+If you have a trouble installing or using ClamAV try asking on our mailing lists. There are four lists available: |
|
98 |
+ |
|
99 |
+- **clamav-announce\*lists.clamav.net** - info about new versions, moderated\[1\]. |
|
100 |
+- **clamav-users\*lists.clamav.net** - user questions |
|
101 |
+- **clamav-devel\*lists.clamav.net** - technical discussions |
|
102 |
+- **clamav-virusdb\*lists.clamav.net** - database update announcements, moderated |
|
103 |
+ |
|
104 |
+You can subscribe and search the mailing list archives at: <https://www.clamav.net/contact.html#ml> |
|
105 |
+ |
|
106 |
+Alternatively you can try asking on the `#clamav` IRC channel - launch your favourite irc client and type: |
|
107 |
+ |
|
108 |
+```bash |
|
109 |
+ /server irc.freenode.net |
|
110 |
+ /join #clamav |
|
111 |
+``` |
|
112 |
+ |
|
113 |
+## Submitting New or Otherwise Undetected Malware |
|
114 |
+ |
|
115 |
+If you've got a virus which is not detected by the current version of ClamAV using the latest signature databases, please submit the sample at our website: |
|
116 |
+ |
|
117 |
+<https://www.clamav.net/reports/malware> |
|
118 |
+ |
|
119 |
+Likewise, if you have a benign file that is flagging as a virus and you wish to report a False Positive, please submit the sample for reive at our website: |
|
120 |
+ |
|
121 |
+<https://www.clamav.net/reports/fp> |
|
0 | 122 |
\ No newline at end of file |
1 | 123 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,681 @@ |
0 |
+# PhishSigs |
|
1 |
+ |
|
2 |
+- [PhishSigs](#phishsigs) |
|
3 |
+- [Database file format](#database-file-format) |
|
4 |
+ - [PDB format](#pdb-format) |
|
5 |
+ - [GDB format](#gdb-format) |
|
6 |
+ - [WDB format](#wdb-format) |
|
7 |
+ - [Hints](#hints) |
|
8 |
+ - [Examples of PDB signatures](#examples-of-pdb-signatures) |
|
9 |
+ - [Examples of WDB signatures](#examples-of-wdb-signatures) |
|
10 |
+ - [Example for how the URL extractor works](#example-for-how-the-url-extractor-works) |
|
11 |
+ - [How matching works](#how-matching-works) |
|
12 |
+ - [RealURL, displayedURL concatenation](#realurl-displayedurl-concatenation) |
|
13 |
+ - [What happens when a match is found](#what-happens-when-a-match-is-found) |
|
14 |
+ - [Extraction of realURL, displayedURL from HTML tags](#extraction-of-realurl-displayedurl-from-html-tags) |
|
15 |
+ - [Example](#example) |
|
16 |
+ - [Simple patterns](#simple-patterns) |
|
17 |
+ - [Regular expressions](#regular-expressions) |
|
18 |
+ - [Flags](#flags) |
|
19 |
+- [Introduction to regular expressions](#introduction-to-regular-expressions) |
|
20 |
+ - [Special characters](#special-characters) |
|
21 |
+ - [Character classes](#character-classes) |
|
22 |
+ - [Escaping](#escaping) |
|
23 |
+ - [Alternation](#alternation) |
|
24 |
+ - [Optional matching, and repetition](#optional-matching-and-repetition) |
|
25 |
+ - [Groups](#groups) |
|
26 |
+- [How to create database files](#how-to-create-database-files) |
|
27 |
+ - [How to create and maintain the whitelist (daily.wdb)](#how-to-create-and-maintain-the-whitelist-dailywdb) |
|
28 |
+ - [How to create and maintain the domainlist (daily.pdb)](#how-to-create-and-maintain-the-domainlist-dailypdb) |
|
29 |
+ - [Dealing with false positives, and undetected phishing mails](#dealing-with-false-positives-and-undetected-phishing-mails) |
|
30 |
+ - [False positives](#false-positives) |
|
31 |
+ - [Undetected phish mails](#undetected-phish-mails) |
|
32 |
+ |
|
33 |
+# Database file format |
|
34 |
+ |
|
35 |
+## PDB format |
|
36 |
+ |
|
37 |
+This file contains urls/hosts that are target of phishing attempts. It |
|
38 |
+contains lines in the following format: |
|
39 |
+ |
|
40 |
+``` |
|
41 |
+ R[Filter]:RealURL:DisplayedURL[:FuncLevelSpec] |
|
42 |
+ H[Filter]:DisplayedHostname[:FuncLevelSpec] |
|
43 |
+``` |
|
44 |
+ |
|
45 |
+- `R` |
|
46 |
+ |
|
47 |
+ regular expression, for the concatenated URL |
|
48 |
+ |
|
49 |
+- `H` |
|
50 |
+ |
|
51 |
+ matches the `DisplayedHostname` as a simple pattern (literally, no regular expression) |
|
52 |
+ |
|
53 |
+ - the pattern can match either the full hostname |
|
54 |
+ |
|
55 |
+ - or a subdomain of the specified hostname |
|
56 |
+ |
|
57 |
+ - to avoid false matches in case of subdomain matches, the engine checks that there is a dot(`.`) or a space(` `) before the matched portion |
|
58 |
+ |
|
59 |
+- `Filter` |
|
60 |
+ |
|
61 |
+ is ignored for R and H for compatibility reasons |
|
62 |
+ |
|
63 |
+- `RealURL` |
|
64 |
+ |
|
65 |
+ is the URL the user is sent to, example: *href* attribute of an html anchor (*\<a\> tag*) |
|
66 |
+ |
|
67 |
+- `DisplayedURL` |
|
68 |
+ |
|
69 |
+ is the URL description displayed to the user, where its *claimed* they are sent, example: contents of an html anchor (*\<a\> tag*) |
|
70 |
+ |
|
71 |
+- `DisplayedHostname` |
|
72 |
+ |
|
73 |
+ is the hostname portion of the DisplayedURL |
|
74 |
+ |
|
75 |
+- `FuncLevelSpec` |
|
76 |
+ |
|
77 |
+ an (optional) functionality level, 2 formats are possible: |
|
78 |
+ |
|
79 |
+ - `minlevel` all engines having functionality level \>= `minlevel` will load this line |
|
80 |
+ |
|
81 |
+ - `minlevel-maxlevel` engines with functionality level \(>=\) `minlevel`, and \(<\) `maxlevel` will load this line |
|
82 |
+ |
|
83 |
+## GDB format |
|
84 |
+ |
|
85 |
+This file contains URL hashes in the following format: |
|
86 |
+ |
|
87 |
+ S:P:HostPrefix[:FuncLevelSpec] |
|
88 |
+ S:F:Sha256hash[:FuncLevelSpec] |
|
89 |
+ S1:P:HostPrefix[:FuncLevelSpec] |
|
90 |
+ S1:F:Sha256hash[:FuncLevelSpec] |
|
91 |
+ S2:P:HostPrefix[:FuncLevelSpec] |
|
92 |
+ S2:F:Sha256hash[:FuncLevelSpec] |
|
93 |
+ S:W:Sha256hash[:FuncLevelSpec] |
|
94 |
+ |
|
95 |
+- `S:` |
|
96 |
+ |
|
97 |
+ These are hashes for Google Safe Browsing - malware sites, and should not be used for other purposes. |
|
98 |
+ |
|
99 |
+- `S2:` |
|
100 |
+ |
|
101 |
+ These are hashes for Google Safe Browsing - phishing sites, and should not be used for other purposes. |
|
102 |
+ |
|
103 |
+- `S1:` |
|
104 |
+ |
|
105 |
+ Hashes for blacklisting phishing sites. Virus name: Phishing.URL.Blacklisted |
|
106 |
+ |
|
107 |
+- `S:W:` |
|
108 |
+ |
|
109 |
+ Locally whitelisted hashes. |
|
110 |
+ |
|
111 |
+- `HostPrefix` |
|
112 |
+ |
|
113 |
+ 4-byte prefix of the sha256 hash of the last 2 or 3 components of the hostname. If prefix doesn’t match, no further lookups are performed. |
|
114 |
+ |
|
115 |
+- `Sha256hash` |
|
116 |
+ |
|
117 |
+ sha256 hash of the canonicalized URL, or a sha256 hash of its prefix/suffix according to the Google Safe Browsing “Performing Lookups” rules. There should be a corresponding `:P:HostkeyPrefix` entry for the hash to be taken into consideration. |
|
118 |
+ |
|
119 |
+To see which hash/URL matched, look at the `clamscan --debug` output, and look for the following strings: `Looking up hash`, `prefix matched`, and `Hash matched`. Local whitelisting of .gdb entries can be done by creating a local.gdb file, and adding a line `S:W:<HASH>`. |
|
120 |
+ |
|
121 |
+## WDB format |
|
122 |
+ |
|
123 |
+This file contains whitelisted url pairs It contains lines in the following format: |
|
124 |
+ |
|
125 |
+``` |
|
126 |
+ X:RealURL:DisplayedURL[:FuncLevelSpec] |
|
127 |
+ M:RealHostname:DisplayedHostname[:FuncLevelSpec] |
|
128 |
+``` |
|
129 |
+ |
|
130 |
+- `X` |
|
131 |
+ |
|
132 |
+ regular expression, for the *entire URL*, not just the hostname |
|
133 |
+ |
|
134 |
+ - The regular expression is by default anchored to start-of-line and end-of-line, as if you have used `^RegularExpression$` |
|
135 |
+ |
|
136 |
+ - A trailing `/` is automatically added both to the regex, and the input string to avoid false matches |
|
137 |
+ |
|
138 |
+ - The regular expression matches the *concatenation* of the RealURL, a colon(`:`), and the DisplayedURL as a single string. It doesn’t separately match RealURL and DisplayedURL\! |
|
139 |
+ |
|
140 |
+- `M` |
|
141 |
+ |
|
142 |
+ matches hostname, or subdomain of it, see notes for H above |
|
143 |
+ |
|
144 |
+## Hints |
|
145 |
+ |
|
146 |
+- empty lines are ignored |
|
147 |
+ |
|
148 |
+- the colons are mandatory |
|
149 |
+ |
|
150 |
+- Don’t leave extra spaces on the end of a line\! |
|
151 |
+ |
|
152 |
+- if any of the lines don’t conform to this format, clamav will abort with a Malformed Database Error |
|
153 |
+ |
|
154 |
+- see section [Extraction-of-realURL](#Extraction-of-realURL,-displayedURL-from-HTML-tags) for more details on realURL/displayedURL |
|
155 |
+ |
|
156 |
+## Examples of PDB signatures |
|
157 |
+ |
|
158 |
+To check for phishing mails that target amazon.com, or subdomains of |
|
159 |
+amazon.com: |
|
160 |
+ |
|
161 |
+``` |
|
162 |
+ H:amazon.com |
|
163 |
+``` |
|
164 |
+ |
|
165 |
+To do the same, but for amazon.co.uk: |
|
166 |
+ |
|
167 |
+``` |
|
168 |
+ H:amazon.co.uk |
|
169 |
+``` |
|
170 |
+ |
|
171 |
+To limit the signatures to certain engine versions: |
|
172 |
+ |
|
173 |
+``` |
|
174 |
+ H:amazon.co.uk:20-30 |
|
175 |
+ H:amazon.co.uk:20- |
|
176 |
+ H:amazon.co.uk:0-20 |
|
177 |
+``` |
|
178 |
+ |
|
179 |
+First line: engine versions 20, 21, ..., 29 can load it |
|
180 |
+ |
|
181 |
+Second line: engine versions \>= 20 can load it |
|
182 |
+ |
|
183 |
+Third line: engine versions \< 20 can load it |
|
184 |
+ |
|
185 |
+In a real situation, you’d probably use the second form. A situation like that would be if you are using a feature of the signatures not available in earlier versions, or if earlier versions have bugs with your signature. Its neither case here, the above examples are for illustrative purposes only. |
|
186 |
+ |
|
187 |
+## Examples of WDB signatures |
|
188 |
+ |
|
189 |
+To allow amazon’s country specific domains and amazon.com, to mix domain names in DisplayedURL, and RealURL: |
|
190 |
+ |
|
191 |
+ X:.+\.amazon\.(at|ca|co\.uk|co\.jp|de|fr)([/?].*)?:.+\.amazon\.com([/?].*)?:17- |
|
192 |
+ |
|
193 |
+Explanation of this signature: |
|
194 |
+ |
|
195 |
+- `X:` |
|
196 |
+ |
|
197 |
+ this is a regular expression |
|
198 |
+ |
|
199 |
+- `:17-` |
|
200 |
+ |
|
201 |
+ load signature only for engines with functionality level \>= 17 (recommended for type X) |
|
202 |
+ |
|
203 |
+The regular expression is the following (X:, :17- stripped, and a / appended) |
|
204 |
+ |
|
205 |
+``` |
|
206 |
+ .+\.amazon\.(at|ca|co\.uk|co\.jp|de|fr)([/?].*)?:.+\.amazon\.com([/?].*)?/ |
|
207 |
+``` |
|
208 |
+ |
|
209 |
+Explanation of this regular expression (note that it is a single regular expression, and not 2 regular expressions splitted at the :). |
|
210 |
+ |
|
211 |
+- `.+` |
|
212 |
+ |
|
213 |
+ any subdomain of |
|
214 |
+ |
|
215 |
+- `\.amazon\.` |
|
216 |
+ |
|
217 |
+ domain we are whitelisting (RealURL part) |
|
218 |
+ |
|
219 |
+- `(at|ca|co\.uk|co\.jp|de|fr)` |
|
220 |
+ |
|
221 |
+ country-domains: at, ca, co.uk, co.jp, de, fr |
|
222 |
+ |
|
223 |
+- `([/?].*)?` |
|
224 |
+ |
|
225 |
+ recomended way to end real url part of whitelist, this protects against embedded URLs (evilurl.example.com/amazon.co.uk/) |
|
226 |
+ |
|
227 |
+- `:` |
|
228 |
+ |
|
229 |
+ RealURL and DisplayedURL are concatenated via a :, so match a literal : here |
|
230 |
+ |
|
231 |
+- `.+` |
|
232 |
+ |
|
233 |
+ any subdomain of |
|
234 |
+ |
|
235 |
+- `\.amazon\.com` |
|
236 |
+ |
|
237 |
+ whitelisted DisplayedURL |
|
238 |
+ |
|
239 |
+- `([/?].*)?` |
|
240 |
+ |
|
241 |
+ recommended way to end displayed url part, to protect against embedded URLs |
|
242 |
+ |
|
243 |
+- `/` |
|
244 |
+ |
|
245 |
+ automatically added to further protect against embedded URLs |
|
246 |
+ |
|
247 |
+When you whitelist an entry make sure you check that both domains are owned by the same entity. What this whitelist entry allows is: Links claiming to point to amazon.com (DisplayedURL), but really go to country-specific domain of amazon (RealURL). |
|
248 |
+ |
|
249 |
+## Example for how the URL extractor works |
|
250 |
+ |
|
251 |
+Consider the following HTML file: |
|
252 |
+ |
|
253 |
+```html |
|
254 |
+ <html> |
|
255 |
+ <a href="http://1.realurl.example.com/"> |
|
256 |
+ 1.displayedurl.example.com |
|
257 |
+ </a> |
|
258 |
+ <a href="http://2.realurl.example.com"> |
|
259 |
+ 2 d<b>i<p>splayedurl.e</b>xa<i>mple.com |
|
260 |
+ </a> |
|
261 |
+ <a href="http://3.realurl.example.com"> |
|
262 |
+ 3.nested.example.com |
|
263 |
+ <a href="http://4.realurl.example.com"> |
|
264 |
+ 4.displayedurl.example.com |
|
265 |
+ </a> |
|
266 |
+ </a> |
|
267 |
+ <form action="http://5.realurl.example.com"> |
|
268 |
+ sometext |
|
269 |
+ <img src="http://5.displayedurl.example.com/img0.gif"/> |
|
270 |
+ <a href="http://5.form.nested.displayedurl.example.com"> |
|
271 |
+ 5.form.nested.link-displayedurl.example.com |
|
272 |
+ </a> |
|
273 |
+ </form> |
|
274 |
+ <a href="http://6.realurl.example.com"> |
|
275 |
+ 6.displ |
|
276 |
+ <img src="6.displayedurl.example.com/img1.gif"/> |
|
277 |
+ ayedurl.example.com |
|
278 |
+ </a> |
|
279 |
+ <a href="http://7.realurl.example.com"> |
|
280 |
+ <iframe src="http://7.displayedurl.example.com"> |
|
281 |
+ </a> |
|
282 |
+``` |
|
283 |
+ |
|
284 |
+The phishing engine extract the following |
|
285 |
+RealURL/DisplayedURL pairs from it: |
|
286 |
+ |
|
287 |
+``` |
|
288 |
+ http://1.realurl.example.com/ |
|
289 |
+ 1.displayedurl.example.com |
|
290 |
+ |
|
291 |
+ http://2.realurl.example.com |
|
292 |
+ 2displayedurl.example.com |
|
293 |
+ |
|
294 |
+ http://3.realurl.example.com |
|
295 |
+ 3.nested.example.com |
|
296 |
+ |
|
297 |
+ http://4.realurl.example.com |
|
298 |
+ 4.displayedurl.example.com |
|
299 |
+ |
|
300 |
+ http://5.realurl.example.com |
|
301 |
+ http://5.displayedurl.example.com/img0.gif |
|
302 |
+ |
|
303 |
+ http://5.realurl.example.com |
|
304 |
+ http://5.form.nested.displayedurl.example.com |
|
305 |
+ |
|
306 |
+ http://5.form.nested.displayedurl.example.com |
|
307 |
+ 5.form.nested.link-displayedurl.example.com |
|
308 |
+ |
|
309 |
+ http://6.realurl.example.com |
|
310 |
+ 6.displayedurl.example.com |
|
311 |
+ |
|
312 |
+ http://6.realurl.example.com |
|
313 |
+ 6.displayedurl.example.com/img1.gif |
|
314 |
+``` |
|
315 |
+ |
|
316 |
+## How matching works |
|
317 |
+ |
|
318 |
+### RealURL, displayedURL concatenation |
|
319 |
+ |
|
320 |
+The phishing detection module processes pairs of RealURL/DisplayedURL. Matching against daily.wdb is done as follows: the realURL is concatenated with a `:`, and with the DisplayedURL, then that *line* is matched against the lines in daily.wdb/daily.pdb |
|
321 |
+ |
|
322 |
+So if you have this line in daily.wdb: |
|
323 |
+ |
|
324 |
+ M:www.google.ro:www.google.com |
|
325 |
+ |
|
326 |
+and this href: `<a href='http://www.google.ro'>www.google.com</a>` then it will be whitelisted, but: `<a href='http://images.google.com'>www.google.com</a>` will not. |
|
327 |
+ |
|
328 |
+### What happens when a match is found |
|
329 |
+ |
|
330 |
+In the case of the whitelist, a match means that the RealURL/DisplayedURL combination is considered clean, and no further checks are performed on it. |
|
331 |
+ |
|
332 |
+In the case of the domainlist, a match means that the RealURL/displayedURL is going to be checked for phishing attempts. |
|
333 |
+ |
|
334 |
+Furthermore you can restrict what checks are to be performed by specifying the 3-digit hexnumber. |
|
335 |
+ |
|
336 |
+### Extraction of realURL, displayedURL from HTML tags |
|
337 |
+ |
|
338 |
+The html parser extracts pairs of realURL/displayedURL based on the following rules. |
|
339 |
+ |
|
340 |
+In version 0.93: After URLs have been extracted, they are normalized, and cut after the hostname. `http://test.example.com/path/somecgi?queryparameters` becomes `http://test.example.com/` |
|
341 |
+ |
|
342 |
+- `a` |
|
343 |
+ |
|
344 |
+ (anchor) the *href* is the realURL, its *contents* is the displayedURL |
|
345 |
+ |
|
346 |
+ - contents |
|
347 |
+ is the tag-stripped contents of the \<a\> tags, so for example \<b\> tags are stripped (but not their contents) |
|
348 |
+ |
|
349 |
+ nesting another \<a\> tag withing an \<a\> tag (besides being invalid html) is treated as a \</a\>\<a.. |
|
350 |
+ |
|
351 |
+- `form` |
|
352 |
+ |
|
353 |
+ the *action* attribute is the realURL, and a nested \<a\> tag is the displayedURL |
|
354 |
+ |
|
355 |
+- `img/area` |
|
356 |
+ |
|
357 |
+ if nested within an *\<a\>* tag, the realURL is the *href* of the a tag, and the *src/dynsrc/area* is the displayedURL of the img |
|
358 |
+ |
|
359 |
+ if nested withing a *form* tag, then the action attribute of the *form* tag is the realURL |
|
360 |
+ |
|
361 |
+- `iframe` |
|
362 |
+ |
|
363 |
+ if nested withing an *\<a\>* tag the *src* attribute is the displayedURL, and the *href* of its parent *a* tag is the realURL |
|
364 |
+ |
|
365 |
+ if nested withing a *form* tag, then the action attribute of the *form* tag is the realURL |
|
366 |
+ |
|
367 |
+### Example |
|
368 |
+ |
|
369 |
+Consider this html file: |
|
370 |
+ |
|
371 |
+```html |
|
372 |
+<a href=”evilurl”\>www.paypal.com\</a\>* |
|
373 |
+ |
|
374 |
+<a href=”evilurl2” title=”www.ebay.com”\>click here to sign |
|
375 |
+in\</a\>* |
|
376 |
+ |
|
377 |
+<form action=”evilurl_form”\>* |
|
378 |
+ |
|
379 |
+*Please sign in to \<a href=”cgi.ebay.com”\>Ebay\</a\using this |
|
380 |
+form* |
|
381 |
+ |
|
382 |
+<input type=’text’ name=’username’\>Username\</input\>* |
|
383 |
+ |
|
384 |
+*....* |
|
385 |
+ |
|
386 |
+</form\>* |
|
387 |
+ |
|
388 |
+<a href=”evilurl”\>\<img src=”images.paypal.com/secure.jpg”\>\</a\>* |
|
389 |
+``` |
|
390 |
+ |
|
391 |
+The resulting realURL/displayedURL pairs will be (note that one tag can generate multiple pairs): |
|
392 |
+ |
|
393 |
+- evilurl / www.paypal.com |
|
394 |
+ |
|
395 |
+- evilurl2 / click here to sign in |
|
396 |
+ |
|
397 |
+- evilurl2 / www.ebay.com |
|
398 |
+ |
|
399 |
+- evilurl_form / cgi.ebay.com |
|
400 |
+ |
|
401 |
+- cgi.ebay.com / Ebay |
|
402 |
+ |
|
403 |
+- evilurl / image.paypal.com/secure.jpg |
|
404 |
+ |
|
405 |
+## Simple patterns |
|
406 |
+ |
|
407 |
+Simple patterns are matched literally, i.e. if you say: |
|
408 |
+ |
|
409 |
+``` |
|
410 |
+www.google.com |
|
411 |
+``` |
|
412 |
+ |
|
413 |
+it is going to match *www.google.com*, and only that. The *. (dot)* character has no special meaning (see the section on regexes [\[sec:Regular-expressions\]](#sec:Regular-expressions) for how the *.(dot)* character behaves there) |
|
414 |
+ |
|
415 |
+## Regular expressions |
|
416 |
+ |
|
417 |
+POSIX regular expressions are supported, and you can consider that internally it is wrapped by *^*, and *$.* In other words, this means that the regular expression has to match the entire concatenated (see section [RealURL,-displayedURL-concatenation](#RealURL,-displayedURL-concatenation) for details on concatenation) url. |
|
418 |
+ |
|
419 |
+It is recomended that you read section [Introduction-to-regular](#Introduction-to-regular) to learn how to write regular expressions, and then come back and read this for hints. |
|
420 |
+ |
|
421 |
+Be advised that clamav contains an internal, very basic regex matcher to reduce the load on the regex matching core. Thus it is recomended that you avoid using regex syntax not supported by it at the very beginning of regexes (at least the first few characters). |
|
422 |
+ |
|
423 |
+Currently the clamav regex matcher supports: |
|
424 |
+ |
|
425 |
+- `.` (dot) character |
|
426 |
+ |
|
427 |
+- `\(\backslash\)` (escaping special characters) |
|
428 |
+ |
|
429 |
+- `|` (pipe) alternatives |
|
430 |
+ |
|
431 |
+- `\[\]` (character classes) |
|
432 |
+ |
|
433 |
+- `()` (parenthesis for grouping, but no group extraction is performed) |
|
434 |
+ |
|
435 |
+- other non-special characters |
|
436 |
+ |
|
437 |
+Thus the following are not supported: |
|
438 |
+ |
|
439 |
+- `\+` repetition |
|
440 |
+ |
|
441 |
+- `\*` repetition |
|
442 |
+ |
|
443 |
+- `{}` repetition |
|
444 |
+ |
|
445 |
+- backreferences |
|
446 |
+ |
|
447 |
+- lookaround |
|
448 |
+ |
|
449 |
+- other “advanced” features not listed in the supported list ;) |
|
450 |
+ |
|
451 |
+This however shouldn’t discourage you from using the “not directly supported features “, because if the internal engine encounters unsupported syntax, it passes it on to the POSIX regex core (beginning from the first unsupported token, everything before that is still processed by the internal matcher). An example might make this more clear: |
|
452 |
+ |
|
453 |
+*www\(\backslash\).google\(\backslash\).(com|ro|it) (\[a-zA-Z\])+\(\backslash\).google\(\backslash\).(com|ro|it)* |
|
454 |
+ |
|
455 |
+Everything till *(\[a-zA-Z\])+* is processed internally, that parenthesis (and everything beyond) is processed by the posix core. |
|
456 |
+ |
|
457 |
+Examples of url pairs that match: |
|
458 |
+ |
|
459 |
+- *www.google.ro images.google.ro* |
|
460 |
+ |
|
461 |
+- www.google.com images.google.ro |
|
462 |
+ |
|
463 |
+Example of url pairs that don’t match: |
|
464 |
+ |
|
465 |
+- www.google.ro images1.google.ro |
|
466 |
+ |
|
467 |
+- images.google.com image.google.com |
|
468 |
+ |
|
469 |
+## Flags |
|
470 |
+ |
|
471 |
+Flags are a binary OR of the following numbers: |
|
472 |
+ |
|
473 |
+- HOST_SUFFICIENT |
|
474 |
+ |
|
475 |
+ 1 |
|
476 |
+ |
|
477 |
+- DOMAIN_SUFFICIENT |
|
478 |
+ |
|
479 |
+ 2 |
|
480 |
+ |
|
481 |
+- DO_REVERSE_LOOKUP |
|
482 |
+ |
|
483 |
+ 4 |
|
484 |
+ |
|
485 |
+- CHECK_REDIR |
|
486 |
+ |
|
487 |
+ 8 |
|
488 |
+ |
|
489 |
+- CHECK_SSL |
|
490 |
+ |
|
491 |
+ 16 |
|
492 |
+ |
|
493 |
+- CHECK_CLOAKING |
|
494 |
+ |
|
495 |
+ 32 |
|
496 |
+ |
|
497 |
+- CLEANUP_URL |
|
498 |
+ |
|
499 |
+ 64 |
|
500 |
+ |
|
501 |
+- CHECK_DOMAIN_REVERSE |
|
502 |
+ |
|
503 |
+ 128 |
|
504 |
+ |
|
505 |
+- CHECK_IMG_URL |
|
506 |
+ |
|
507 |
+ 256 |
|
508 |
+ |
|
509 |
+- DOMAINLIST_REQUIRED |
|
510 |
+ |
|
511 |
+ 512 |
|
512 |
+ |
|
513 |
+The names of the constants are self-explanatory. |
|
514 |
+ |
|
515 |
+These constants are defined in libclamav/phishcheck.h, you can check there for the latest flags. |
|
516 |
+ |
|
517 |
+There is a default set of flags that are enabled, these are currently: |
|
518 |
+ |
|
519 |
+ ( CLEANUP_URL | CHECK_SSL | CHECK_CLOAKING | CHECK_IMG_URL ) |
|
520 |
+ |
|
521 |
+ssl checking is performed only for a tags currently. |
|
522 |
+ |
|
523 |
+You must decide for each line in the domainlist if you want to filter any flags (that is you don’t want certain checks to be done), and then calculate the binary OR of those constants, and then convert it into a 3-digit hexnumber. For example you devide that domain_sufficient shouldn’t be used for ebay.com, and you don’t want to check images either, so you come up with this flag number: \(2|256\Rightarrow\)258\((decimal)\Rightarrow102(hexadecimal)\) |
|
524 |
+ |
|
525 |
+So you add this line to daily.wdb: |
|
526 |
+ |
|
527 |
+- R102 www.ebay.com .+ |
|
528 |
+ |
|
529 |
+# Introduction to regular expressions |
|
530 |
+ |
|
531 |
+Recomended reading: |
|
532 |
+ |
|
533 |
+- http://www.regular-expressions.info/quickstart.html |
|
534 |
+ |
|
535 |
+- http://www.regular-expressions.info/tutorial.html |
|
536 |
+ |
|
537 |
+- regex(7) man-page: http://www.tin.org/bin/man.cgi?section=7\&topic=regex |
|
538 |
+ |
|
539 |
+## Special characters |
|
540 |
+ |
|
541 |
+- \[ |
|
542 |
+ |
|
543 |
+ the opening square bracket - it marks the beginning of a character class, see section[Character-classes](#Character-classes) |
|
544 |
+ |
|
545 |
+- \(\backslash\) |
|
546 |
+ |
|
547 |
+ the backslash - escapes special characters, see section [Escaping](#Escaping) |
|
548 |
+ |
|
549 |
+- ^ |
|
550 |
+ |
|
551 |
+ the caret - matches the beginning of a line (not needed in clamav regexes, this is implied) |
|
552 |
+ |
|
553 |
+- $ |
|
554 |
+ |
|
555 |
+ the dollar sign - matches the end of a line (not needed in clamav regexes, this is implied) |
|
556 |
+ |
|
557 |
+- ̇ |
|
558 |
+ |
|
559 |
+ the period or dot - matches *any* character |
|
560 |
+ |
|
561 |
+- | |
|
562 |
+ |
|
563 |
+ the vertical bar or pipe symbol - matches either of the token on its left and right side, see section [Alternation](#sub:Alternation) |
|
564 |
+ |
|
565 |
+- ? |
|
566 |
+ |
|
567 |
+ the question mark - matches optionally the left-side token, see section[Optional-matching,-and](Optional-matching,-and) |
|
568 |
+ |
|
569 |
+- \* |
|
570 |
+ |
|
571 |
+ the asterisk or star - matches 0 or more occurences of the left-side token, see section [Optional-matching,-and](Optional-matching,-and) |
|
572 |
+ |
|
573 |
+- + |
|
574 |
+ |
|
575 |
+ the plus sign - matches 1 or more occurences of the left-side token, see section [Optional-matching,-and](Optional-matching,-and) |
|
576 |
+ |
|
577 |
+- ( |
|
578 |
+ |
|
579 |
+ the opening round bracket - marks beginning of a group, see section [Groups](Groups) |
|
580 |
+ |
|
581 |
+- ) |
|
582 |
+ |
|
583 |
+ the closing round bracket - marks end of a group, see section[Groups](Groups) |
|
584 |
+ |
|
585 |
+## Character classes |
|
586 |
+ |
|
587 |
+## Escaping |
|
588 |
+ |
|
589 |
+Escaping has two purposes: |
|
590 |
+ |
|
591 |
+- it allows you to actually match the special characters themselves, for example to match the literal *+*, you would write *\(\backslash\)+* |
|
592 |
+ |
|
593 |
+- it also allows you to match non-printable characters, such as the tab (*\(\backslash\)t*), newline (*\(\backslash\)n*), .. |
|
594 |
+ |
|
595 |
+However since non-printable characters are not valid inside an url, you won’t have a reason to use them. |
|
596 |
+ |
|
597 |
+## Alternation |
|
598 |
+ |
|
599 |
+## Optional matching, and repetition |
|
600 |
+ |
|
601 |
+## Groups |
|
602 |
+ |
|
603 |
+Groups are usually used together with repetition, or alternation. For example: *(com|it)+* means: match 1 or more repetitions of *com* or *it,* that is it matches: com, it, comcom, comcomcom, comit, itit, ititcom,... you get the idea. |
|
604 |
+ |
|
605 |
+Groups can also be used to extract substring, but this is not supported by the clam engine, and not needed either in this case. |
|
606 |
+ |
|
607 |
+# How to create database files |
|
608 |
+ |
|
609 |
+## How to create and maintain the whitelist (daily.wdb) |
|
610 |
+ |
|
611 |
+If the phishing code claims that a certain mail is phishing, but its not, you have 2 choices: |
|
612 |
+ |
|
613 |
+- examine your rules daily.pdb, and fix them if necessary (see: section[How-to-create](How-to-create)) |
|
614 |
+ |
|
615 |
+- add it to the whitelist (discussed here) |
|
616 |
+ |
|
617 |
+Lets assume you are having problems because of links like this in a mail: |
|
618 |
+ |
|
619 |
+```html |
|
620 |
+ <a href=''http://69.0.241.57/bCentral/L.asp?L=XXXXXXXX''> |
|
621 |
+ http://www.bcentral.it/ |
|
622 |
+ </a> |
|
623 |
+``` |
|
624 |
+ |
|
625 |
+After investigating those sites further, you decide they are no threat, and create a line like this in daily.wdb: |
|
626 |
+ |
|
627 |
+``` |
|
628 |
+R http://www\(\backslash\).bcentral\(\backslash\).it/.+ |
|
629 |
+http://69\(\backslash\).0\(\backslash\).241\(\backslash\).57/bCentral/L\(\backslash\).asp?L=.+ |
|
630 |
+``` |
|
631 |
+ |
|
632 |
+Note: urls like the above can be used to track unique mail recipients, and thus know if somebody actually reads mails (so they can send more spam). However since this site required no authentication information, it is safe from a phishing point of view. |
|
633 |
+ |
|
634 |
+## How to create and maintain the domainlist (daily.pdb) |
|
635 |
+ |
|
636 |
+When not using –phish-scan-alldomains (production environments for example), you need to decide which urls you are going to check. |
|
637 |
+ |
|
638 |
+Although at a first glance it might seem a good idea to check everything, it would produce false positives. Particularly newsletters, ads, etc. are likely to use URLs that look like phishing attempts. |
|
639 |
+ |
|
640 |
+Lets assume that you’ve recently seen many phishing attempts claiming they come from Paypal. Thus you need to add paypal to daily.pdb: |
|
641 |
+ |
|
642 |
+``` |
|
643 |
+R .+ .+\(\backslash\).paypal\(\backslash\).com |
|
644 |
+``` |
|
645 |
+ |
|
646 |
+The above line will block (detect as phishing) mails that contain urls that claim to lead to paypal, but they don’t in fact. |
|
647 |
+ |
|
648 |
+Be carefull not to create regexes that match a too broad range of urls though. |
|
649 |
+ |
|
650 |
+## Dealing with false positives, and undetected phishing mails |
|
651 |
+ |
|
652 |
+### False positives |
|
653 |
+ |
|
654 |
+Whenever you see a false positive (mail that is detected as phishing, but its not), you need to examine *why* clamav decided that its phishing. You can do this easily by building clamav with debugging (./configure –enable-experimental –enable-debug), and then running a tool: |
|
655 |
+ |
|
656 |
+```bash |
|
657 |
+$contrib/phishing/why.py phishing.eml |
|
658 |
+``` |
|
659 |
+ |
|
660 |
+This will show the url that triggers the phish verdict, and a reason why that url is considered phishing attempt. |
|
661 |
+ |
|
662 |
+Once you know the reason, you might need to modify daily.pdb (if one of yours rules inthere are too broad), or you need to add the url to daily.wdb. If you think the algorithm is incorrect, please file a bug report on bugzilla.clamav.net, including the output of *why.py*. |
|
663 |
+ |
|
664 |
+### Undetected phish mails |
|
665 |
+ |
|
666 |
+Using why.py doesn’t help here unfortunately (it will say: clean), so all you can do is: |
|
667 |
+ |
|
668 |
+```bash |
|
669 |
+$clamscan/clamscan –phish-scan-alldomains undetected.eml |
|
670 |
+``` |
|
671 |
+ |
|
672 |
+And see if the mail is detected, if yes, then you need to add an appropriate line to daily.pdb (see section [How-to-create](How-to-create)). |
|
673 |
+ |
|
674 |
+If the mail is not detected, then try using: |
|
675 |
+ |
|
676 |
+```bash |
|
677 |
+$clamscan/clamscan –debug undetected.eml|less |
|
678 |
+``` |
|
679 |
+ |
|
680 |
+Then see what urls are being checked, see if any of them is in a whitelist, see if all urls are detected, etc. |
0 | 681 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,1013 @@ |
0 |
+# Creating signatures for ClamAV |
|
1 |
+ |
|
2 |
+- [Creating signatures for ClamAV](#creating-signatures-for-clamav) |
|
3 |
+- [Introduction](#introduction) |
|
4 |
+- [Debug information from libclamav](#debug-information-from-libclamav) |
|
5 |
+- [Signature formats](#signature-formats) |
|
6 |
+ - [Hash-based signatures](#hash-based-signatures) |
|
7 |
+ - [MD5 hash-based signatures](#md5-hash-based-signatures) |
|
8 |
+ - [SHA1 and SHA256 hash-based signatures](#sha1-and-sha256-hash-based-signatures) |
|
9 |
+ - [PE section based hash signatures](#pe-section-based-hash-signatures) |
|
10 |
+ - [Hash signatures with unknown size](#hash-signatures-with-unknown-size) |
|
11 |
+ - [Body-based signatures](#body-based-signatures) |
|
12 |
+ - [Hexadecimal format](#hexadecimal-format) |
|
13 |
+ - [Wildcards](#wildcards) |
|
14 |
+ - [Character classes](#character-classes) |
|
15 |
+ - [Alternate strings](#alternate-strings) |
|
16 |
+ - [Basic signature format](#basic-signature-format) |
|
17 |
+ - [Extended signature format](#extended-signature-format) |
|
18 |
+ - [Logical signatures](#logical-signatures) |
|
19 |
+ - [Subsignature Modifiers](#subsignature-modifiers) |
|
20 |
+ - [Special Subsignature Types](#special-subsignature-types) |
|
21 |
+ - [Macro subsignatures (clamav-0.96) : <span class="nodecor">`${min-max}MACROID$`</span>](#macro-subsignatures-clamav-096-span-classnodecormin-maxmacroidspan) |
|
22 |
+ - [PCRE subsignatures (clamav-0.99) : <span class="nodecor">`Trigger/PCRE/[Flags]`</span>](#pcre-subsignatures-clamav-099-span-classnodecortriggerpcreflagsspan) |
|
23 |
+ - [Icon signatures for PE files](#icon-signatures-for-pe-files) |
|
24 |
+ - [Signatures for Version Information metadata in PE files](#signatures-for-version-information-metadata-in-pe-files) |
|
25 |
+ - [Trusted and Revoked Certificates](#trusted-and-revoked-certificates) |
|
26 |
+ - [Signatures based on container metadata](#signatures-based-on-container-metadata) |
|
27 |
+ - [Signatures based on ZIP/RAR metadata (obsolete)](#signatures-based-on-ziprar-metadata-obsolete) |
|
28 |
+ - [Whitelist databases](#whitelist-databases) |
|
29 |
+ - [Signature names](#signature-names) |
|
30 |
+ - [Using YARA rules in ClamAV](#using-yara-rules-in-clamav) |
|
31 |
+ - [Passwords for archive files \[experimental\]](#passwords-for-archive-files-experimental) |
|
32 |
+- [Special files](#special-files) |
|
33 |
+ - [HTML](#html) |
|
34 |
+ - [Text files](#text-files) |
|
35 |
+ - [Compressed Portable Executable files](#compressed-portable-executable-files) |
|
36 |
+ |
|
37 |
+# Introduction |
|
38 |
+ |
|
39 |
+CVD (ClamAV Virus Database) is a digitally signed container that includes signature databases in various text formats. The header of the container is a 512 bytes long string with colon separated fields: |
|
40 |
+ |
|
41 |
+``` |
|
42 |
+ClamAV-VDB:build time:version:number of signatures:functionality level required:MD5 checksum:digital signature:builder name:build time (sec) |
|
43 |
+``` |
|
44 |
+ |
|
45 |
+`sigtool --info` displays detailed information about a given CVD file: |
|
46 |
+ |
|
47 |
+```bash |
|
48 |
+zolw@localhost:/usr/local/share/clamav$ sigtool -i main.cvd |
|
49 |
+File: main.cvd |
|
50 |
+Build time: 09 Dec 2007 15:50 +0000 |
|
51 |
+Version: 45 |
|
52 |
+Signatures: 169676 |
|
53 |
+Functionality level: 21 |
|
54 |
+Builder: sven |
|
55 |
+MD5: b35429d8d5d60368eea9630062f7c75a |
|
56 |
+Digital signature: dxsusO/HWP3/GAA7VuZpxYwVsE9b+tCk+tPN6OyjVF/U8 |
|
57 |
+JVh4vYmW8mZ62ZHYMlM903TMZFg5hZIxcjQB3SX0TapdF1SFNzoWjsyH53eXvMDY |
|
58 |
+eaPVNe2ccXLfEegoda4xU2TezbGfbSEGoU1qolyQYLX674sNA2Ni6l6/CEKYYh |
|
59 |
+Verification OK. |
|
60 |
+``` |
|
61 |
+ |
|
62 |
+The ClamAV project distributes a number of CVD files, including *main.cvd* and *daily.cvd*. |
|
63 |
+ |
|
64 |
+# Debug information from libclamav |
|
65 |
+ |
|
66 |
+In order to create efficient signatures for ClamAV it’s important to understand how the engine handles input files. The best way to see how it works is having a look at the debug information from libclamav. You can do it by calling `clamscan` with the `--debug` and `--leave-temps` flags. The first switch makes clamscan display all the interesting information from libclamav and the second one avoids deleting temporary files so they can be analyzed further. |
|
67 |
+ |
|
68 |
+The now important part of the info is: |
|
69 |
+ |
|
70 |
+```bash |
|
71 |
+$ clamscan --debug attachment.exe |
|
72 |
+[...] |
|
73 |
+LibClamAV debug: Recognized MS-EXE/DLL file |
|
74 |
+LibClamAV debug: Matched signature for file type PE |
|
75 |
+LibClamAV debug: File type: Executable |
|
76 |
+``` |
|
77 |
+ |
|
78 |
+The engine recognized a windows executable. |
|
79 |
+ |
|
80 |
+```bash |
|
81 |
+LibClamAV debug: Machine type: 80386 |
|
82 |
+LibClamAV debug: NumberOfSections: 3 |
|
83 |
+LibClamAV debug: TimeDateStamp: Fri Jan 10 04:57:55 2003 |
|
84 |
+LibClamAV debug: SizeOfOptionalHeader: e0 |
|
85 |
+LibClamAV debug: File format: PE |
|
86 |
+LibClamAV debug: MajorLinkerVersion: 6 |
|
87 |
+LibClamAV debug: MinorLinkerVersion: 0 |
|
88 |
+LibClamAV debug: SizeOfCode: 0x9000 |
|
89 |
+LibClamAV debug: SizeOfInitializedData: 0x1000 |
|
90 |
+LibClamAV debug: SizeOfUninitializedData: 0x1e000 |
|
91 |
+LibClamAV debug: AddressOfEntryPoint: 0x27070 |
|
92 |
+LibClamAV debug: BaseOfCode: 0x1f000 |
|
93 |
+LibClamAV debug: SectionAlignment: 0x1000 |
|
94 |
+LibClamAV debug: FileAlignment: 0x200 |
|
95 |
+LibClamAV debug: MajorSubsystemVersion: 4 |
|
96 |
+LibClamAV debug: MinorSubsystemVersion: 0 |
|
97 |
+LibClamAV debug: SizeOfImage: 0x29000 |
|
98 |
+LibClamAV debug: SizeOfHeaders: 0x400 |
|
99 |
+LibClamAV debug: NumberOfRvaAndSizes: 16 |
|
100 |
+LibClamAV debug: Subsystem: Win32 GUI |
|
101 |
+LibClamAV debug: ------------------------------------ |
|
102 |
+LibClamAV debug: Section 0 |
|
103 |
+LibClamAV debug: Section name: UPX0 |
|
104 |
+LibClamAV debug: Section data (from headers - in memory) |
|
105 |
+LibClamAV debug: VirtualSize: 0x1e000 0x1e000 |
|
106 |
+LibClamAV debug: VirtualAddress: 0x1000 0x1000 |
|
107 |
+LibClamAV debug: SizeOfRawData: 0x0 0x0 |
|
108 |
+LibClamAV debug: PointerToRawData: 0x400 0x400 |
|
109 |
+LibClamAV debug: Section's memory is executable |
|
110 |
+LibClamAV debug: Section's memory is writeable |
|
111 |
+LibClamAV debug: ------------------------------------ |
|
112 |
+LibClamAV debug: Section 1 |
|
113 |
+LibClamAV debug: Section name: UPX1 |
|
114 |
+LibClamAV debug: Section data (from headers - in memory) |
|
115 |
+LibClamAV debug: VirtualSize: 0x9000 0x9000 |
|
116 |
+LibClamAV debug: VirtualAddress: 0x1f000 0x1f000 |
|
117 |
+LibClamAV debug: SizeOfRawData: 0x8200 0x8200 |
|
118 |
+LibClamAV debug: PointerToRawData: 0x400 0x400 |
|
119 |
+LibClamAV debug: Section's memory is executable |
|
120 |
+LibClamAV debug: Section's memory is writeable |
|
121 |
+LibClamAV debug: ------------------------------------ |
|
122 |
+LibClamAV debug: Section 2 |
|
123 |
+LibClamAV debug: Section name: UPX2 |
|
124 |
+LibClamAV debug: Section data (from headers - in memory) |
|
125 |
+LibClamAV debug: VirtualSize: 0x1000 0x1000 |
|
126 |
+LibClamAV debug: VirtualAddress: 0x28000 0x28000 |
|
127 |
+LibClamAV debug: SizeOfRawData: 0x200 0x1ff |
|
128 |
+LibClamAV debug: PointerToRawData: 0x8600 0x8600 |
|
129 |
+LibClamAV debug: Section's memory is writeable |
|
130 |
+LibClamAV debug: ------------------------------------ |
|
131 |
+LibClamAV debug: EntryPoint offset: 0x8470 (33904) |
|
132 |
+``` |
|
133 |
+ |
|
134 |
+The section structure displayed above suggests the executable is packed |
|
135 |
+with UPX. |
|
136 |
+ |
|
137 |
+```bash |
|
138 |
+LibClamAV debug: ------------------------------------ |
|
139 |
+LibClamAV debug: EntryPoint offset: 0x8470 (33904) |
|
140 |
+LibClamAV debug: UPX/FSG/MEW: empty section found - assuming |
|
141 |
+ compression |
|
142 |
+LibClamAV debug: UPX: bad magic - scanning for imports |
|
143 |
+LibClamAV debug: UPX: PE structure rebuilt from compressed file |
|
144 |
+LibClamAV debug: UPX: Successfully decompressed with NRV2B |
|
145 |
+LibClamAV debug: UPX/FSG: Decompressed data saved in |
|
146 |
+ /tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede |
|
147 |
+LibClamAV debug: ***** Scanning decompressed file ***** |
|
148 |
+LibClamAV debug: Recognized MS-EXE/DLL file |
|
149 |
+LibClamAV debug: Matched signature for file type PE |
|
150 |
+``` |
|
151 |
+ |
|
152 |
+Indeed, libclamav recognizes the UPX data and saves the decompressed |
|
153 |
+(and rebuilt) executable into |
|
154 |
+`/tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede`. Then it continues by |
|
155 |
+scanning this new file: |
|
156 |
+ |
|
157 |
+```bash |
|
158 |
+LibClamAV debug: File type: Executable |
|
159 |
+LibClamAV debug: Machine type: 80386 |
|
160 |
+LibClamAV debug: NumberOfSections: 3 |
|
161 |
+LibClamAV debug: TimeDateStamp: Thu Jan 27 11:43:15 2011 |
|
162 |
+LibClamAV debug: SizeOfOptionalHeader: e0 |
|
163 |
+LibClamAV debug: File format: PE |
|
164 |
+LibClamAV debug: MajorLinkerVersion: 6 |
|
165 |
+LibClamAV debug: MinorLinkerVersion: 0 |
|
166 |
+LibClamAV debug: SizeOfCode: 0xc000 |
|
167 |
+LibClamAV debug: SizeOfInitializedData: 0x19000 |
|
168 |
+LibClamAV debug: SizeOfUninitializedData: 0x0 |
|
169 |
+LibClamAV debug: AddressOfEntryPoint: 0x7b9f |
|
170 |
+LibClamAV debug: BaseOfCode: 0x1000 |
|
171 |
+LibClamAV debug: SectionAlignment: 0x1000 |
|
172 |
+LibClamAV debug: FileAlignment: 0x1000 |
|
173 |
+LibClamAV debug: MajorSubsystemVersion: 4 |
|
174 |
+LibClamAV debug: MinorSubsystemVersion: 0 |
|
175 |
+LibClamAV debug: SizeOfImage: 0x26000 |
|
176 |
+LibClamAV debug: SizeOfHeaders: 0x1000 |
|
177 |
+LibClamAV debug: NumberOfRvaAndSizes: 16 |
|
178 |
+LibClamAV debug: Subsystem: Win32 GUI |
|
179 |
+LibClamAV debug: ------------------------------------ |
|
180 |
+LibClamAV debug: Section 0 |
|
181 |
+LibClamAV debug: Section name: .text |
|
182 |
+LibClamAV debug: Section data (from headers - in memory) |
|
183 |
+LibClamAV debug: VirtualSize: 0xc000 0xc000 |
|
184 |
+LibClamAV debug: VirtualAddress: 0x1000 0x1000 |
|
185 |
+LibClamAV debug: SizeOfRawData: 0xc000 0xc000 |
|
186 |
+LibClamAV debug: PointerToRawData: 0x1000 0x1000 |
|
187 |
+LibClamAV debug: Section contains executable code |
|
188 |
+LibClamAV debug: Section's memory is executable |
|
189 |
+LibClamAV debug: ------------------------------------ |
|
190 |
+LibClamAV debug: Section 1 |
|
191 |
+LibClamAV debug: Section name: .rdata |
|
192 |
+LibClamAV debug: Section data (from headers - in memory) |
|
193 |
+LibClamAV debug: VirtualSize: 0x2000 0x2000 |
|
194 |
+LibClamAV debug: VirtualAddress: 0xd000 0xd000 |
|
195 |
+LibClamAV debug: SizeOfRawData: 0x2000 0x2000 |
|
196 |
+LibClamAV debug: PointerToRawData: 0xd000 0xd000 |
|
197 |
+LibClamAV debug: ------------------------------------ |
|
198 |
+LibClamAV debug: Section 2 |
|
199 |
+LibClamAV debug: Section name: .data |
|
200 |
+LibClamAV debug: Section data (from headers - in memory) |
|
201 |
+LibClamAV debug: VirtualSize: 0x17000 0x17000 |
|
202 |
+LibClamAV debug: VirtualAddress: 0xf000 0xf000 |
|
203 |
+LibClamAV debug: SizeOfRawData: 0x17000 0x17000 |
|
204 |
+LibClamAV debug: PointerToRawData: 0xf000 0xf000 |
|
205 |
+LibClamAV debug: Section's memory is writeable |
|
206 |
+LibClamAV debug: ------------------------------------ |
|
207 |
+LibClamAV debug: EntryPoint offset: 0x7b9f (31647) |
|
208 |
+LibClamAV debug: Bytecode executing hook id 257 (0 hooks) |
|
209 |
+attachment.exe: OK |
|
210 |
+[...] |
|
211 |
+``` |
|
212 |
+ |
|
213 |
+No additional files get created by libclamav. By writing a signature for the decompressed file you have more chances that the engine will detect the target data when it gets compressed with another packer. |
|
214 |
+ |
|
215 |
+This method should be applied to all files for which you want to create signatures. By analyzing the debug information you can quickly see how the engine recognizes and preprocesses the data and what additional files get created. Signatures created for bottom-level temporary files are usually more generic and should help detecting the same malware in different forms. |
|
216 |
+ |
|
217 |
+# Signature formats |
|
218 |
+ |
|
219 |
+## Hash-based signatures |
|
220 |
+ |
|
221 |
+The easiest way to create signatures for ClamAV is to use filehash checksums, however this method can be only used against static malware. |
|
222 |
+ |
|
223 |
+### MD5 hash-based signatures |
|
224 |
+ |
|
225 |
+To create a MD5 signature for `test.exe` use the `--md5` option of |
|
226 |
+sigtool: |
|
227 |
+ |
|
228 |
+```bash |
|
229 |
+zolw@localhost:/tmp/test$ sigtool --md5 test.exe > test.hdb |
|
230 |
+zolw@localhost:/tmp/test$ cat test.hdb |
|
231 |
+48c4533230e1ae1c118c741c0db19dfb:17387:test.exe |
|
232 |
+``` |
|
233 |
+ |
|
234 |
+That’s it\! The signature is ready for use: |
|
235 |
+ |
|
236 |
+```bash |
|
237 |
+zolw@localhost:/tmp/test$ clamscan -d test.hdb test.exe |
|
238 |
+test.exe: test.exe FOUND |
|
239 |
+ |
|
240 |
+----------- SCAN SUMMARY ----------- |
|
241 |
+Known viruses: 1 |
|
242 |
+Scanned directories: 0 |
|
243 |
+Engine version: 0.92.1 |
|
244 |
+Scanned files: 1 |
|
245 |
+Infected files: 1 |
|
246 |
+Data scanned: 0.02 MB |
|
247 |
+Time: 0.024 sec (0 m 0 s) |
|
248 |
+``` |
|
249 |
+ |
|
250 |
+You can change the name (by default sigtool uses the name of the file) and place it inside a `*.hdb` file. A single database file can include any number of signatures. To get them automatically loaded each time clamscan/clamd starts just copy the database file(s) into the local virus database directory (eg. /usr/local/share/clamav). |
|
251 |
+ |
|
252 |
+*The hash-based signatures shall not be used for text files, HTML and any other data that gets internally preprocessed before pattern matching. If you really want to use a hash signature in such a case, run clamscan with –debug and –leave-temps flags as described above and create a signature for a preprocessed file left in /tmp. Please keep in mind that a hash signature will stop matching as soon as a single byte changes in the target file.* |
|
253 |
+ |
|
254 |
+### SHA1 and SHA256 hash-based signatures |
|
255 |
+ |
|
256 |
+ClamAV 0.98 has also added support for SHA1 and SHA256 file checksums. The format is the same as for MD5 file checksum. It can differentiate between them based on the length of the hash string in the signature. For best backwards compatibility, these should be placed inside a `*.hsb` file. The format is: |
|
257 |
+ |
|
258 |
+``` |
|
259 |
+HashString:FileSize:MalwareName |
|
260 |
+``` |
|
261 |
+ |
|
262 |
+### PE section based hash signatures |
|
263 |
+ |
|
264 |
+You can create a hash signature for a specific section in a PE file. Such signatures shall be stored inside `.mdb` files in the following format: |
|
265 |
+ |
|
266 |
+``` |
|
267 |
+PESectionSize:PESectionHash:MalwareName |
|
268 |
+``` |
|
269 |
+ |
|
270 |
+The easiest way to generate MD5 based section signatures is to extract target PE sections into separate files and then run sigtool with the option `--mdb` |
|
271 |
+ |
|
272 |
+ClamAV 0.98 has also added support for SHA1 and SHA256 section based signatures. The format is the same as for MD5 PE section based signatures. It can differentiate between them based on the length of the hash string in the signature. For best backwards compatibility, these should be placed inside a `*.msb` file. |
|
273 |
+ |
|
274 |
+### Hash signatures with unknown size |
|
275 |
+ |
|
276 |
+ClamAV 0.98 has also added support for hash signatures where the size is not known but the hash is. It is much more performance-efficient to use signatures with specific sizes, so be cautious when using this feature. For these cases, the ’\*’ character can be used in the size field. To ensure proper backwards compatibility with older versions of ClamAV, these signatures must have a minimum functional level of 73 or higher. Signatures that use the wildcard size without this level set will be rejected as malformed. |
|
277 |
+ |
|
278 |
+``` |
|
279 |
+Sample .hsb signature matching any size |
|
280 |
+HashString:*:MalwareName:73 |
|
281 |
+ |
|
282 |
+Sample .msb signature matching any size |
|
283 |
+*:PESectionHash:MalwareName:73 |
|
284 |
+``` |
|
285 |
+ |
|
286 |
+## Body-based signatures |
|
287 |
+ |
|
288 |
+ClamAV stores all body-based signatures in a hexadecimal format. In this section by a hex-signature we mean a fragment of malware’s body converted into a hexadecimal string which can be additionally extended using various wildcards. |
|
289 |
+ |
|
290 |
+### Hexadecimal format |
|
291 |
+ |
|
292 |
+You can use `sigtool --hex-dump` to convert any data into a hex-string: |
|
293 |
+ |
|
294 |
+```bash |
|
295 |
+zolw@localhost:/tmp/test$ sigtool --hex-dump |
|
296 |
+How do I look in hex? |
|
297 |
+486f7720646f2049206c6f6f6b20696e206865783f0a |
|
298 |
+``` |
|
299 |
+ |
|
300 |
+### Wildcards |
|
301 |
+ |
|
302 |
+ClamAV supports the following wildcards for hex-signatures: |
|
303 |
+ |
|
304 |
+- `??` |
|
305 |
+ |
|
306 |
+ Match any byte. |
|
307 |
+ |
|
308 |
+- `a?` |
|
309 |
+ |
|
310 |
+ Match a high nibble (the four high bits). |
|
311 |
+ **IMPORTANT NOTE:** The nibble matching is only available in |
|
312 |
+ libclamav with the functionality level 17 and higher therefore |
|
313 |
+ please only use it with .ndb signatures followed by ":17" |
|
314 |
+ (MinEngineFunctionalityLevel, see [3.2.7](#ndb)). |
|
315 |
+ |
|
316 |
+- `?a` |
|
317 |
+ |
|
318 |
+ Match a low nibble (the four low bits). |
|
319 |
+ |
|
320 |
+- `*` |
|
321 |
+ |
|
322 |
+ Match any number of bytes. |
|
323 |
+ |
|
324 |
+- `{n}` |
|
325 |
+ |
|
326 |
+ Match \(n\) bytes. |
|
327 |
+ |
|
328 |
+- `{-n}` |
|
329 |
+ |
|
330 |
+ Match \(n\) or less bytes. |
|
331 |
+ |
|
332 |
+- `{n-}` |
|
333 |
+ |
|
334 |
+ Match \(n\) or more bytes. |
|
335 |
+ |
|
336 |
+- `{n-m}` |
|
337 |
+ |
|
338 |
+ Match between \(n\) and \(m\) bytes (\(m > n\)). |
|
339 |
+ |
|
340 |
+- `HEXSIG[x-y]aa` or `aa[x-y]HEXSIG` |
|
341 |
+ |
|
342 |
+ Match aa anchored to a hex-signature, see |
|
343 |
+ <https://bugzilla.clamav.net/show_bug.cgi?id=776> for discussion and |
|
344 |
+ examples. |
|
345 |
+ |
|
346 |
+The range signatures `*` and `{}` virtually separate a hex-signature into two parts, eg. `aabbcc*bbaacc` is treated as two sub-signatures `aabbcc` and `bbaacc` with any number of bytes between them. It’s a requirement that each sub-signature includes a block of two static characters somewhere in its body. Note that there is one exception to this restriction; that is when the range wildcard is of the form `{n}` with `n<128`. In this case, ClamAV uses an optimization and translates `{n}` to the string consisting of `n ??` character wildcards. Character wildcards do not divide hex signatures into two parts and so the two static character requirement does not apply. |
|
347 |
+ |
|
348 |
+### Character classes |
|
349 |
+ |
|
350 |
+ClamAV supports the following character classes for hex-signatures: |
|
351 |
+ |
|
352 |
+- `(B)` |
|
353 |
+ |
|
354 |
+ Match word boundary (including file boundaries). |
|
355 |
+ |
|
356 |
+- `(L)` |
|
357 |
+ |
|
358 |
+ Match CR, CRLF or file boundaries. |
|
359 |
+ |
|
360 |
+- `(W)` |
|
361 |
+ |
|
362 |
+ Match a non-alphanumeric character. |
|
363 |
+ |
|
364 |
+### Alternate strings |
|
365 |
+ |
|
366 |
+- Single-byte alternates (clamav-0.96) `(aa|bb|cc|...)` or `!(aa|bb|cc|...)` Match a member from a set of bytes \[aa, bb, cc, ...\]. |
|
367 |
+ - Negation operation can be applied to match any non-member, assumed to be one-byte in length. |
|
368 |
+ - Signature modifiers and wildcards cannot be applied. |
|
369 |
+ |
|
370 |
+- Multi-byte fixed length alternates `(aaaa|bbbb|cccc|...)` or `!(aaaa|bbbb|cccc|...)` Match a member from a set of multi-byte alternates \[aaaa, bbbb, cccc, ...\] of n-length. |
|
371 |
+ - All set members must be the same length. |
|
372 |
+ - Negation operation can be applied to match any non-member, assumed to be n-bytes in length (clamav-0.98.2). |
|
373 |
+ - Signature modifiers and wildcards cannot be applied. |
|
374 |
+ |
|
375 |
+- Generic alternates (clamav-0.99) `(alt1|alt2|alt3|...)` Match a member from a set of alternates \[alt1, alt2, alt3, ...\] that can be of variable lengths. |
|
376 |
+ - Negation operation cannot be applied. |
|
377 |
+ - Signature modifiers and nibble wildcards \[`??, a?, ?a`\] can be applied. |
|
378 |
+ - Ranged wildcards \[`{n-m}`\] are limited to a fixed range of less than 128 bytes \[`{1} -> {127}`\]. |
|
379 |
+ |
|
380 |
+Note that using signature modifiers and wildcards classifies the alternate type to be a generic alternate. Thus single-byte alternates and multi-byte fixed length alternates can use signature modifiers and wildcards but will be classified as generic alternate. This means that negation cannot be applied in this situation and there is a slight performance impact. |
|
381 |
+ |
|
382 |
+### Basic signature format |
|
383 |
+ |
|
384 |
+The simplest (and now deprecated) signature format is: |
|
385 |
+ |
|
386 |
+``` |
|
387 |
+MalwareName=HexSignature |
|
388 |
+``` |
|
389 |
+ |
|
390 |
+ClamAV will scan the entire file looking for HexSignature. All signatures of this type must be placed inside `*.db` files. |
|
391 |
+ |
|
392 |
+### Extended signature format |
|
393 |
+ |
|
394 |
+The extended signature format allows for specification of additional information such as a target file type, virus offset or engine version, making the detection more reliable. The format is: |
|
395 |
+ |
|
396 |
+``` |
|
397 |
+MalwareName:TargetType:Offset:HexSignature[:MinFL:[MaxFL]] |
|
398 |
+``` |
|
399 |
+ |
|
400 |
+where `TargetType` is one of the following numbers specifying the type of the target file: |
|
401 |
+ |
|
402 |
+- 0 = any file |
|
403 |
+ |
|
404 |
+- 1 = Portable Executable, both 32- and 64-bit. |
|
405 |
+ |
|
406 |
+- 2 = OLE2 containers, including their specific macros. The OLE2 format is primarily used by MS Office and MSI installation files. |
|
407 |
+ |
|
408 |
+- 3 = HTML (normalized: whitespace transformed to spaces, tags/tag attributes normalized, all lowercase), Javascript is normalized too: all strings are normalized (hex encoding is decoded), numbers are parsed and normalized, local variables/function names are normalized to ’n001’ format, argument to eval() is parsed as JS again, unescape() is handled, some simple JS packers are handled, output is whitespace normalized. |
|
409 |
+ |
|
410 |
+- 4 = Mail file |
|
411 |
+ |
|
412 |
+- 5 = Graphics |
|
413 |
+ |
|
414 |
+- 6 = ELF |
|
415 |
+ |
|
416 |
+- 7 = ASCII text file (normalized) |
|
417 |
+ |
|
418 |
+- 8 = Unused |
|
419 |
+ |
|
420 |
+- 9 = Mach-O files |
|
421 |
+ |
|
422 |
+- 10 = PDF files |
|
423 |
+ |
|
424 |
+- 11 = Flash files |
|
425 |
+ |
|
426 |
+- 12 = Java class files |
|
427 |
+ |
|
428 |
+And `Offset` is an asterisk or a decimal number `n` possibly combined with a special modifier: |
|
429 |
+ |
|
430 |
+- `*` = any |
|
431 |
+ |
|
432 |
+- `n` = absolute offset |
|
433 |
+ |
|
434 |
+- `EOF-n` = end of file minus `n` bytes |
|
435 |
+ |
|
436 |
+Signatures for PE, ELF and Mach-O files additionally support: |
|
437 |
+ |
|
438 |
+- `EP+n` = entry point plus n bytes (`EP+0` for `EP`) |
|
439 |
+ |
|
440 |
+- `EP-n` = entry point minus n bytes |
|
441 |
+ |
|
442 |
+- `Sx+n` = start of section `x`’s (counted from 0) data plus `n` bytes |
|
443 |
+ |
|
444 |
+- `SEx` = entire section `x` (offset must lie within section boundaries) |
|
445 |
+ |
|
446 |
+- `SL+n` = start of last section plus `n` bytes |
|
447 |
+ |
|
448 |
+All the above offsets except `*` can be turned into **floating offsets** and represented as `Offset,MaxShift` where `MaxShift` is an unsigned integer. A floating offset will match every offset between `Offset` and `Offset+MaxShift`, eg. `10,5` will match all offsets from 10 to 15 and `EP+n,y` will match all offsets from `EP+n` to `EP+n+y`. Versions of ClamAV older than 0.91 will silently ignore the `MaxShift` extension and only use `Offset`. Optional `MinFL` and `MaxFL` parameters can restrict the signature to specific engine releases. All signatures in the extended format must be placed inside `*.ndb` files. |
|
449 |
+ |
|
450 |
+### Logical signatures |
|
451 |
+ |
|
452 |
+Logical signatures allow combining of multiple signatures in extended format using logical operators. They can provide both more detailed and flexible pattern matching. The logical sigs are stored inside `*.ldb` files in the following format: |
|
453 |
+ |
|
454 |
+``` |
|
455 |
+SignatureName;TargetDescriptionBlock;LogicalExpression;Subsig0; |
|
456 |
+Subsig1;Subsig2;... |
|
457 |
+``` |
|
458 |
+ |
|
459 |
+where: |
|
460 |
+ |
|
461 |
+- `TargetDescriptionBlock` provides information about the engine and target file with comma separated `Arg:Val` pairs. For args where `Val` is a range, the minimum and maximum values should be expressed as `min-max`. |
|
462 |
+ |
|
463 |
+- `LogicalExpression` specifies the logical expression describing the relationship between `Subsig0...SubsigN`. **Basis clause:** 0,1,...,N decimal indexes are SUB-EXPRESSIONS representing `Subsig0, Subsig1,...,SubsigN` respectively. **Inductive clause:** if `A` and `B` are SUB-EXPRESSIONS and `X, Y` are decimal numbers then `(A&B)`, `(A|B)`, `A=X`, `A=X,Y`, `A>X`, `A>X,Y`, `A<X` and `A<X,Y` are SUB-EXPRESSIONS |
|
464 |
+ |
|
465 |
+- `SubsigN` is n-th subsignature in extended format possibly preceded with an offset. There can be specified up to 64 subsigs. |
|
466 |
+ |
|
467 |
+Keywords used in `TargetDescriptionBlock`: |
|
468 |
+ |
|
469 |
+- `Target:X`: Target file type |
|
470 |
+ |
|
471 |
+- `Engine:X-Y`: Required engine functionality (range; 0.96). Note that if the `Engine` keyword is used, it must be the first one in the `TargetDescriptionBlock` for backwards compatibility |
|
472 |
+ |
|
473 |
+- `FileSize:X-Y`: Required file size (range in bytes; 0.96) |
|
474 |
+ |
|
475 |
+- `EntryPoint`: Entry point offset (range in bytes; 0.96) |
|
476 |
+ |
|
477 |
+- `NumberOfSections`: Required number of sections in executable (range; 0.96) |
|
478 |
+ |
|
479 |
+- `Container:CL_TYPE_*`: File type of the container which stores the scanned file. Specifying `CL_TYPE_ANY` matches on root objects only. |
|
480 |
+ |
|
481 |
+- `Intermediates:CL_TYPE_*>CL_TYPE_*`: File types of intermediate containers which stores the scanned file. Specify 1-16 file types separated by ’`>`’ in top-down order (’`>`’ separator not needed for single file type), last type should be the immediate container for the malicious content. `CL_TYPE_ANY` can be used as a wildcard file type. (expr; 0.100.0) |
|
482 |
+ |
|
483 |
+- `IconGroup1`: Icon group name 1 from .idb signature Required engine functionality (range; 0.96) |
|
484 |
+ |
|
485 |
+- `IconGroup2`: Icon group name 2 from .idb signature Required engine functionality (range; 0.96) |
|
486 |
+ |
|
487 |
+Modifiers for subexpressions: |
|
488 |
+ |
|
489 |
+- `A=X`: If the SUB-EXPRESSION A refers to a single signature then this signature must get matched exactly X times; if it refers to a (logical) block of signatures then this block must generate exactly X matches (with any of its sigs). |
|
490 |
+ |
|
491 |
+- `A=0` specifies negation (signature or block of signatures cannot be matched) |
|
492 |
+ |
|
493 |
+- `A=X,Y`: If the SUB-EXPRESSION A refers to a single signature then this signature must be matched exactly X times; if it refers to a (logical) block of signatures then this block must generate X matches and at least Y different signatures must get matched. |
|
494 |
+ |
|
495 |
+- `A>X`: If the SUB-EXPRESSION A refers to a single signature then this signature must get matched more than X times; if it refers to a (logical) block of signatures then this block must generate more than X matches (with any of its sigs). |
|
496 |
+ |
|
497 |
+- `A>X,Y`: If the SUB-EXPRESSION A refers to a single signature then this signature must get matched more than X times; if it refers to a (logical) block of signatures then this block must generate more than X matches and at least Y different signatures must be matched. |
|
498 |
+ |
|
499 |
+- `A<X` and `A<X,Y` as above with the change of "more" to "less". |
|
500 |
+ |
|
501 |
+Examples: |
|
502 |
+ |
|
503 |
+``` |
|
504 |
+Sig1;Target:0;(0&1&2&3)&(4|1);6b6f74656b;616c61;7a6f6c77;7374656 |
|
505 |
+6616e;deadbeef |
|
506 |
+ |
|
507 |
+Sig2;Target:0;((0|1|2)>5,2)&(3|1);6b6f74656b;616c61;7a6f6c77;737 |
|
508 |
+46566616e |
|
509 |
+ |
|
510 |
+Sig3;Target:0;((0|1|2|3)=2)&(4|1);6b6f74656b;616c61;7a6f6c77;737 |
|
511 |
+46566616e;deadbeef |
|
512 |
+ |
|
513 |
+Sig4;Engine:51-255,Target:1;((0|1)&(2|3))&4;EP+123:33c06834f04100 |
|
514 |
+f2aef7d14951684cf04100e8110a00;S2+78:22??232c2d252229{-15}6e6573 |
|
515 |
+(63|64)61706528;S3+50:68efa311c3b9963cb1ee8e586d32aeb9043e;f9c58 |
|
516 |
+dcf43987e4f519d629b103375;SL+550:6300680065005c0046006900 |
|
517 |
+``` |
|
518 |
+ |
|
519 |
+### Subsignature Modifiers |
|
520 |
+ |
|
521 |
+ClamAV (clamav-0.99) supports a number of additional subsignature |
|
522 |
+modifiers for logical signatures. This is done by specifying ’::’ |
|
523 |
+followed by a number of characters representing the desired options. |
|
524 |
+Signatures using subsignature modifiers require `Engine:81-255` for |
|
525 |
+backwards-compatibility. |
|
526 |
+ |
|
527 |
+- Case-Insensitive \[`i`\] |
|
528 |
+ |
|
529 |
+ Specifying the `i` modifier causes ClamAV to match all alphabetic hex bytes as case-insensitive. All patterns in ClamAV are case-sensitive by default. |
|
530 |
+ |
|
531 |
+- Wide \[`w`\] |
|
532 |
+ |
|
533 |
+ Specifying the `w` causes ClamAV to match all hex bytes encoded with two bytes per character. Note this simply interweaves each character with NULL characters and does not truly support UTF-16 characters. Wildcards for ’wide’ subsignatures are not treated as wide (i.e. there can be an odd number of intermittent characters). This can be combined with `a` to search for patterns in both wide and ascii. |
|
534 |
+ |
|
535 |
+- Fullword \[`f`\] |
|
536 |
+ |
|
537 |
+ Match subsignature as a fullword (delimited by non-alphanumeric characters). |
|
538 |
+ |
|
539 |
+- Ascii \[`a`\] |
|
540 |
+ |
|
541 |
+ Match subsignature as ascii characters. This can be combined with `w` to search for patterns in both ascii and wide. |
|
542 |
+ |
|
543 |
+Examples: |
|
544 |
+ |
|
545 |
+``` |
|
546 |
+clamav-nocase-A;Engine:81-255,Target:0;0&1;41414141::i;424242424242::i |
|
547 |
+ -matches 'AAAA'(nocase) and 'BBBBBB'(nocase) |
|
548 |
+ |
|
549 |
+clamav-fullword-A;Engine:81-255,Target:0;0&1;414141;68656c6c6f::f |
|
550 |
+ -matches 'AAA' and 'hello'(fullword) |
|
551 |
+clamav-fullword-B;Engine:81-255,Target:0;0&1;414141;68656c6c6f::fi |
|
552 |
+ -matches 'AAA' and 'hello'(fullword nocase) |
|
553 |
+ |
|
554 |
+clamav-wide-B2;Engine:81-255,Target:0;0&1;414141;68656c6c6f::wa |
|
555 |
+ -matches 'AAA' and 'hello'(wide ascii) |
|
556 |
+clamav-wide-C0;Engine:81-255,Target:0;0&1;414141;68656c6c6f::iwfa |
|
557 |
+ -matches 'AAA' and 'hello'(nocase wide fullword ascii) |
|
558 |
+``` |
|
559 |
+ |
|
560 |
+## Special Subsignature Types |
|
561 |
+ |
|
562 |
+### Macro subsignatures (clamav-0.96) : <span class="nodecor">`${min-max}MACROID$`</span> |
|
563 |
+ |
|
564 |
+Macro subsignatures are used to combine a number of existing extended |
|
565 |
+signatures (`.ndb`) into a on-the-fly generated alternate string logical |
|
566 |
+signature (`.ldb`). Signatures using macro subsignatures require |
|
567 |
+`Engine:51-255` for backwards-compatibility. |
|
568 |
+ |
|
569 |
+Example: |
|
570 |
+ |
|
571 |
+``` |
|
572 |
+ test.ldb: |
|
573 |
+ TestMacro;Engine:51-255,Target:0;0&1;616161;${6-7}12$ |
|
574 |
+ |
|
575 |
+ test.ndb: |
|
576 |
+ D1:0:$12:626262 |
|
577 |
+ D2:0:$12:636363 |
|
578 |
+ D3:0:$30:626264 |
|
579 |
+``` |
|
580 |
+ |
|
581 |
+The example logical signature `TestMacro` is functionally equivalent |
|
582 |
+to: |
|
583 |
+ |
|
584 |
+``` |
|
585 |
+`TestMacro;Engine:51-255,Target:0;0;616161{3-4}(626262|636363)` |
|
586 |
+``` |
|
587 |
+ |
|
588 |
+- `MACROID` points to a group of signatures; there can be at most 32 macro groups. |
|
589 |
+ |
|
590 |
+ - In the example, `MACROID` is `12` and both `D1` and `D2` are members of macro group `12`. `D3` is a member of separate macro group `30`. |
|
591 |
+ |
|
592 |
+- `{min-max}` specifies the offset range at which one of the group signatures should match; the offset range is relative to the starting offset of the preceding subsignature. This means a macro subsignature cannot be the first subsignature. |
|
593 |
+ |
|
594 |
+ - In the example, `{min-max}` is `{6-7}` and it is relative to the start of a `616161` match. |
|
595 |
+ |
|
596 |
+- For more information and examples please see <https://wwws.clamav.net/bugzilla/show_bug.cgi?id=164>. |
|
597 |
+ |
|
598 |
+### PCRE subsignatures (clamav-0.99) : <span class="nodecor">`Trigger/PCRE/[Flags]`</span> |
|
599 |
+ |
|
600 |
+PCRE subsignatures are used within a logical signature (`.ldb`) to specify regex matches that execute once triggered by a conditional based on preceding subsignatures. Signatures using PCRE subsignatures require `Engine:81-255` for backwards-compatibility. |
|
601 |
+ |
|
602 |
+- `Trigger` is a required field that is a valid `LogicalExpression` and may refer to any subsignatures that precede this subsignature. Triggers cannot be self-referential and cannot refer to subsequent subsignatures. |
|
603 |
+ |
|
604 |
+- `PCRE` is the expression representing the regex to execute. `PCRE` must be delimited by ’/’ and usage of ’/’ within the expression need to be escaped. For backward compatibility, ’;’ within the expression must be expressed as ’`\x3B`’. `PCRE` cannot be empty and (?UTF\*) control sequence is not allowed. If debug is specified, named capture groups are displayed in a post-execution report. |
|
605 |
+ |
|
606 |
+- `Flags` are a series of characters which affect the compilation and execution of `PCRE` within the PCRE compiler and the ClamAV engine. This field is optional. |
|
607 |
+ |
|
608 |
+ - `g [CLAMAV_GLOBAL]` specifies to search for ALL matches of PCRE (default is to search for first match). NOTE: INCREASES the time needed to run the PCRE. |
|
609 |
+ |
|
610 |
+ - `r [CLAMAV_ROLLING]` specifies to use the given offset as the starting location to search for a match as opposed to the only location; applies to subsigs without maxshifts. By default, in order to facilatate normal ClamAV offset behavior, PCREs are auto-anchored (only attempt match on first offset); using the rolling option disables the auto-anchoring. |
|
611 |
+ |
|
612 |
+ - `e [CLAMAV_ENCOMPASS]` specifies to CONFINE matching between the specified offset and maxshift; applies only when maxshift is specified. Note: DECREASES time needed to run the PCRE. |
|
613 |
+ |
|
614 |
+ - `i [PCRE_CASELESS]` |
|
615 |
+ |
|
616 |
+ - `s [PCRE_DOTALL]` |
|
617 |
+ |
|
618 |
+ - `m [PCRE_MULTILINE]` |
|
619 |
+ |
|
620 |
+ - `x [PCRE_EXTENDED]` |
|
621 |
+ |
|
622 |
+ - `A [PCRE_ANCHORED]` |
|
623 |
+ |
|
624 |
+ - `E [PCRE_DOLLAR_ENODNLY]` |
|
625 |
+ |
|
626 |
+ - `U [PCRE_UNGREEDY]` |
|
627 |
+ |
|
628 |
+Examples: |
|
629 |
+ |
|
630 |
+``` |
|
631 |
+Find.All.ClamAV;Engine:81-255,Target:0;1;6265676c6164697427736e6f7462797465636f6465;0/clamav/g |
|
632 |
+ |
|
633 |
+Find.ClamAV.OnlyAt.299;Engine:81-255,Target:0;2;7374756c747a67657473;7063726572656765786c6f6c;299:0&1/clamav/ |
|
634 |
+ |
|
635 |
+Find.ClamAV.StartAt.300;Engine:81-255,Target:0;3;616c61696e;62756731393238;636c6f736564;300:0&1&2/clamav/r |
|
636 |
+ |
|
637 |
+Find.All.Encompassed.ClamAV;Engine:81-255,Target:0;3;7768796172656e2774;796f757573696e67;79617261;200,300:0&1&2/clamav/ge |
|
638 |
+ |
|
639 |
+Named.CapGroup.Pcre;Engine:81-255,Target:0;3;636f75727479617264;616c62756d;74657272696572;50:0&1&2/variable=(?<nilshell>.{16})end/gr |
|
640 |
+ |
|
641 |
+Firefox.TreeRange.UseAfterFree;Engine:81-255,Target:0,Engine:81-255;0&1&2;2e766965772e73656c656374696f6e;2e696e76616c696461746553656c656374696f6e;0&1/\x2Eview\x2Eselection.*?\x2Etree\s*\x3D\s*null.*?\x2Einvalidate/smi |
|
642 |
+ |
|
643 |
+Firefox.IDB.UseAfterFree;Engine:81-255,Target:0;0&1;4944424b657952616e6765;0/^\x2e(only|lowerBound|upperBound|bound)\x28.*?\x29.*?\x2e(lower|upper|lowerOpen|upperOpen)/smi |
|
644 |
+ |
|
645 |
+Firefox.boundElements;Engine:81-255,Target:0;0&1&2;6576656e742e6 |
|
646 |
+26f756e64456c656d656e7473;77696e646f772e636c6f7365;0&1/on(load|click)\s*=\s*\x22?window\.close\s*\x28/si |
|
647 |
+``` |
|
648 |
+ |
|
649 |
+## Icon signatures for PE files |
|
650 |
+ |
|
651 |
+ClamAV 0.96 includes an approximate/fuzzy icon matcher to help detecting malicious executables disguising themselves as innocent looking image files, office documents and the like. |
|
652 |
+ |
|
653 |
+Icon matching is only triggered via .ldb signatures using the special attribute tokens `IconGroup1` or `IconGroup2`. These identify two (optional) groups of icons defined in a .idb database file. The format of the .idb file is: |
|
654 |
+ |
|
655 |
+``` |
|
656 |
+ICONNAME:GROUP1:GROUP2:ICON_HASH |
|
657 |
+``` |
|
658 |
+ |
|
659 |
+where: |
|
660 |
+ |
|
661 |
+- `ICON_NAME` is a unique string identifier for a specific icon, |
|
662 |
+ |
|
663 |
+- `GROUP1` is a string identifier for the first group of icons (`IconGroup1`) |
|
664 |
+ |
|
665 |
+- `GROUP2` is a string identifier for the second group of icons (`IconGroup2`), |
|
666 |
+ |
|
667 |
+- `ICON_HASH` is a fuzzy hash of the icon image |
|
668 |
+ |
|
669 |
+The `ICON_HASH` field can be obtained from the debug output of libclamav. For example: |
|
670 |
+ |
|
671 |
+```bash |
|
672 |
+LibClamAV debug: ICO SIGNATURE: |
|
673 |
+ICON_NAME:GROUP1:GROUP2:18e2e0304ce60a0cc3a09053a30000414100057e000afe0000e 80006e510078b0a08910d11ad04105e0811510f084e01040c080a1d0b0021000a39002a41 |
|
674 |
+``` |
|
675 |
+ |
|
676 |
+## Signatures for Version Information metadata in PE files |
|
677 |
+ |
|
678 |
+Starting with ClamAV 0.96 it is possible to easily match certain information built into PE files (executables and dynamic link libraries). Whenever you lookup the properties of a PE executable file in windows, you are presented with a bunch of details about the file itself. |
|
679 |
+ |
|
680 |
+These info are stored in a special area of the file resources which goes under the name of `VS_VERSION_INFORMATION` (or versioninfo for short). It is divided into 2 parts. The first part (which is rather uninteresting) is really a bunch of numbers and flags indicating the product and file version. It was originally intended for use with installers which, after parsing it, should be able to determine whether a certain executable or library are to be upgraded/overwritten or are already up to date. Suffice to say, this approach never really worked and is generally never used. |
|
681 |
+ |
|
682 |
+The second block is much more interesting: it is a simple list of key/value strings, intended for user information and completely ignored by the OS. For example, if you look at ping.exe you can see the company being *"Microsoft Corporation"*, the description *"TCP/IP Ping command"*, the internal name *"ping.exe"* and so on... Depending on the OS version, some keys may be given peculiar visibility in the file properties dialog, however they are internally all the same. |
|
683 |
+ |
|
684 |
+To match a versioninfo key/value pair, the special file offset anchor `VI` was introduced. This is similar to the other anchors (like `EP` and `SL`) except that, instead of matching the hex pattern against a single offset, it checks it against each and every key/value pair in the file. The `VI` token doesn’t need nor accept a `+/-` offset like e.g. `EP+1`. As for the hex signature itself, it’s just the utf16 dump of the key and value. Only the `??` and `(aa|bb)` wildcards are allowed in the signature. Usually, you don’t need to bother figuring it out: each key/value pair together with the corresponding VI-based signature is printed by `clamscan` when the `--debug` option is given. |
|
685 |
+ |
|
686 |
+For example `clamscan --debug freecell.exe` produces: |
|
687 |
+ |
|
688 |
+```bash |
|
689 |
+[...] |
|
690 |
+Recognized MS-EXE/DLL file |
|
691 |
+in cli_peheader |
|
692 |
+versioninfo_cb: type: 10, name: 1, lang: 410, rva: 9608 |
|
693 |
+cli_peheader: parsing version info @ rva 9608 (1/1) |
|
694 |
+VersionInfo (d2de): 'CompanyName'='Microsoft Corporation' - |
|
695 |
+VI:43006f006d00700061006e0079004e0061006d006500000000004d006900 |
|
696 |
+630072006f0073006f0066007400200043006f00720070006f0072006100740 |
|
697 |
+069006f006e000000 |
|
698 |
+VersionInfo (d32a): 'FileDescription'='Entertainment Pack |
|
699 |
+FreeCell Game' - VI:460069006c006500440065007300630072006900700 |
|
700 |
+0740069006f006e000000000045006e007400650072007400610069006e006d |
|
701 |
+0065006e00740020005000610063006b0020004600720065006500430065006 |
|
702 |
+c006c002000470061006d0065000000 |
|
703 |
+VersionInfo (d396): 'FileVersion'='5.1.2600.0 (xpclient.010817 |
|
704 |
+-1148)' - VI:460069006c006500560065007200730069006f006e00000000 |
|
705 |
+0035002e0031002e0032003600300030002e003000200028007800700063006 |
|
706 |
+c00690065006e0074002e003000310030003800310037002d00310031003400 |
|
707 |
+380029000000 |
|
708 |
+VersionInfo (d3fa): 'InternalName'='freecell' - VI:49006e007400 |
|
709 |
+650072006e0061006c004e0061006d006500000066007200650065006300650 |
|
710 |
+06c006c000000 |
|
711 |
+VersionInfo (d4ba): 'OriginalFilename'='freecell' - VI:4f007200 |
|
712 |
+6900670069006e0061006c00460069006c0065006e0061006d0065000000660 |
|
713 |
+0720065006500630065006c006c000000 |
|
714 |
+VersionInfo (d4f6): 'ProductName'='Sistema operativo Microsoft |
|
715 |
+Windows' - VI:500072006f0064007500630074004e0061006d00650000000 |
|
716 |
+000530069007300740065006d00610020006f00700065007200610074006900 |
|
717 |
+76006f0020004d006900630072006f0073006f0066007400ae0020005700690 |
|
718 |
+06e0064006f0077007300ae000000 |
|
719 |
+VersionInfo (d562): 'ProductVersion'='5.1.2600.0' - VI:50007200 |
|
720 |
+6f006400750063007400560065007200730069006f006e00000035002e00310 |
|
721 |
+02e0032003600300030002e0030000000 |
|
722 |
+[...] |
|
723 |
+``` |
|
724 |
+ |
|
725 |
+Although VI-based signatures are intended for use in logical signatures you can test them using ordinary `.ndb` files. For example: |
|
726 |
+ |
|
727 |
+``` |
|
728 |
+my_test_vi_sig:1:VI:paste_your_hex_sig_here |
|
729 |
+``` |
|
730 |
+ |
|
731 |
+Final note. If you want to decode a VI-based signature into a human readable form you can use: |
|
732 |
+ |
|
733 |
+```bash |
|
734 |
+echo hex_string | xxd -r -p | strings -el |
|
735 |
+``` |
|
736 |
+ |
|
737 |
+For example: |
|
738 |
+ |
|
739 |
+```bash |
|
740 |
+$ echo 460069006c0065004400650073006300720069007000740069006f006e |
|
741 |
+000000000045006e007400650072007400610069006e006d0065006e007400200 |
|
742 |
+05000610063006b0020004600720065006500430065006c006c00200047006100 |
|
743 |
+6d0065000000 | xxd -r -p | strings -el |
|
744 |
+FileDescription |
|
745 |
+Entertainment Pack FreeCell Game |
|
746 |
+``` |
|
747 |
+ |
|
748 |
+## Trusted and Revoked Certificates |
|
749 |
+ |
|
750 |
+Clamav 0.98 checks signed PE files for certificates and verifies each certificate in the chain against a database of trusted and revoked certificates. The signature format is |
|
751 |
+ |
|
752 |
+``` |
|
753 |
+ Name;Trusted;Subject;Serial;Pubkey;Exponent;CodeSign;TimeSign;CertSign; |
|
754 |
+ NotBefore;Comment[;minFL[;maxFL]] |
|
755 |
+``` |
|
756 |
+ |
|
757 |
+where the corresponding fields are: |
|
758 |
+ |
|
759 |
+- `Name:` name of the entry |
|
760 |
+ |
|
761 |
+- `Trusted:` bit field, specifying whether the cert is trusted. 1 for trusted. 0 for revoked |
|
762 |
+ |
|
763 |
+- `Subject:` sha1 of the Subject field in hex |
|
764 |
+ |
|
765 |
+- `Serial:` the serial number as clamscan –debug –verbose reports |
|
766 |
+ |
|
767 |
+- `Pubkey:` the public key in hex |
|
768 |
+ |
|
769 |
+- `Exponent:` the exponent in hex. Currently ignored and hardcoded to 010001 (in hex) |
|
770 |
+ |
|
771 |
+- `CodeSign:` bit field, specifying whether this cert can sign code. 1 for true, 0 for false |
|
772 |
+ |
|
773 |
+- `TimeSign:` bit field. 1 for true, 0 for false |
|
774 |
+ |
|
775 |
+- `CertSign:` bit field, specifying whether this cert can sign other certs. 1 for true, 0 for false |
|
776 |
+ |
|
777 |
+- `NotBefore:` integer, cert should not be added before this variable. Defaults to 0 if left empty |
|
778 |
+ |
|
779 |
+- `Comment:` comments for this entry |
|
780 |
+ |
|
781 |
+The signatures for certs are stored inside `.crb` files. |
|
782 |
+ |
|
783 |
+## Signatures based on container metadata |
|
784 |
+ |
|
785 |
+ClamAV 0.96 allows creating generic signatures matching files stored inside different container types which meet specific conditions. The signature format is |
|
786 |
+ |
|
787 |
+``` |
|
788 |
+ VirusName:ContainerType:ContainerSize:FileNameREGEX: |
|
789 |
+ FileSizeInContainer:FileSizeReal:IsEncrypted:FilePos: |
|
790 |
+ Res1:Res2[:MinFL[:MaxFL]] |
|
791 |
+``` |
|
792 |
+ |
|
793 |
+where the corresponding fields are: |
|
794 |
+ |
|
795 |
+- `VirusName:` Virus name to be displayed when signature matches |
|
796 |
+ |
|
797 |
+- `ContainerType:` one of |
|
798 |
+ - `CL_TYPE_ZIP`, |
|
799 |
+ - `CL_TYPE_RAR`, |
|
800 |
+ - `CL_TYPE_ARJ`, |
|
801 |
+ - `CL_TYPE_MSCAB`, |
|
802 |
+ - `CL_TYPE_7Z`, |
|
803 |
+ - `CL_TYPE_MAIL`, |
|
804 |
+ - `CL_TYPE_(POSIX|OLD)_TAR`, |
|
805 |
+ - `CL_TYPE_CPIO_(OLD|ODC|NEWC|CRC)` or |
|
806 |
+ - `*` to match any of the container types listed here |
|
807 |
+ |
|
808 |
+- `ContainerSize:` size of the container file itself (eg. size of the zip archive) specified in bytes as absolute value or range `x-y` |
|
809 |
+ |
|
810 |
+- `FileNameREGEX:` regular expression describing name of the target file |
|
811 |
+ |
|
812 |
+- `FileSizeInContainer:` usually compressed size; for MAIL, TAR and CPIO == `FileSizeReal`; specified in bytes as absolute value or range |
|
813 |
+ |
|
814 |
+- `FileSizeReal:` usually uncompressed size; for MAIL, TAR and CPIO == `FileSizeInContainer`; absolute value or range |
|
815 |
+ |
|
816 |
+- `IsEncrypted`: 1 if the target file is encrypted, 0 if it’s not and `*` to ignore |
|
817 |
+ |
|
818 |
+- `FilePos`: file position in container (counting from 1); absolute value or range |
|
819 |
+ |
|
820 |
+- `Res1`: when `ContainerType` is `CL_TYPE_ZIP` or `CL_TYPE_RAR` this field is treated as a CRC sum of the target file specified in hexadecimal format; for other container types it’s ignored |
|
821 |
+ |
|
822 |
+- `Res2`: not used as of ClamAV 0.96 |
|
823 |
+ |
|
824 |
+The signatures for container files are stored inside `.cdb` files. |
|
825 |
+ |
|
826 |
+## Signatures based on ZIP/RAR metadata (obsolete) |
|
827 |
+ |
|
828 |
+The (now obsolete) archive metadata signatures can be only applied to |
|
829 |
+ZIP and RAR files and have the following format: |
|
830 |
+ |
|
831 |
+``` |
|
832 |
+ virname:encrypted:filename:normal size:csize:crc32:cmethod: |
|
833 |
+ fileno:max depth |
|
834 |
+``` |
|
835 |
+ |
|
836 |
+where the corresponding fields are: |
|
837 |
+ |
|
838 |
+- Virus name |
|
839 |
+ |
|
840 |
+- Encryption flag (1 – encrypted, 0 – not encrypted) |
|
841 |
+ |
|
842 |
+- File name (this is a regular expression - \* to ignore) |
|
843 |
+ |
|
844 |
+- Normal (uncompressed) size (\* to ignore) |
|
845 |
+ |
|
846 |
+- Compressed size (\* to ignore) |
|
847 |
+ |
|
848 |
+- CRC32 (\* to ignore) |
|
849 |
+ |
|
850 |
+- Compression method (\* to ignore) |
|
851 |
+ |
|
852 |
+- File position in archive (\* to ignore) |
|
853 |
+ |
|
854 |
+- Maximum number of nested archives (\* to ignore) |
|
855 |
+ |
|
856 |
+The database file should have the extension of `.zmd` or `.rmd` for zip or rar metadata respectively. |
|
857 |
+ |
|
858 |
+## Whitelist databases |
|
859 |
+ |
|
860 |
+To whitelist a specific file use the MD5 signature format and place it inside a database file with the extension of `.fp`. To whitelist a specific file with the SHA1 or SHA256 file hash signature format, place the signature inside a database file with the extension of `.sfp`. To whitelist a specific signature from the database you just add its name into a local file called local.ign2 stored inside the database directory. You can additionally follow the signature name with the MD5 of the entire database entry for this signature, eg: |
|
861 |
+ |
|
862 |
+``` |
|
863 |
+ Eicar-Test-Signature:bc356bae4c42f19a3de16e333ba3569c |
|
864 |
+``` |
|
865 |
+ |
|
866 |
+In such a case, the signature will no longer be whitelisted when its entry in the database gets modified (eg. the signature gets updated to avoid false alerts). |
|
867 |
+ |
|
868 |
+## Signature names |
|
869 |
+ |
|
870 |
+ClamAV uses the following prefixes for signature names: |
|
871 |
+ |
|
872 |
+- *Worm* for Internet worms |
|
873 |
+ |
|
874 |
+- *Trojan* for backdoor programs |
|
875 |
+ |
|
876 |
+- *Adware* for adware |
|
877 |
+ |
|
878 |
+- *Flooder* for flooders |
|
879 |
+ |
|
880 |
+- *HTML* for HTML files |
|
881 |
+ |
|
882 |
+- *Email* for email messages |
|
883 |
+ |
|
884 |
+- *IRC* for IRC trojans |
|
885 |
+ |
|
886 |
+- *JS* for Java Script malware |
|
887 |
+ |
|
888 |
+- *PHP* for PHP malware |
|
889 |
+ |
|
890 |
+- *ASP* for ASP malware |
|
891 |
+ |
|
892 |
+- *VBS* for VBS malware |
|
893 |
+ |
|
894 |
+- *BAT* for BAT malware |
|
895 |
+ |
|
896 |
+- *W97M*, *W2000M* for Word macro viruses |
|
897 |
+ |
|
898 |
+- *X97M*, *X2000M* for Excel macro viruses |
|
899 |
+ |
|
900 |
+- *O97M*, *O2000M* for generic Office macro viruses |
|
901 |
+ |
|
902 |
+- *DoS* for Denial of Service attack software |
|
903 |
+ |
|
904 |
+- *DOS* for old DOS malware |
|
905 |
+ |
|
906 |
+- *Exploit* for popular exploits |
|
907 |
+ |
|
908 |
+- *VirTool* for virus construction kits |
|
909 |
+ |
|
910 |
+- *Dialer* for dialers |
|
911 |
+ |
|
912 |
+- *Joke* for hoaxes |
|
913 |
+ |
|
914 |
+Important rules of the naming convention: |
|
915 |
+ |
|
916 |
+- always use a -zippwd suffix in the malware name for signatures of type zmd, |
|
917 |
+ |
|
918 |
+- always use a -rarpwd suffix in the malware name for signatures of type rmd, |
|
919 |
+ |
|
920 |
+- only use alphanumeric characters, dash (-), dot (.), underscores (_) in malware names, never use space, apostrophe or quote mark. |
|
921 |
+ |
|
922 |
+## Using YARA rules in ClamAV |
|
923 |
+ |
|
924 |
+ClamAV version 0.99 and above can process YARA rules. ClamAV virus database file names ending with “.yar” or “.yara” are parsed as yara rule files. The link to the YARA rule grammar documentation may be found at http://plusvic.github.io/yara/. There are currently a few limitations on using YARA rules within ClamAV: |
|
925 |
+ |
|
926 |
+- YARA modules are not yet supported by ClamAV. This includes the “import” keyword and any YARA module-specific keywords. |
|
927 |
+ |
|
928 |
+- Global rules(“global” keyword) are not supported by ClamAV. |
|
929 |
+ |
|
930 |
+- External variables(“contains” and “matches” keywords) are not supported. |
|
931 |
+ |
|
932 |
+- YARA rules pre-compiled with the *yarac* command are not supported. |
|
933 |
+ |
|
934 |
+- As in the ClamAV logical and extended signature formats, YARA strings and segments of strings separated by wild cards must represent at least two octets of data. |
|
935 |
+ |
|
936 |
+- There is a maximum of 64 strings per YARA rule. |
|
937 |
+ |
|
938 |
+- YARA rules in ClamAV must contain at least one literal, hexadecimal, or regular expression string. |
|
939 |
+ |
|
940 |
+In addition, there are a few more ClamAV processing modes that may affect the outcome of YARA rules. |
|
941 |
+ |
|
942 |
+- *File decomposition and decompression* - Since ClamAV uses file decomposition and decompression to find viruses within de-archived and uncompressed inner files, YARA rules executed by ClamAV will match against these files as well. |
|
943 |
+ |
|
944 |
+- *Normalization* - By default, ClamAV normalizes HTML, JavaScript, and ASCII text files. YARA rules in ClamAV will match against the normalized result. The effects of normalization of these file types may be captured using `clamscan --leave-temps --tempdir=mytempdir`. YARA rules may then be written using the normalized file(s) found in `mytempdir`. Alternatively, starting with ClamAV 0.100.0, `clamscan --normalize=no` will prevent normalization and only scan the raw file. To obtain similar behavior prior to 0.99.2, use `clamscan --scan-html=no`. The corresponding parameters for clamd.conf are `Normalize` and `ScanHTML`. |
|
945 |
+ |
|
946 |
+- *YARA conditions driven by string matches* - All YARA conditions are driven by string matches in ClamAV. This saves from executing every YARA rule on every file. Any YARA condition may be augmented with a string match clause which is always true, such as: |
|
947 |
+ |
|
948 |
+```yara |
|
949 |
+ rule CheckFileSize |
|
950 |
+ { |
|
951 |
+ strings: |
|
952 |
+ $abc = "abc" |
|
953 |
+ condition: |
|
954 |
+ ($abc or not $abc) and filesize < 200KB |
|
955 |
+ } |
|
956 |
+``` |
|
957 |
+ |
|
958 |
+This will ensure that the YARA condition always performs the desired action (checking the file size in this example), |
|
959 |
+ |
|
960 |
+## Passwords for archive files \[experimental\] |
|
961 |
+ |
|
962 |
+ClamAV 0.99 allows for users to specify password attempts for certain password-compatible archives. Passwords will be attempted in order of appearance in the password signature file which use the extension of `.pwdb`. If no passwords apply or none are provided, ClamAV will default to the original behavior of parsing the file. Currently, as of ClamAV 0.99 \[flevel 81\], only `.zip` archives using the traditional PKWARE encryption are supported. The signature format is |
|
963 |
+ |
|
964 |
+``` |
|
965 |
+ SignatureName;TargetDescriptionBlock;PWStorageType;Password |
|
966 |
+``` |
|
967 |
+ |
|
968 |
+where: |
|
969 |
+ |
|
970 |
+- `SignatureName`: name to be displayed during debug when a password is successful |
|
971 |
+ |
|
972 |
+- `TargetDescriptionBlock`: provides information about the engine and target file with comma separated Arg:Val pairs |
|
973 |
+ - `Engine:X-Y`: Required engine functionality |
|
974 |
+ - `Container:CL_TYPE_*`: File type of applicable containers |
|
975 |
+ |
|
976 |
+- `PWStorageType`: determines how the password field is parsed |
|
977 |
+ - 0 = cleartext |
|
978 |
+ - 1 = hex |
|
979 |
+ |
|
980 |
+- `Password`: value used in password attempt |
|
981 |
+ |
|
982 |
+The signatures for password attempts are stored inside `.pwdb` files. |
|
983 |
+ |
|
984 |
+# Special files |
|
985 |
+ |
|
986 |
+## HTML |
|
987 |
+ |
|
988 |
+ClamAV contains a special HTML normalisation code which helps to detect HTML exploits. Running `sigtool --html-normalise` on a HTML file should generate the following files: |
|
989 |
+ |
|
990 |
+- nocomment.html - the file is normalized, lower-case, with all comments and superfluous white space removed |
|
991 |
+ |
|
992 |
+- notags.html - as above but with all HTML tags removed |
|
993 |
+ |
|
994 |
+The code automatically decodes JScript.encode parts and char ref’s (e.g. `f`). You need to create a signature against one of the created files. To eliminate potential false positive alerts the target type should be set to 3. |
|
995 |
+ |
|
996 |
+## Text files |
|
997 |
+ |
|
998 |
+Similarly to HTML all ASCII text files get normalized (converted to lower-case, all superfluous white space and control characters removed, etc.) before scanning. Use `clamscan --leave-temps` to obtain a normalized file then create a signature with the target type 7. |
|
999 |
+ |
|
1000 |
+## Compressed Portable Executable files |
|
1001 |
+ |
|
1002 |
+If the file is compressed with UPX, FSG, Petite or other PE packer supported by libclamav, run `clamscan` with `--debug --leave-temps`. Example output for a FSG compressed file: |
|
1003 |
+ |
|
1004 |
+```bash |
|
1005 |
+LibClamAV debug: UPX/FSG/MEW: empty section found - assuming compression |
|
1006 |
+LibClamAV debug: FSG: found old EP @119e0 |
|
1007 |
+LibClamAV debug: FSG: Unpacked and rebuilt executable saved in |
|
1008 |
+/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c |
|
1009 |
+ |
|
1010 |
+``` |
|
1011 |
+ |
|
1012 |
+Next create a type 1 signature for `/tmp/clamav-f592b20f9329ac1c91f0e12137bcce6c` |
0 | 1013 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,242 @@ |
0 |
+# Usage |
|
1 |
+ |
|
2 |
+## Clam daemon |
|
3 |
+ |
|
4 |
+`clamd` is a multi-threaded daemon that uses *libclamav* to scan files for viruses. It may work in one or both modes listening on: |
|
5 |
+ |
|
6 |
+- Unix (local) socket |
|
7 |
+- TCP socket |
|
8 |
+ |
|
9 |
+The daemon is fully configurable via the `clamd.conf` file \[8\]. `clamd` recognizes the following commands: |
|
10 |
+ |
|
11 |
+- **PING** |
|
12 |
+ Check the daemon’s state (should reply with "PONG"). |
|
13 |
+- **VERSION** |
|
14 |
+ Print program and database versions. |
|
15 |
+- **RELOAD** |
|
16 |
+ Reload the databases. |
|
17 |
+- **SHUTDOWN** |
|
18 |
+ Perform a clean exit. |
|
19 |
+- **SCAN file/directory** |
|
20 |
+ Scan file or directory (recursively) with archive support enabled (a full path is required). |
|
21 |
+- **RAWSCAN file/directory** |
|
22 |
+ Scan file or directory (recursively) with archive and special file support disabled (a full path is required). |
|
23 |
+- **CONTSCAN file/directory** |
|
24 |
+ Scan file or directory (recursively) with archive support enabled and don’t stop the scanning when a virus is found. |
|
25 |
+- **MULTISCAN file/directory** |
|
26 |
+ Scan file in a standard way or scan directory (recursively) using multiple threads (to make the scanning faster on SMP machines). |
|
27 |
+- **ALLMATCHSCAN file/directory** |
|
28 |
+ ALLMATCHSCAN works just like SCAN except that it sets a mode where, after finding a virus within a file, continues scanning for additional viruses. |
|
29 |
+- **INSTREAM** |
|
30 |
+ *It is mandatory to prefix this command with **n** or **z**.* Scan a stream of data. The stream is sent to clamd in chunks, after INSTREAM, on the same socket on which the command was sent. This avoids the overhead of establishing new TCP connections and problems with NAT. The format of the chunk is: `<length><data>` where `<length>` is the size of the following data in bytes expressed as a 4 byte unsigned integer in network byte order and `<data>` is the actual chunk. Streaming is terminated by sending a zero-length chunk. Note: do not exceed StreamMaxLength as defined in clamd.conf, otherwise clamd will reply with *INSTREAM size limit exceeded* and close the connection. |
|
31 |
+- **FILDES** |
|
32 |
+ *It is mandatory to newline terminate this command, or prefix with **n** or **z**. This command only works on UNIX domain sockets.* Scan a file descriptor. After issuing a FILDES command a subsequent rfc2292/bsd4.4 style packet (with at least one dummy character) is sent to clamd carrying the file descriptor to be scanned inside the ancillary data. Alternatively the file descriptor may be sent in the same packet, including the extra character. |
|
33 |
+- **STATS** |
|
34 |
+ *It is mandatory to newline terminate this command, or prefix with **n** or **z**, it is recommended to only use the **z** prefix.* On this command clamd provides statistics about the scan queue, contents of scan queue, and memory usage. The exact reply format is subject to changes in future releases. |
|
35 |
+- **IDSESSION, END** |
|
36 |
+ *It is mandatory to prefix this command with **n** or **z**, also all commands inside **IDSESSION** must be prefixed.* Start/end a clamd session. Within a session multiple SCAN, INSTREAM, FILDES, VERSION, STATS commands can be sent on the same socket without opening new connections. Replies from clamd will be in the form `<id>: <response>` where `<id>` is the request number (in ASCII, starting from 1) and `<response>` is the usual clamd reply. The reply lines have the same delimiter as the corresponding command had. Clamd will process the commands asynchronously, and reply as soon as it has finished processing. Clamd requires clients to read all the replies it sent, before sending more commands to prevent send() deadlocks. The recommended way to implement a client that uses IDSESSION is with non-blocking sockets, and a select()/poll() loop: whenever send would block, sleep in select/poll until either you can write more data, or read more replies. *Note that using non-blocking sockets without the select/poll loop and alternating recv()/send() doesn’t comply with clamd’s requirements.* If clamd detects that a client has deadlocked, it will close the connection. Note that clamd may close an IDSESSION connection too if the client doesn’t follow the protocol’s requirements. |
|
37 |
+- **STREAM** (deprecated, use **INSTREAM** instead) |
|
38 |
+ Scan stream: clamd will return a new port number you should connect to and send data to scan. |
|
39 |
+ |
|
40 |
+It’s recommended to prefix clamd commands with the letter **z** (eg. zSCAN) to indicate that the command will be delimited by a NULL character and that clamd should continue reading command data until a NULL character is read. The null delimiter assures that the complete command and its entire argument will be processed as a single command. Alternatively commands may be prefixed with the letter **n** (e.g. nSCAN) to use a newline character as the delimiter. Clamd replies will honour the requested terminator in turn. If clamd doesn’t recognize the command, or the command doesn’t follow the requirements specified below, it will reply with an error message, and close the connection. Clamd can handle the following signals: |
|
41 |
+ |
|
42 |
+- **SIGTERM** - perform a clean exit |
|
43 |
+- **SIGHUP** - reopen the log file |
|
44 |
+- **SIGUSR2** - reload the database |
|
45 |
+ |
|
46 |
+Clamd should not be started in the background using the shell operator `&` or external tools. Instead, you should run and wait for clamd to load the database and daemonize itself. After that, clamd is instantly ready to accept connections and perform file scanning. |
|
47 |
+ |
|
48 |
+## Clam**d**scan |
|
49 |
+ |
|
50 |
+`clamdscan` is a simple `clamd` client. In many cases you can use it as a `clamscan` replacement however you must remember that: |
|
51 |
+ |
|
52 |
+- it only depends on `clamd` |
|
53 |
+- although it accepts the same command line options as `clamscan` most of them are ignored because they must be enabled directly in `clamd`, i.e. `clamd.conf` |
|
54 |
+- in TCP mode scanned files must be accessible for `clamd`, if you enabled LocalSocket in clamd.conf then clamdscan will try to workaround this limitation by using FILDES |
|
55 |
+ |
|
56 |
+## On-access Scanning |
|
57 |
+ |
|
58 |
+There is a special thread in `clamd` that performs on-access scanning under Linux and shares internal virus database with the daemon. By default, this thread will only notify you when potential threats are discovered. If you turn on prevention via `clamd.conf` then **you must follow some important rules when using it:** |
|
59 |
+ |
|
60 |
+- Always stop the daemon cleanly - using the SHUTDOWN command or the SIGTERM signal. In other case you can lose access to protected files until the system is restarted. |
|
61 |
+- Never protect the directory your mail-scanner software uses for attachment unpacking. Access to all infected files will be automatically blocked and the scanner (including `clamd`\!) will not be able to detect any viruses. In the result **all infected mails may be delivered.** |
|
62 |
+- Watch your entire filesystem only using the `clamd.conf` OnAccessMountPath option. While this will disable on-access prevention, it will avoid potential system lockups caused by fanotify’s blocking functionality. |
|
63 |
+- Using the On-Access Scanner to watch a virtual filesystem will result in undefined behaviour. |
|
64 |
+ |
|
65 |
+The default configuration utilizes inotify to recursively keep track of directories. If you need to protect more than 8192 directories it will be necessary to change inotify’s `max_user_watches` value. |
|
66 |
+ |
|
67 |
+This can be done temporarily with: |
|
68 |
+ |
|
69 |
+```bash |
|
70 |
+ $ sysctl fs.inotify.max_user_watches=<n> |
|
71 |
+``` |
|
72 |
+ |
|
73 |
+Where `<n>` is the new maximum desired. |
|
74 |
+ |
|
75 |
+To watch your entire filesystem add the following lines to `clamd.conf`: |
|
76 |
+ |
|
77 |
+```ini |
|
78 |
+ ScanOnAccess yes |
|
79 |
+ OnAccessMountPath / |
|
80 |
+``` |
|
81 |
+ |
|
82 |
+Similarly, to protect your home directory add the following lines to |
|
83 |
+`clamd.conf`: |
|
84 |
+ |
|
85 |
+```ini |
|
86 |
+ ScanOnAccess yes |
|
87 |
+ OnAccessIncludePath /home |
|
88 |
+ OnAccessExcludePath /home/user/temp/dir/of/your/mail/scanning/software |
|
89 |
+ OnAccessPrevention yes |
|
90 |
+``` |
|
91 |
+ |
|
92 |
+For more configuration options, type ’man clamd.conf’ or reference the example clamd.conf. |
|
93 |
+ |
|
94 |
+## Clamdtop |
|
95 |
+ |
|
96 |
+`clamdtop` is a tool to monitor one or multiple instances of clamd. It has a (color) ncurses interface, that shows the jobs in clamd’s queue, memory usage, and information about the loaded signature database. You can specify on the command-line to which clamd(s) it should connect to. By default it will attempt to connect to the local clamd as defined in clamd.conf. |
|
97 |
+ |
|
98 |
+For more detailed help, type ’man clamdtop’ or ’clamdtop –help’. |
|
99 |
+ |
|
100 |
+## Clamscan |
|
101 |
+ |
|
102 |
+`clamscan` is ClamAV’s command line virus scanner. It can be used to scan files and/or directories for viruses. In order for clamscan to work proper, the ClamAV virus database files must be installed on the system you are using clamscan on. |
|
103 |
+ |
|
104 |
+The general usage of clamscan is: clamscan \[options\] |
|
105 |
+\[file/directory/-\] |
|
106 |
+ |
|
107 |
+For more detailed help, type ’man clamscan’ or ’clamscan –help’. |
|
108 |
+ |
|
109 |
+## ClamBC |
|
110 |
+ |
|
111 |
+`clambc` is Clam Anti-Virus’ bytecode testing tool. It can be used to test files which contain bytecode. For more detailed help, type ’man clambc’ or ’clambc –help’. |
|
112 |
+ |
|
113 |
+## Freshclam |
|
114 |
+ |
|
115 |
+`freshclam` is ClamAV’s virus database update tool and reads it’s configuration from the file ’freshclam.conf’ (this may be overridden by command line options). Freshclam’s default behavior is to attempt to update databases that are paired with downloaded cdiffs. Potentially corrupted databases are not updated and are automatically fully replaced after several failed attempts unless otherwise specified. |
|
116 |
+ |
|
117 |
+Here is a sample usage including cdiffs: |
|
118 |
+ |
|
119 |
+```bash |
|
120 |
+$ freshclam |
|
121 |
+ |
|
122 |
+ClamAV update process started at Mon Oct 7 08:15:10 2013 |
|
123 |
+main.cld is up to date (version: 55, sigs: 2424225, f-level: 60, builder: neo) |
|
124 |
+Downloading daily-17945.cdiff [100%] |
|
125 |
+Downloading daily-17946.cdiff [100%] |
|
126 |
+Downloading daily-17947.cdiff [100%] |
|
127 |
+daily.cld updated (version: 17947, sigs: 406951, f-level: 63, builder: neo) |
|
128 |
+Downloading bytecode-227.cdiff [100%] |
|
129 |
+Downloading bytecode-228.cdiff [100%] |
|
130 |
+bytecode.cld updated (version: 228, sigs: 43, f-level: 63, builder: neo) |
|
131 |
+Database updated (2831219 signatures) from database.clamav.net (IP: 64.6.100.177) |
|
132 |
+``` |
|
133 |
+ |
|
134 |
+For more detailed help, type ’man clamscan’ or ’clamscan –help’. |
|
135 |
+ |
|
136 |
+## Clamconf |
|
137 |
+ |
|
138 |
+`clamconf` is the Clam Anti-Virus configuration utility. It is used for displaying values of configurations options in ClamAV, which will show the contents of clamd.conf (or tell you if it is not properly configured), the contents of freshclam.conf, and display information about software settings, database, platform, and build information. Here is a sample clamconf output: |
|
139 |
+ |
|
140 |
+```bash |
|
141 |
+$ clamconf |
|
142 |
+ |
|
143 |
+Checking configuration files in /etc/clamav |
|
144 |
+ |
|
145 |
+Config file: clamd.conf |
|
146 |
+----------------------- |
|
147 |
+ERROR: Please edit the example config file /etc/clamav/clamd.conf |
|
148 |
+ |
|
149 |
+Config file: freshclam.conf |
|
150 |
+--------------------------- |
|
151 |
+ERROR: Please edit the example config file /etc/clamav/freshclam.conf |
|
152 |
+ |
|
153 |
+clamav-milter.conf not found |
|
154 |
+ |
|
155 |
+Software settings |
|
156 |
+----------------- |
|
157 |
+Version: 0.98.2 |
|
158 |
+Optional features supported: MEMPOOL IPv6 AUTOIT_EA06 BZIP2 RAR JIT |
|
159 |
+ |
|
160 |
+Database information |
|
161 |
+-------------------- |
|
162 |
+Database directory: /xclam/gcc/release/share/clamav |
|
163 |
+WARNING: freshclam.conf and clamd.conf point to different database directories |
|
164 |
+print_dbs: Can't open directory /xclam/gcc/release/share/clamav |
|
165 |
+ |
|
166 |
+Platform information |
|
167 |
+-------------------- |
|
168 |
+uname: Linux 3.5.0-44-generic #67~precise1-Ubuntu SMP Wed Nov 13 16:20:03 UTC 2013 i686 |
|
169 |
+OS: linux-gnu, ARCH: i386, CPU: i686 |
|
170 |
+Full OS version: Ubuntu 12.04.3 LTS |
|
171 |
+zlib version: 1.2.3.4 (1.2.3.4), compile flags: 55 |
|
172 |
+Triple: i386-pc-linux-gnu |
|
173 |
+CPU: i686, Little-endian |
|
174 |
+platform id: 0x0a114d4d0404060401040604 |
|
175 |
+ |
|
176 |
+Build information |
|
177 |
+----------------- |
|
178 |
+GNU C: 4.6.4 (4.6.4) |
|
179 |
+GNU C++: 4.6.4 (4.6.4) |
|
180 |
+CPPFLAGS: |
|
181 |
+CFLAGS: -g -O0 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE |
|
182 |
+CXXFLAGS: |
|
183 |
+LDFLAGS: |
|
184 |
+Configure: '--prefix=/xclam/gcc/release/' '--disable-clamav' '--enable-debug' 'CFLAGS=-g -O0' |
|
185 |
+sizeof(void*) = 4 |
|
186 |
+Engine flevel: 77, dconf: 77 |
|
187 |
+ |
|
188 |
+``` |
|
189 |
+ |
|
190 |
+For more detailed help, type ’man clamconf’ or ’clamconf –help’. |
|
191 |
+ |
|
192 |
+## Output format |
|
193 |
+ |
|
194 |
+### clamscan |
|
195 |
+ |
|
196 |
+`clamscan` writes all regular program messages to **stdout** and errors/warnings to **stderr**. You can use the option `--stdout` to redirect all program messages to **stdout**. Warnings and error messages from `libclamav` are always printed to **stderr**. A typical output from `clamscan` looks like this: |
|
197 |
+ |
|
198 |
+```bash |
|
199 |
+ /tmp/test/removal-tool.exe: Worm.Sober FOUND |
|
200 |
+ /tmp/test/md5.o: OK |
|
201 |
+ /tmp/test/blob.c: OK |
|
202 |
+ /tmp/test/message.c: OK |
|
203 |
+ /tmp/test/error.hta: VBS.Inor.D FOUND |
|
204 |
+``` |
|
205 |
+ |
|
206 |
+When a virus is found its name is printed between the `filename:` and `FOUND` strings. In case of archives the scanner depends on libclamav and only prints the first virus found within an archive: |
|
207 |
+ |
|
208 |
+```bash |
|
209 |
+ $ clamscan malware.zip |
|
210 |
+ malware.zip: Worm.Mydoom.U FOUND |
|
211 |
+``` |
|
212 |
+ |
|
213 |
+When using the –allmatch(-z) flag, clamscan may print multiple virus `FOUND` lines for archives and files. |
|
214 |
+ |
|
215 |
+### clamd |
|
216 |
+ |
|
217 |
+The output format of `clamd` is very similar to `clamscan`. |
|
218 |
+ |
|
219 |
+```bash |
|
220 |
+ $ telnet localhost 3310 |
|
221 |
+ Trying 127.0.0.1... |
|
222 |
+ Connected to localhost. |
|
223 |
+ Escape character is '^]'. |
|
224 |
+ SCAN /home/zolw/test |
|
225 |
+ /home/zolw/test/clam.exe: ClamAV-Test-File FOUND |
|
226 |
+ Connection closed by foreign host. |
|
227 |
+``` |
|
228 |
+ |
|
229 |
+In the **SCAN** mode it closes the connection when the first virus is found. |
|
230 |
+ |
|
231 |
+```bash |
|
232 |
+ SCAN /home/zolw/test/clam.zip |
|
233 |
+ /home/zolw/test/clam.zip: ClamAV-Test-File FOUND |
|
234 |
+``` |
|
235 |
+ |
|
236 |
+**CONTSCAN** and **MULTISCAN** don’t stop scanning in case a virus is found. Error messages are printed in the following format: |
|
237 |
+ |
|
238 |
+```bash |
|
239 |
+ SCAN /no/such/file |
|
240 |
+ /no/such/file: Can't stat() the file. ERROR |
|
241 |
+``` |
0 | 242 |
new file mode 100644 |
... | ... |
@@ -0,0 +1,400 @@ |
0 |
+# LibClamAV |
|
1 |
+ |
|
2 |
+Libclamav provides an easy and effective way to add a virus protection into your software. The library is thread-safe and transparently recognizes and scans within archives, mail files, MS Office document files, executables and other special formats. |
|
3 |
+ |
|
4 |
+## License |
|
5 |
+ |
|
6 |
+Libclamav is licensed under the GNU GPL v2 license. This means you are **not allowed** to link commercial, closed-source software against it. All software using libclamav must be GPL compliant. |
|
7 |
+ |
|
8 |
+## Supported formats and features |
|
9 |
+ |
|
10 |
+### Executables |
|
11 |
+ |
|
12 |
+The library has a built-in support for 32- and 64-bit Portable Executable, ELF and Mach-O files. Additionally, it can handle PE files compressed or obfuscated with the following tools: |
|
13 |
+ |
|
14 |
+- Aspack (2.12) |
|
15 |
+- UPX (all versions) |
|
16 |
+- FSG (1.3, 1.31, 1.33, 2.0) |
|
17 |
+- Petite (2.x) |
|
18 |
+- PeSpin (1.1) |
|
19 |
+- NsPack |
|
20 |
+- wwpack32 (1.20) |
|
21 |
+- MEW |
|
22 |
+- Upack |
|
23 |
+- Y0da Cryptor (1.3) |
|
24 |
+ |
|
25 |
+### Mail files |
|
26 |
+ |
|
27 |
+Libclamav can handle almost every mail file format including TNEF (winmail.dat) attachments. |
|
28 |
+ |
|
29 |
+### Archives and compressed files |
|
30 |
+ |
|
31 |
+The following archive and compression formats are supported by internal handlers: |
|
32 |
+ |
|
33 |
+- Zip (+ SFX) |
|
34 |
+- RAR (+ SFX) |
|
35 |
+- 7Zip |
|
36 |
+- Tar |
|
37 |
+- CPIO |
|
38 |
+- Gzip |
|
39 |
+- Bzip2 |
|
40 |
+- DMG |
|
41 |
+- IMG |
|
42 |
+- ISO 9660 |
|
43 |
+- PKG |
|
44 |
+- HFS+ partition |
|
45 |
+- HFSX partition |
|
46 |
+- APM disk image |
|
47 |
+- GPT disk image |
|
48 |
+- MBR disk image |
|
49 |
+- XAR |
|
50 |
+- XZ |
|
51 |
+- MS OLE2 |
|
52 |
+- MS Cabinet Files (+ SFX) |
|
53 |
+- MS CHM (Compiled HTML) |
|
54 |
+- MS SZDD compression format |
|
55 |
+- BinHex |
|
56 |
+- SIS (SymbianOS packages) |
|
57 |
+- AutoIt |
|
58 |
+- NSIS |
|
59 |
+- InstallShield |
|
60 |
+ |
|
61 |
+### Documents |
|
62 |
+ |
|
63 |
+The most popular file formats are supported: |
|
64 |
+ |
|
65 |
+- MS Office and MacOffice files |
|
66 |
+- RTF |
|
67 |
|
|
68 |
+- HTML |
|
69 |
+ |
|
70 |
+In the case of Office, RTF and PDF files, libclamav will only extract the embedded objects and will not decode the text data itself. The text decoding and normalization is only performed for HTML files. |
|
71 |
+ |
|
72 |
+### Data Loss Prevention |
|
73 |
+ |
|
74 |
+Libclamav includes a DLP module which can detect the following credit card issuers: AMEX, VISA, MasterCard, Discover, Diner’s Club, and JCB and U.S. social security numbers inside text files. |
|
75 |
+ |
|
76 |
+Future versions of Libclamav may include additional features to detect other credit cards and other forms of PII (Personally Identifiable Information) which may be transmitted without the benefit of being encrypted. |
|
77 |
+ |
|
78 |
+### Others |
|
79 |
+ |
|
80 |
+Libclamav can handle various obfuscators, encoders, files vulnerable to security risks such as: |
|
81 |
+ |
|
82 |
+- JPEG (exploit detection) |
|
83 |
+- RIFF (exploit detection) |
|
84 |
+- uuencode |
|
85 |
+- ScrEnc obfuscation |
|
86 |
+- CryptFF |
|
87 |
+ |
|
88 |
+## API |
|
89 |
+ |
|
90 |
+### Header file |
|
91 |
+ |
|
92 |
+Every program using libclamav must include the header file `clamav.h`: |
|
93 |
+ |
|
94 |
+```c |
|
95 |
+ #include <clamav.h> |
|
96 |
+``` |
|
97 |
+ |
|
98 |
+### Initialization |
|
99 |
+ |
|
100 |
+Before using libclamav, you should call `cl_init()` to initialize it. `CL_INIT_DEFAULT` is a macro that can be passed to `cl_init()` representing the default initialization settings. When it’s done, you’re ready to create a new scan engine by calling `cl_engine_new()`. To free resources allocated by the engine use `cl_engine_free()`. Function prototypes: |
|
101 |
+ |
|
102 |
+```c |
|
103 |
+ int cl_init(unsigned int options); |
|
104 |
+ struct cl_engine *cl_engine_new(void); |
|
105 |
+ int cl_engine_free(struct cl_engine *engine); |
|
106 |
+``` |
|
107 |
+ |
|
108 |
+`cl_init()` and `cl_engine_free()` return `CL_SUCCESS` on success or another code on error. `cl_engine_new()` return a pointer or NULL if there’s not enough memory to allocate a new engine structure. |
|
109 |
+ |
|
110 |
+### Database loading |
|
111 |
+ |
|
112 |
+The following set of functions provides an interface for loading the virus database: |
|
113 |
+ |
|
114 |
+```c |
|
115 |
+ const char *cl_retdbdir(void); |
|
116 |
+ |
|
117 |
+ int cl_load(const char *path, struct cl_engine *engine, |
|
118 |
+ unsigned int *signo, unsigned int options); |
|
119 |
+``` |
|
120 |
+ |
|
121 |
+`cl_retdbdir()` returns the default (hardcoded) path to the directory with ClamAV databases. `cl_load()` loads a single database file or all databases from a given directory (when `path` points to a directory). The second argument is used for passing in the pointer to the engine that should be previously allocated with `cl_engine_new()`. A number of loaded signatures will be **added** to `signo`. The last argument can pass the following flags: |
|
122 |
+ |
|
123 |
+- **CL_DB_STDOPT** |
|
124 |
+ This is an alias for a recommended set of scan options. |
|
125 |
+- **CL_DB_PHISHING** |
|
126 |
+ Load phishing signatures. |
|
127 |
+- **CL_DB_PHISHING_URLS** |
|
128 |
+ Initialize the phishing detection module and load .wdb and .pdb |
|
129 |
+ files. |
|
130 |
+- **CL_DB_PUA** |
|
131 |
+ Load signatures for Potentially Unwanted Applications. |
|
132 |
+- **CL_DB_OFFICIAL_ONLY** |
|
133 |
+ Only load official signatures from digitally signed databases. |
|
134 |
+- **CL_DB_BYTECODE** |
|
135 |
+ Load bytecode. |
|
136 |
+ |
|
137 |
+`cl_load()` returns `CL_SUCCESS` on success and another code on failure. |
|
138 |
+ |
|
139 |
+```c |
|
140 |
+ ... |
|
141 |
+ struct cl_engine *engine; |
|
142 |
+ unsigned int sigs = 0; |
|
143 |
+ int ret; |
|
144 |
+ |
|
145 |
+ if((ret = cl_init(CL_INIT_DEFAULT)) != CL_SUCCESS) { |
|
146 |
+ printf("cl_init() error: %s\n", cl_strerror(ret)); |
|
147 |
+ return 1; |
|
148 |
+ } |
|
149 |
+ |
|
150 |
+ if(!(engine = cl_engine_new())) { |
|
151 |
+ printf("Can't create new engine\n"); |
|
152 |
+ return 1; |
|
153 |
+ } |
|
154 |
+ |
|
155 |
+ ret = cl_load(cl_retdbdir(), engine, &sigs, CL_DB_STDOPT); |
|
156 |
+``` |
|
157 |
+ |
|
158 |
+### Error handling |
|
159 |
+ |
|
160 |
+Use `cl_strerror()` to convert error codes into human readable messages. The function returns a statically allocated string: |
|
161 |
+ |
|
162 |
+```c |
|
163 |
+ if(ret != CL_SUCCESS) { |
|
164 |
+ printf("cl_load() error: %s\n", cl_strerror(ret)); |
|
165 |
+ cl_engine_free(engine); |
|
166 |
+ return 1; |
|
167 |
+ } |
|
168 |
+``` |
|
169 |
+ |
|
170 |
+### Engine structure |
|
171 |
+ |
|
172 |
+When all required databases are loaded you should prepare the detection engine by calling `cl_engine_compile()`. In case of failure you should still free the memory allocated to the engine with `cl_engine_free()`: |
|
173 |
+ |
|
174 |
+```c |
|
175 |
+ int cl_engine_compile(struct cl_engine *engine); |
|
176 |
+``` |
|
177 |
+ |
|
178 |
+In our example: |
|
179 |
+ |
|
180 |
+```c |
|
181 |
+ if((ret = cl_engine_compile(engine)) != CL_SUCCESS) { |
|
182 |
+ printf("cl_engine_compile() error: %s\n", cl_strerror(ret)); |
|
183 |
+ cl_engine_free(engine); |
|
184 |
+ return 1; |
|
185 |
+ } |
|
186 |
+``` |
|
187 |
+ |
|
188 |
+### Limits |
|
189 |
+ |
|
190 |
+When you create a new engine with `cl_engine_new()`, it will have all internal settings set to default values as recommended by the ClamAV authors. It’s possible to check and modify the values (numerical and strings) using the following set of functions: |
|
191 |
+ |
|
192 |
+```c |
|
193 |
+int cl_engine_set_num(struct cl_engine *engine, |
|
194 |
+ enum cl_engine_field field, long long num); |
|
195 |
+ |
|
196 |
+long long cl_engine_get_num(const struct cl_engine *engine, |
|
197 |
+ enum cl_engine_field field, int *err); |
|
198 |
+ |
|
199 |
+int cl_engine_set_str(struct cl_engine *engine, |
|
200 |
+ enum cl_engine_field field, const char *str); |
|
201 |
+ |
|
202 |
+const char *cl_engine_get_str(const struct cl_engine *engine, |
|
203 |
+ enum cl_engine_field field, int *err); |
|
204 |
+``` |
|
205 |
+ |
|
206 |
+Please don’t modify the default values unless you know what you’re doing. Refer to the ClamAV sources (clamscan, clamd) for examples. |
|
207 |
+ |
|
208 |
+### Database checks |
|
209 |
+ |
|
210 |
+It’s very important to keep the internal instance of the database up to date. You can watch database changes with the `cl_stat..()` family of functions. |
|
211 |
+ |
|
212 |
+```c |
|
213 |
+ int cl_statinidir(const char *dirname, struct cl_stat *dbstat); |
|
214 |
+ int cl_statchkdir(const struct cl_stat *dbstat); |
|
215 |
+ int cl_statfree(struct cl_stat *dbstat); |
|
216 |
+``` |
|
217 |
+ |
|
218 |
+Initialization: |
|
219 |
+ |
|
220 |
+```c |
|
221 |
+ ... |
|
222 |
+ struct cl_stat dbstat; |
|
223 |
+ |
|
224 |
+ memset(&dbstat, 0, sizeof(struct cl_stat)); |
|
225 |
+ cl_statinidir(dbdir, &dbstat); |
|
226 |
+``` |
|
227 |
+ |
|
228 |
+To check for a change you just need to call `cl_statchkdir` and check its return value (0 - no change, 1 - some change occurred). Remember to reset the `cl_stat` structure after reloading the database. |
|
229 |
+ |
|
230 |
+```c |
|
231 |
+ if(cl_statchkdir(&dbstat) == 1) { |
|
232 |
+ reload_database...; |
|
233 |
+ cl_statfree(&dbstat); |
|
234 |
+ cl_statinidir(cl_retdbdir(), &dbstat); |
|
235 |
+ } |
|
236 |
+``` |
|
237 |
+ |
|
238 |
+Libclamav \(\ge0.96\) includes and additional call to check the number of signatures that can be loaded from a given directory: |
|
239 |
+ |
|
240 |
+```c |
|
241 |
+ int cl_countsigs(const char *path, unsigned int countoptions, |
|
242 |
+ unsigned int *sigs); |
|
243 |
+``` |
|
244 |
+ |
|
245 |
+The first argument points to the database directory, the second one specifies what signatures should be counted: `CL_COUNTSIGS_OFFICIAL` (official signatures), `CL_COUNTSIGS_UNOFFICIAL` (third party signatures), `CL_COUNTSIGS_ALL` (all signatures). The last argument points to the counter to which the number of detected signatures will be added (therefore the counter should be initially set to 0). The call returns `CL_SUCCESS` or an error code. |
|
246 |
+ |
|
247 |
+### Data scan functions |
|
248 |
+ |
|
249 |
+It’s possible to scan a file or descriptor using: |
|
250 |
+ |
|
251 |
+```c |
|
252 |
+ int cl_scanfile(const char *filename, const char **virname, |
|
253 |
+ unsigned long int *scanned, const struct cl_engine *engine, |
|
254 |
+ unsigned int options); |
|
255 |
+ |
|
256 |
+ int cl_scandesc(int desc, const char **virname, unsigned |
|
257 |
+ long int *scanned, const struct cl_engine *engine, |
|
258 |
+ unsigned int options); |
|
259 |
+``` |
|
260 |
+ |
|
261 |
+Both functions will store a virus name under the pointer `virname`, the virus name is part of the engine structure and must not be released directly. If the third argument (`scanned`) is not NULL, the functions will increase its value with the size of scanned data (in `CL_COUNT_PRECISION` units). The last argument (`options`) specified the scan options and supports the following flags (which can be combined using bit operators): |
|
262 |
+ |
|
263 |
+- **CL_SCAN_STDOPT** |
|
264 |
+ This is an alias for a recommended set of scan options. You should use it to make your software ready for new features in the future versions of libclamav. |
|
265 |
+- **CL_SCAN_RAW** |
|
266 |
+ Use it alone if you want to disable support for special files. |
|
267 |
+- **CL_SCAN_ARCHIVE** |
|
268 |
+ This flag enables transparent scanning of various archive formats. |
|
269 |
+- **CL_SCAN_BLOCKENCRYPTED** |
|
270 |
+ With this flag the library will mark encrypted archives as viruses (Encrypted.Zip, Encrypted.RAR). |
|
271 |
+- **CL_SCAN_MAIL** |
|
272 |
+ Enable support for mail files. |
|
273 |
+- **CL_SCAN_OLE2** |
|
274 |
+ Enables support for OLE2 containers (used by MS Office and .msi files). |
|
275 |
+- **CL_SCAN_PDF** |
|
276 |
+ Enables scanning within PDF files. |
|
277 |
+- **CL_SCAN_SWF** |
|
278 |
+ Enables scanning within SWF files, notably compressed SWF. |
|
279 |
+- **CL_SCAN_PE** |
|
280 |
+ This flag enables deep scanning of Portable Executable files and allows libclamav to unpack executables compressed with run-time unpackers. |
|
281 |
+- **CL_SCAN_ELF** |
|
282 |
+ Enable support for ELF files. |
|
283 |
+- **CL_SCAN_BLOCKBROKEN** |
|
284 |
+ libclamav will try to detect broken executables and mark them as Broken.Executable. |
|
285 |
+- **CL_SCAN_HTML** |
|
286 |
+ This flag enables HTML normalisation (including ScrEnc decryption). |
|
287 |
+- **CL_SCAN_ALGORITHMIC** |
|
288 |
+ Enable algorithmic detection of viruses. |
|
289 |
+- **CL_SCAN_PHISHING_BLOCKSSL** |
|
290 |
+ Phishing module: always block SSL mismatches in URLs. |
|
291 |
+- **CL_SCAN_PHISHING_BLOCKCLOAK** |
|
292 |
+ Phishing module: always block cloaked URLs. |
|
293 |
+- **CL_SCAN_STRUCTURED** |
|
294 |
+ Enable the DLP module which scans for credit card and SSN numbers. |
|
295 |
+- **CL_SCAN_STRUCTURED_SSN_NORMAL** |
|
296 |
+ Search for SSNs formatted as xx-yy-zzzz. |
|
297 |
+- **CL_SCAN_STRUCTURED_SSN_STRIPPED** |
|
298 |
+ Search for SSNs formatted as xxyyzzzz. |
|
299 |
+- **CL_SCAN_PARTIAL_MESSAGE** |
|
300 |
+ Scan RFC1341 messages split over many emails. You will need to periodically clean up `$TemporaryDirectory/clamav-partial` directory. |
|
301 |
+- **CL_SCAN_HEURISTIC_PRECEDENCE** |
|
302 |
+ Allow heuristic match to take precedence. When enabled, if a heuristic scan (such as phishingScan) detects a possible virus/phish it will stop scan immediately. Recommended, saves CPU scan-time. When disabled, virus/phish detected by heuristic scans will be reported only at the end of a scan. If an archive contains both a heuristically detected virus/phishing, and a real malware, the real malware will be reported. |
|
303 |
+- **CL_SCAN_BLOCKMACROS** |
|
304 |
+ OLE2 containers, which contain VBA macros will be marked infected (Heuristics.OLE2.ContainsMacros). |
|
305 |
+ |
|
306 |
+All functions return `CL_CLEAN` when the file seems clean, `CL_VIRUS` when a virus is detected and another value on failure. |
|
307 |
+ |
|
308 |
+```c |
|
309 |
+ ... |
|
310 |
+ const char *virname; |
|
311 |
+ |
|
312 |
+ if((ret = cl_scanfile("/tmp/test.exe", &virname, NULL, engine, |
|
313 |
+ CL_SCAN_STDOPT)) == CL_VIRUS) { |
|
314 |
+ printf("Virus detected: %s\n", virname); |
|
315 |
+ } else { |
|
316 |
+ printf("No virus detected.\n"); |
|
317 |
+ if(ret != CL_CLEAN) |
|
318 |
+ printf("Error: %s\n", cl_strerror(ret)); |
|
319 |
+ } |
|
320 |
+``` |
|
321 |
+ |
|
322 |
+### Memory |
|
323 |
+ |
|
324 |
+Because the engine structure occupies a few megabytes of system memory, you should release it with `cl_engine_free()` if you no longer need to scan files. |
|
325 |
+ |
|
326 |
+### Forking daemons |
|
327 |
+ |
|
328 |
+If you’re using libclamav with a forking daemon you should call `srand()` inside a forked child before making any calls to the libclamav functions. This will avoid possible collisions with temporary filenames created by other processes of the daemon. This procedure is not required for multi-threaded daemons. |
|
329 |
+ |
|
330 |
+### clamav-config |
|
331 |
+ |
|
332 |
+Use `clamav-config` to check compilation information for libclamav. |
|
333 |
+ |
|
334 |
+```bash |
|
335 |
+ $ clamav-config --libs |
|
336 |
+ -L/usr/local/lib -lz -lbz2 -lgmp -lpthread |
|
337 |
+ $ clamav-config --cflags |
|
338 |
+ -I/usr/local/include -g -O2 |
|
339 |
+``` |
|
340 |
+ |
|
341 |
+### Example |
|
342 |
+ |
|
343 |
+You will find an example scanner application in the clamav source package (/example). Provided you have ClamAV already installed, execute the following to compile it: |
|
344 |
+ |
|
345 |
+```bash |
|
346 |
+ gcc -Wall ex1.c -o ex1 -lclamav |
|
347 |
+``` |
|
348 |
+ |
|
349 |
+## CVD format |
|
350 |
+ |
|
351 |
+CVD (ClamAV Virus Database) is a digitally signed tarball containing one or more databases. The header is a 512-bytes long string with colon separated fields: |
|
352 |
+ |
|
353 |
+```ini |
|
354 |
+ClamAV-VDB:build time:version:number of signatures:functionality |
|
355 |
+level required:MD5 checksum:digital signature:builder name:build time (sec) |
|
356 |
+``` |
|
357 |
+ |
|
358 |
+`sigtool --info` displays detailed information on CVD files: |
|
359 |
+ |
|
360 |
+```bash |
|
361 |
+$ sigtool -i daily.cvd |
|
362 |
+File: daily.cvd |
|
363 |
+Build time: 10 Mar 2008 10:45 +0000 |
|
364 |
+Version: 6191 |
|
365 |
+Signatures: 59084 |
|
366 |
+Functionality level: 26 |
|
367 |
+Builder: ccordes |
|
368 |
+MD5: 6e6e29dae36b4b7315932c921e568330 |
|
369 |
+Digital signature: zz9irc9irupR3z7yX6J+OR6XdFPUat4HIM9ERn3kAcOWpcMFxq |
|
370 |
+Fs4toG5WJsHda0Jj92IUusZ7wAgYjpai1Nr+jFfXHsJxv0dBkS5/XWMntj0T1ctNgqmiF |
|
371 |
++RLU6V0VeTl4Oej3Aya0cVpd9K4XXevEO2eTTvzWNCAq0ZzWNdjc |
|
372 |
+Verification OK. |
|
373 |
+``` |
|
374 |
+ |
|
375 |
+## Graphics |
|
376 |
+ |
|
377 |
+The current ClamAV logo was created by Alicia Willet, Talos. |
|
378 |
+ |
|
379 |
+## OpenAntiVirus |
|
380 |
+ |
|
381 |
+Our database includes the virus database (about 7000 signatures) from OpenAntiVirus (<http://OpenAntiVirus.org>). |
|
382 |
+ |
|
383 |
+1. Subscribers are not allowed to post to the mailing list |
|
384 |
+ |
|
385 |
+2. For Windows instructions please see win32/README in the main source code directory. |
|
386 |
+ |
|
387 |
+3. See section [3.7](#unit-testing) on how to run the unit tests |
|
388 |
+ |
|
389 |
+4. if not available ClamAV will fall back to an interpreter |
|
390 |
+ |
|
391 |
+5. Note that several versions of GCC have bugs when compiling LLVM, see <http://llvm.org/docs/GettingStarted.html#brokengcc> for a full list. |
|
392 |
+ |
|
393 |
+6. The configure script in ClamAV automatically enables the unit tests, if it finds the check framework, however it doesn’t consider it a fatal error if unit tests cannot be enabled. |
|
394 |
+ |
|
395 |
+7. To get more info on clamscan options run ’man clamscan’ |
|
396 |
+ |
|
397 |
+8. man 5 clamd.conf |
|
398 |
+ |
|
399 |
+9. Remember to initialize the virus counter variable with 0. |
0 | 400 |
deleted file mode 100644 |
... | ... |
@@ -1,400 +0,0 @@ |
1 |
-# LibClamAV |
|
2 |
- |
|
3 |
-Libclamav provides an easy and effective way to add a virus protection into your software. The library is thread-safe and transparently recognizes and scans within archives, mail files, MS Office document files, executables and other special formats. |
|
4 |
- |
|
5 |
-## License |
|
6 |
- |
|
7 |
-Libclamav is licensed under the GNU GPL v2 license. This means you are **not allowed** to link commercial, closed-source software against it. All software using libclamav must be GPL compliant. |
|
8 |
- |
|
9 |
-## Supported formats and features |
|
10 |
- |
|
11 |
-### Executables |
|
12 |
- |
|
13 |
-The library has a built-in support for 32- and 64-bit Portable Executable, ELF and Mach-O files. Additionally, it can handle PE files compressed or obfuscated with the following tools: |
|
14 |
- |
|
15 |
-- Aspack (2.12) |
|
16 |
-- UPX (all versions) |
|
17 |
-- FSG (1.3, 1.31, 1.33, 2.0) |
|
18 |
-- Petite (2.x) |
|
19 |
-- PeSpin (1.1) |
|
20 |
-- NsPack |
|
21 |
-- wwpack32 (1.20) |
|
22 |
-- MEW |
|
23 |
-- Upack |
|
24 |
-- Y0da Cryptor (1.3) |
|
25 |
- |
|
26 |
-### Mail files |
|
27 |
- |
|
28 |
-Libclamav can handle almost every mail file format including TNEF (winmail.dat) attachments. |
|
29 |
- |
|
30 |
-### Archives and compressed files |
|
31 |
- |
|
32 |
-The following archive and compression formats are supported by internal handlers: |
|
33 |
- |
|
34 |
-- Zip (+ SFX) |
|
35 |
-- RAR (+ SFX) |
|
36 |
-- 7Zip |
|
37 |
-- Tar |
|
38 |
-- CPIO |
|
39 |
-- Gzip |
|
40 |
-- Bzip2 |
|
41 |
-- DMG |
|
42 |
-- IMG |
|
43 |
-- ISO 9660 |
|
44 |
-- PKG |
|
45 |
-- HFS+ partition |
|
46 |
-- HFSX partition |
|
47 |
-- APM disk image |
|
48 |
-- GPT disk image |
|
49 |
-- MBR disk image |
|
50 |
-- XAR |
|
51 |
-- XZ |
|
52 |
-- MS OLE2 |
|
53 |
-- MS Cabinet Files (+ SFX) |
|
54 |
-- MS CHM (Compiled HTML) |
|
55 |
-- MS SZDD compression format |
|
56 |
-- BinHex |
|
57 |
-- SIS (SymbianOS packages) |
|
58 |
-- AutoIt |
|
59 |
-- NSIS |
|
60 |
-- InstallShield |
|
61 |
- |
|
62 |
-### Documents |
|
63 |
- |
|
64 |
-The most popular file formats are supported: |
|
65 |
- |
|
66 |
-- MS Office and MacOffice files |
|
67 |
-- RTF |
|
68 |
|
|
69 |
-- HTML |
|
70 |
- |
|
71 |
-In the case of Office, RTF and PDF files, libclamav will only extract the embedded objects and will not decode the text data itself. The text decoding and normalization is only performed for HTML files. |
|
72 |
- |
|
73 |
-### Data Loss Prevention |
|
74 |
- |
|
75 |
-Libclamav includes a DLP module which can detect the following credit card issuers: AMEX, VISA, MasterCard, Discover, Diner’s Club, and JCB and U.S. social security numbers inside text files. |
|
76 |
- |
|
77 |
-Future versions of Libclamav may include additional features to detect other credit cards and other forms of PII (Personally Identifiable Information) which may be transmitted without the benefit of being encrypted. |
|
78 |
- |
|
79 |
-### Others |
|
80 |
- |
|
81 |
-Libclamav can handle various obfuscators, encoders, files vulnerable to security risks such as: |
|
82 |
- |
|
83 |
-- JPEG (exploit detection) |
|
84 |
-- RIFF (exploit detection) |
|
85 |
-- uuencode |
|
86 |
-- ScrEnc obfuscation |
|
87 |
-- CryptFF |
|
88 |
- |
|
89 |
-## API |
|
90 |
- |
|
91 |
-### Header file |
|
92 |
- |
|
93 |
-Every program using libclamav must include the header file `clamav.h`: |
|
94 |
- |
|
95 |
-```c |
|
96 |
- #include <clamav.h> |
|
97 |
-``` |
|
98 |
- |
|
99 |
-### Initialization |
|
100 |
- |
|
101 |
-Before using libclamav, you should call `cl_init()` to initialize it. `CL_INIT_DEFAULT` is a macro that can be passed to `cl_init()` representing the default initialization settings. When it’s done, you’re ready to create a new scan engine by calling `cl_engine_new()`. To free resources allocated by the engine use `cl_engine_free()`. Function prototypes: |
|
102 |
- |
|
103 |
-```c |
|
104 |
- int cl_init(unsigned int options); |
|
105 |
- struct cl_engine *cl_engine_new(void); |
|
106 |
- int cl_engine_free(struct cl_engine *engine); |
|
107 |
-``` |
|
108 |
- |
|
109 |
-`cl_init()` and `cl_engine_free()` return `CL_SUCCESS` on success or another code on error. `cl_engine_new()` return a pointer or NULL if there’s not enough memory to allocate a new engine structure. |
|
110 |
- |
|
111 |
-### Database loading |
|
112 |
- |
|
113 |
-The following set of functions provides an interface for loading the virus database: |
|
114 |
- |
|
115 |
-```c |
|
116 |
- const char *cl_retdbdir(void); |
|
117 |
- |
|
118 |
- int cl_load(const char *path, struct cl_engine *engine, |
|
119 |
- unsigned int *signo, unsigned int options); |
|
120 |
-``` |
|
121 |
- |
|
122 |
-`cl_retdbdir()` returns the default (hardcoded) path to the directory with ClamAV databases. `cl_load()` loads a single database file or all databases from a given directory (when `path` points to a directory). The second argument is used for passing in the pointer to the engine that should be previously allocated with `cl_engine_new()`. A number of loaded signatures will be **added** to `signo`. The last argument can pass the following flags: |
|
123 |
- |
|
124 |
-- **CL_DB_STDOPT** |
|
125 |
- This is an alias for a recommended set of scan options. |
|
126 |
-- **CL_DB_PHISHING** |
|
127 |
- Load phishing signatures. |
|
128 |
-- **CL_DB_PHISHING_URLS** |
|
129 |
- Initialize the phishing detection module and load .wdb and .pdb |
|
130 |
- files. |
|
131 |
-- **CL_DB_PUA** |
|
132 |
- Load signatures for Potentially Unwanted Applications. |
|
133 |
-- **CL_DB_OFFICIAL_ONLY** |
|
134 |
- Only load official signatures from digitally signed databases. |
|
135 |
-- **CL_DB_BYTECODE** |
|
136 |
- Load bytecode. |
|
137 |
- |
|
138 |
-`cl_load()` returns `CL_SUCCESS` on success and another code on failure. |
|
139 |
- |
|
140 |
-```c |
|
141 |
- ... |
|
142 |
- struct cl_engine *engine; |
|
143 |
- unsigned int sigs = 0; |
|
144 |
- int ret; |
|
145 |
- |
|
146 |
- if((ret = cl_init(CL_INIT_DEFAULT)) != CL_SUCCESS) { |
|
147 |
- printf("cl_init() error: %s\n", cl_strerror(ret)); |
|
148 |
- return 1; |
|
149 |
- } |
|
150 |
- |
|
151 |
- if(!(engine = cl_engine_new())) { |
|
152 |
- printf("Can't create new engine\n"); |
|
153 |
- return 1; |
|
154 |
- } |
|
155 |
- |
|
156 |
- ret = cl_load(cl_retdbdir(), engine, &sigs, CL_DB_STDOPT); |
|
157 |
-``` |
|
158 |
- |
|
159 |
-### Error handling |
|
160 |
- |
|
161 |
-Use `cl_strerror()` to convert error codes into human readable messages. The function returns a statically allocated string: |
|
162 |
- |
|
163 |
-```c |
|
164 |
- if(ret != CL_SUCCESS) { |
|
165 |
- printf("cl_load() error: %s\n", cl_strerror(ret)); |
|
166 |
- cl_engine_free(engine); |
|
167 |
- return 1; |
|
168 |
- } |
|
169 |
-``` |
|
170 |
- |
|
171 |
-### Engine structure |
|
172 |
- |
|
173 |
-When all required databases are loaded you should prepare the detection engine by calling `cl_engine_compile()`. In case of failure you should still free the memory allocated to the engine with `cl_engine_free()`: |
|
174 |
- |
|
175 |
-```c |
|
176 |
- int cl_engine_compile(struct cl_engine *engine); |
|
177 |
-``` |
|
178 |
- |
|
179 |
-In our example: |
|
180 |
- |
|
181 |
-```c |
|
182 |
- if((ret = cl_engine_compile(engine)) != CL_SUCCESS) { |
|
183 |
- printf("cl_engine_compile() error: %s\n", cl_strerror(ret)); |
|
184 |
- cl_engine_free(engine); |
|
185 |
- return 1; |
|
186 |
- } |
|
187 |
-``` |
|
188 |
- |
|
189 |
-### Limits |
|
190 |
- |
|
191 |
-When you create a new engine with `cl_engine_new()`, it will have all internal settings set to default values as recommended by the ClamAV authors. It’s possible to check and modify the values (numerical and strings) using the following set of functions: |
|
192 |
- |
|
193 |
-```c |
|
194 |
-int cl_engine_set_num(struct cl_engine *engine, |
|
195 |
- enum cl_engine_field field, long long num); |
|
196 |
- |
|
197 |
-long long cl_engine_get_num(const struct cl_engine *engine, |
|
198 |
- enum cl_engine_field field, int *err); |
|
199 |
- |
|
200 |
-int cl_engine_set_str(struct cl_engine *engine, |
|
201 |
- enum cl_engine_field field, const char *str); |
|
202 |
- |
|
203 |
-const char *cl_engine_get_str(const struct cl_engine *engine, |
|
204 |
- enum cl_engine_field field, int *err); |
|
205 |
-``` |
|
206 |
- |
|
207 |
-Please don’t modify the default values unless you know what you’re doing. Refer to the ClamAV sources (clamscan, clamd) for examples. |
|
208 |
- |
|
209 |
-### Database checks |
|
210 |
- |
|
211 |
-It’s very important to keep the internal instance of the database up to date. You can watch database changes with the `cl_stat..()` family of functions. |
|
212 |
- |
|
213 |
-```c |
|
214 |
- int cl_statinidir(const char *dirname, struct cl_stat *dbstat); |
|
215 |
- int cl_statchkdir(const struct cl_stat *dbstat); |
|
216 |
- int cl_statfree(struct cl_stat *dbstat); |
|
217 |
-``` |
|
218 |
- |
|
219 |
-Initialization: |
|
220 |
- |
|
221 |
-```c |
|
222 |
- ... |
|
223 |
- struct cl_stat dbstat; |
|
224 |
- |
|
225 |
- memset(&dbstat, 0, sizeof(struct cl_stat)); |
|
226 |
- cl_statinidir(dbdir, &dbstat); |
|
227 |
-``` |
|
228 |
- |
|
229 |
-To check for a change you just need to call `cl_statchkdir` and check its return value (0 - no change, 1 - some change occurred). Remember to reset the `cl_stat` structure after reloading the database. |
|
230 |
- |
|
231 |
-```c |
|
232 |
- if(cl_statchkdir(&dbstat) == 1) { |
|
233 |
- reload_database...; |
|
234 |
- cl_statfree(&dbstat); |
|
235 |
- cl_statinidir(cl_retdbdir(), &dbstat); |
|
236 |
- } |
|
237 |
-``` |
|
238 |
- |
|
239 |
-Libclamav \(\ge0.96\) includes and additional call to check the number of signatures that can be loaded from a given directory: |
|
240 |
- |
|
241 |
-```c |
|
242 |
- int cl_countsigs(const char *path, unsigned int countoptions, |
|
243 |
- unsigned int *sigs); |
|
244 |
-``` |
|
245 |
- |
|
246 |
-The first argument points to the database directory, the second one specifies what signatures should be counted: `CL_COUNTSIGS_OFFICIAL` (official signatures), `CL_COUNTSIGS_UNOFFICIAL` (third party signatures), `CL_COUNTSIGS_ALL` (all signatures). The last argument points to the counter to which the number of detected signatures will be added (therefore the counter should be initially set to 0). The call returns `CL_SUCCESS` or an error code. |
|
247 |
- |
|
248 |
-### Data scan functions |
|
249 |
- |
|
250 |
-It’s possible to scan a file or descriptor using: |
|
251 |
- |
|
252 |
-```c |
|
253 |
- int cl_scanfile(const char *filename, const char **virname, |
|
254 |
- unsigned long int *scanned, const struct cl_engine *engine, |
|
255 |
- unsigned int options); |
|
256 |
- |
|
257 |
- int cl_scandesc(int desc, const char **virname, unsigned |
|
258 |
- long int *scanned, const struct cl_engine *engine, |
|
259 |
- unsigned int options); |
|
260 |
-``` |
|
261 |
- |
|
262 |
-Both functions will store a virus name under the pointer `virname`, the virus name is part of the engine structure and must not be released directly. If the third argument (`scanned`) is not NULL, the functions will increase its value with the size of scanned data (in `CL_COUNT_PRECISION` units). The last argument (`options`) specified the scan options and supports the following flags (which can be combined using bit operators): |
|
263 |
- |
|
264 |
-- **CL_SCAN_STDOPT** |
|
265 |
- This is an alias for a recommended set of scan options. You should use it to make your software ready for new features in the future versions of libclamav. |
|
266 |
-- **CL_SCAN_RAW** |
|
267 |
- Use it alone if you want to disable support for special files. |
|
268 |
-- **CL_SCAN_ARCHIVE** |
|
269 |
- This flag enables transparent scanning of various archive formats. |
|
270 |
-- **CL_SCAN_BLOCKENCRYPTED** |
|
271 |
- With this flag the library will mark encrypted archives as viruses (Encrypted.Zip, Encrypted.RAR). |
|
272 |
-- **CL_SCAN_MAIL** |
|
273 |
- Enable support for mail files. |
|
274 |
-- **CL_SCAN_OLE2** |
|
275 |
- Enables support for OLE2 containers (used by MS Office and .msi files). |
|
276 |
-- **CL_SCAN_PDF** |
|
277 |
- Enables scanning within PDF files. |
|
278 |
-- **CL_SCAN_SWF** |
|
279 |
- Enables scanning within SWF files, notably compressed SWF. |
|
280 |
-- **CL_SCAN_PE** |
|
281 |
- This flag enables deep scanning of Portable Executable files and allows libclamav to unpack executables compressed with run-time unpackers. |
|
282 |
-- **CL_SCAN_ELF** |
|
283 |
- Enable support for ELF files. |
|
284 |
-- **CL_SCAN_BLOCKBROKEN** |
|
285 |
- libclamav will try to detect broken executables and mark them as Broken.Executable. |
|
286 |
-- **CL_SCAN_HTML** |
|
287 |
- This flag enables HTML normalisation (including ScrEnc decryption). |
|
288 |
-- **CL_SCAN_ALGORITHMIC** |
|
289 |
- Enable algorithmic detection of viruses. |
|
290 |
-- **CL_SCAN_PHISHING_BLOCKSSL** |
|
291 |
- Phishing module: always block SSL mismatches in URLs. |
|
292 |
-- **CL_SCAN_PHISHING_BLOCKCLOAK** |
|
293 |
- Phishing module: always block cloaked URLs. |
|
294 |
-- **CL_SCAN_STRUCTURED** |
|
295 |
- Enable the DLP module which scans for credit card and SSN numbers. |
|
296 |
-- **CL_SCAN_STRUCTURED_SSN_NORMAL** |
|
297 |
- Search for SSNs formatted as xx-yy-zzzz. |
|
298 |
-- **CL_SCAN_STRUCTURED_SSN_STRIPPED** |
|
299 |
- Search for SSNs formatted as xxyyzzzz. |
|
300 |
-- **CL_SCAN_PARTIAL_MESSAGE** |
|
301 |
- Scan RFC1341 messages split over many emails. You will need to periodically clean up `$TemporaryDirectory/clamav-partial` directory. |
|
302 |
-- **CL_SCAN_HEURISTIC_PRECEDENCE** |
|
303 |
- Allow heuristic match to take precedence. When enabled, if a heuristic scan (such as phishingScan) detects a possible virus/phish it will stop scan immediately. Recommended, saves CPU scan-time. When disabled, virus/phish detected by heuristic scans will be reported only at the end of a scan. If an archive contains both a heuristically detected virus/phishing, and a real malware, the real malware will be reported. |
|
304 |
-- **CL_SCAN_BLOCKMACROS** |
|
305 |
- OLE2 containers, which contain VBA macros will be marked infected (Heuristics.OLE2.ContainsMacros). |
|
306 |
- |
|
307 |
-All functions return `CL_CLEAN` when the file seems clean, `CL_VIRUS` when a virus is detected and another value on failure. |
|
308 |
- |
|
309 |
-```c |
|
310 |
- ... |
|
311 |
- const char *virname; |
|
312 |
- |
|
313 |
- if((ret = cl_scanfile("/tmp/test.exe", &virname, NULL, engine, |
|
314 |
- CL_SCAN_STDOPT)) == CL_VIRUS) { |
|
315 |
- printf("Virus detected: %s\n", virname); |
|
316 |
- } else { |
|
317 |
- printf("No virus detected.\n"); |
|
318 |
- if(ret != CL_CLEAN) |
|
319 |
- printf("Error: %s\n", cl_strerror(ret)); |
|
320 |
- } |
|
321 |
-``` |
|
322 |
- |
|
323 |
-### Memory |
|
324 |
- |
|
325 |
-Because the engine structure occupies a few megabytes of system memory, you should release it with `cl_engine_free()` if you no longer need to scan files. |
|
326 |
- |
|
327 |
-### Forking daemons |
|
328 |
- |
|
329 |
-If you’re using libclamav with a forking daemon you should call `srand()` inside a forked child before making any calls to the libclamav functions. This will avoid possible collisions with temporary filenames created by other processes of the daemon. This procedure is not required for multi-threaded daemons. |
|
330 |
- |
|
331 |
-### clamav-config |
|
332 |
- |
|
333 |
-Use `clamav-config` to check compilation information for libclamav. |
|
334 |
- |
|
335 |
-```bash |
|
336 |
- $ clamav-config --libs |
|
337 |
- -L/usr/local/lib -lz -lbz2 -lgmp -lpthread |
|
338 |
- $ clamav-config --cflags |
|
339 |
- -I/usr/local/include -g -O2 |
|
340 |
-``` |
|
341 |
- |
|
342 |
-### Example |
|
343 |
- |
|
344 |
-You will find an example scanner application in the clamav source package (/example). Provided you have ClamAV already installed, execute the following to compile it: |
|
345 |
- |
|
346 |
-```bash |
|
347 |
- gcc -Wall ex1.c -o ex1 -lclamav |
|
348 |
-``` |
|
349 |
- |
|
350 |
-## CVD format |
|
351 |
- |
|
352 |
-CVD (ClamAV Virus Database) is a digitally signed tarball containing one or more databases. The header is a 512-bytes long string with colon separated fields: |
|
353 |
- |
|
354 |
-```ini |
|
355 |
-ClamAV-VDB:build time:version:number of signatures:functionality |
|
356 |
-level required:MD5 checksum:digital signature:builder name:build time (sec) |
|
357 |
-``` |
|
358 |
- |
|
359 |
-`sigtool --info` displays detailed information on CVD files: |
|
360 |
- |
|
361 |
-```bash |
|
362 |
-$ sigtool -i daily.cvd |
|
363 |
-File: daily.cvd |
|
364 |
-Build time: 10 Mar 2008 10:45 +0000 |
|
365 |
-Version: 6191 |
|
366 |
-Signatures: 59084 |
|
367 |
-Functionality level: 26 |
|
368 |
-Builder: ccordes |
|
369 |
-MD5: 6e6e29dae36b4b7315932c921e568330 |
|
370 |
-Digital signature: zz9irc9irupR3z7yX6J+OR6XdFPUat4HIM9ERn3kAcOWpcMFxq |
|
371 |
-Fs4toG5WJsHda0Jj92IUusZ7wAgYjpai1Nr+jFfXHsJxv0dBkS5/XWMntj0T1ctNgqmiF |
|
372 |
-+RLU6V0VeTl4Oej3Aya0cVpd9K4XXevEO2eTTvzWNCAq0ZzWNdjc |
|
373 |
-Verification OK. |
|
374 |
-``` |
|
375 |
- |
|
376 |
-## Graphics |
|
377 |
- |
|
378 |
-The current ClamAV logo was created by Alicia Willet, Talos. |
|
379 |
- |
|
380 |
-## OpenAntiVirus |
|
381 |
- |
|
382 |
-Our database includes the virus database (about 7000 signatures) from OpenAntiVirus (<http://OpenAntiVirus.org>). |
|
383 |
- |
|
384 |
-1. Subscribers are not allowed to post to the mailing list |
|
385 |
- |
|
386 |
-2. For Windows instructions please see win32/README in the main source code directory. |
|
387 |
- |
|
388 |
-3. See section [3.7](#unit-testing) on how to run the unit tests |
|
389 |
- |
|
390 |
-4. if not available ClamAV will fall back to an interpreter |
|
391 |
- |
|
392 |
-5. Note that several versions of GCC have bugs when compiling LLVM, see <http://llvm.org/docs/GettingStarted.html#brokengcc> for a full list. |
|
393 |
- |
|
394 |
-6. The configure script in ClamAV automatically enables the unit tests, if it finds the check framework, however it doesn’t consider it a fatal error if unit tests cannot be enabled. |
|
395 |
- |
|
396 |
-7. To get more info on clamscan options run ’man clamscan’ |
|
397 |
- |
|
398 |
-8. man 5 clamd.conf |
|
399 |
- |
|
400 |
-9. Remember to initialize the virus counter variable with 0. |