Browse code

Update development docs based on Micah's feedback

Andrew authored on 2018/10/18 06:39:42
Showing 1 changed files
... ...
@@ -3,7 +3,52 @@ This page aims to provide information useful when developing, debugging, or
3 3
 profiling ClamAV.
4 4
 
5 5
 ## Building ClamAV for Development
6
-Below are some recommendations for building ClamAV so that it's easy to debug:
6
+Below are some recommendations for building ClamAV so that it's easy to debug.
7
+
8
+### Satisfying Build Dependencies
9
+To satisify all build dependencies:
10
+
11
+#### Debian/Ubuntu:
12
+```
13
+sudo apt-get install libxml2-dev libxml2 libbz2-dev bzip2 check make libssl-dev openssl zlib1g zlib1g-dev gcc gettext autoconf automake libtool cmake autoconf-archive pkg-config g++-multilib libmilter1.0.1 libmilter-dev valgrind libcurl4-openssl-dev libjson-c-dev ncurses-dev libpcre3-dev
14
+```
15
+
16
+#### CentOS/RHEL/Fedora
17
+```
18
+sudo yum install libxml2-devel libxml2 bzip2-devel bzip2 check make openssl-devel openssl zlib zlib-devel gcc gettext autoconf automake libtool cmake autoreconf pkg-config g++-multilib sendmail sendmail-devel libtool-ltdl-devel valgrind
19
+
20
+sudo yum groupinstall "Development Tools"
21
+```
22
+
23
+#### Solaris (using OpenCSW)
24
+```
25
+sudo /opt/csw/bin/pkgutil -y -i common coreutils automake autoconf libxml2_2 libxml2_dev bzip2 libbz2_dev libcheck0 libcheck_dev gmake cmake libssl1_0_0 libssl_dev openssl_utilslibgcc_s1 libiconv2 zlib1 libstdc++6 libpcre1 libltdl7 lzlib_stub zlib_stub libmilter libtool ggrep gsed pkgconfig ggettext gcc4core gcc4g++ libgcc_s1 libgccpp1
26
+
27
+sudo pkg install system/header
28
+
29
+sudo ln -sf /opt/csw/bin/gnm /usr/bin/nm
30
+sudo ln -sf /opt/csw/bin/gsed /usr/bin/sed
31
+sudo ln -sf /opt/csw/bin/gmake /usr/bin/make
32
+```
33
+If you receive an error message like
34
+`gcc: error: /opt/csw/lib/libstdc++.so: No such file or directory`,
35
+change versions with `/opt/csw/sbin/alternatives --config automake`
36
+
37
+#### FreeBSD
38
+The easiest way to install dependencies for FreeBSD is to just rely on ports:
39
+```
40
+cd /usr/ports/security/clamav
41
+make
42
+```
43
+
44
+### Download the Source
45
+```
46
+git clone https://github.com/Cisco-Talos/clamav-devel.git
47
+cd clamav-devel
48
+```
49
+
50
+If you intend to make changes and submit a pull request, fork the clamav-devel
51
+repo first and then clone your fork of the repository.
7 52
 
8 53
 ### Running ./configure
9 54
 Suggestions:
... ...
@@ -38,14 +83,10 @@ Suggestions:
38 38
       The json output contains additional metadata that might be helpful when
39 39
       debugging.
40 40
 
41
-    - `--enable-static --disable-shared`: This will only build libclamav and
42
-      the supporting libraries as static libraries, and will result in the
43
-      clamscan that is built having this code embedded.  This is useful for
44
-      running programs like gprof which don't handle profiling code in shared
45
-      objects.
46
-
47 41
     - `--with-systemdsystemunitdir=no`: Don't try to register clamd as a
48
-      systemd service
42
+      systemd service (on systems that use systemd). You likely don't want this
43
+      development build of clamd to register as a service, and this eliminates
44
+      the need to run `make install` with `sudo`.
49 45
 
50 46
     - You might want to include the following flags also so that the optional
51 47
       functionality is enabled: `--enable-experimental --enable-clamdtop
... ...
@@ -53,20 +94,39 @@ Suggestions:
53 53
       Note that this may require you to install additional development
54 54
       libraries.
55 55
 
56
-    - I ran into problems building with llvm on Ubuntu 18.04, so add
57
-      `--disable-llvm`
56
+    - `--disable-llvm`: When enabled, LLVM provides the capability to
57
+      just-in-time compile ClamAV bytecode signatures. Without LLVM, ClamAV
58
+      uses a built-in bytecode interpreter to execute bytecode signatures.
59
+      The mechanism is different, but the results are same and the performance
60
+      overall is comparable.  At present only LLVM versions up to LLVM 3.6.2
61
+      are supported by ClamAV, and LLVM 3.6.2 is old enough that newer
62
+      distributions no longer provide it. Therefore, we recommend using
63
+      the `--disable-llvm` configure option.
58 64
 
59 65
 Altogether, the following configure command can be used:
60 66
 
61 67
 ```
62
-CFLAGS="-ggdb -O0" ./configure --prefix=`pwd`/built --enable-debug --enable-check --enable-coverage --enable-libjson --enable-static --disable-shared --with-systemdsystemunitdir=no --enable-experimental --enable-clamdtop --enable-libjson --enable-xml --enable-pcre --disable-llvm
63
-```
64
-To satisify all library dependencies, something like this should work
65
-(from Ubuntu 18.04):
66
-```
67
-sudo apt-get install git gcc libxml2-dev libssl-dev make libmilter-dev libcurl4-openssl-dev libjson-c-dev check pkgconf libncurses5-dev libpcre3-dev g++ libtool libbz2-dev
68
+CFLAGS="-ggdb -O0" ./configure --prefix=`pwd`/installed --enable-debug --enable-check --enable-coverage --enable-libjson --with-systemdsystemunitdir=no --enable-experimental --enable-clamdtop --enable-libjson --enable-xml --enable-pcre --disable-llvm
68 69
 ```
69 70
 
71
+NOTE: It is possible to build libclamav as a static library and have it
72
+statically linked into clamscan/clamd (to do this, run `./configure` with
73
+`--enable-static --disable-shared`).  This is useful for using tools like gprof
74
+that do not support profiling code in shared objects.  However, there are two
75
+drawbacks to doing this:
76
+
77
+ - clamscan/clamd will not be able to extract files from RAR archives.  Based
78
+   on the software license of the unrar library that ClamAV uses, the library
79
+   can only be dynamically loaded.  ClamAV will attempt to dlopen the unrar
80
+   library shared object and will continue on without RAR extraction support
81
+   if the library can't be found (or if it doesn't get built, which is what
82
+   happens if you indicate that shared libraries should not be built).
83
+
84
+ - If you make changes to libclamav, you'll need to `make clean`, `make`, and
85
+   `make install` again to have clamscan/clamd rebuilt using the new
86
+   libclamav.a.  The makefiles don't seem to know to rebuild clamscan/clamd
87
+   when libclamav.a changes (TODO, fix this).
88
+
70 89
 ### Running make
71 90
 Run the following to finishing building.  `-j2` in the code below is used to
72 91
 indicate that the build process should use 2 cores.  Increase this if your
... ...
@@ -79,15 +139,18 @@ Also, you can run 'make check' to run the unit tests
79 79
 
80 80
 ### Downloading the Official Ruleset
81 81
 If you plan to use custom rules for testing, you can invoke clamscan via
82
-`./built/bin/clamscan`, specifying your custom rule files via `-d` parameters.
82
+`./installed/bin/clamscan`, specifying your custom rule files via `-d` parameters.
83 83
 
84 84
 If you want to download the official ruleset to use with clamscan, do the
85 85
 following:
86
-1. Run `mkdir -p built/share/clamav`
86
+1. Run `mkdir -p installed/share/clamav`
87 87
 2. Comment out line 8 of etc/freshclam.conf.sample
88
-3. Run `./built/bin/freshclam --config-file etc/freshclam.conf.sample`
88
+3. Run `./installed/bin/freshclam --config-file etc/freshclam.conf.sample`
89 89
 
90 90
 ## General Debugging
91
+NOTE: Some of the debugging/profiling tools mentioned in the sections below are
92
+specific to Linux
93
+
91 94
 ### Useful clamscan Flags
92 95
 The following are useful flags to include when debugging clamscan:
93 96
 
... ...
@@ -105,7 +168,8 @@ The following are useful flags to include when debugging clamscan:
105 105
   an executable is determined to be broken, some functionality might not get
106 106
   invoked for the sample, and this could be an indication of an issue parsing
107 107
   the PE header or file.  This causes those binary to generate an alert instead
108
-  of just continuing on.
108
+  of just continuing on.  NOTE: This will be renamed to `--alert-broken`
109
+  starting in ClamAV 0.101.
109 110
 
110 111
 - `--max-filesize=2000M --max-scansize=2000M --max-files=2000000
111 112
    --max-recursion=2000000 --max-embeddedpe=2000M --max-htmlnormalize=2000000
... ...
@@ -140,42 +204,11 @@ writing:
140 140
   about the certificates stored within the binary.  Note - sigtool has this
141 141
   functionality as well and doesn't require a rule match to view the cert data
142 142
 
143
-### Useful sigtool Flags
144
-sigtool pulls in libclamav and provides shortcuts to doing tasks that clamscan
145
-does behind the scenes.  These can be really useful when writing a signature or
146
-trying to get information about a signature that might be causing FPs or
147
-performance problems.
148
-
149
-The following sigtool flags can be useful when debugging:
150
-
151
-- `--unpack`: Unpack the specified CVD/CLD file
152
-
153
-- `--decode`: Given a ClamAV signature from STDIN, show a more user-friendly
154
-  representation of it
155
-
156
-- `--hex-dump`: Given a sequence of bytes from STDIN, print the hex equivalent
157
-
158
-- `--mdb`: Generate section hashes of the specified file
159
-
160
-- `--imp`: Generate import hashes of the specified file
161
-
162
-- `--html-normalise`: Normalize the specified HTML file in the way that
163
-  clamscan will before looking for rule matches.  This makes it either to write
164
-  rules that will actually match.
165
-
166
-- `--ascii-normalise`: Normalized the specified ASCII text file in the way that
167
-  clamscan will before looking for rule matches
168
-
169
-- `--print-certs`: Print the Authenticode signatures of any PE files specified.
170
-  This is useful when writing signature-based .crb rule files.
171
-
172
-- `--vba`: Extract VBA/Word6 macro code
173
-
174 143
 ### Using gdb
175 144
 Given that you might want to pass a lot of arguments to gdb, consider taking
176 145
 advantage of the `--args` parameter.  For example:
177 146
 ```
178
-gdb --args ./built/bin/clamscan -d /tmp/test.ldb -d /tmp/blacklist.crb -d --dumpcerts --debug --verbose --max-filesize=2000M --max-scansize=2000M --max-files=2000000 --max-recursion=2000000 --max-embeddedpe=2000M --max-iconspe=2000000 f8f101166fec5785b4e240e4b9e748fb6c14fdc3cd7815d74205fc59ce121515
147
+gdb --args ./installed/bin/clamscan -d /tmp/test.ldb -d /tmp/blacklist.crb -d --dumpcerts --debug --verbose --max-filesize=2000M --max-scansize=2000M --max-files=2000000 --max-recursion=2000000 --max-embeddedpe=2000M --max-iconspe=2000000 f8f101166fec5785b4e240e4b9e748fb6c14fdc3cd7815d74205fc59ce121515
179 148
 ```
180 149
 
181 150
 When using ClamAV without libclamav statically linked, if you set breakpoints
... ...
@@ -195,12 +228,12 @@ If checking for leaks, be sure to run clamscan with samples that will hit as
195 195
 many of the unique code paths in the code you are testing.  An example
196 196
 invocation is as follows:
197 197
 ```
198
-valgrind --leak-check=full ./built/bin/clamscan -d /tmp/test.ldb --leave-temps --tempdir /tmp/test --debug --verbose /tmp/upx-samples/ > /tmp/upx-results-2.txt 2>&1
198
+valgrind --leak-check=full ./installed/bin/clamscan -d /tmp/test.ldb --leave-temps --tempdir /tmp/test --debug --verbose /tmp/upx-samples/ > /tmp/upx-results-2.txt 2>&1
199 199
 ```
200 200
 Alternatively, on Linux, you can use glibc's built-in leak checking
201 201
 functionality:
202 202
 ```
203
-MALLOC_CHECK_=7 ./built/bin/clamscan
203
+MALLOC_CHECK_=7 ./installed/bin/clamscan
204 204
 ```
205 205
 See the [mallopt man page](http://manpages.ubuntu.com/manpages/trusty/man3/mallopt.3.html) for more details
206 206
 
... ...
@@ -251,7 +284,7 @@ $ sudo su
251 251
 Invoke clamscan via perf record as follows, and run perf script to collect the
252 252
 profiling data:
253 253
 ```
254
-perf record -F 100 -g -- ./built/bin/clamscan -d /tmp/test.ldb /tmp/2aa6b18d509090c60c3e4ecdd8aeb16e5f149807e3404c86892112710eab576d
254
+perf record -F 100 -g -- ./installed/bin/clamscan -d /tmp/test.ldb /tmp/2aa6b18d509090c60c3e4ecdd8aeb16e5f149807e3404c86892112710eab576d
255 255
 perf script > out.perf
256 256
 ```
257 257
 The '-F' parameter indicates how many samples should be collected during