2018-10-20 Stuart Caie * src/chmextract.c: add anti "../" and leading slash protection to chmextract. I'm not pleased about this. All the sample code provided with libmspack is meant to be simple examples of library use, not "productised" binaries. Making the "useful" code samples install as binaries was a mistake. They were never intended to protect you from unpacking archive files with relative/absolute paths, and I would prefer that they never will be. 2018-10-17 Stuart Caie * cab.h: Make the CAB block input buffer one byte larger, to allow a maximum-allowed-size input block and the special extra byte added after the block by cabd_sys_read_block to help Quantum alignment. Thanks to Henri Salo for reporting this. 2018-10-17 Stuart Caie * chmd_read_headers(): again reject files with blank filenames, this time because their 1st or 2nd byte is null, not because their length is zero. Thanks again to Hanno Böck for finding the issue. 2018-10-16 Stuart Caie * Makefile.am: using automake _DEPENDENCIES for chmd_test appears to override the default dependencies (e.g. sources), so libchmd.la was no longer considered a dependency of chmd_test. This breaks parallel builds like "make -j4". Added libchmd.la explicitly to dependencies. Thanks to Thomas Deutschmann for reporting this. 2018-10-16 Stuart Caie * cabd.c: add new parameter, MSCABD_PARAM_SALVAGE, which makes CAB file reading and extraction more lenient, to allow damaged or mangled CABs to be extracted. When enabled: - cabd->open() won't reject cabinets with files that have invalid folder indices or filenames. These files will simply be skipped - cabd->extract() won't reject files with invalid lengths, but will limit them to the maximum possible - block output sizes over 32768 bytes won't be rejected - invalid data block checksums won't be rejected It's still possible for corrupted files to fail extraction, but more data can be extracted before they do. This new parameter doesn't affect the existing MSCABD_PARAM_FIXMSZIP parameter, which ignores MSZIP decompression failures. You can enable both at once. Thanks to Micah Snyder from ClamAV for working with me to get this feature into libmspack. This also helps ClamAV move towards using a vanilla copy of libmspack without needing their own patchset. 2018-08-13 Stuart Caie * mspack.h: clarify that mspack_system.free() should allow NULL. If your mspack_system implementation doesn't, it would already have crashed, as there are several places where libmspack calls sys->free(NULL). This change makes it official, and amends a few "if (x) sys->free(x)" cases to the simpler "sys->free(x)" to make it clearer. 2018-08-09 Stuart Caie * Makefile.am: the test file cve-2015-4467-reset-interval-zero.chm is detected by ClamAV as BC.Legacy.Exploit.CVE_2012_1458-1 "infected". My hosting deletes anything that ClamAV calls "infected", so has been continually deleting the official libmspack 0.7alpha release. CVE-2012-1458 is the same issue as CVE-2015-4467: both libmspack, and ClamAV using libmspack, could get a division-by-zero crash when the LZX reset interval was zero. This was fixed years ago, but ClamAV still has it as a signature, which today prevents me from releasing libmspack. BC.Legacy.Exploit.CVE_2012_1458-1 is a bytecode signature, so I can't see the exact trigger conditions, but I can see that it looks for the "LZXC" signature of the LZX control file, so I've changed this to "lzxc" and added a step in the Makefile to change it back to LZXC, so I can release libmspack whether or not ClamAV keeps the signature. 2018-04-26 Stuart Caie * read_chunk(): the test that chunk numbers are in bounds was off by one, so read_chunk() returned a pointer taken from outside allocated memory that usually crashes libmspack when accessed. Thanks to Hanno Böck for finding the issue and providing a sample. * chmd_read_headers(): reject files with blank filenames. Thanks again to Hanno Böck for finding the issue and providing a sample file. 2018-02-06 Stuart Caie * chmd.c: fixed an off-by-one error in the TOLOWER() macro, reported by Dmitry Glavatskikh. Thanks Dmitry! 2017-11-26 Stuart Caie * kwajd_read_headers(): fix up the logic of reading the filename and extension headers to avoid a one or two byte overwrite. Thanks to Jakub Wilk for finding the issue. * test/kwajd_test.c: add tests for KWAJ filename.ext handling 2017-10-16 Stuart Caie * test/cabd_test.c: update the short string tests to expect not only MSPACK_ERR_DATAFORMAT but also MSPACK_ERR_READ, because of the recent change to cabd_read_string(). Thanks to maitreyee43 for spotting this. * test/msdecompile_md5: update the setup instructions for this script, and also change the script so it works with current Wine. Again, thanks to maitreyee43 for trying to use it and finding it not working. 2017-08-13 Stuart Caie * src/chmextract.c: support MinGW one-arg mkdir(). Thanks to AntumDeluge for reporting this. 2017-08-13 Stuart Caie * read_spaninfo(): a CHM file can have no ResetTable and have a negative length in SpanInfo, which then feeds a negative output length to lzxd_init(), which then sets frame_size to a value of your choosing, the lower 32 bits of output length, larger than LZX_FRAME_SIZE. If the first LZX block is uncompressed, this writes data beyond the end of the window. This issue was raised by ClamAV as CVE-2017-6419. Thanks to Sebastian Andrzej Siewior for finding this by chance! * lzxd_init(), lzxd_set_output_length(), mszipd_init(): due to the issue mentioned above, these functions now reject negative lengths 2017-08-05 Stuart Caie * cabd_read_string(): add missing error check on result of read(). If an mspack_system implementation returns an error, it's interpreted as a huge positive integer, which leads to reading past the end of the stack-based buffer. Thanks to Sebastian Andrzej Siewior for explaining the problem. This issue was raised by ClamAV as CVE-2017-11423 2016-04-20 Stuart Caie * configure.ac: change my email address to kyzer@cabextract.org.uk 2015-05-10 Stuart Caie * cabd_read_string(): correct rejection of empty strings. Thanks to Hanno Böck for finding the issue and providing a sample file. 2015-05-10 Stuart Caie * Makefile.am: Add subdir-objects option as suggested by autoreconf. * configure.ac: Add AM_PROG_AR as suggested by autoreconf. 2015-01-29 Stuart Caie * system.h: if C99 inttypes.h exists, use its PRI{d,u}{32,64} macros. Thanks to Johnathan Kollasch for the suggestion. 2015-01-18 Stuart Caie * lzxd_decompress(): the byte-alignment code for reading uncompressed block headers presumed it could wind i_ptr back 2 bytes, but this hasn't been true since READ_BYTES was allowed to read bytes straddling two blocks, leaving just 1 byte in the read buffer. Thanks to Jakub Wilk for finding the issue and providing a sample file. * inflate(): off-by-one error. Distance codes are 0-29, not 0-30. Thanks to Jakub Wilk again. * chmd_read_headers(), search_chunk(): another fix for checking pointer is within a chunk, thanks again to Jakub Wilk. 2015-01-17 Stuart Caie * GET_UTF8_CHAR(): Remove 5/6-byte encoding support and check decoded chars are no more than U+10FFFF. * chmd_init_decomp(): A reset interval of 0 is invalid. Thanks to Jakub Wilk for finding the issue and providing a sample and patch. 2015-01-15 Stuart Caie * chmd_read_headers(): add a bounds check to prevent over-reading data, which caused a segfault on 32-bit architectures. Thanks to Jakub Wilk. * search_chunk(): change the order of pointer arithmetic operations to avoid overflow during bounds checks, which lead to segfaults on 32-bit architectures. Again, thanks to Jakub Wilk for finding this issue, providing sample files and a patch. 2015-01-08 Stuart Caie * cabd_extract(): No longer uses broken state data if extracting from folder 1, 2, 1 and setting up folder 2 fails. This prevents a jump to null and thus segfault. Thanks to Jakub Wilk again. * cabd_read_string: reject empty strings. They are not found in any valid CAB files. Thanks to Hanno Böck for sending me an example. 2015-01-05 Stuart Caie * cabd_can_merge_folders(): disallow folder merging if the combined folder would have more than 65535 data blocks. * cabd_decompress(): disallow files if their offset, length or offset+length is more than 65535*32768, the maximum size of any folder. Thanks to Jakub Wilk for identifying the problem and providing a sample file. 2014-04-20 Stuart Caie * readhuff.h: fixed the table overflow check, which allowed one more code after capacity had been reached, resulting in a read of uninitialized data inside the decoding table. Thanks to Denis Kroshin for identifying the problem and providing a sample file. 2013-05-27 Stuart Caie * test/oabx.c: added new example command for unpacking OAB files. 2013-05-17 Stuart Caie * mspack.h: Support for decompressing a new file format, the Exchange Offline Address Book (OAB). Thanks to David Woodhouse for writing the implementation. I've bumped the version to 0.4alpha in celebration. 2012-04-15 Stuart Caie * chmd_read_headers(): More thorough validation of CHM header values. Thanks to Sergei Trofimovich for finding sample files. * read_reset_table(): Better test for overflow. Thanks again to Sergei Trofimovich for generating a good example. * test/chminfo.c: this test program reads the reset table by itself and was also susceptible to the same overflow problems. 2012-03-16 Stuart Caie * Makefile.am, configure.ac: make the GCC warning flags conditional on using the GCC compiler. Thanks to Dagobert Michelsen for letting me know. 2011-11-25 Stuart Caie * lzxd_decompress(): Prevent matches that go beyond the start of the LZX stream. Thanks to Sergei Trofimovich for testing with valgrind and finding a corrupt sample file that exercises this scenario. 2011-11-23 Stuart Caie * chmd_fast_find(): add a simple check against infinite PMGL loops. Thanks to Sergei Trofimovich for finding sample files. Multi-step PMGL/PMGI infinite loops remain possible. 2011-06-17 Stuart Caie * read_reset_table(): wasn't reading the right offset for getting the LZX uncompressed length. Thanks to Sergei Trofimovich for finding the bug. 2011-05-31 Stuart Caie * kwajd.c, mszipd.c: KWAJ type 4 files (MSZIP) are now supported. Thanks to Clive Turvey for sending me the format details. * doc/szdd_kwaj_format.html: Updated documentation to cover KWAJ's MSZIP compression. 2011-05-11 Stuart Caie * cabd_find(): rethought how large vs small file support is handled, as users were getting "library not compiled to support large files" message on some small files. Now checks for actual off_t overflow, rather than trying to preempt it. 2011-05-10: Stuart Caie * chmd.c: implemented fast_find() * test/chmx.c: removed the multiple extraction orders, now it just extracts in the fastest order * test/chmd_order.c: new program added to test that different extraction orders don't affect the results of extraction * test/chmd_find.c: new program to test that fast_find() works. Either supply your own filename to find, or it will try finding every file in the CHM. * configure.ac: because CHM fast find requires case-insensitive comparisons, tolower() or towlower() are used where possible. These functions and their headers are checked for. * mspack.h: exposed struct mschmd_sec_mscompressed's spaninfo and struct mschmd_header's first_pmgl, last_pmgl and chunk_cache to the world. Check that the CHM decoder version is v2 or higher before using them. * system.c: set CHM decoder version to v2 2011-04-27: Stuart Caie * many files: Made C++ compilers much happier with libmspack. Changed char * to const char * where possible. * mspack.h: Changed user-supplied char * to const char *. Unless you've written your own mspack_system implementation, you will likely be unaffected. If you have written your own mspack_system implementation: 1: change open() so it takes a const char *filename 2: change message() so it takes a const char *format If you cast your function into the mspack_system struct, you can change the cast instead of the function. 2011-04-27: Stuart Caie * Makefile.am: changed CFLAGS from "-Wsign-compare -Wconversion -pedantic" to "-W -Wno-unused". This enables more warnings, and disables these specific warnings which are now a hinderance. 2011-04-27: Stuart Caie * test/cabrip.c, test/chminfo.c: used macros from system.h for printing offsets and reading 64-bit values, rather than reinvent the wheel. * cabd_can_merge_folders(): declare variables at the start of a block so older C compilers won't choke. * cabd_find(): avoid compiler complaints about non-initialised variables. We know they'll get initialised before use, but the compiler can't reverse a state machine to draw the same conclusion. 2011-04-26: Stuart Caie * configure.ac, mspack/system.h: Added a configure test to get the size of off_t. If off_t is 8 bytes or more, we presume this system has large file support. This fixes LFS detection for Fedora x86_64 and Darwin/Mac OS X, neither of which declare FILESIZEBITS in . It's not against the POSIX standard to do this: "A definition of [FILESIZEBITS] shall be omitted from the header on specific implementations where the corresponding value is equal to or greater than the stated minimum, but where the value can vary depending on the file to which it is applied." (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html) Thanks to Edward Sheldrake for the patch. 2011-04-26: Stuart Caie * chmd.c: all 64-bit integer reads are now consolidated into the read_off64() function * chmd_read_headers(): this function has been made resilient against accessing memory past the end of a chunk. Thanks to Sergei Trofimovich for sending me examples and analysis. * chmd_init_decomp(): this function now reads the SpanInfo file if the ResetTable file isn't available, it also checks that each system file it needs is large enough before accessing it, and some of its code has been split into several new functions: find_sys_file(), read_reset_table() and read_spaninfo() 2011-04-26: Stuart Caie * mspack.h, chmd.c: now reads the SpanInfo system file if the ResetTable file isn't available. This adds a new spaninfo pointer into struct mschmd_sec_mscompressed 2011-04-26: Stuart Caie * test/chminfo.c: more sanity checks for corrupted CHM files where entries go past the end of a PMGL/PMGI chunk, thanks to Sergei Trofimovich for sending me examples and analysis. 2011-04-25: Stuart Caie * cabd_merge(): Drew D'Addesio showed me spanning cabinets which don't have all the CFFILE entries they should, but otherwise have all necessary data for extraction. Changed the merging folders test to be less strict; if folders don't exactly match, warn which files are missing, but allow merging if at least one necessary file is present. 2010-09-24: Stuart Caie * readhuff.h: Don't let build_decode_table() allow empty trees. It's meant to be special case just for the LZX length tree, so move that logic out to the LZX code. Thanks to Danny Kroshin for discovering the bug. * lzxd.c: Allow empty length trees, but not other trees. If the length tree is empty, fail if asked to decode a length symbol. Again, thanks to Danny Kroshin for discovering the bug. 2010-09-20: Stuart Caie * Makefile.am: Set EXTRA_DIST so it doesn't include .svn directories in the distribution, but does include docs. 2010-09-20: Stuart Caie * Makefile.am, configure.ac: Use modern auto* practises; turn on automake silent rules where possible, use "m4" directory for libtool macros, use LT_INIT instead of AC_PROG_LIBTOOL and use AM_CPPFLAGS instead of INCLUDES. Thanks to Sergei Trofimovich for the patch. 2010-09-15: Stuart Caie * many files: Made the code compile with C++ - Renamed all 'this' variables/parameters to 'self' - Added casts to all memory allocations. - Added extern "C" to header files with extern declarations. - Made system.c include system.h. - Changed the K&R-style headers to ANSI-style headers in md5.c 2010-08-04: Stuart Caie * many files: removed unnecessary include 2010-07-19: Stuart Caie * cabd_md5.c, chmd_md5.c: Replace writing files to disk then MD5summing them, with an MD5summer built into mspack_system. Much, much faster results. * qtmd_decompress(): Robert Riebisch pointed out a Quantum data integrity check that could never be tripped, because frame_todo is unsigned, so it will never be decremented below zero. Replaced the check with one that assumes that decrementing past zero wraps frame_todo round to a number more than its maximum value (QTM_FRAME_SIZE). 2010-07-18: Stuart Caie * cabd.c: Special logic to pass cabd_sys_read() errors back to cabd_extract() wasn't compatible with the decompressor logic of returning the same error repeatedly once unpacking fails. This meant that if decompressing failed because of a read error, then the next file in the same folder would come back as "no error", but the decompressed wouldn't have even attempted to decompress the file. Added a new state variable, read_error, with the same lifespan as a decompressor, to pass the underlying reason for MSPACK_ERR_READ errors back. * mszipd.c: improve MS-ZIP recovery by saving all the bytes decoded prior to a block failing. This requires remembering how far we got through the block, so the code has been made slightly slower (about 0.003 seconds slower per gigabyte unpacked) by removing the local variable window_posn and keeping it in the state structure instead. 2010-07-16: Stuart Caie * Makefile.am: strange interactions. When -std=c99 is used, my Ubuntu's (libc6-dev 2.11.1-0ubuntu7.2) does NOT define fseeko() unless _LARGEFILE_SOURCE is also defined. But configure always uses -std=gnu99, not -std=c99, so its test determines _LARGEFILE_SOURCE isn't needed but HAVE_FSEEKO is true. The implicit fseeko definition has a 32-bit rather than 64-bit offset, which means the mode parameter is interpreted as part of the offset, and the mode is taken from the stack, which is generally 0 (SEEK_SET). This breaks all SEEK_CURs. The code works fine when -std=c99 is not set, so just remove it for the time being. 2010-07-12: Stuart Caie * system.c: Reject reading/writing a negative number of bytes. * chmd.c: allow zero-length files to be seen. Previously they were skipped because they were mistaken for directory entries. 2010-07-08: Stuart Caie * qtmd.c: Larry Frieson found an important bug in the Quantum decoder. Window wraps flush all unwritten data to disk. However, sometimes less data is needed, which makes out_bytes negative, which is then passed to write(). Some write() implementations treat negative sizes it as a large positive integer and segfault trying to write the buffer. * Makefile.am, test/*.c: fixed automake file so that the package passes a "make distcheck". 2010-07-07: Stuart Caie * doc/szdd_kwaj_format.html: explain SZDD/KWAJ file format. * lzssd.c: fixed SZDD decompression bugs. * test/chmd_compare: Add scripts for comparing chmd_md5 against Microsoft's own code. * test/chmd_md5.c: remove the need to decompress everything twice, as this is already in chmx.c if needed. 2010-07-06: Stuart Caie * many files: added SZDD and KWAJ decompression support. 2010-06-18: Stuart Caie * system.h: expanded the test for 64-bit largefile support so it also works on 64-bit native operating systems where you don't have to define _FILE_OFFSET_BITS. 2010-06-17: Stuart Caie * libmspack.pc.in: Added pkg-config support. Thanks to Patrice Dumas for the patch. 2010-06-14: Stuart Caie * qtmd.c, lzxd.c, mszipd.c: created new headers, readbits.h and readhuff.h, which bundle up the bit-reading and huffman-reading code found in the MSZIP, LZX and Quantum decoders. 2010-06-11: Stuart Caie * qtmd_static_init(): Removed function in favour of static const tables, same rationale as for lzxd_static_init(). * qtmd_read_input(), zipd_read_input(): After testing against my set of CABs from the wild, I've found both these functions _need_ an extra EOF flag, like lzxd_read_input() has. So I've added it. This means CABs get decoded properly AND there's no reading fictional bytes. 2010-06-03: Stuart Caie * test/cabd_md5.c: updated this so it has better output and doesn't need to be in the same directory as the files for multi- part sets. 2010-05-20: Stuart Caie * qtmd_read_input(), zipd_read_input(): Both these functions are essentially copies of lzxd_read_input(), but that has a feature they don't have - an extra EOF flag. So if EOF is encountered (sys->read() returns 0 bytes), these don't pass on the error. Their respective bit-reading functions that called them then go on to access at least one byte of the input buffer, which doesn't exist as sys->read() returned 0. Thanks to Michael Vidrevich for spotting this and providing a test case. 2010-05-20: Stuart Caie * system.h: It turns out no configure.ac tests are needed to decide between __func__ and __FUNCTION__, so I put the standard one (__func__) back into the D() macro, along with some special-case ifdefs for old versions of GCC. * lzxd_static_init(): Removed function in favour of static const tables. Jorge Lodos thinks it causes multithreading problems, I disagree. However, there are speed benefits to declaring the tables as static const. * cabd_init_decomp(): Fixed code which never runs but would write to a null pointer if it could. Changed it to an assert() as it will only trip if someone rewrites the internals of cabd.c. Thanks to Jorge Lodos for finding it. * inflate(): Fixed an off-by-one error: if the LITERAL table emitted code 286, this would read one byte past the end of lit_extrabits[]. Thanks to Jorge Lodos for finding it. 2010-05-06: Stuart Caie * test/cabrip.c, test/chminfo.c: add fseeko() support 2009-06-01: Stuart Caie * README: clarify the extended license terms * doc, Makefile.am: make the doxygen makefile work when using an alternate build directory 2006-09-20: Stuart Caie * system.h: I had a choice of adding more to configure.ac to test for __func__ and __FUNCTION__, or just removing __FUNCTION__ from the D() macro. I chose the latter. * Makefile.am: Now the --enable-debug in configure will actually apply -DDEBUG to the sources. 2006-09-20: Stuart Caie * qtmd_decompress(): Fixed a major bug in the QTM decoder, as reported by Tomasz Kojm last year. Removed the restriction on window sizes as a result. Correctly decodes the XLVIEW cabinets. 2006-08-31: Stuart Caie * lzxd_decompress(): Two major bugs fixed. Firstly, the R0/R1/R2 local variables weren't set to 1 after lzxd_reset_state(). Secondly, the LZX decompression stream can sometimes become odd-aligned (after an uncompressed block) and the next 16 bit fetch needs to be split across two input buffers, ENSURE_BITS() didn't cover this case. Many thanks to Igor Glucksmann for discovering both these bugs. 2005-06-30: Stuart Caie * cabd_search(): fixed problems with searching files > 4GB for cabinets. 2005-06-23: Stuart Caie * qtmd_init(): The QTM decoder is broken for QTM streams with a window size less than the frame size. Until this is fixed, fail to initialise QTM window sizes less than 15. Thanks to Tomasz Kojm for finding the bug. 2005-03-22: Stuart Caie * system.h: now undefs "read", as the latest glibc defines read() as a macro which messes everything up. Thanks to Ville Skyttä for the update. 2005-03-14: Stuart Caie * test/multifh.c: write an mspack_system implementation that can handle normal disk files, open file handles, open file descriptors and raw memory all at the same time. 2005-02-24: Stuart Caie * chmd_read_headers(): avoid infinite loop when chmhs1_ChunkSize is zero. Thanks to Serge Semashko for the research and discovery. 2005-02-18: Stuart Caie * mspack.h: renamed the "interface" parameter of mspack_version() to "entity", as interface is a reserved word in C++. Thanks to Yuriy Z for the discovery. 2004-12-09: Stuart Caie * lzss.h, szdd.h, szddd.h: more work on the SZDD/LZSS design. 2004-06-12: Stuart Caie * lzxd_static_init(): removed write to lzxd_extra_bits[52], thanks to Nigel Horne from the ClamAV project. 2004-04-23: Stuart Caie * mspack.h: changed 'this' parameters to 'self' to allow compiling in C++ compilers, thanks to Michal Cihar for the suggestion. * mspack.h, system.h, mspack.def, winbuild.sh: integrated some changes from Petr Blahos to let libmspack build as a Win32 DLL. * chmd_fast_find(): added the first part of this code, and comments sufficient to finish it :) 2004-04-08 Stuart Caie * test/chminfo.c: added a program for dumping useful data from CHM files, e.g. index entries and reset tables. I wrote this a while ago for investigating a corrupt cabinet, but I never committed it. 2004-03-26 Stuart Caie * test/cabd_memory.c: added a new test example which shows an mspack_system implementation that reads and writes from memory only, no file I/O. Even the source code has a little cab file embedded in it. 2004-03-10 Stuart Caie * cabd.c: updated the location of the CAB SDK. * cabd.c: changed a couple of MSPACK_ERR_READ errors not based on read() failures into MSPACK_ERR_DATAFORMAT errors. * mszipd_decompress(): repair mode now aborts after writing a repaired block if the error was a hard error (e.g. read error, out of blocks, etc) 2004-03-08 Stuart Caie * Makefile.am: now builds and installs a versioned library. * mszipd.c: completed a new MS-ZIP and inflate implementation. * system.c: added mspack_version() and committed to a versioned ABI for the library. * cabd.c: made mszip repair functionality work correctly. * cabd.c: now identifies invalid block headers * doc/: API documentation is now included with the library, not just on the web. * chmd.c: fixed error messages and 64-bit debug output. * chmd.c: now also catches NULL files in section 1. * test/chmx.c: now acts more like cabextract. 2003-08-29 Stuart Caie * ChangeLog: started keeping a ChangeLog :)