bulk_extractor

This is the development tree. Production downloads are at:

OTHER License

Downloads
71
Stars
1K
Committers
37

Bot releases are hidden (Show)

bulk_extractor - Release 2.1.1 Latest Release

Published by simsong 6 months ago

  • renames jpeg_carved feature extractor to jpeg so that the flag -S jpeg_carve_mode=2 enables carving of all contiguous JPEGs.
  • Adds new help for bulk_extractor -h to explain all available carve modes.
bulk_extractor -

Published by simsong 9 months ago

The digital forensics tool bulk_extractor version 2.1.0 is now available for general use.

Release download point:
https://github.com/simsong/bulk_extractor/releases

GIT repository:
https://github.com/simsong/bulk_extractor

I am pleased to announce the general availability of bulk_extractor version 2.1. This is the first release of bulk_extractor version 2 that is recommended for general use.

Bulk_extractor 2 is a significant rewrite of bulk_extractor. Verison 2 significantly improves the performance and portability of version 1. The rewrite started in 2016 and was largely completed by January 2021.

Details of the rewrite, including a detailed report of the performance improvements and lessons learned, can be found in Sharpening Your Tools: Updating bulk_extractor for the 2020s, Simson Garfinkel and Jon Stewart. Communications of the ACM, August 2023.

Bulk_extractor version 2.1 is the first stable version of bulk_extractor version 2 that is recommended for general use. It corrects a problem with the string search scanner that caused bulk_extractor to hang on open-ended regular expressions such as [a-z]*@company.com specified with the -F flag. With version 2.1, we have replaced the C++17 regex compiler with Google's RE2 regex compiler that avoids backtracking. As a result, these open-ended regular expressions no longer hang.

2.0/2.1 Improvements over Version 1:

  • BE2 is significantly faster on multi-core systems than BE1.

Release 2.1 Limitations

  • BEViewer is not included in this release. Although it works with Version 2, it is not yet officially supported.

  • scan_outlook and scan_hiberfile are now disabled by default because they did not have unit tests. These scanners can be re-enabled by specifying -eoutlook and -ehiberfile on the command line.

  • scan_aes no longer scans for 192-bit AES keys by default, although this behavior can be re-enabled.

Known bugs:

  • The RAR decompressor does not reliably decompress all RAR files and only supports RAR v1, v2, and v3.

  • The RAR scanner will not reliably name carved RAR file components that contain UTF-8 characters in their name.

You can help

We are looking for help to implement the following algorithms:

  • WkdmDecompress - http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/iokit/Kernel/WKdmDecompress.c

  • xz, 7zip, and LZMA/LZMA2 decompression

  • lzo decompression

  • BZIP2 decompression

  • CAB decompression

  • Scanning for the start of BitLocker protected volumes.

  • NTFS decompression

  • Better handling of MIME encoding

  • Process more data with -e xor and look for CCN hits. Most will be false positives

  • Demonstration of bulk_extractor running on a grid (how fast can it run?)

  • Python Bridge - run multiple copies of python to let scanners be written in python

  • scan_pipe - runs every sbuf through an external program.

bulk_extractor - bulk_extractor 2.0.6

Published by simsong 10 months ago

Minor packaging updates.

bulk_extractor - bulk_extractor 2.0.3

Published by simsong over 1 year ago

Version 2.0.3 is released. However, please note:

  • There appears to be a hang in the multi-threaded logic on some systems. This is under review.
  • Carving for IPv6 packets is not 100%
  • There are compiler warnings when compiling on MacOS 13.3.1
bulk_extractor - bulk_extractor V2.0.0 RELEASE

Published by simsong over 2 years ago

Release 2.0.0 of bulk_extractor, a high-performance digital forensics tool that works like a "find evidence" button, pulling actionable intelligence out of disk images, files, memory dumps, network traffic, and just about anything else.

Note: we recommend using the bulk_extractor-2.0.0.tar.gz file attached, which is a proper release, rather than cloning the repo and all of the sub-repos and then using automake to create the configure script.

bulk_extractor - bulk_extractor V2.0.0 beta 3

Published by simsong almost 3 years ago

bulk_extractor --- a high-performance digital forensics tool that scans a disk image, a file, or a directory of files and extracts information such as email addresses, JPEGs and JSON snippets without parsing the file system or file system structures. Written in C++ and highly parallelized.

This beta:

  • Adds additional regression test.
  • Fixes bugs reported in betas 1 and 2.

Please report bugs to https://github.com/simsong/bulk_extractor/issues

bulk_extractor - bulk_extractor V2.0.0 beta 2

Published by simsong almost 3 years ago

bulk_extractor is a high-performance C++ program that scans a disk image, a file, or a directory of files and extracts information such as email addresses, JPEGs and JSON snippets without parsing the file system or file system structures.

This beta:

  • Addresses packaging concerns and adds additional regression test.
  • Fixes handling of E01 files.

Download from: bulk_extractor-2.0.0-beta2.tar.gz

bulk_extractor - bulk_extractor V2.0.0 beta 1

Published by simsong about 3 years ago

bulk_extractor is a high-performance digital forensics tool that finds data including JPEG images, email addresses, social security numbers, and other kinds of "known formats" in files and on raw disk partitions, even if the data are compressed, BASE64 encoded, or transformed using other well-known algorithms.

After six years, we have a new release of bulk_extractor! This version now requires C++17, includes a significant test suite with significant code coverage, and is designed for systems with high numbers of CPU cores. Tested on Ubuntu, MacOS, and Fedora.

bulk_extractor - Bulk_Extractor Release 1.5.3

Published by simsong about 10 years ago

Release 1.5.3 corrects minor bugs that were found in version 1.5.0, and represents a significant improvement over release 1.4.0.

bulk_extractor - Official 1.4.0 release.

Published by simsong almost 11 years ago

The official 1.4.0 release. Reasonably well tested.

bulk_extractor - Initial v1.4.0 beta

Published by simsong over 11 years ago

Please let us know how it works. We are especially interested in feedback on the XOR scanner.