Common Corpus (common-corpus)
August 16, 2023 ยท View on GitHub
Common Corpus is used to build coverage-minimized corpus data sets for fuzzing.
Usage
- Follow the initial setup instructions at "How to Build a Fuzzing Corpus" on the Isosceles blog (steps 1 through 7).
- Compile your target binary with SanitizerCoverage enabled (e.g. with
-fsanitize=address -fsanitize-coverage=trace-pc-guard). - Setup the configuration variables in the header of
common_corpus.py. This includes information about the file format, the target command line, and the access keys that are used for reading Common Crawl data on S3. - Run the
common_corpus.pyscript and supply the CSV file created above as the first argument.
Corpus files will be created in the out directory. The tool will output a "+" for each interesting file added to the corpus, and a "." for tests that did not result in new code coverage.