From 89b77ce2ca335f4cd6fd5ad6d343c3c4a9d348b3 Mon Sep 17 00:00:00 2001 From: Justin Bogner Date: Thu, 12 Oct 2017 01:44:24 +0000 Subject: [PATCH] docs: Add some information about Fuzzing LLVM itself This splits some content out of the libFuzzer docs and adds a fair amount of detail about the fuzzers in LLVM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315544 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/FuzzingLLVM.rst | 221 +++++++++++++++++++++++++++++++++++++++++++ docs/LibFuzzer.rst | 72 ++------------ docs/index.rst | 4 + 3 files changed, 232 insertions(+), 65 deletions(-) create mode 100644 docs/FuzzingLLVM.rst diff --git a/docs/FuzzingLLVM.rst b/docs/FuzzingLLVM.rst new file mode 100644 index 00000000000..6aa0be7d19f --- /dev/null +++ b/docs/FuzzingLLVM.rst @@ -0,0 +1,221 @@ +================================ +Fuzzing LLVM libraries and tools +================================ + +.. contents:: + :local: + :depth: 2 + +Introduction +============ + +The LLVM tree includes a number of fuzzers for various components. These are +built on top of :doc:`LibFuzzer `. + + +Available Fuzzers +================= + +clang-fuzzer +------------ + +A |generic fuzzer| that tries to compile textual input as C++ code. Some of the +bugs this fuzzer has reported are `on bugzilla `__ +and `on OSS Fuzz's tracker +`__. + +clang-proto-fuzzer +------------------ + +A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf +class that describes a subset of the C++ language. + +This fuzzer accepts clang command line options after `ignore_remaining_args=1`. +For example, the following command will fuzz clang with a higher optimization +level: + +.. code-block:: shell + + % bin/clang-proto-fuzzer -ignore_remaining_args=1 -O3 + +clang-format-fuzzer +------------------- + +A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the +bugs this fuzzer has reported are `on bugzilla `__ +and `on OSS Fuzz's tracker +`. +Some of the bugs this fuzzer has reported are `on bugzilla +`__ + +llvm-dwarfdump-fuzzer +--------------------- + +A |generic fuzzer| that interprets inputs as object files and runs +:doc:`llvm-dwarfdump ` on them. Some of the bugs +this fuzzer has reported are `on OSS Fuzz's tracker +` and the triple is required. For example, +the following command would fuzz AArch64 with :doc:`GlobalISel`: + +.. code-block:: shell + + % bin/llvm-isel-fuzzer -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0 + +llvm-mc-assemble-fuzzer +----------------------- + +A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as +target specific assembly. + +Note that this fuzzer has an unusual command line interface which is not fully +compatible with all of libFuzzer's features. Fuzzer arguments must be passed +after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For +example, to fuzz the AArch64 assembler you might use the following command: + +.. code-block:: console + + llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4 + +This scheme will likely change in the future. + +llvm-mc-disassemble-fuzzer +-------------------------- + +A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs +as assembled binary data. + +Note that this fuzzer has an unusual command line interface which is not fully +compatible with all of libFuzzer's features. See the notes above about +``llvm-mc-assemble-fuzzer`` for details. + + +.. |generic fuzzer| replace:: :ref:`generic fuzzer ` +.. |protobuf fuzzer| + replace:: :ref:`libprotobuf-mutator based fuzzer ` +.. |LLVM IR fuzzer| + replace:: :ref:`structured LLVM IR fuzzer ` + + +Mutators and Input Generators +============================= + +The inputs for a fuzz target are generated via random mutations of a +:ref:`corpus `. There are a few options for the kinds of +mutations that a fuzzer in LLVM might want. + +.. _fuzzing-llvm-generic: + +Generic Random Fuzzing +---------------------- + +The most basic form of input mutation is to use the built in mutators of +LibFuzzer. These simply treat the input corpus as a bag of bits and make random +mutations. This type of fuzzer is good for stressing the surface layers of a +program, and is good at testing things like lexers, parsers, or binary +protocols. + +Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_, +`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_, +`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_. + +.. _fuzzing-llvm-protobuf: + +Structured Fuzzing using ``libprotobuf-mutator`` +------------------------------------------------ + +We can use libprotobuf-mutator_ in order to perform structured fuzzing and +stress deeper layers of programs. This works by defining a protobuf class that +translates arbitrary data into structurally interesting input. Specifically, we +use this to work with a subset of the C++ language and perform mutations that +produce valid C++ programs in order to exercise parts of clang that are more +interesting than parser error handling. + +To build this kind of fuzzer you need `protobuf`_ and its dependencies +installed, and you need to specify some extra flags when configuring the build +with :doc:`CMake `. For example, `clang-proto-fuzzer`_ can be enabled by +adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in +:ref:`building-fuzzers`. + +The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is +`clang-proto-fuzzer`_. + +.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator +.. _protobuf: https://github.com/google/protobuf + +.. _fuzzing-llvm-ir: + +Structured Fuzzing of LLVM IR +----------------------------- + +We also use a more direct form of structured fuzzing for fuzzers that take +:doc:`LLVM IR ` as input. This is achieved through the ``FuzzMutate`` +library, which was `discussed at EuroLLVM 2017`_. + +The ``FuzzMutate`` library is used to structurally fuzz backends in +`llvm-isel-fuzzer`_. + +.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg + + +Building and Running +==================== + +.. _building-fuzzers: + +Configuring LLVM to Build Fuzzers +--------------------------------- + +Fuzzers will be built and linked to libFuzzer by default as long as you build +LLVM with sanitizer coverage enabled. You would typically also enable at least +one sanitizer for the fuzzers to be particularly likely, so the most common way +to build the fuzzers is by adding the following two flags to your CMake +invocation: ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``. + +.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building + with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off`` + to avoid building the sanitizers themselves with sanitizers enabled. + +Continuously Running and Finding Bugs +------------------------------------- + +There used to be a public buildbot running LLVM fuzzers continuously, and while +this did find issues, it didn't have a very good way to report problems in an +actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more +instead. + +https://github.com/google/oss-fuzz/blob/master/projects/llvm/project.yaml +https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm + +.. _OSS Fuzz: https://github.com/google/oss-fuzz + + +Utilities for Writing Fuzzers +============================= + +There are some utilities available for writing fuzzers in LLVM. + +Some helpers for handling the command line interface are available in +``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command +line options in a consistent way and to implement standalone main functions so +your fuzzer can be built and tested when not built against libFuzzer. + +There is also some handling of the CMake config for fuzzers, where you should +use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works +similarly to functions such as ``add_llvm_tool``, but they take care of linking +to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to +enable standalone testing. diff --git a/docs/LibFuzzer.rst b/docs/LibFuzzer.rst index c4baa2127c1..2ae84afeed8 100644 --- a/docs/LibFuzzer.rst +++ b/docs/LibFuzzer.rst @@ -42,10 +42,10 @@ This installs the Clang binary as ``./third_party/llvm-build/Release+Asserts/bin/clang``) The libFuzzer code resides in the LLVM repository, and requires a recent Clang -compiler to build (and is used to `fuzz various parts of LLVM itself`_). -However the fuzzer itself does not (and should not) depend on any part of LLVM -infrastructure and can be used for other projects without requiring the rest -of LLVM. +compiler to build (and is used to :doc:`fuzz various parts of LLVM itself +`). However the fuzzer itself does not (and should not) depend on +any part of LLVM infrastructure and can be used for other projects without +requiring the rest of LLVM. Getting Started @@ -137,6 +137,8 @@ Finally, link with ``libFuzzer.a``:: clang -fsanitize-coverage=trace-pc-guard -fsanitize=address your_lib.cc fuzz_target.cc libFuzzer.a -o my_fuzzer +.. _libfuzzer-corpus: + Corpus ------ @@ -627,66 +629,6 @@ which was configured with ``-DLIBFUZZER_ENABLE_TESTS=ON`` flag. ninja check-fuzzer -Fuzzing components of LLVM -========================== -.. contents:: - :local: - :depth: 1 - -To build any of the LLVM fuzz targets use the build instructions above. - -clang-format-fuzzer -------------------- -The inputs are random pieces of C++-like text. - -.. code-block:: console - - ninja clang-format-fuzzer - mkdir CORPUS_DIR - ./bin/clang-format-fuzzer CORPUS_DIR - -Optionally build other kinds of binaries (ASan+Debug, MSan, UBSan, etc). - -Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052 - -clang-fuzzer ------------- - -The behavior is very similar to ``clang-format-fuzzer``. - -Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057 - -llvm-as-fuzzer --------------- - -Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=24639 - -llvm-mc-fuzzer --------------- - -This tool fuzzes the MC layer. Currently it is only able to fuzz the -disassembler but it is hoped that assembly, and round-trip verification will be -added in future. - -When run in dissassembly mode, the inputs are opcodes to be disassembled. The -fuzzer will consume as many instructions as possible and will stop when it -finds an invalid instruction or runs out of data. - -Please note that the command line interface differs slightly from that of other -fuzzers. The fuzzer arguments should follow ``--fuzzer-args`` and should have -a single dash, while other arguments control the operation mode and target in a -similar manner to ``llvm-mc`` and should have two dashes. For example: - -.. code-block:: console - - llvm-mc-fuzzer --triple=aarch64-linux-gnu --disassemble --fuzzer-args -max_len=4 -jobs=10 - -Buildbot --------- - -A buildbot continuously runs the above fuzzers for LLVM components, with results -shown at http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer . - FAQ ========================= @@ -808,4 +750,4 @@ Trophies .. _`value profile`: #value-profile .. _`caller-callee pairs`: http://clang.llvm.org/docs/SanitizerCoverage.html#caller-callee-coverage .. _BoringSSL: https://boringssl.googlesource.com/boringssl/ -.. _`fuzz various parts of LLVM itself`: `Fuzzing components of LLVM`_ + diff --git a/docs/index.rst b/docs/index.rst index 212143ac79e..955607a751c 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -183,6 +183,7 @@ For developers of applications which use LLVM as a library. ProgrammersManual Extensions LibFuzzer + FuzzingLLVM ScudoHardenedAllocator OptBisect @@ -228,6 +229,9 @@ For developers of applications which use LLVM as a library. :doc:`LibFuzzer` A library for writing in-process guided fuzzers. +:doc:`FuzzingLLVM` + Information on writing and using Fuzzers to find bugs in LLVM. + :doc:`ScudoHardenedAllocator` A library that implements a security-hardened `malloc()`. -- 2.50.1