[llvm] [DevPolicy] Add guidelines for fuzzer-generated issue reports (PR #112372)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 15 10:23:11 PDT 2024


https://github.com/nikic updated https://github.com/llvm/llvm-project/pull/112372

>From f47ae5f3d47890becc03c3334841ce7f60b94fb3 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Tue, 15 Oct 2024 16:49:09 +0200
Subject: [PATCH 1/2] Add guidelines for fuzzer-generated issue reports

---
 llvm/docs/DeveloperPolicy.rst | 51 +++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/llvm/docs/DeveloperPolicy.rst b/llvm/docs/DeveloperPolicy.rst
index caa4b31b949c92..f433c8c0c530f3 100644
--- a/llvm/docs/DeveloperPolicy.rst
+++ b/llvm/docs/DeveloperPolicy.rst
@@ -853,6 +853,57 @@ their patch with every possible configuration.
 * 3rd step: If Galina could not help you, please escalate to the
   `Infrastructure Working Group <mailto:iwg at llvm.org>`_.
 
+Guidelines for fuzzer-generated issues
+--------------------------------------
+
+Fuzzing is a valuable tool for finding compiler bugs, and the LLVM project
+welcomes fuzzer-generated test cases. However, some additional guidelines
+should be followed to make such reports maximally useful.
+
+Fuzzer-generated issues should indicate that they are such, either in the
+issue description, or (for organization members) by applying the
+``fuzzer-generated`` label. This helps us prioritize issues. The remaining
+guidelines depend on the type of issue the fuzzer detects.
+
+**For miscompilations:** These issues are usually detected by looking for
+different results when using ``-O0`` and ``-O2``, or similar. When reporting
+miscompilations, please make sure that your fuzzing methodology can only
+generate well-defined, deterministic code. Results between optimizations levels
+can legitimately differ if the code invokes undefined behavior, or includes
+non-deterministic operations. Note that running cleanly under sanitizers is
+not sufficient to establish absense of undefined behavior.
+
+Reports using ``-Ofast``, ``-ffast-math``, or other flags that permit
+floating-point reassociation/approximation must include a credible root cause
+analysis, as behavior differences are likely to be caused by legal transforms.
+
+**For crashes / assertion failures:** Crashes that occur on valid code are more
+valuable than crashes on invalid code. Both can be reported, but the former is
+more likely to see a timely fix.
+
+Fuzzing can be performed at multiple levels, where higher levels are less likely
+to produce false positives. For example, a crash triggered by valid C code will
+generally indicate a real bug. However, a crash triggered by syntactically
+well-formed LLVM IR may not. For example, a target that does not support
+scalable vectors may break when provided IR using them. When fuzzing at a lower
+level, it is encouraged to verify the plausibility of the results.
+
+Fatal errors that do not generate a stack trace should not be reported. They
+indicate an incorrect use of LLVM, rather than a bug.
+
+**For missed optimizations:** There is an infinite number of optimizations that
+*could* be implemented, but only a small subset of them is relevant for
+real-world code. As such, fuzzer-generated reports for missed optimizations are
+only accepted if plausible real-world usefulness can be shown.
+
+For example, a valid strategy is to take a corpus of real-world code and use a
+super-optimizer to find missed optimization opportunities. An invalid strategy
+is to generate random code and check whether GCC generates less code than Clang.
+
+Fuzzer-generated missed optimization reports that are not derived from
+real-world code must include a root-cause analysis, and an explanation for why
+you believe that the missed optimization has real-world relevance.
+
 .. _new-llvm-components:
 
 Introducing New Components into LLVM

>From c46f01e340c07cd7e91eacc7cd279a27f771ea68 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Tue, 15 Oct 2024 19:22:47 +0200
Subject: [PATCH 2/2] minimized reproducer, deduplication, llvm main

---
 llvm/docs/DeveloperPolicy.rst | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/llvm/docs/DeveloperPolicy.rst b/llvm/docs/DeveloperPolicy.rst
index f433c8c0c530f3..65caf5120c14f3 100644
--- a/llvm/docs/DeveloperPolicy.rst
+++ b/llvm/docs/DeveloperPolicy.rst
@@ -862,8 +862,16 @@ should be followed to make such reports maximally useful.
 
 Fuzzer-generated issues should indicate that they are such, either in the
 issue description, or (for organization members) by applying the
-``fuzzer-generated`` label. This helps us prioritize issues. The remaining
-guidelines depend on the type of issue the fuzzer detects.
+``fuzzer-generated`` label.
+
+Issues should include a minimized reproducer (including both the necessary code
+and command line arguments) both as part of the issue description and as a
+godbolt.org link. An effort should be made to deduplicate issues that likely
+have the same root cause, and check whether a similar issue has already been
+reported. Reports should always be submitted against current LLVM ``main``,
+not a released version.
+
+The remaining guidelines depend on the type of issue the fuzzer detects.
 
 **For miscompilations:** These issues are usually detected by looking for
 different results when using ``-O0`` and ``-O2``, or similar. When reporting



More information about the llvm-commits mailing list