[llvm] [DevPolicy] Add guidelines for fuzzer-generated issue reports (PR #112372)

Tue Oct 15 09:44:21 PDT 2024

================
@@ -853,6 +853,57 @@ their patch with every possible configuration.
 * 3rd step: If Galina could not help you, please escalate to the
   `Infrastructure Working Group <mailto:iwg at llvm.org>`_.
 
+Guidelines for fuzzer-generated issues
+--------------------------------------
+
+Fuzzing is a valuable tool for finding compiler bugs, and the LLVM project
+welcomes fuzzer-generated test cases. However, some additional guidelines
+should be followed to make such reports maximally useful.
+
+Fuzzer-generated issues should indicate that they are such, either in the
+issue description, or (for organization members) by applying the
+``fuzzer-generated`` label. This helps us prioritize issues. The remaining
+guidelines depend on the type of issue the fuzzer detects.
+
+**For miscompilations:** These issues are usually detected by looking for
+different results when using ``-O0`` and ``-O2``, or similar. When reporting
+miscompilations, please make sure that your fuzzing methodology can only
+generate well-defined, deterministic code. Results between optimizations levels
+can legitimately differ if the code invokes undefined behavior, or includes
+non-deterministic operations. Note that running cleanly under sanitizers is
+not sufficient to establish absense of undefined behavior.
+
+Reports using ``-Ofast``, ``-ffast-math``, or other flags that permit
+floating-point reassociation/approximation must include a credible root cause
+analysis, as behavior differences are likely to be caused by legal transforms.
+
+**For crashes / assertion failures:** Crashes that occur on valid code are more
+valuable than crashes on invalid code. Both can be reported, but the former is
+more likely to see a timely fix.
+
+Fuzzing can be performed at multiple levels, where higher levels are less likely
+to produce false positives. For example, a crash triggered by valid C code will
+generally indicate a real bug. However, a crash triggered by syntactically
+well-formed LLVM IR may not. For example, a target that does not support
+scalable vectors may break when provided IR using them. When fuzzing at a lower
+level, it is encouraged to verify the plausibility of the results.
+
+Fatal errors that do not generate a stack trace should not be reported. They
----------------
Endilll wrote:

Probably worth clarifying that C source, especially a valid one, that triggers a fatal error in LLVM is indeed interesting, because that might be a Clang IRGen bug.

https://github.com/llvm/llvm-project/pull/112372