[llvm-dev] How to find the root causes of compiler bugs in practice?

Robinson, Paul via llvm-dev llvm-dev at lists.llvm.org
Mon Nov 9 06:24:33 PST 2020

I skimmed the first paper.  I’m assuming they picked mis-compile bugs (bad code generation).  It looks like a technique to triangulate on the likely buggy module by mutating the example code provided with a bug report, and comparing coverage traces for “good” and “bad” sources.  It’s very black-box, assuming no diagnostic aids from the software under test, but that’s reasonable for an automated technique.

In my experience, many bugs have fairly obvious origins (at least in terms of likely source modules) to someone experienced in a given area.  But I could see this being a useful tool for bugs with less obvious origins, and certainly a lot less tedious than wading through lots of diagnostic output.  Also could be useful to people less familiar with LLVM, as one way to narrow down the search for a bug without needing to learn a lot about LLVM’s own diagnostic tools.

FTR, the paper says the benchmark and code are available at the project webpage: https://github.com/JunjieCheck/DiWi if anyone wants to try it out.

From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Min-Yih Hsu via llvm-dev
Sent: Friday, November 6, 2020 12:29 PM
To: Zhide Zhou <cszide at 163.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] How to find the root causes of compiler bugs in practice?

In general, finding the root cause in LLVM is not really a big difference than debugging a normal software: Depending on the scenario, if it’s a crash then putting it on a gdb is probably the first step you wanna do. And this usually can tell you the answer pretty fast.
More tricky scenarios usually involving developers to leverage various of LLVM-specific diagnosing features, the `-print-after-all` CLI option in opt, to name a few, to provide more insights on the intermediate steps. To help you narrowing down the problematic region. If the input is too big, more advanced tool like bugpoint (this is also LLVM-specific tool) can help you bisecting and trimming the input. After these (pre)processing, normal debugging tricks like gdb or even the good-old-printf can be easily applied

IMHO the efficiency of finding root cause heavily depends on your experiences on engineering. And I 100% agree that it sometimes takes a lot of time. So it kinda makes sense that people want to automate it, but I’m not an expert on this matter. All I know is there has been tons of research and efforts on finding bugs - there might be some overlap on these two topics, i’m not really sure. But you might want to checkout techniques like fuzzing, and sanitizer. LLVM has pretty mature implementations on both of them.


On Nov 6, 2020, at 6:16 AM, Zhide Zhou via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:

Hi, developers,
Recently, I read two papers [1], [2] about finding the root causes of compiler bugs. However, I do not find any information in these paper about how compiler developers find the root causes of compiler bugs in practice. So I am curious whether these techniques are useful in practice. For my experience,  the outputs of compilers are always used to isolate the causes of compiler bugs, such as the IR after each pass or the backtrace.
I am a newbie for LLVM. So I am curious how developers of LLVM or GCC find the root causes of compiler bugs in practice.


[1] Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang. 2019. Compiler bug isolation via effective witness test program generation. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA, 223–234. DOI:https://doi.org/10.1145/3338906.3338957<https://urldefense.com/v3/__https:/doi.org/10.1145/3338906.3338957__;!!JmoZiZGBv3RvKRSx!vPRrEiiBNs81YnmhqWD8vnz5PepmivNOy63gXrXkgTO2DODexM6QTO5QZkWF-IB1vg$>
[2] Junjie Chen, Haoyang Ma, Lingming Zhang, Enhanced Compiler Bug Isolation via Memoized Search, Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering.

LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201109/59af72d2/attachment-0001.html>

More information about the llvm-dev mailing list