[llvm] [NVPTX] Skip processing BasicBlocks with single unreachable instruction in `nvptx-lower-unreachable` pass. (PR #72641)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 5 13:32:06 PST 2024


================
@@ -138,7 +138,19 @@ bool NVPTXLowerUnreachable::runOnFunction(Function &F) {
   InlineAsm *Exit = InlineAsm::get(ExitFTy, "exit;", "", true);
 
   bool Changed = false;
-  for (auto &BB : F)
+
+  // In scenarios where a BasicBlock contains only one unreachable instruction,
+  // the joint action of nvptx-isel and unreachable-mbb-elimination
+  // effectively optimizes the BasicBlock out. However, adding an exit
+  // command to such a BasicBlock, as suggested by this pass, preserves it
+  // within the Control Flow Graph (CFG), thereby negatively impacting size and
+  // performance. To counteract this undesirable consequence, we choose to
+  // refrain from processing BasicBlocks with just one unreachable instruction
+  // in this pass.
+
----------------
Artem-B wrote:

>  and the patch does not offer any evidence of a performance improvement 

This is true. It was also not intended to buy us any performance gains, so this part is WAI.

> or resolution of a specific issue elsewhere,

I strongly disagree with this assertion. The patch does resolve the issue we've been struggling with for a very long time. We currently do not have any other mechanism to avoid miscompilation in ptxas, which makes this pass *essential* to guarantee correctness.

While I understand your frustration with the performance regression, undoing the pass is not the way forward. I've already proposed few options that would help (escape hatch option to disable this pass if/when it's needed, working around the issue on the source code level), but for some reason you keep hammering on the "disable to fix part", without providing strong enough reasons for that. "let's break things for everyone so my code would run fast" is not a very strong argument.

> I recommend removing the problematic pass altogether since the proposed solutions to address performance issues of the pass have not satisfied reviewers.

I recommend alternative options that do not require reintroducing the miscompilation for everyone with NVIDIA GPUs.


https://github.com/llvm/llvm-project/pull/72641


More information about the llvm-commits mailing list