[llvm] [NVPTX] Skip processing BasicBlocks with single unreachable instruction in `nvptx-lower-unreachable` pass. (PR #72641)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 15 16:30:47 PST 2024


================
@@ -138,7 +138,19 @@ bool NVPTXLowerUnreachable::runOnFunction(Function &F) {
   InlineAsm *Exit = InlineAsm::get(ExitFTy, "exit;", "", true);
 
   bool Changed = false;
-  for (auto &BB : F)
+
+  // In scenarios where a BasicBlock contains only one unreachable instruction,
+  // the joint action of nvptx-isel and unreachable-mbb-elimination
+  // effectively optimizes the BasicBlock out. However, adding an exit
+  // command to such a BasicBlock, as suggested by this pass, preserves it
+  // within the Control Flow Graph (CFG), thereby negatively impacting size and
+  // performance. To counteract this undesirable consequence, we choose to
+  // refrain from processing BasicBlocks with just one unreachable instruction
+  // in this pass.
+
----------------
Artem-B wrote:

I'm not disagreeing that it does make a difference in your use case, and that in your case not having an exit is benign and produces faster code.

The trouble is that I can not say that the patch is benign for *all* NVPTX users.

As far as LLVM is concerned you patch is correct, but in case of NVIDIA GPUs, we also need to take into account NVIDIA's assembler and we do know that it miscompiles sufficiently convoluted code, unless we explicitly annotate unreachable code with an `exit` instead of allowing it to fall through.  ptxas does not have the same info as LLVM and does not know that the code is unreachable. The fall-through effectively looks like a new CFG edge which confuses it, and results in thread mis-convergence.

I may be wrong and would be happy to be *proven* wrong. And example of the "works for me here" is not sufficient. I'll readily admit that it does, and will get back to my point above -- can we guarantee it to work for all users? Until we can, we should keep generating `exit`.

I would be open to resurrecting a hidden compiler flag to disable generation of `exit`, instead. 

I personally do not know how to prove correctness in this case. Based on the past experience with this issue (~8 years of looking for the fix: https://bugs.llvm.org/show_bug.cgi?id=27738), empirical testing on this particular issue would also be insufficient. The issue  tends to pop up in odd and unpredictable places and is very hard to reproduce and diagnose in most of those cases. So, please, take my word that there's a very good reason for the abundance of caution I'm advocating for.

https://github.com/llvm/llvm-project/pull/72641


More information about the llvm-commits mailing list