[llvm] [NVPTX] Skip processing BasicBlocks with single unreachable instruction in `nvptx-lower-unreachable` pass. (PR #72641)

Thu Mar 7 11:29:33 PST 2024

================
@@ -138,7 +138,19 @@ bool NVPTXLowerUnreachable::runOnFunction(Function &F) {
   InlineAsm *Exit = InlineAsm::get(ExitFTy, "exit;", "", true);
 
   bool Changed = false;
-  for (auto &BB : F)
+
+  // In scenarios where a BasicBlock contains only one unreachable instruction,
+  // the joint action of nvptx-isel and unreachable-mbb-elimination
+  // effectively optimizes the BasicBlock out. However, adding an exit
+  // command to such a BasicBlock, as suggested by this pass, preserves it
+  // within the Control Flow Graph (CFG), thereby negatively impacting size and
+  // performance. To counteract this undesirable consequence, we choose to
+  // refrain from processing BasicBlocks with just one unreachable instruction
+  // in this pass.
+
----------------
Artem-B wrote:

> every unreachable block ends with a trap terminator (which I guess is the current behavior).

I think we can enable unreachable->trap lowering with a flag, but by default unreachable is still lowered into nothing. AFAICT, trap will likely create the same performance regression as exit does in this case.

For NVPTX, we will continue to need `trap` or `exit` to avoid confusing ptxas about the intended control flow. If someone consciously wants/needs to trade off miscompilation vs performance, we'll still need to have an explicit "shoot this foot, please" option for that.

https://github.com/llvm/llvm-project/pull/72641