[llvm] [NVPTX] Only run LowerUnreachable when necessary (PR #109868)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 24 14:48:16 PDT 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-nvptx
Author: Justin Fargnoli (justinfargnoli)
<details>
<summary>Changes</summary>
[NVPTX: Lower unreachable to exit to allow ptxas to accurately reconstruct the CFG.](https://github.com/llvm/llvm-project/commit/1ee4d880e8760256c606fe55b7af85a4f70d006d) added the LowerUnreachable pass to NVPTX. Based on the PR description:
> Finally, although I expect this to fix most of
> https://bugs.llvm.org/show_bug.cgi?id=27738, I do still encounter
> miscompilations with Julia's unreachable-heavy code when targeting these
> older GPUs using an older ptxas version (specifically, from CUDA 11.4 or
> below). This is likely due to related bugs in ptxas which have been fixed
> since, as I have filed several reproducers with NVIDIA over the past couple of
> years. I'm not inclined to look into fixing those issues over here, and will
> instead be recommending our users to upgrade CUDA to 11.5+ when using these GPUs.
this pass is only necessary when targeting Pascal or earlier via ptxas from CUDA 11.4 or earlier. This PR updates NVPTXTargetMachine.cpp to reflect that.
CC @<!-- -->maleadt
---
Full diff: https://github.com/llvm/llvm-project/pull/109868.diff
3 Files Affected:
- (modified) llvm/lib/Target/NVPTX/NVPTXSubtarget.h (+3)
- (modified) llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp (+7-3)
- (modified) llvm/test/CodeGen/NVPTX/unreachable.ll (+9-6)
``````````diff
diff --git a/llvm/lib/Target/NVPTX/NVPTXSubtarget.h b/llvm/lib/Target/NVPTX/NVPTXSubtarget.h
index 8b9059bd60cbd4..e2ce088cacdf53 100644
--- a/llvm/lib/Target/NVPTX/NVPTXSubtarget.h
+++ b/llvm/lib/Target/NVPTX/NVPTXSubtarget.h
@@ -95,6 +95,9 @@ class NVPTXSubtarget : public NVPTXGenSubtargetInfo {
bool hasDotInstructions() const {
return SmVersion >= 61 && PTXVersion >= 50;
}
+ bool hasPTXASUnreachableBug() const {
+ return SmVersion < 70 && PTXVersion <= 74;
+ }
bool hasCvtaParam() const { return SmVersion >= 70 && PTXVersion >= 77; }
unsigned int getFullSmVersion() const { return FullSmVersion; }
unsigned int getSmVersion() const { return getFullSmVersion() / 10; }
diff --git a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
index 57b7fa783c14a7..b79b4ff93efe49 100644
--- a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
@@ -368,9 +368,13 @@ void NVPTXPassConfig::addIRPasses() {
addPass(createSROAPass());
}
- const auto &Options = getNVPTXTargetMachine().Options;
- addPass(createNVPTXLowerUnreachablePass(Options.TrapUnreachable,
- Options.NoTrapAfterNoreturn));
+ if (ST.hasPTXASUnreachableBug()) {
+ // Run LowerUnreachable to WAR a ptxas bug. See the commit description of
+ // 1ee4d880e8760256c606fe55b7af85a4f70d006d for more details.
+ const auto &Options = getNVPTXTargetMachine().Options;
+ addPass(createNVPTXLowerUnreachablePass(Options.TrapUnreachable,
+ Options.NoTrapAfterNoreturn));
+ }
}
bool NVPTXPassConfig::addInstSelector() {
diff --git a/llvm/test/CodeGen/NVPTX/unreachable.ll b/llvm/test/CodeGen/NVPTX/unreachable.ll
index 011497c4e23401..4219f0f3b47fc9 100644
--- a/llvm/test/CodeGen/NVPTX/unreachable.ll
+++ b/llvm/test/CodeGen/NVPTX/unreachable.ll
@@ -1,11 +1,15 @@
; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs \
-; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-NOTRAP
+; RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-NOTRAP
; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs \
-; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-NOTRAP
+; RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-NOTRAP
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs -mattr=+ptx75 \
+; RUN: | FileCheck %s --check-prefixes=CHECK-BUG-FIXED
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_70 -verify-machineinstrs \
+; RUN: | FileCheck %s --check-prefixes=CHECK-BUG-FIXED
; RUN: llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs -trap-unreachable \
-; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-TRAP
+; RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-TRAP
; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs -trap-unreachable \
-; RUN: | FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-TRAP
+; RUN: | FileCheck %s --check-prefixes=CHECK,CHECK-TRAP
; RUN: %if ptxas && !ptxas-12.0 %{ llc < %s -march=nvptx -mcpu=sm_20 -verify-machineinstrs | %ptxas-verify %}
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_20 -verify-machineinstrs | %ptxas-verify %}
@@ -21,12 +25,11 @@ define void @kernel_func() {
; CHECK-TRAP: trap;
; CHECK-NOTRAP-NOT: trap;
; CHECK: exit;
+; CHECK-BUG-FIXED-NOT: exit;
unreachable
}
attributes #0 = { noreturn }
-
!nvvm.annotations = !{!1}
-
!1 = !{ptr @kernel_func, !"kernel", i32 1}
``````````
</details>
https://github.com/llvm/llvm-project/pull/109868
More information about the llvm-commits
mailing list