[llvm] 323bfde - AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Connor Abbott via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 29 06:29:23 PST 2020
Author: Connor Abbott
Date: 2020-01-29T15:08:46+01:00
New Revision: 323bfde20c5f3e63db3d6b385b394ed38542abe6
URL: https://github.com/llvm/llvm-project/commit/323bfde20c5f3e63db3d6b385b394ed38542abe6
DIFF: https://github.com/llvm/llvm-project/commit/323bfde20c5f3e63db3d6b385b394ed38542abe6.diff
LOG: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Summary:
The code was assuming in a few places that if there was only one exit
from the function that it was a normal return, which is invalid. It
could be an infinite loop, in which case we still need to insert the
usual fake edge so that the null export happens. This fixes shaders that
end with an infinite loop that discards.
Reviewers: arsenm, nhaehnle, critson
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71192
Added:
Modified:
llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
index 01bb60f07f2e..f7bd478d73e6 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.cpp
@@ -195,8 +195,6 @@ static BasicBlock *unifyReturnBlockSet(Function &F,
bool AMDGPUUnifyDivergentExitNodes::runOnFunction(Function &F) {
auto &PDT = getAnalysis<PostDominatorTreeWrapperPass>().getPostDomTree();
- if (PDT.getRoots().size() <= 1)
- return false;
LegacyDivergenceAnalysis &DA = getAnalysis<LegacyDivergenceAnalysis>();
@@ -321,7 +319,7 @@ bool AMDGPUUnifyDivergentExitNodes::runOnFunction(Function &F) {
if (ReturningBlocks.empty())
return false; // No blocks return
- if (ReturningBlocks.size() == 1)
+ if (ReturningBlocks.size() == 1 && !InsertExport)
return false; // Already has a single return block
const TargetTransformInfo &TTI
diff --git a/llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll b/llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll
index 30280b967ad8..a2358f3a80f4 100644
--- a/llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll
+++ b/llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll
@@ -45,6 +45,22 @@ end:
ret void
}
+; test the case where there's only a kill in an infinite loop
+; CHECK-LABEL: only_kill
+; CHECK: exp null off, off, off, off done vm
+; CHECK-NEXT: s_endpgm
+; SIInsertSkips inserts an extra null export here, but it should be harmless.
+; CHECK: exp null off, off, off, off done vm
+; CHECK-NEXT: s_endpgm
+define amdgpu_ps void @only_kill() #0 {
+main_body:
+ br label %loop
+
+loop:
+ call void @llvm.amdgcn.kill(i1 false) #3
+ br label %loop
+}
+
; In case there's an epilog, we shouldn't have to do this.
; CHECK-LABEL: return_nonvoid
; CHECK-NOT: exp null off, off, off, off done vm
More information about the llvm-commits
mailing list