[PATCH] D23417: AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar branches

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 11 13:19:28 PDT 2016


arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:101
@@ -96,2 +100,3 @@
   void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.addRequired<MachineDominatorTree>();
     AU.setPreservesCFG();
----------------
Should also preserve it (it might be redundant with preserves CFG but I'm not sure if you actually use it)

================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:248
@@ -238,1 +247,3 @@
 
+static bool hasUniformTerminator(const MachineBasicBlock *MBB) {
+  MachineBasicBlock::const_iterator Term = MBB->getFirstTerminator();
----------------
Reference

================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:249-266
@@ -239,1 +248,20 @@
+static bool hasUniformTerminator(const MachineBasicBlock *MBB) {
+  MachineBasicBlock::const_iterator Term = MBB->getFirstTerminator();
+
+  // No terminator means this is a fall-through which is a uniform branch.
+  if (Term == MBB->end())
+    return true;
+
+  switch (Term->getOpcode()) {
+  default:
+    return false;
+  case AMDGPU::S_BRANCH:
+  case AMDGPU::S_CBRANCH_SCC0:
+  case AMDGPU::S_CBRANCH_SCC1:
+  case AMDGPU::S_CBRANCH_VCCNZ:
+  case AMDGPU::S_CBRANCH_VCCZ:
+    return true;
+  }
+}
+
 bool SIFixSGPRCopies::runOnMachineFunction(MachineFunction &MF) {
----------------
This is also missing s_cbranch_exec[n]z, s_endpgm, si_return.

This function seems generally wrong to me. What exactly are you looking for by uniform terminator? The way I've been thinking about it, any branch terminator is uniform. 

This only looks at a single terminator, so it won't correctly handle something like:

s_cbranch_scc0 BB1
s_branch BB2

I think what you really want is to look for is si_mask_branch (or at this early point one of the pseudo terminators that modify exec, none of which should be branches).

But you can also have something like:
si_mask_branch BB1
s_cbranch_vccz BB1
s_branch BB2

Coming directly out of isel I think this is the most likely scenario since BranchFolding is supposed to cleanup unneeded unconditional branches. I think what you probably want to be looking for is really a terminator that modifiers exec.

================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:307
@@ -278,1 +306,3 @@
 
+        // We don't need to fix the PHI if the common denominator of the
+        // two incoming blocks terminates with a uniform branch.
----------------
s/denominator/dominator


https://reviews.llvm.org/D23417





More information about the llvm-commits mailing list