[PATCH] D16603: AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 29 21:08:38 PST 2016


arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIFixSGPRLiveRanges.cpp:115
@@ +114,3 @@
+                                          E = MBB->end(); I != E; ++I) {
+      if (!TII->isSOPP(I->getOpcode()))
+        return false;
----------------
tstellarAMD wrote:
> arsenm wrote:
> > You should use TII->isSOPP(*I) here.
> > 
> > I think there should be an assert that I->isBranch()
> You mean the assert should be inside the branch?
Actually since this is from getFirstTerminator, I think the MachineVerifier would already catch this so it's not important

================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:2924-2925
@@ +2923,4 @@
+
+    if (I->getOpcode() == AMDGPU::S_CBRANCH_SCC0 ||
+        I->getOpcode() == AMDGPU::S_CBRANCH_SCC1)
+      Worklist.push_back(I);
----------------
tstellarAMD wrote:
> arsenm wrote:
> > Why is this limited to branches? What if you had another user like a s_cselect? I would expect any user to need processing
> If we don't limit this to branch instructions, then any instructions in the block which read scc will be moved to VALU, which we don't want.  For most instructions the decision to move them to the VALU is based on explicit uses.  This function needs to handle the cases where changing and implicit uses would force the instruction to go to the VALU.
But if the SCC def instruction was moved, any non-branch users would also need to be moved. We don't want all SCC users, just the ones until the next SCC def which is what this loop does. I don't think being implicit avoids the problem of needing to move the users.

Also I think this function should be only called for isDef() below. I think this works because the def check is still done for the original def instruction, but is more work to follow. This will end up getting called twice if you happen to have an addc to branch on overflow which has both a use and a def, and it will end up moving more instructions from the use.

The loop should also probably start at SCCDefInst + 1 also.



http://reviews.llvm.org/D16603





More information about the llvm-commits mailing list