[PATCH] D68092: [AMDGPU] Invert the handling of skip insertion.

Tue Oct 1 06:17:33 PDT 2019

cdevadas marked 2 inline comments as done.
cdevadas added a comment.

You are right, we need to redesign the function shouldRetainSkips, especially in computing the cost. It is not guaranteed that the order of 'From' to 'To' blocks is a fall-through. 
There could essentially be a nested control-flow which makes the cost computation a little complex. We can only approximate the number of instructions in the region. 
We have talked about it earlier and trying to make the current design more close to how SIInsertSkip works now.

================
Comment at: lib/Target/AMDGPU/SILowerControlFlow.cpp:247-248

   // Insert a pseudo terminator to help keep the verifier happy. This will also
   // be used later when inserting skips.
+  MachineInstr *NewBr = BuildMI(MBB, I, DL, TII->get(AMDGPU::S_CBRANCH_EXECZ))
----------------
nhaehnle wrote:
> Comment needs to be updated.
I will change it.

================
Comment at: lib/Target/AMDGPU/SIRemoveShortExecBranches.cpp:10-12
+/// This pass removes the unwanted exec mask instructions inserted during
+/// SILowerControlFlow. It removes the exec masks for the short branches and
+/// tries to retain it for the long branches.
----------------
nhaehnle wrote:
> This should say that it removes EXECZ branches for short branches, right?
> 
> Also... "try to retain"? I do hope we always succeed at that :)
Yes, it is misleading. I will fix the comment.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68092/new/

https://reviews.llvm.org/D68092