[PATCH] D31350: AMDGPU : Fix common dominator of two incoming blocks terminates with uniform branch issue.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 12 12:35:14 PDT 2017


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:333
+                               const TargetRegisterInfo *TRI) {
+  DenseSet<MachineBasicBlock*> Visited;
+  SmallVector<MachineBasicBlock*, 4> Worklist(MBB->pred_begin(), 
----------------
wdng wrote:
> arsenm wrote:
> > You could maybe make this a map and cache the result so that every single block that needs to be visited doesn't need to walk all the way up each time
> Visited is used to check whether the visiting node has been visited before, once is has been visited before it's predecessors won't be processed again. For example:
> 
> ```
>     D
>   /   \
>  A     C
>  |  \  |
>  B     E 
>   \   /
>     F
> ```
> Assume starting from node F,  trace down F->B->A->D, when visiting from F to E, A and D won't be processed again. So I don't think there is a need to save results like A and D.
If you consider the entire function, the same predecessors will be visited for phis in other blocks


================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:78
 #include "llvm/Target/TargetMachine.h"
+#include "llvm/ADT/DenseSet.h"
 
----------------
This should be the first included llvm header


================
Comment at: test/CodeGen/AMDGPU/sgprcopies.ll:6
+; GCN: v_add
+define void @checkTwoBlocksWithUniformBranch(i32 addrspace(1)* nocapture %out, i32 %width, float %xPos, float %yPos, float %xStep, float %yStep, i32 %maxIter) {
+entry:
----------------
This should be marked amdgpu_kernel


Repository:
  rL LLVM

https://reviews.llvm.org/D31350





More information about the llvm-commits mailing list