[PATCH] D31350: AMDGPU : Fix common dominator of two incoming blocks terminates with uniform branch issue.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 12 12:35:14 PDT 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:333
+ const TargetRegisterInfo *TRI) {
+ DenseSet<MachineBasicBlock*> Visited;
+ SmallVector<MachineBasicBlock*, 4> Worklist(MBB->pred_begin(),
----------------
wdng wrote:
> arsenm wrote:
> > You could maybe make this a map and cache the result so that every single block that needs to be visited doesn't need to walk all the way up each time
> Visited is used to check whether the visiting node has been visited before, once is has been visited before it's predecessors won't be processed again. For example:
>
> ```
> D
> / \
> A C
> | \ |
> B E
> \ /
> F
> ```
> Assume starting from node F, trace down F->B->A->D, when visiting from F to E, A and D won't be processed again. So I don't think there is a need to save results like A and D.
If you consider the entire function, the same predecessors will be visited for phis in other blocks
================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:78
#include "llvm/Target/TargetMachine.h"
+#include "llvm/ADT/DenseSet.h"
----------------
This should be the first included llvm header
================
Comment at: test/CodeGen/AMDGPU/sgprcopies.ll:6
+; GCN: v_add
+define void @checkTwoBlocksWithUniformBranch(i32 addrspace(1)* nocapture %out, i32 %width, float %xPos, float %yPos, float %xStep, float %yStep, i32 %maxIter) {
+entry:
----------------
This should be marked amdgpu_kernel
Repository:
rL LLVM
https://reviews.llvm.org/D31350
More information about the llvm-commits
mailing list