[PATCH] D60834: [AMDGPU] Uniform values being used outside loop marked non-divergent

Wed Apr 17 13:15:01 PDT 2019

rtaylor added a comment.

We have a test case such that a value that is uniform in the loop is used outside the loop where threads might have diverged.

For example:

define amdgpu_ps void @_amdgpu_ps_main(<4 x i32> inreg %desc, float %divergent, <2 x i32> %ptrish) {
bb59:

  br label %.preheader

.preheader:

  %tmp62 = phi i32 [ %tmp105, %bb104 ], [ 0, %bb59 ]
  cmp and branch here

bb104:

  %tmp105 = add nuw nsw i32 %tmp62, 1
  cmp and branch here

.loopexit:

  %load2 = tail call i32 @llvm.amdgcn.s.buffer.load.i32(<4 x i32> %desc, i32 %tmp62, i32 0)

}

This calls lcssa after StructurizeCFG which inserts PHI nodes into the exit block for this type of value, allowing proper DA of the value after the loop.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60834/new/

https://reviews.llvm.org/D60834