[PATCH] D73815: AMDGPU: Fix divergence analysis of control flow intrinsics

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 3 15:24:38 PST 2020


arsenm added a comment.

I think the only obstacle to eliminating requiresUniformRegister is due to the treatment of phis with always uniform inputs. This case for example, the LCSSA phi in the return block was incorrectly concluded to be divergent, despite there only being one always uniform input

  Printing analysis 'Legacy Divergence Analysis' for function 'atomic_nand_i32_lds':
  DIVERGENT: i32 addrspace(3)* %ptr
  
             :
  DIVERGENT:       %1 = load i32, i32 addrspace(3)* %ptr, align 4
                   br label %atomicrmw.start
  
             atomicrmw.start:
                   %phi.broken = phi i64 [ %4, %atomicrmw.start ], [ 0, %0 ]
  DIVERGENT:       %loaded = phi i32 [ %1, %0 ], [ %newloaded, %atomicrmw.start ]
  DIVERGENT:       %2 = and i32 %loaded, 4
  DIVERGENT:       %new = xor i32 %2, -1
  DIVERGENT:       %3 = cmpxchg i32 addrspace(3)* %ptr, i32 %loaded, i32 %new seq_cst seq_cst
  DIVERGENT:       %success = extractvalue { i32, i1 } %3, 1
  DIVERGENT:       %newloaded = extractvalue { i32, i1 } %3, 0
                   %4 = call i64 @llvm.amdgcn.if.break.i64(i1 %success, i64 %phi.broken)
  DIVERGENT:       %5 = call i1 @llvm.amdgcn.loop.i64(i64 %4)
  DIVERGENT:       br i1 %5, label %atomicrmw.end, label %atomicrmw.start
  
             atomicrmw.end:
  DIVERGENT:       %newloaded.lcssa = phi i32 [ %newloaded, %atomicrmw.start ]
  DIVERGENT:       %.lcssa = phi i64 [ %4, %atomicrmw.start ]
  DIVERGENT:       call void @llvm.amdgcn.end.cf.i64(i64 %.lcssa)
  DIVERGENT:       ret i32 %newloaded.lcssa


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73815/new/

https://reviews.llvm.org/D73815





More information about the llvm-commits mailing list