[PATCH] D147408: [AMDGPU] Iterative scan implementation for atomic optimizer.

Thu May 18 06:02:15 PDT 2023

foad added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll:459-462
+; GFX8-NEXT:    s_ff1_i32_b32 s5, s3
+; GFX8-NEXT:    s_ff1_i32_b32 s6, s2
+; GFX8-NEXT:    s_add_i32 s5, s5, 32
+; GFX8-NEXT:    s_min_u32 s5, s6, s5
----------------
pravinjagtap wrote:
> pravinjagtap wrote:
> > foad wrote:
> > > Not your fault, but we really ought to be able to select s_ff1_i32_b64 here.
> > > Not your fault, but we really ought to be able to select s_ff1_i32_b64 here.
> > 
> > I am not sure how to address this. May be, we need to teach ISel this specific pattern.
> Hello @foad,
> Can we consider generating //s_ff1_i32_b64 //and //s_bitset0_b64 //as independent task (future enhancement) and unblock this to move forward since we need to submit this in stages ?  
> Can we consider generating s_ff1_i32_b64 and s_bitset0_b64 as independent task (future enhancement)
Yes of course, it does not need to block this patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147408/new/

https://reviews.llvm.org/D147408