[PATCH] D130729: [InferAddressSpaces] [AMDGPU] Add inference for flat_atomic intrinsics

Thu Aug 4 14:00:30 PDT 2022

b-sumner added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/gep-const-offset-address-space.ll:158
+; CHECK-NEXT:    v_pk_mov_b32 v[0:1], s[6:7], s[6:7] op_sel:[0,1]
+; CHECK-NEXT:    global_atomic_add_f64 v2, v[0:1], s[0:1]
+; CHECK-NEXT:    s_endpgm
----------------
rampitec wrote:
> arsenm wrote:
> > rampitec wrote:
> > > arsenm wrote:
> > > > jrbyrnes wrote:
> > > > > arsenm wrote:
> > > > > > I wouldn't expect this transform to happen. I would expect to emit the flat instruction for the flat atomic despite the address space
> > > > > Not for this PHI test in particular, but for all these tests in which we lower to a global_atomic, right? 
> > > > Yes. I expect the flat atomic intrinsic to give the flat instruction regardless of address space
> > > If we know its AS exactly why not to do it? Especially that we are widely using code specialization with AS checking when flat atomic is unavailable.
> > I thought the whole reason we had these address space specific intrinsics in the first place was because of the painfully divergent behaviors in the instructions
> Added @b-sumner. There is some divergence between DS and VMEM, I do not recall global vs flat within the same GPU. But then I believe these intrinsics exist to use what the target can offer, so mostly because of the divergence between GPUs itself.
Agreed.  These intrinsics are used to expose HW capabilities when available, and users will be pleased if we can specialize to a known address space.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130729/new/

https://reviews.llvm.org/D130729