[PATCH] D101591: [AMDGPU] Improve global SADDR selection

Thu Apr 29 17:49:29 PDT 2021

arsenm added a comment.

In D101591#2727536 <https://reviews.llvm.org/D101591#2727536>, @rampitec wrote:

> In D101591#2727519 <https://reviews.llvm.org/D101591#2727519>, @arsenm wrote:
>
>> This looks like it's regressing the case where the offset is known constant, but just doesn't fit.
>
> Do you mean atomic_add_i32_huge_offset() test? Will check it.

That's one case, but it looks like all of the cases that add an extra v_mov_b32 (e.g. global_load_saddr_i8_offset_neg4096 which I commented on).

>> Can you also make the corresponding globalisel change?
>
> Yes, will do.

================
Comment at: llvm/test/CodeGen/AMDGPU/global-saddr-load.ll:88-91
+; GFX10-NEXT:    v_mov_b32_e32 v0, 0
+; GFX10-NEXT:    s_add_u32 s0, s2, 0xfffff000
+; GFX10-NEXT:    s_addc_u32 s1, s3, -1
+; GFX10-NEXT:    global_load_ubyte v0, v0, s[0:1]
----------------
rampitec wrote:
> arsenm wrote:
> > This is more instructions
> It is one materialized zero, I think it is more or less degenarate case. In a real world there is usually a vgpr with zero. On the other hand this is one VGPR less.
This is the same pattern in all of the cases with a large immediate offset split. I think this is reasonably common and I would lean towards fewer instructions here

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101591/new/

https://reviews.llvm.org/D101591