[PATCH] D101591: [AMDGPU] Improve global SADDR selection
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 29 17:49:29 PDT 2021
arsenm added a comment.
In D101591#2727536 <https://reviews.llvm.org/D101591#2727536>, @rampitec wrote:
> In D101591#2727519 <https://reviews.llvm.org/D101591#2727519>, @arsenm wrote:
>
>> This looks like it's regressing the case where the offset is known constant, but just doesn't fit.
>
> Do you mean atomic_add_i32_huge_offset() test? Will check it.
That's one case, but it looks like all of the cases that add an extra v_mov_b32 (e.g. global_load_saddr_i8_offset_neg4096 which I commented on).
>> Can you also make the corresponding globalisel change?
>
> Yes, will do.
================
Comment at: llvm/test/CodeGen/AMDGPU/global-saddr-load.ll:88-91
+; GFX10-NEXT: v_mov_b32_e32 v0, 0
+; GFX10-NEXT: s_add_u32 s0, s2, 0xfffff000
+; GFX10-NEXT: s_addc_u32 s1, s3, -1
+; GFX10-NEXT: global_load_ubyte v0, v0, s[0:1]
----------------
rampitec wrote:
> arsenm wrote:
> > This is more instructions
> It is one materialized zero, I think it is more or less degenarate case. In a real world there is usually a vgpr with zero. On the other hand this is one VGPR less.
This is the same pattern in all of the cases with a large immediate offset split. I think this is reasonably common and I would lean towards fewer instructions here
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D101591/new/
https://reviews.llvm.org/D101591
More information about the llvm-commits
mailing list