[PATCH] D126389: [AMDGPU] Improve codegen of extractelement/insertelement in some cases

Piotr Sobczak via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 31 06:55:33 PDT 2022


piotr added a comment.

In D126389#3546536 <https://reviews.llvm.org/D126389#3546536>, @jpages wrote:

> In D126389#3540486 <https://reviews.llvm.org/D126389#3540486>, @rampitec wrote:
>
>> In D126389#3540098 <https://reviews.llvm.org/D126389#3540098>, @jpages wrote:
>>
>>> In D126389#3539881 <https://reviews.llvm.org/D126389#3539881>, @foad wrote:
>>>
>>>> In D126389#3538346 <https://reviews.llvm.org/D126389#3538346>, @rampitec wrote:
>>>>
>>>>> Any performance numbers? The 8 element case was driven by a specific customer program and the performance of the cmp/select was better than movrel.
>>>>
>>>> I don't know why that would be. Maybe the performance characteristics are different on GFX10+ compared to GFX9.
>>>>
>>>> Also on GFX10+ sgpr usage does not affect occupancy, so perhaps the heuristic could be tweaked to make it more likely to use s_movrel (not v_movrel) on GFX10+.
>>>
>>> I will try to get some performance numbers on specific games. Do you know if this performance problem was specific to an architecture? 
>>> Like Jay said, I could tweak the heuristic for this and only generate it for GFX10+.
>>
>> All the tests were done on gfx9 flavors. It is also worth noting gfx9 does not have movrel, but uses s_set_gpr_idx_on instead, which may be the difference.
>>
>> I.e. it may be linked to FeatureMovrel.
>
> I measured the performance with/without the patch on the original game in which we spotted this issue.
> It's a small improvement of about 0.5% in the current version compared to without the patch. This was measured on GFX1030.
>
> It seems to be a little bit faster with the patch, I can't really test on gfx9 with this game.
>
> If it's better to keep the initial version for gfx9, I could:
>
> - Only do this patch for GFX10+
> - Some flag?

Since the current heuristic was driven by gfx9 that doesn't have movrel, Stas's suggestion to make it conditional on FeatureMovrel makes perfect sense to me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126389/new/

https://reviews.llvm.org/D126389



More information about the llvm-commits mailing list