[PATCH] D123343: [AMDGPU] Refactor LDS alignment checks.

Wed Apr 13 07:58:32 PDT 2022

rampitec added a comment.

In D123343#3448319 <https://reviews.llvm.org/D123343#3448319>, @foad wrote:

> In D123343#3437583 <https://reviews.llvm.org/D123343#3437583>, @rampitec wrote:
>
>> I am not sure we really want to tell truth about the 'Fast' here. If we tell that DS read misaligned by 1 byte is slow vectorizer will not combine 2 of them and we will get 2 separate ds_read_b32 instead of ds_read2_b32. It is slow, but the ds_read2_b32 is still faster than 2 separate instructions equally misaligned. That is what happens then: https://reviews.llvm.org/differential/diff/421361/
>
> Maybe LoadStoreVectorizer should be changed to create slow instructions, if the instructions being combined were slow already.

I have a w/a for this in the D123634 <https://reviews.llvm.org/D123634>, but in general I do not think 'fast' or 'slow' is a right measure. Something is not fast or slow, but faster or slower than something other. This would be a big change though.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:1555
       // gfx8 and older.
-      bool AlignedBy16 = Alignment >= Align(16);
-      if (IsFast)
-        *IsFast = AlignedBy16;
+      RequiredAlignment = Align(16);
+      break;
----------------
foad wrote:
> You don't need this - it is already handled by using PowerOf2Ceil to initialize RequiredAlignment.
Indeed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123343/new/

https://reviews.llvm.org/D123343