[PATCH] D123343: [AMDGPU] Refactor LDS alignment checks.

Wed Apr 13 06:32:21 PDT 2022

foad added a comment.

In D123343#3437583 <https://reviews.llvm.org/D123343#3437583>, @rampitec wrote:

> I am not sure we really want to tell truth about the 'Fast' here. If we tell that DS read misaligned by 1 byte is slow vectorizer will not combine 2 of them and we will get 2 separate ds_read_b32 instead of ds_read2_b32. It is slow, but the ds_read2_b32 is still faster than 2 separate instructions equally misaligned. That is what happens then: https://reviews.llvm.org/differential/diff/421361/

Maybe LoadStoreVectorizer should be changed to create slow instructions, if the instructions being combined were slow already.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123343/new/

https://reviews.llvm.org/D123343