[PATCH] D123343: [AMDGPU] Refactor LDS alignment checks.

Thu Apr 7 16:11:59 PDT 2022

rampitec added a comment.

I am not sure we really want to tell truth about the 'Fast' here. If we tell that DS read misaligned by 1 byte is slow vectorizer will not combine 2 of them and we will get 2 separate ds_read_b32 instead of ds_read2_b32. It is slow, but the ds_read2_b32 is still faster than 2 separate instructions equally misaligned. That is what happens then: https://reviews.llvm.org/differential/diff/421361/

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123343/new/

https://reviews.llvm.org/D123343