[PATCH] D98491: [AMDGPU] Split GCN subtarget features for unaligned access

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 23 07:51:25 PDT 2021


rampitec added a comment.

In D98491#2643944 <https://reviews.llvm.org/D98491#2643944>, @hsmhsm wrote:

> @mbrkusanin
> My proposal was simple: prefer ds_read2_b64 (and write) over b128 if alignment < 16. I never heard of b64 performance issues though, so the rest is the same as now: create a widest load/store possible with that one exception for b128.

We have since then confirmed that ds_read_b64 has the same performance hit on memory not aligned to 64 bit, so 64 bit operations too need an alignment check.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98491/new/

https://reviews.llvm.org/D98491



More information about the llvm-commits mailing list