[PATCH] D76567: AMDGPU: Implement getMemcpyLoopLoweringType

Thu Mar 26 09:45:47 PDT 2020

arsenm marked an inline comment as done.
arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:332
+    // introduce them.
+    if (MinAlign == 2)
+      return Type::getInt8Ty(Context);
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > arsenm wrote:
> > > > foad wrote:
> > > > > `<=`? You can't do unaligned dword (or multi-dword) accesses, can you?
> > > > Yes, you can on anything remotely new. It's also not critical to get this exactly right, since the loads will still be legalized later.
> > > Then I don't understand why you have a special case for `MinAlign == 2` at all. Why not just use unaligned (multi-)dword accesses, like you would for `MinAlign == 1`?
> > See D74345, we recently discovered 2 byte aligned accesses end up getting executed as multiple 1 byte accesses
> So accesses are slow if `(run time address) % 4 == 2`. If `MinAlign == 2` then there's a 50% chance of the access being slow. If `MinAlign == 1` then it's only a 25% chance. Is that right?
Yes, that is my understanding

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76567/new/

https://reviews.llvm.org/D76567