[PATCH] D80364: [amdgpu] Teach load widening to handle non-DWORD aligned loads.

Fri May 22 13:25:09 PDT 2020

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7472-7475
+    std::tie(AD, RC) =
+        MFI->getPreloadedValue(AMDGPUFunctionArgInfo::DISPATCH_PTR);
+    if (AD && AD->isRegister() && AD->getRegister() == Reg)
+      return true;
----------------
hliao wrote:
> hliao wrote:
> > hliao wrote:
> > > arsenm wrote:
> > > > This will be true for all of the preloaded SGPRs. However I think looking for the argument copies here is the wrong approach. The original intrinsic calls should have been annotated with the align return value attribute. Currently this is a burden on the frontend/library call emitting the intrinsic. We could either annotate the intrinsic calls in one of the later passes (maybe AMDGPUCodeGenPrepare), or add a minimum alignment to the intrinsic definition which will always be applied, similar to how intrinsic declarations already get their other attributes
> > > do we have that attribute available? could you elaborate in detail?
> > OK, I found that.
> In another review [[ https://reviews.llvm.org/D80422 | D80422 ]], relevant intrinsics are being annotated with assumed alignment on the return pointer. Unfortunately, in SelectionDAG, we won't have the facility to keep tracking of that hint. I will enhance `AMDGPUCodeGenPrepare` pass to do that similar thing.
We don't need to explicitly know the pointer alignment. If the intrinsic was properly annotated from the beginning (as would be the case after D80422), the downstream load users would have the correct alignment assigned. The optimizer propagates alignment information already

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80364/new/

https://reviews.llvm.org/D80364