[PATCH] D80364: [amdgpu] Teach load widening to handle non-DWORD aligned loads.

Wed May 27 16:56:23 PDT 2020

arsenm added a comment.

I'd still like to find a way to avoid a whole extra pass run for this. In the test here, the LoadStoreVectorizer should have vectorized these? Why didn't it?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:32-36
+static cl::opt<bool>
+    WidenLoads("amdgpu-late-codegenprepare-widen-constant-loads",
+               cl::desc("Widen sub-dword constant address space loads in "
+                        "AMDGPULateCodeGenPrepare"),
+               cl::ReallyHidden, cl::init(true));
----------------
If a separate pass is going to do the load widening, you can remove it from AMDGPUCodeGenPrepare

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.dispatch.ptr.ll:59
+
 declare noalias i8 addrspace(4)* @llvm.amdgcn.dispatch.ptr() #0

----------------
Missing align?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80364/new/

https://reviews.llvm.org/D80364