[PATCH] D80364: [amdgpu] Teach load widening to handle non-DWORD aligned loads.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 2 11:32:28 PDT 2020


arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/vectorize-loads.ll:26
+; A little more complicated case where more sub-dword loads could be coalesced
+; if they are not widening earlier.
+; GCN-LABEL: {{^}}load_4i16:
----------------
s/widening/widened


================
Comment at: llvm/test/CodeGen/AMDGPU/vectorize-loads.ll:32-33
+; GCN-DAG: s_lshr_b32 s{{[0-9]+}}, s[[D1]], 16
+; GCN: s_endpgm
+define protected amdgpu_kernel void @load_4i16(i32 addrspace(1)* %out) {
+entry:
----------------
Function name needs to be better. This is not merely a v4i16 vectorization, there's the constant widening to consider


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80364/new/

https://reviews.llvm.org/D80364





More information about the llvm-commits mailing list