[PATCH] D48537: AMDGPU: Add pass to lower kernel arguments to loads

Mon Jun 25 11:52:54 PDT 2018

rampitec added a comment.

In general I believe having distinct loads for kernel arguments is a right thing to do. Of course that would be preferable to keep only one mechanism and have no dual support like for noalis and SI LDS.
Couple notes though:

1. I think we need to have a distinct addresspace for kernarg, not just constant.
2. This would be handy to set names on the argument loads derived from either original argument names if present, or from an argument number. This patch introduces geps with immediate offsets which would render IR unreadable.

================
Comment at: test/CodeGen/AMDGPU/extract_vector_elt-i16.ll:117
 ; GCN: {{buffer|global}}_store_short
-define amdgpu_kernel void @dynamic_extract_vector_elt_v3i16(i16 addrspace(1)* %out, <3 x i16> %foo, i32 %idx) #0 {
+define amdgpu_kernel void @dynamic_extract_vector_elt_v3i16(i16 addrspace(1)* %out, [8 x i32], <3 x i16> %foo, i32 %idx) #0 {
   %p0 = extractelement <3 x i16> %foo, i32 %idx
----------------
Why do you need all of that explicit padding in many tests?

https://reviews.llvm.org/D48537