[PATCH] D48537: AMDGPU: Add pass to lower kernel arguments to loads

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 25 11:52:54 PDT 2018


rampitec added a comment.

In general I believe having distinct loads for kernel arguments is a right thing to do. Of course that would be preferable to keep only one mechanism and have no dual support like for noalis and SI LDS.
Couple notes though:

1. I think we need to have a distinct addresspace for kernarg, not just constant.
2. This would be handy to set names on the argument loads derived from either original argument names if present, or from an argument number. This patch introduces geps with immediate offsets which would render IR unreadable.



================
Comment at: test/CodeGen/AMDGPU/extract_vector_elt-i16.ll:117
 ; GCN: {{buffer|global}}_store_short
-define amdgpu_kernel void @dynamic_extract_vector_elt_v3i16(i16 addrspace(1)* %out, <3 x i16> %foo, i32 %idx) #0 {
+define amdgpu_kernel void @dynamic_extract_vector_elt_v3i16(i16 addrspace(1)* %out, [8 x i32], <3 x i16> %foo, i32 %idx) #0 {
   %p0 = extractelement <3 x i16> %foo, i32 %idx
----------------
Why do you need all of that explicit padding in many tests?


https://reviews.llvm.org/D48537





More information about the llvm-commits mailing list