[PATCH] D47370: AMDGPU: Round up kernel argument allocation size

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 25 12:05:42 PDT 2018


t-tye added inline comments.


================
Comment at: lib/Target/AMDGPU/AMDGPUSubtarget.cpp:426
+  // Being able to dereference past the end is useful for emitting scalar loads.
+  return alignTo(TotalSize, 4);
 }
----------------
I believe you can align this to 16. See HSA spec at www.hsafoundation.com/html_spec111/HSA_Library.htm#PRM/Topics/04_SyntaxSemantics/kernarg_segment.htm which says:

"The alignment of the base address of the kernel's kernarg segment variables is the larger of 16 bytes and the maximum alignment of the kernel's kernarg segment variables."

I suspect that the OpenCL runtime simply aligns all kernarg allocations to 256 but not sure of other languages.


https://reviews.llvm.org/D47370





More information about the llvm-commits mailing list