[PATCH] D21421: [NVPTX] Improve lowering of byval args of device functions.

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 16 10:56:21 PDT 2016

tra added a comment.

Here's a bit more details on what I'm trying to do. Let's take ptr_in_byval_func() I've added in lower-kernel-ptr-arg.ll as an example.
Currently we produce this PTX:

  	mov.b64	%rd1, ptr_in_byval_func_param_0;
  	cvta.local.u64 	%rd2, %rd1;
  	ld.param.u64 	%rd3, [ptr_in_byval_func_param_1];
  	ld.u32 	%rd4, [%rd2+8];
  	ld.u32 	%rd5, [%rd2+12];
  	shl.b64 	%rd6, %rd5, 32;
  	or.b64  	%rd7, %rd6, %rd4;
  	ld.u32 	%r1, [%rd7];
  	st.u32 	[%rd3], %r1;

We're doing fair amount of unnecessary things here (I wonder why we're loading a pointer as two 32-bit words here, too). To make things worse, it also forces PTX-to-SASS compiler to spill byval argument into local memory when we take address of parameter. It adds even more unnecessary code on SASS level.

With this patch things look a bit better:

  	ld.param.u64 	%rd1, [ptr_in_byval_func_param_1];
  	ld.param.u64 	%rd2, [ptr_in_byval_func_param_0+8];
  	ld.u32 	%r1, [%rd2];
  	st.u32 	[%rd1], %r1;

Comment at: lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp:235
@@ +234,3 @@
+    if (Arg.getType()->isPointerTy() && Arg.hasByValAttr())
+      handleByValParam(&Arg);
+  return true;
All function arguments are passed via either .reg or .param. Quoting from the link above:
> The parameter (.param) state space is used [...] (2b) to declare locally-scoped byte array variables that serve as function call arguments, typically for passing large structures by value to a function.

We copy all argument to .param space when we lower a call. 
In case of your example we get this:
	{ // callseq 0
	.reg .b32 temp_param_reg;
	.param .align 4 .b8 param0[8];
	st.param.b32	[param0+0], %r4;
	st.param.b32	[param0+4], %r3;
	.param .b32 retval0;
	call.uni (retval0), 
	ld.param.b32	%r5, [retval0+0];


More information about the llvm-commits mailing list