[PATCH] D21421: [NVPTX] Improve lowering of byval args of device functions.
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 16 10:56:21 PDT 2016
tra added a comment.
Here's a bit more details on what I'm trying to do. Let's take ptr_in_byval_func() I've added in lower-kernel-ptr-arg.ll as an example.
Currently we produce this PTX:
mov.b64 %rd1, ptr_in_byval_func_param_0;
cvta.local.u64 %rd2, %rd1;
ld.param.u64 %rd3, [ptr_in_byval_func_param_1];
ld.u32 %rd4, [%rd2+8];
ld.u32 %rd5, [%rd2+12];
shl.b64 %rd6, %rd5, 32;
or.b64 %rd7, %rd6, %rd4;
ld.u32 %r1, [%rd7];
st.u32 [%rd3], %r1;
ret;
We're doing fair amount of unnecessary things here (I wonder why we're loading a pointer as two 32-bit words here, too). To make things worse, it also forces PTX-to-SASS compiler to spill byval argument into local memory when we take address of parameter. It adds even more unnecessary code on SASS level.
With this patch things look a bit better:
ld.param.u64 %rd1, [ptr_in_byval_func_param_1];
ld.param.u64 %rd2, [ptr_in_byval_func_param_0+8];
ld.u32 %r1, [%rd2];
st.u32 [%rd1], %r1;
ret;
================
Comment at: lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp:235
@@ +234,3 @@
+ if (Arg.getType()->isPointerTy() && Arg.hasByValAttr())
+ handleByValParam(&Arg);
+ return true;
----------------
All function arguments are passed via either .reg or .param. Quoting from the link above:
> The parameter (.param) state space is used [...] (2b) to declare locally-scoped byte array variables that serve as function call arguments, typically for passing large structures by value to a function.
We copy all argument to .param space when we lower a call.
In case of your example we get this:
```
{ // callseq 0
.reg .b32 temp_param_reg;
.param .align 4 .b8 param0[8];
st.param.b32 [param0+0], %r4;
st.param.b32 [param0+4], %r3;
.param .b32 retval0;
call.uni (retval0),
_Z15device_function1S,
(
param0
);
ld.param.b32 %r5, [retval0+0];
```
http://reviews.llvm.org/D21421
More information about the llvm-commits
mailing list