[PATCH] D21421: [NVPTX] Improve lowering of byval args of device functions.

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 16 13:59:32 PDT 2016

tra added inline comments.

Comment at: lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp:668
@@ +667,3 @@
+        ReplaceNode(N, Src.getOperand(0).getNode());
+        return;
+      }
LowerKernelArgs() makes sure that IR itself no longer touches argument directly.
Here's a typical example of what happens:

Input IR with byval argument:
define i32 @gen_arg(%struct.S* byval align 1) #0 {
  %2 = getelementptr inbounds %struct.S, %struct.S* %0, i32 0, i32 0
  %3 = load i8, i8* %2, align 1
  %4 = sext i8 %3 to i32
  %5 = mul nsw i32 %4, 3
  ret i32 %5
After LowerKernelArgs:
define i32 @gen_arg(%struct.S* byval align 1) #0 {
  %2 = alloca %struct.S, align 1
  %3 = addrspacecast %struct.S* %0 to %struct.S addrspace(101)*
  %4 = load %struct.S, %struct.S addrspace(101)* %3
  store %struct.S %4, %struct.S* %2
  %5 = getelementptr inbounds %struct.S, %struct.S* %2, i32 0, i32 0
  %6 = load i8, i8* %5, align 1
  %7 = sext i8 %6 to i32
  %8 = mul nsw i32 %7, 3
  ret i32 %8

That guarantees that the only part of IR that touches the argument is the addrspacecast instruction. Everything else operates on its value. That covers our goal of controlling direct access to byval argument on IR level.

Argument itself is lowered to MoveParam (which is just a wrapper over argument symbol) in LowerFormalArguments() which becomes an input to addrspacecast.

Normally addrspacecast would be lowered as an intrinsic converting from generic param space. 
In this case we check if it's addrspacecast(moveParam) (which would only ever happen for an argument)  and lower it as the symbol of the argument. Behavior of all other addrspacecast variants is not affected.

I think that covers lowering of the IR in the body of the function.

One remaining case is CopyToExportRegsIfNeeded() which is called by SelectionDAGISel::LowerArguments() and which creates CopyToReg node which directly copies arg pointer to a register during unoptimized builds. We end up taking address of byval argument just to copy it to a register that's never used (AFAICT) for any purpose.

I'm not sure yet what to do about this -- do not copy byval pointer to export regs, copy something else to export regs (what?), or eliminate CopyToReg later. Ideas or suggestions are welcome.


More information about the llvm-commits mailing list