[llvm] [NVPTX] add an optional early copy of byval arguments (PR #113384)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 23 14:54:20 PDT 2024


================
@@ -734,3 +739,21 @@ bool NVPTXLowerArgs::runOnFunction(Function &F) {
 }
 
 FunctionPass *llvm::createNVPTXLowerArgsPass() { return new NVPTXLowerArgs(); }
+
+static bool copyFunctionByValArgs(Function &F) {
+  LLVM_DEBUG(dbgs() << "Creating a copy of byval args of " << F.getName()
+                    << "\n");
+  bool Changed = false;
+  for (Argument &Arg : F.args())
+    if (Arg.getType()->isPointerTy() && Arg.hasByValAttr()) {
+      copyByValParam(F, Arg);
----------------
Artem-B wrote:

Remains to be seen. We may consider taking `grid_constant` into account and skip the early copy for such byval arguments. 

I've ran a toy test with/without early copy with `bin/opt --nvptx-early-byval-copy=[0|1] -mtriple nvptx64 -mcpu=sm_70 -mattr=ptx77 -O3   -S < test/CodeGen/NVPTX/lower-byval-args.ll | bin/llc -mtriple nvptx64 -mcpu=sm_70 -mattr=ptx77`

https://gist.githubusercontent.com/Artem-B/f8af0b7f60668c4ca6b900c56d72ce23/raw/da4b7ec5c05e6914cda8b6be9738c1cbbb3ffe86/early-copy.diff

Results are... interesting, even on these toy examples. The final code is substantially smaller (16K -> 12K) and looks quite a bit better than the code w/o copies. However, `cvta.param` is gone, so that's something I'll look into in the follow-up patches.


https://github.com/llvm/llvm-project/pull/113384


More information about the llvm-commits mailing list