[llvm] [NVPTX] add an optional early copy of byval arguments (PR #113384)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 23 14:54:20 PDT 2024
================
@@ -734,3 +739,21 @@ bool NVPTXLowerArgs::runOnFunction(Function &F) {
}
FunctionPass *llvm::createNVPTXLowerArgsPass() { return new NVPTXLowerArgs(); }
+
+static bool copyFunctionByValArgs(Function &F) {
+ LLVM_DEBUG(dbgs() << "Creating a copy of byval args of " << F.getName()
+ << "\n");
+ bool Changed = false;
+ for (Argument &Arg : F.args())
+ if (Arg.getType()->isPointerTy() && Arg.hasByValAttr()) {
+ copyByValParam(F, Arg);
----------------
Artem-B wrote:
Remains to be seen. We may consider taking `grid_constant` into account and skip the early copy for such byval arguments.
I've ran a toy test with/without early copy with `bin/opt --nvptx-early-byval-copy=[0|1] -mtriple nvptx64 -mcpu=sm_70 -mattr=ptx77 -O3 -S < test/CodeGen/NVPTX/lower-byval-args.ll | bin/llc -mtriple nvptx64 -mcpu=sm_70 -mattr=ptx77`
https://gist.githubusercontent.com/Artem-B/f8af0b7f60668c4ca6b900c56d72ce23/raw/da4b7ec5c05e6914cda8b6be9738c1cbbb3ffe86/early-copy.diff
Results are... interesting, even on these toy examples. The final code is substantially smaller (16K -> 12K) and looks quite a bit better than the code w/o copies. However, `cvta.param` is gone, so that's something I'll look into in the follow-up patches.
https://github.com/llvm/llvm-project/pull/113384
More information about the llvm-commits
mailing list