[PATCH] D138531: [PATCH] [NVPTX] Backend support for variadic functions

Tue Nov 22 22:05:35 PST 2022

tra added a subscriber: yaxunl.
tra added a comment.

Nice.

I'm out of office this week and will take a closer look when I'm back next week, probably closer to the end of it.

@yaxunl Does HIP currently allow variadic functions on GPU? Of so, does that include kernels, or only regular functions?

================
Comment at: llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1676
+      O << ",\n";
+    O << "\t.param .align 8 .b8 %VAParam[]";
+  }
----------------
What determines the alignment here? 
NVIDIA does not seem to specify anything regarding alignment here and their example shows `align 4`:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-and-function-directives-func

================
Comment at: llvm/test/CodeGen/NVPTX/vaargs.ll:13
+
+define i32 @foo(i32 %a, ...) {
+; CHECK-LABEL: foo(
----------------
NVCC does not seem to allow varargs for kernels, only for `__device__` functions. https://godbolt.org/z/s75vWsfbK

Not sure if we can do much about that on LLVM level, that would need to be something to be enforced in the front-end. 

================
Comment at: llvm/test/CodeGen/NVPTX/vaargs.ll:18
+; CHECK64-NEXT:  .local .align 8 .b8 __local_depot0[24];
+; CHECK-NEXT:    .reg .b[[BITS]] %SP;
+; CHECK-NEXT:    .reg .b[[BITS]] %SPL;
----------------
Would it be possible to reduce the checks to the minimum number of the instruction necessary to illustrate that we've lowered varargs correctly? Everything else just obscures what is ti exactly that we're testing for here.
If the remaining checks are still verbose, it may be useful to interleave the checks with the IR itself, so it's easier to tell which IR produced particular PTX.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138531/new/

https://reviews.llvm.org/D138531