[PATCH] D138531: [PATCH] [NVPTX] Backend support for variadic functions

Pavel Kopyl via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 23 16:56:30 PST 2022


pavelkopyl added inline comments.


================
Comment at: llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1676
+      O << ",\n";
+    O << "\t.param .align 8 .b8 %VAParam[]";
+  }
----------------
tra wrote:
> What determines the alignment here? 
> NVIDIA does not seem to specify anything regarding alignment here and their example shows `align 4`:
> https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-and-function-directives-func
> 
> 
It seems the documentation is a little bit outdated, because NVCC 11.7 generates .align 8 for the last parameter (unsized array): https://godbolt.org/z/7W7YThMf8



================
Comment at: llvm/test/CodeGen/NVPTX/vaargs.ll:18
+; CHECK64-NEXT:  .local .align 8 .b8 __local_depot0[24];
+; CHECK-NEXT:    .reg .b[[BITS]] %SP;
+; CHECK-NEXT:    .reg .b[[BITS]] %SPL;
----------------
tra wrote:
> Would it be possible to reduce the checks to the minimum number of the instruction necessary to illustrate that we've lowered varargs correctly? Everything else just obscures what is ti exactly that we're testing for here.
> If the remaining checks are still verbose, it may be useful to interleave the checks with the IR itself, so it's easier to tell which IR produced particular PTX.
> 
OK, I'll try to make it more clear.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138531/new/

https://reviews.llvm.org/D138531



More information about the llvm-commits mailing list