[PATCH] D138531: [PATCH] [NVPTX] Backend support for variadic functions

Sat Dec 3 04:50:01 PST 2022

pavelkopyl added a comment.

In D138531#3957954 <https://reviews.llvm.org/D138531#3957954>, @tra wrote:

>> Note that aggregates passed by value as variadic arguments are not currently supported.
>
> What happens when a user does try to pass an aggregate as a var arg?

That will trigger llvm_unreachable() at llvm/lib/CodeGen/ValueTypes.cpp:551
But this is common issue - aggregates are not allowed (at least now) in variadic arguments.

================
Comment at: llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:1676
+      O << ",\n";
+    O << "\t.param .align 8 .b8 %VAParam[]";
+  }
----------------
tra wrote:
> pavelkopyl wrote:
> > tra wrote:
> > > What determines the alignment here? 
> > > NVIDIA does not seem to specify anything regarding alignment here and their example shows `align 4`:
> > > https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-and-function-directives-func
> > > 
> > > 
> > It seems the documentation is a little bit outdated, because NVCC 11.7 generates .align 8 for the last parameter (unsized array): https://godbolt.org/z/7W7YThMf8
> > 
> The question remains. Do we set alignment to 8 because that's what NVCC does or is there some other reason behind it?
> I.e. should it follow the alignment guarantees provided by e.g. `malloc` which returns a pointer sufficiently aligned to access any type. 
> 
> I think this should be retrieved from DataLayout or TargetInfo, instead of being hardcoded here.
> Based on `NVPTXTargetLowering::getFunctionParamOptimizedAlign`, we may have argument alignment as high as 16.
I agree, that would be a right way to get alignment value from DataLayout. To be honest, it's not clear which LLVM IR type corresponds to unsized byte array and PTX documentation allows any alignment - 1, 2, 4, 8 or 16, but it doesn't specify which one should be used in what cases. Furthermore. from the correctness point of view exact value of the array alignment doesn't matter: both LowerCall() and LowerVAARG() insert instructions that align va_lits pointer according to a value type being stored/loaded (please, see vaargs.ll test). If we specify ".param .align 1 .b8 %VAParam[]"  that may lead just to a padding space between the first variadic argument and beginning of the array itself. On the other hand, ".align 16" may also lead to wasting of stack space. So, ".aling 8" seems to be an optimal value. NVCC also uses ".align 8". That's why I chose exactly this value.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138531/new/

https://reviews.llvm.org/D138531