[PATCH] D119247: [NVPTX] Use align attribute for kernel pointer arg alignment

Wed Feb 9 12:07:18 PST 2022

tra added inline comments.

================
Comment at: llvm/test/CodeGen/NVPTX/nvcl-param-align.ll:7
+define void @foo(i64 %img, i64 %sampler, <5 x float>* align 32 %v) {
+; The parameter alignment is determined by the align attribute.
 ; CHECK: .param .u32 .ptr .align 32 foo_param_2
----------------
nikic wrote:
> tra wrote:
> > nikic wrote:
> > > tra wrote:
> > > > What's expected to happen if the alignment is not specified explicitly?
> > > > 
> > > > It may be worth adding a test case for that.
> > > If no alignment is specified, then the default is 1. I've extended the test to check for this.
> > Always defaulting to unaligned access would likely be a performance regression.
> > LLVM IR spec generally defaults to `If the alignment is not specified, then the code generator makes a target-specific assumption.`
> > I think we do need to infer assumed alignment from the data layout, if we do know the type.
> > 
> There may be some confusion here: LLVM defaults to ABI alignment for alignment of memory accesses like loads and stores (or rather, these accesses always have explicit alignment, but this default is usedwhen parsing textual IR). However, for pointer arguments, the default alignment is 1 if no `align` attribute is provided (and it's not a `byval` parameter, but that's explicitly excluded here).
> 
> There should be no regressions here as long as the frontend adds the necessary alignment annotations, which was done for OpenCL kernels in D118894. As this code is only used for NVCL and not CUDA I assume that is sufficient, but please correct me if I'm wrong.
The nominal goal of this patch is to allow dealing with opaque pointer types. It does not require defaulting to alignment of 1. Sticking with a known working default alignment for the non-opaque pointers seems to be a better choice.

I agree that it's more or less a no-op if IR is generated by clang. However, there are other LLVM IR users that generate IR themselves. E.g. I have no idea whether TensorFlow's XLA, or julia, or any other LLVM-based JIT always passes alignment into for pointer arguments.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119247/new/

https://reviews.llvm.org/D119247