[PATCH] D120129: [NVPTX] Enhance vectorization of ld.param & st.param

Tue Mar 22 06:11:39 PDT 2022

kovdan01 marked 4 inline comments as done.
kovdan01 added a comment.

@tra Added appropriate tests for kernels. Looks like that kernel functions get external linkage regardless static specifier, so their parameters alignment is kept untouched.
To ensure that it's true, altered clang/test/CodeGenCUDA/device-fun-linkage.cu and added appropriate assertion in `getFunctionParamOptimizedAlign` - it checks module metadata and identifies if the function is marked as kernel or not.

================
Comment at: llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:4322
+/// internal or private linkage as for other linkage types callers may already
+/// rely on default linkage. This function returns maximum from alignment 16
+/// and default alignment. Alignment 16 is chosen because it's maximum size
----------------
tra wrote:
> Did you mean "rely on default alignment" instead of "linkage"?
> 
> "To allow using 128-bit vectorized loads/stores. his function ensures that alignment is 16 or greater."
Sure, I meant that :)

================
Comment at: llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp:235
+// NVPTXTargetLowering::LowerCall.
+static void tryIncreaseByValAlign(Argument *Arg, Value *ArgInParamAS,
+                                  const NVPTXTargetLowering *TLI) {
----------------
tra wrote:
> "tryIncreaseByValAlign" does not sound quite right.  
> "adjustByValArgAlignment()" ?
> 
> I'd also mention that we traverse all loads from byval pointer and adjust their alignment, if those were using known offset.
> 
I agree, your variant sound better. Fixed.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120129/new/

https://reviews.llvm.org/D120129