[all-commits] [llvm/llvm-project] 006534: [NVPTX] Improve device function byval parameter lo...

Alex MacLean via All-commits all-commits at lists.llvm.org
Fri Feb 28 14:15:46 PST 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 006534315972728390d82fc8381c9ab1bf6e6490
      https://github.com/llvm/llvm-project/commit/006534315972728390d82fc8381c9ab1bf6e6490
  Author: Alex MacLean <amaclean at nvidia.com>
  Date:   2025-02-28 (Fri, 28 Feb 2025)

  Changed paths:
    M llvm/lib/Target/NVPTX/CMakeLists.txt
    M llvm/lib/Target/NVPTX/NVPTX.h
    A llvm/lib/Target/NVPTX/NVPTXForwardParams.cpp
    M llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
    M llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
    M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
    M llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
    M llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
    A llvm/test/CodeGen/NVPTX/forward-ld-param.ll
    M llvm/test/CodeGen/NVPTX/i128-array.ll
    M llvm/test/CodeGen/NVPTX/lower-args-gridconstant.ll
    M llvm/test/CodeGen/NVPTX/lower-args.ll
    M llvm/test/CodeGen/NVPTX/variadics-backend.ll
    M llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/nvptx-basic.ll.expected

  Log Message:
  -----------
  [NVPTX] Improve device function byval parameter lowering (#129188)

PTX supports 2 methods of accessing device function parameters:

- "simple" case: If a parameters is only loaded, and all loads can
address the parameter via a constant offset, then the parameter may be
loaded via the ".param" address space. This case is not possible if the
parameters is stored to or has it's address taken. This method is
preferable when possible.

- "move param" case: For more complex cases the address of the param may
be placed in a register via a "mov" instruction. This mov also
implicitly moves the param to the ".local" address space and allows for
it to be written to. This essentially defers the responsibilty of the
byval copy to the PTX calling convention.

The handling of these cases in the NVPTX backend for byval pointers has
some major issues. We currently attempt to determine if a copy is
necessary in NVPTXLowerArgs and either explicitly make an additional
copy in the IR, or insert "addrspacecast" to move the param to the param
address space. Unfortunately the criteria for determining which case is
possible are not correct, leading to miscompilations
(https://godbolt.org/z/Gq1fP7a3G). Further, the criteria for the
"simple" case aren't enforceable in LLVM IR across other transformations
and instruction selection, making deciding between the 2 cases in
NVPTXLowerArgs brittle and buggy.

This patch aims to fix these issues and improve address space related
optimization. In NVPTXLowerArgs, we conservatively assume that all
parameters will use the "move param" case and the local address space.
Responsibility for switching to the "simple" case is given to a new
MachineIR pass, NVPTXForwardParams, which runs once it has become clear
whether or not this is possible. This ensures that the correct address
space is known for the "move param" case allowing for optimization,
while still using the "simple" case where ever possible.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list