[all-commits] [llvm/llvm-project] af8ebf: [NVPTX] Allow the ctor/dtor lowering pass to emit ...

Fri Nov 10 07:33:43 PST 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: af8ebfdcd9d633572ea25eb12407eb92c18e563a
      https://github.com/llvm/llvm-project/commit/af8ebfdcd9d633572ea25eb12407eb92c18e563a
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-11-10 (Fri, 10 Nov 2023)

  Changed paths:
    M llvm/lib/Target/NVPTX/NVPTXCtorDtorLowering.cpp
    M llvm/test/CodeGen/NVPTX/lower-ctor-dtor.ll

  Log Message:
  -----------
  [NVPTX] Allow the ctor/dtor lowering pass to emit kernels (#71549)

Summary:
This pass emits the new "nvptx$device$init" and "nvptx$device$fini"
kernels that are callable by the device. This intends to mimic the
method of lowering for AMDGPU where we emit `amdgcn.device.init` and
`amdgcn.device.fini` respectively. These kernels simply iterate a symbol
called `__init_array_start/stop` and `__fini_array_start/stop`.
Normally, the linker provides these symbols automatically. In the AMDGPU
case we only need call the kernel and we call the ctors / dtors.
However, for NVPTX we require the user initializes these variables to
the associated globals that we already emit as a part of this pass.

The motivation behind this change is to move away from OpenMP's handling
of ctors / dtors. I would much prefer that the backend / runtime handles
this. That allows us to handle ctors / dtors in a language agnostic way,

This approach requires that the runtime initializes the associated
globals. They are marked `weak` so we can emit this per-TU. The kernel
itself is `weak_odr` as it is copied exactly.

One downside is that any module containing these kernels elicitis the
"stack size cannot be statically determined warning" every time from
`nvlink` which is annoying but inconsequential for functionality. It
would be nice if there were a way to silence this warning however.