[all-commits] [llvm/llvm-project] af8ebf: [NVPTX] Allow the ctor/dtor lowering pass to emit ...
Joseph Huber via All-commits
all-commits at lists.llvm.org
Fri Nov 10 07:33:43 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: af8ebfdcd9d633572ea25eb12407eb92c18e563a
https://github.com/llvm/llvm-project/commit/af8ebfdcd9d633572ea25eb12407eb92c18e563a
Author: Joseph Huber <jhuber6 at vols.utk.edu>
Date: 2023-11-10 (Fri, 10 Nov 2023)
Changed paths:
M llvm/lib/Target/NVPTX/NVPTXCtorDtorLowering.cpp
M llvm/test/CodeGen/NVPTX/lower-ctor-dtor.ll
Log Message:
-----------
[NVPTX] Allow the ctor/dtor lowering pass to emit kernels (#71549)
Summary:
This pass emits the new "nvptx$device$init" and "nvptx$device$fini"
kernels that are callable by the device. This intends to mimic the
method of lowering for AMDGPU where we emit `amdgcn.device.init` and
`amdgcn.device.fini` respectively. These kernels simply iterate a symbol
called `__init_array_start/stop` and `__fini_array_start/stop`.
Normally, the linker provides these symbols automatically. In the AMDGPU
case we only need call the kernel and we call the ctors / dtors.
However, for NVPTX we require the user initializes these variables to
the associated globals that we already emit as a part of this pass.
The motivation behind this change is to move away from OpenMP's handling
of ctors / dtors. I would much prefer that the backend / runtime handles
this. That allows us to handle ctors / dtors in a language agnostic way,
This approach requires that the runtime initializes the associated
globals. They are marked `weak` so we can emit this per-TU. The kernel
itself is `weak_odr` as it is copied exactly.
One downside is that any module containing these kernels elicitis the
"stack size cannot be statically determined warning" every time from
`nvlink` which is annoying but inconsequential for functionality. It
would be nice if there were a way to silence this warning however.
More information about the All-commits
mailing list