[Openmp-commits] [PATCH] D110180: [OpenMP] Add support for changing stack size in device RTL
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Oct 1 10:39:26 PDT 2021
jdoerfert accepted this revision.
jdoerfert added a comment.
This revision is now accepted and ready to land.
LG. Two comments.
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:932
We should add a TODO here. It's unreasonable that we copy stuff from the device even though the host has the image with the information. I know this is how we do it for other stuff too, in general seems sub-optimal.
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:1241
+ RoundUp(KernelInfo->StackSize, 8) +
+ RoundUp(CudaThreadsPerBlock, DeviceData[DeviceId].WarpSize);
Put these things in separate variables with explanation what they mean and how the size is computed. In the current way this is just magic.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the Openmp-commits