[Openmp-commits] [PATCH] D110006: [OpenMP] Add support for dynamic shared memory in new RTL

Joseph Huber via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Oct 4 06:33:23 PDT 2021


jhuber6 added inline comments.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:102
 /// file later.
 struct omptarget_device_environmentTy {
   int32_t debug_level;
----------------
JonChesterfield wrote:
> This struct is also defined in deviceRTLs/common/device_environment.h, which doesn't seem to have the extra field present. Any objection to moving that header up a couple of levels then including it from the various plugins and device runtimes?
> 
> uint64_t seems overly ambitious for shared size. The amdgpu hardware is limited to ~ 64k, can cuda go over 32 bits for this?
It would probably be a good idea to keep the device environment stuff consistent. Currently there might be some difference between the new and old runtime so we should make sure that they all share the same size at least. 

And you're probably right, the argument in `cuLaunchKernel` that it's passed to only uses an `unsigned int` which is most likely 32 bits. Not likely someone is going to allocate a few billion bytes of shared memory.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110006/new/

https://reviews.llvm.org/D110006



More information about the Openmp-commits mailing list