[Openmp-commits] [PATCH] D45326: [OpenMP] [CUDA plugin] Add support	for teams reduction via scratchpad
    George Rokos via Phabricator via Openmp-commits 
    openmp-commits at lists.llvm.org
       
    Thu Apr  5 11:13:19 PDT 2018
    
    
  
grokos added a comment.
One caveat regarding Alexey's proposal: According to the CUDA programming guide <https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations>, `malloc` on the device allocates space from a fixed-size heap. The default size of this heap is 8MB. If we run into a scenario where more than 8MB will be required for the reduction scratchpad, allocating the scratchpad from the device will fail. The heap size can be user-defined from the host, but for that to happen the host must know how large the scratchpad needs to be, which defeats the purpose of moving scratchpad allocation from the plugin to the nvptx runtime.
Repository:
  rOMP OpenMP
https://reviews.llvm.org/D45326
    
    
More information about the Openmp-commits
mailing list