[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions
Joseph Huber via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Jun 27 08:22:58 PDT 2023
jhuber6 added a comment.
So this is implementing the `stacksave` using `__kmpc_alloc_shared` instead? It makes sense since the OpenMP standard expects sharing for the stack. I wonder how this interfaces with `-fopenmp-cuda-mode`.
================
Comment at: clang/lib/CodeGen/CGDecl.cpp:1603
+ // deallocation call of __kmpc_free_shared() is emitted later.
+ if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) {
+ // Emit call to __kmpc_alloc_shared() instead of the alloca.
----------------
Does NVPTX handle this already? If not, is there a compelling reason to exclude NVPTX? Otherwise we should check if we are the OpenMP device.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D153883/new/
https://reviews.llvm.org/D153883
More information about the cfe-commits
mailing list