[PATCH] D95976: [OpenMP] Simplify offloading parallel call codegen
Michael Kruse via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Apr 21 10:54:38 PDT 2021
Meinersbur accepted this revision.
Meinersbur added a comment.
This revision is now accepted and ready to land.
This test seem to pass on Windows now. Please still fix the clang-format remarks, such as going over 80 characters on a line.
================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1150
llvm::Value *IsMaster =
Bld.CreateICmpEQ(RT.getGPUThreadID(CGF), getMasterThreadID(CGF));
Bld.CreateCondBr(IsMaster, MasterBB, EST.ExitBB);
----------------
ggeorgakoudis wrote:
> Meinersbur wrote:
> > There seem to be more unordered codegen calls, such as this one.
> Some previous emitted values can be re-used, e.g., GPUThreadID in line 1150 can re-use the value from line 1140 , instead of re-emitted. I've kept emitting them as it was previously done. What is the preferred way to handle those?
`getGPUThreadID`/`getMasterThreadID` could cache the value if used multiple times., but it would also require to put them into the entry block to be available anywhere in the function.
Otherwise, use a best-effort to minimize overhead even if the optimizer cannot unify them or in debug builds.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95976/new/
https://reviews.llvm.org/D95976
More information about the cfe-commits
mailing list