[Openmp-commits] [PATCH] D113602: [OpenMP] Fix master thread barrier for Pascal and amdgpu
Joel E. Denny via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Nov 11 20:07:07 PST 2021
jdenny marked an inline comment as done.
jdenny added inline comments.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3463
+ // BlockSize = BlockHwSize - WarpSize;
+ // bool IsWorker = InitCB >= 0 && InitCB < BlockSize;
// if (IsWorker) {
----------------
jdoerfert wrote:
> This doesn't quite work because now every thread in the last warp will execute the user code.
> I think the minimal addition is
> ```
> if (InitCB <u BlockSize)
> return;
> ```
> and then whatever we had before.
Ah, you're right that's not what I meant to do. It managed to work for my test because it eliminated the divergence in the last warp.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113602/new/
https://reviews.llvm.org/D113602
More information about the Openmp-commits
mailing list