[Openmp-commits] [PATCH] D113602: [OpenMP] Fix master thread barrier for Pascal and amdgpu

Joel E. Denny via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Nov 11 20:07:07 PST 2021

jdenny marked an inline comment as done.
jdenny added inline comments.

Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3463
+    //                       BlockSize = BlockHwSize - WarpSize;
+    //                       bool IsWorker = InitCB >= 0 && InitCB < BlockSize;
     //                       if (IsWorker) {
jdoerfert wrote:
> This doesn't quite work because now every thread in the last warp will execute the user code.
> I think the minimal addition is
> ```
> if (InitCB <u BlockSize)
>   return;
> ```
> and then whatever we had before.
Ah, you're right that's not what I meant to do.  It managed to work for my test because it eliminated the divergence in the last warp.



More information about the Openmp-commits mailing list