[Openmp-commits] [PATCH] D113602: [OpenMP] Fix master thread barrier for Pascal and amdgpu

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Nov 11 17:40:21 PST 2021


jdoerfert added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3463
+    //                       BlockSize = BlockHwSize - WarpSize;
+    //                       bool IsWorker = InitCB >= 0 && InitCB < BlockSize;
     //                       if (IsWorker) {
----------------
This doesn't quite work because now every thread in the last warp will execute the user code.
I think the minimal addition is
```
if (InitCB <u BlockSize)
  return;
```
and then whatever we had before.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3547
+    IsWorker->setDebugLoc(DLoc);
+    BranchInst::Create(StateMachineBeginBB, UserCodeEntryBB, IsWorker, InitBB);
 
----------------
-master +main


================
Comment at: openmp/libomptarget/DeviceRTL/src/Kernel.cpp:103
+  if (UseGenericStateMachine &&
+      mapping::getThreadIdInBlock() < mapping::getBlockSize())
     genericStateMachine(Ident);
----------------
-master +main


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113602/new/

https://reviews.llvm.org/D113602



More information about the Openmp-commits mailing list