[PATCH] D106033: [OpenMP] Folding threadLimit and numThreads when single value in kernels

Shilei Tian via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 20 14:35:51 PDT 2021


tianshilei1992 added inline comments.


================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPKinds.def:208
 
+__OMP_RTL(__kmpc_get_hardware_num_blocks, false, Int32, )
+__OMP_RTL(__kmpc_get_hardware_num_threads_in_block, false, Int32, )
----------------
Probably worth a separate patch to add them.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3632
+          return indicatePessimisticFixpoint();
+        } else {
+          CurrentNumTeams = NumT;
----------------
No need to have `else` here because the code above already returns.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3635
+        }
+      } else {
+        // TODO: No attribute, then default?
----------------
no need for `else` as well


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3636
+      } else {
+        // TODO: No attribute, then default?
+      }
----------------
Directly indicate pessimistic state because we don't know clearly what the number is.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3646
+
+    return getState() == StateBefore ? ChangeStatus::UNCHANGED
+                                     : ChangeStatus::CHANGED;
----------------
This has to be updated as `SimplifiedValue` is not part of `BooleanState`. Refer to `foldIsSPMDExecMode`.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3652
+  /// The value is an attribute in the kernel
+  ChangeStatus foldHardwareNumThreads(Attributor &A) {
+    // Specialize only if all the calls agree with the number of threads
----------------
Update accordingly


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3705
+    A.getOrCreateAAFor<AAFoldRuntimeCall>(
+        IRPosition::callsite_function(*CI), /* QueryingAA */ nullptr,
+        DepClassTy::NONE, /* ForceUpdate */ false,
----------------
This part needs to be changed. Refer to the trunk for more details. Basically it should be `IRPosition::callsite_returned(*CI)`.


================
Comment at: openmp/libomptarget/deviceRTLs/common/src/reduction.cu:200
   uint32_t TeamId = GetBlockIdInKernel();
-  uint32_t NumTeams = GetNumberOfBlocksInKernel();
+  uint32_t NumTeams = __kmpc_get_hardware_num_blocks();
   static unsigned SHARED(Bound);
----------------
Since this part is moved to another patch, they should go in this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106033/new/

https://reviews.llvm.org/D106033



More information about the llvm-commits mailing list