[Openmp-commits] [PATCH] D106033: [OpenMP] Folding threadLimit and numThreads when single value in kernels
Shilei Tian via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Tue Jul 20 14:35:51 PDT 2021
tianshilei1992 added inline comments.
================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPKinds.def:208
+__OMP_RTL(__kmpc_get_hardware_num_blocks, false, Int32, )
+__OMP_RTL(__kmpc_get_hardware_num_threads_in_block, false, Int32, )
----------------
Probably worth a separate patch to add them.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3632
+ return indicatePessimisticFixpoint();
+ } else {
+ CurrentNumTeams = NumT;
----------------
No need to have `else` here because the code above already returns.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3635
+ }
+ } else {
+ // TODO: No attribute, then default?
----------------
no need for `else` as well
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3636
+ } else {
+ // TODO: No attribute, then default?
+ }
----------------
Directly indicate pessimistic state because we don't know clearly what the number is.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3646
+
+ return getState() == StateBefore ? ChangeStatus::UNCHANGED
+ : ChangeStatus::CHANGED;
----------------
This has to be updated as `SimplifiedValue` is not part of `BooleanState`. Refer to `foldIsSPMDExecMode`.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3652
+ /// The value is an attribute in the kernel
+ ChangeStatus foldHardwareNumThreads(Attributor &A) {
+ // Specialize only if all the calls agree with the number of threads
----------------
Update accordingly
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:3705
+ A.getOrCreateAAFor<AAFoldRuntimeCall>(
+ IRPosition::callsite_function(*CI), /* QueryingAA */ nullptr,
+ DepClassTy::NONE, /* ForceUpdate */ false,
----------------
This part needs to be changed. Refer to the trunk for more details. Basically it should be `IRPosition::callsite_returned(*CI)`.
================
Comment at: openmp/libomptarget/deviceRTLs/common/src/reduction.cu:200
uint32_t TeamId = GetBlockIdInKernel();
- uint32_t NumTeams = GetNumberOfBlocksInKernel();
+ uint32_t NumTeams = __kmpc_get_hardware_num_blocks();
static unsigned SHARED(Bound);
----------------
Since this part is moved to another patch, they should go in this patch.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106033/new/
https://reviews.llvm.org/D106033
More information about the Openmp-commits
mailing list