[PATCH] D104928: [OMPIRBuilder] add minimalist reduction support

Michael Kruse via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 23 12:08:08 PDT 2021


Meinersbur added a comment.

In D104928#2852415 <https://reviews.llvm.org/D104928#2852415>, @ftynse wrote:

> I expect this to double the complexity and triple the length of the code. One reduction clause (with potentially several reductions in it) is mapped to one runtime call, which takes an array of pointers to reduction variables. Adding reductions progressively to that requires to modify many details in the IR, in particular the size of allocas and the signature of the outlined function, among other things. On the other hand, I expect the list of reductions to be fully available when processing a directive. I can propose using some ReductionInfo struct that contains all the relevant input pieces and have one array instead of four co-indexed arrays.

I was rather thinking of collecting the reductions and manifesting them in `OpenMPIRBuilder::finalize` like `createParallel` does. However, as you pointed out implementations likely know all the reductions, so there might not be useful. However, I like the idea of `ReductionInfo`.



================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:551-552
+  ///
+  /// \param Loc                The location where the reduction was
+  ///                           encountered.
+  /// \param AllocaIP           An insertion point suitable for allocas usable
----------------
What are the requirement for the location? Such as: Has to be between the last local store of a reduced location and the end of the thread.


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1037-1039
+  assert(ReductionGen.size() == Variables.size());
+  assert(AtomicReductionGen.size() == Variables.size() ||
+         AtomicReductionGen.empty());
----------------
assert descriptions missing.

Seems like the caller has to decide whether to use atomics or not. I think the API should itself decide whether atomics are possible or not. That is, if all `ReductionInfo::AtomicReductionGen` have a value we can use atomics, otherwise we fall back to `ReductionGen`.


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1044-1047
+  BasicBlock *ContinuationBlock =
+      Loc.IP.getBlock()->splitBasicBlock(Loc.IP.getPoint(), "reduce.finalize");
+  Loc.IP.getBlock()->getTerminator()->eraseFromParent();
+  Builder.SetInsertPoint(Loc.IP.getBlock(), Loc.IP.getBlock()->end());
----------------
`Loc.IP.getBlock()` is used orften enough to store it in a variable.


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1054-1056
+    IRBuilderBase::InsertPointGuard Guard(Builder);
+    Builder.restoreIP(AllocaIP);
+    RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array");
----------------
This could be done before the previous `Builder.SetInsertPoint` without an `InsertPointGuard`


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1082
+  const DataLayout &DL = Module->getDataLayout();
+  unsigned RedArrayByteSize = divideCeil(DL.getTypeSizeInBits(RedArrayTy), 8);
+  Constant *RedArraySize = Builder.getInt64(RedArrayByteSize);
----------------
Consider using `getTypeStoreSize`


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1084-1086
+  FunctionCallee ReductionFunc = Module->getOrInsertFunction(
+      ".omp.reduction.func", Builder.getVoidTy(), Builder.getInt8PtrTy(),
+      Builder.getInt8PtrTy());
----------------
[serious] This will try to reuse the same `.omp.reduction.func` for all reductions in the module. I think you will want to add a new function with private linkage.


================
Comment at: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:1114
+  for (unsigned i = 0, e = Variables.size(); i < e; ++i) {
+    Value *RedValue = Builder.CreateLoad(Variables[i], "red.value." + Twine(i));
+    Value *PrivateRedValue = Builder.CreateLoad(
----------------
This produces a deprecated warning: Use the version that explicitly specifies the loaded type instead


================
Comment at: llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp:2625-2626
+  // Store initial values of reductions into global variables.
+  Builder.CreateStore(ConstantFP::get(Builder.getFloatTy(), 0.0), SumReduced);
+  Builder.CreateStore(Builder.getInt32(1), XorReduced);
+
----------------
Could you give these a name just such they are easier to identifier in the dump?


================
Comment at: llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp:2695-2710
+  auto SumReduction = [&](InsertPointTy IP, Value *LHS, Value *RHS,
+                          Value *&Result) {
+    IRBuilderBase::InsertPointGuard Guard(Builder);
+    Builder.restoreIP(IP);
+    Result = Builder.CreateFAdd(LHS, RHS, "red.add");
+    return Builder.saveIP();
+  };
----------------
Should OMPIRBuilder provide a set of standard reduction callbacks so users don't have to declare them?


================
Comment at: llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp:2819
+  BasicBlock *NonAtomicBB = Switch->case_begin()->getCaseSuccessor();
+  BasicBlock *AtomicBB = (++Switch->case_begin())->getCaseSuccessor();
+
----------------
Use `std::next` (or `case_begin() + 1`). `++` is modifying its argument which suggests that that `case_begin()` returns a reference. I am somewhat surprised that this even works.

>From cppreference.com:
> Although the expression ++c.begin() often compiles, it is not guaranteed to do so: c.begin() is an rvalue expression, and there is no LegacyInputIterator requirement that specifies that increment of an rvalue is guaranteed to work. In particular, when iterators are implemented as pointers or its operator++ is lvalue-ref-qualified, ++c.begin() does not compile, while std::next(c.begin()) does. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104928/new/

https://reviews.llvm.org/D104928



More information about the llvm-commits mailing list