[Openmp-commits] [PATCH] D132045: [OpenMPOpt] Improving memory transfer latency hiding

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Aug 24 09:58:51 PDT 2022


jdoerfert added a comment.

Mostly minor comments that should be addressed. Also, test are missing.
Maybe split it into the runtime part and the compiler part to make the patches smaller.



================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4186
 };
 
+bool splitMapperToIssueAndWait(CallInst *RuntimeCall,
----------------
Put all of this into an anonymous namespace or make them member functions of OpenMPOpt, or a mix of the two. 

Also add documentation describing what is happening briefly.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4193
+  Instruction *FirstInst = &(F->getEntryBlock().front());
+  AllocaInst *Handle = new AllocaInst(IRBuilder.AsyncInfo, F->getAddressSpace(),
+                                      "handle", FirstInst);
----------------
Use `DL.getAllocaAddrSpace()` to get the AddressSpace, with DL being the data layout (`const DataLayout &DL = M->getDataLayout();`)


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4213
+      RuntimeCall->getArgOperand(OffloadArray::DeviceIDArgNum), // device_id.
+      Handle // handle to wait on.
+  };
----------------
Put the comment in the line before the value.
Also add some assertions to verify their type matches what you expect.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4222
+
+SmallVector<std::pair<Instruction *, Instruction *>>
+getUseTreeDuplicated(Instruction *I) {
----------------
Don't return a container, pass it in by reference instead.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4241
+
+  UseTreeDuplicated.erase(UseTreeDuplicated.begin());
+  return UseTreeDuplicated;
----------------
This operation is going to be costly, can we avoid it?


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4246
+bool reorganizeDuplicatedInstructions(
+    SmallVector<std::pair<Instruction *, Instruction *>> UseTreeDuplicated,
+    Instruction *I) {
----------------
Pass containers by reference to avoid copies.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4256
+  return true;
+}
+
----------------
Why didn't we do this right away when we created the UseTree vector? 


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4260
+  BasicBlock *RuntimeCallBB = RuntimeCall.getParent();
+  CallInst *IssueRuntimeCall = &RuntimeCall;
+
----------------
Avoid duplication. RuntimeCall is fine to be used instead below.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4279
+  BasicBlock *NextBB =
+      MapperBB->splitBasicBlock(MapperBB->front().getNextNode(), "next.bb");
+
----------------
isn't `front()` just `RuntimeCall`? If so, use it.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4303
+  Function *OutlinedFunc = CE.extractCodeRegion(CEAC);
+  OutlinedFunc->setName("mapper_issue_wrapper");
+
----------------
Use a protected namespace name, e.g.,

`"__openmp_mapper_issue_wrapper_" + MapperBB->getParent()->getName()`


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4319
+  return true;
+}
+
----------------
Move this code into the above function.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4333
+  return RTCall;
+}
+
----------------
This is not a good idea. Instead, return the calls as part of the splitting. This won't work if you split multiple times in the same function and it also is really costly.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4337
+bool canMoveThrough(CallInst *CI, Instruction *I, AliasAnalysis &AA) {
+  // TODO: check again for mayReadFromMemory
+  if (!(I->mayHaveSideEffects()) &&
----------------
I don't recall why we have the callbase + read condition below. Maybe remove it for now?


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4343-4346
+  if (isNoModRef(MR))
+    return true;
+
+  return false;
----------------



================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4367
+  return false;
+}
+// check if a bb exists in a vector
----------------
You need to keep track of the blocks you visited or loops will cause this to continue forever.

Once you have a visited set, you can also pop the last SuccVec member always as you don't need all of them.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4374
+  return false;
+}
+
----------------
Linear searches are generally not a good idea. I'll comment on the call site.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4406
+bool moveIssueRTCInOrigBB(CallInst *IssueWrapperCall, AliasAnalysis &AA) {
+  Instruction *IssuMovePoint;
+  Instruction *I = IssueWrapperCall;
----------------



================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4432-4435
+  if (!I) {
+    RTCallWait->moveBefore(RTCallWait->getParent()->getTerminator());
+    return true;
+  }
----------------



================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4467
+  bool Changed = false;
+  auto AsyncMemTransfers = [&](Use &U, Function &Decl) {
+    auto *RTCall = OpenMPOpt::getCallIfRegularCall(U, &RFI);
----------------
The return value of this function is defined as:
```
    ┊ /// Run the callback \p CB on each use within the function \p F and forget
    ┊ /// the use if the result is true.
```
The `Changed` variable is good to keep track overall but each use that is removed has to be reported via the return value of the callback.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4481
+    Function *IssueWrapper = outlineMapperIssueRT(*RTCallIssue);
+    if (!IssueWrapper) // cannot outline the function for some reasons
+      return Changed;
----------------
Avoid these trailing comments.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4486
+    CallInst *IssueWrapperCall =
+        getRTFunctionCall(IssueWrapper->getName().str(), F);
+
----------------
Also here, return the call from `outlineMapperIssueRT` or go through the users of `IssueWrapper`, don't "search" for the call by name.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4488
+
+    MergeBlockIntoPredecessor(IssueWrapperCall->getParent());
+    BasicBlock *IssueBB = IssueWrapperCall->getParent();
----------------
This can be done in the outlineMapperIssueRT function.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4493
+    moveWaitRTCInOrigBB(IssueWrapperCall, RTCallWait, AA);
+    bool issueMovedInOrigBB = moveIssueRTCInOrigBB(IssueWrapperCall, AA);
+
----------------
Style: IssueMoved...
Also other places, like `split` above.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4502
+    BasicBlock *NextBB;
+    DominatorTree DT = DominatorTree(F);
+
----------------
Use FAM to get the DominatorTree.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4520
+    if (moveIssueRTCInBB(IssueWrapperCall, CurrentBB, AA))
+      Changed = true;
+
----------------
The move code here should go into the moveIssue helper. No need to split it into two parts.


================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:5377
+    return PreservedAnalyses::none();
+  }
 
----------------
You should move this into the run method of OpenMPOpt as there is also the split part, no?
This should be combined with (or replace) the existing code in `hideMemTransfersLatency`


================
Comment at: openmp/libomptarget/include/omptarget.h:190
   DeviceTy &Device;
+  bool SyncFlag;
 
----------------



================
Comment at: openmp/libomptarget/include/omptarget.h:193
 public:
-  AsyncInfoTy(DeviceTy &Device) : Device(Device) {}
-  ~AsyncInfoTy() { synchronize(); }
+  AsyncInfoTy(DeviceTy &Device, bool Sf = 1) : Device(Device), SyncFlag(Sf) {}
+
----------------



================
Comment at: openmp/libomptarget/include/omptarget.h:203
   int synchronize();
-
   /// Return a void* reference with a lifetime that is at least as long as this
----------------
unrelated.


================
Comment at: openmp/libomptarget/include/omptarget.h:292
+void __tgt_target_data_begin_mapper_wait(ident_t *Loc, int64_t DeviceId,
+                                         __tgt_async_info *Handle);
 void __tgt_target_data_begin_nowait_mapper(
----------------
newline. Also, add a comment above both to describe that what they do.


================
Comment at: openmp/libomptarget/src/interface.cpp:110
+  if (Rc == OFFLOAD_SUCCESS && !Handle)
     Rc = AsyncInfo.synchronize();
+
----------------
make it 
```
if (SUCCESS) {
  if (handle)
    ...
  else
    ...
}
```
to help people read it.


================
Comment at: openmp/libomptarget/src/interface.cpp:121
+
+  DP("Entering data begin region for device %" PRId64 " with %d mappings\n",
+     DeviceId);
----------------
Modify the text to make sure we know we are in the wait now


================
Comment at: openmp/libomptarget/src/interface.cpp:128
+    return;
+  }
+
----------------
We need to extract the logic to get the DeviceId from the above function. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132045/new/

https://reviews.llvm.org/D132045



More information about the Openmp-commits mailing list