[Openmp-commits] [PATCH] D132045: [OpenMPOpt] Improving memory transfer latency hiding
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Aug 24 09:58:51 PDT 2022
jdoerfert added a comment.
Mostly minor comments that should be addressed. Also, test are missing.
Maybe split it into the runtime part and the compiler part to make the patches smaller.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4186
};
+bool splitMapperToIssueAndWait(CallInst *RuntimeCall,
----------------
Put all of this into an anonymous namespace or make them member functions of OpenMPOpt, or a mix of the two.
Also add documentation describing what is happening briefly.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4193
+ Instruction *FirstInst = &(F->getEntryBlock().front());
+ AllocaInst *Handle = new AllocaInst(IRBuilder.AsyncInfo, F->getAddressSpace(),
+ "handle", FirstInst);
----------------
Use `DL.getAllocaAddrSpace()` to get the AddressSpace, with DL being the data layout (`const DataLayout &DL = M->getDataLayout();`)
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4213
+ RuntimeCall->getArgOperand(OffloadArray::DeviceIDArgNum), // device_id.
+ Handle // handle to wait on.
+ };
----------------
Put the comment in the line before the value.
Also add some assertions to verify their type matches what you expect.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4222
+
+SmallVector<std::pair<Instruction *, Instruction *>>
+getUseTreeDuplicated(Instruction *I) {
----------------
Don't return a container, pass it in by reference instead.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4241
+
+ UseTreeDuplicated.erase(UseTreeDuplicated.begin());
+ return UseTreeDuplicated;
----------------
This operation is going to be costly, can we avoid it?
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4246
+bool reorganizeDuplicatedInstructions(
+ SmallVector<std::pair<Instruction *, Instruction *>> UseTreeDuplicated,
+ Instruction *I) {
----------------
Pass containers by reference to avoid copies.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4256
+ return true;
+}
+
----------------
Why didn't we do this right away when we created the UseTree vector?
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4260
+ BasicBlock *RuntimeCallBB = RuntimeCall.getParent();
+ CallInst *IssueRuntimeCall = &RuntimeCall;
+
----------------
Avoid duplication. RuntimeCall is fine to be used instead below.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4279
+ BasicBlock *NextBB =
+ MapperBB->splitBasicBlock(MapperBB->front().getNextNode(), "next.bb");
+
----------------
isn't `front()` just `RuntimeCall`? If so, use it.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4303
+ Function *OutlinedFunc = CE.extractCodeRegion(CEAC);
+ OutlinedFunc->setName("mapper_issue_wrapper");
+
----------------
Use a protected namespace name, e.g.,
`"__openmp_mapper_issue_wrapper_" + MapperBB->getParent()->getName()`
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4319
+ return true;
+}
+
----------------
Move this code into the above function.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4333
+ return RTCall;
+}
+
----------------
This is not a good idea. Instead, return the calls as part of the splitting. This won't work if you split multiple times in the same function and it also is really costly.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4337
+bool canMoveThrough(CallInst *CI, Instruction *I, AliasAnalysis &AA) {
+ // TODO: check again for mayReadFromMemory
+ if (!(I->mayHaveSideEffects()) &&
----------------
I don't recall why we have the callbase + read condition below. Maybe remove it for now?
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4343-4346
+ if (isNoModRef(MR))
+ return true;
+
+ return false;
----------------
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4367
+ return false;
+}
+// check if a bb exists in a vector
----------------
You need to keep track of the blocks you visited or loops will cause this to continue forever.
Once you have a visited set, you can also pop the last SuccVec member always as you don't need all of them.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4374
+ return false;
+}
+
----------------
Linear searches are generally not a good idea. I'll comment on the call site.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4406
+bool moveIssueRTCInOrigBB(CallInst *IssueWrapperCall, AliasAnalysis &AA) {
+ Instruction *IssuMovePoint;
+ Instruction *I = IssueWrapperCall;
----------------
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4432-4435
+ if (!I) {
+ RTCallWait->moveBefore(RTCallWait->getParent()->getTerminator());
+ return true;
+ }
----------------
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4467
+ bool Changed = false;
+ auto AsyncMemTransfers = [&](Use &U, Function &Decl) {
+ auto *RTCall = OpenMPOpt::getCallIfRegularCall(U, &RFI);
----------------
The return value of this function is defined as:
```
┊ /// Run the callback \p CB on each use within the function \p F and forget
┊ /// the use if the result is true.
```
The `Changed` variable is good to keep track overall but each use that is removed has to be reported via the return value of the callback.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4481
+ Function *IssueWrapper = outlineMapperIssueRT(*RTCallIssue);
+ if (!IssueWrapper) // cannot outline the function for some reasons
+ return Changed;
----------------
Avoid these trailing comments.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4486
+ CallInst *IssueWrapperCall =
+ getRTFunctionCall(IssueWrapper->getName().str(), F);
+
----------------
Also here, return the call from `outlineMapperIssueRT` or go through the users of `IssueWrapper`, don't "search" for the call by name.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4488
+
+ MergeBlockIntoPredecessor(IssueWrapperCall->getParent());
+ BasicBlock *IssueBB = IssueWrapperCall->getParent();
----------------
This can be done in the outlineMapperIssueRT function.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4493
+ moveWaitRTCInOrigBB(IssueWrapperCall, RTCallWait, AA);
+ bool issueMovedInOrigBB = moveIssueRTCInOrigBB(IssueWrapperCall, AA);
+
----------------
Style: IssueMoved...
Also other places, like `split` above.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4502
+ BasicBlock *NextBB;
+ DominatorTree DT = DominatorTree(F);
+
----------------
Use FAM to get the DominatorTree.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4520
+ if (moveIssueRTCInBB(IssueWrapperCall, CurrentBB, AA))
+ Changed = true;
+
----------------
The move code here should go into the moveIssue helper. No need to split it into two parts.
================
Comment at: llvm/lib/Transforms/IPO/OpenMPOpt.cpp:5377
+ return PreservedAnalyses::none();
+ }
----------------
You should move this into the run method of OpenMPOpt as there is also the split part, no?
This should be combined with (or replace) the existing code in `hideMemTransfersLatency`
================
Comment at: openmp/libomptarget/include/omptarget.h:190
DeviceTy &Device;
+ bool SyncFlag;
----------------
================
Comment at: openmp/libomptarget/include/omptarget.h:193
public:
- AsyncInfoTy(DeviceTy &Device) : Device(Device) {}
- ~AsyncInfoTy() { synchronize(); }
+ AsyncInfoTy(DeviceTy &Device, bool Sf = 1) : Device(Device), SyncFlag(Sf) {}
+
----------------
================
Comment at: openmp/libomptarget/include/omptarget.h:203
int synchronize();
-
/// Return a void* reference with a lifetime that is at least as long as this
----------------
unrelated.
================
Comment at: openmp/libomptarget/include/omptarget.h:292
+void __tgt_target_data_begin_mapper_wait(ident_t *Loc, int64_t DeviceId,
+ __tgt_async_info *Handle);
void __tgt_target_data_begin_nowait_mapper(
----------------
newline. Also, add a comment above both to describe that what they do.
================
Comment at: openmp/libomptarget/src/interface.cpp:110
+ if (Rc == OFFLOAD_SUCCESS && !Handle)
Rc = AsyncInfo.synchronize();
+
----------------
make it
```
if (SUCCESS) {
if (handle)
...
else
...
}
```
to help people read it.
================
Comment at: openmp/libomptarget/src/interface.cpp:121
+
+ DP("Entering data begin region for device %" PRId64 " with %d mappings\n",
+ DeviceId);
----------------
Modify the text to make sure we know we are in the wait now
================
Comment at: openmp/libomptarget/src/interface.cpp:128
+ return;
+ }
+
----------------
We need to extract the logic to get the DeviceId from the above function.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132045/new/
https://reviews.llvm.org/D132045
More information about the Openmp-commits
mailing list