[PATCH] D149773: [AMDGPU][IGLP] Add iglp_opt(1) strategy for single wave gemms

Austin Kerbow via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 4 23:04:56 PDT 2023


kerbowa added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:876
+class MFMASmallGemmSingleWaveOpt final : public IGLPStrategy {
 private:
 public:
----------------
Extra private.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:927
+  // double count.
+  for (; I != E; I++) {
+    if (I->second)
----------------
For this type of idiom wouldn't you normally use a visited set? Could also be improved by tracking all the VEMM_R and their associated DS_WRITE with perms via some map?  


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:934
+        continue;
+      auto FirstPred = std::find_if(
+          I->first->Preds.begin(), I->first->Preds.end(), [](const SDep &Pred) {
----------------
Could both of these find_if before the if statement be hoisted to the outer loop?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:940
+
+      SDep *VMEMSucc = std::find_if(
+          FirstPred->getSUnit()->Succs.begin(),
----------------
It looks like the assumption is that each V_PERM DS_WRITE pair will have one VMEM load associated with it, since the loop stops at the first VMEM found? Is this correct?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:946
+          });
+      if (VMEMSucc == FirstPred->getSUnit()->Succs.end())
+        continue;
----------------
 If there is no vmem successors can you mark the instruction as visited, or else just move this before the inner loop like I recommend above?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:968
+    }
+  }
 
----------------
I think the calculation for DSWWithSharedVMEMCount could be simplified by iterating over DSWithPerms once, while adding each found VMEM to a list or map with a count that increases by 1 for each found DSWithPerm paired with that VMEM. To avoid some of the repeated traversals of preds/succs of the same instructions?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149773/new/

https://reviews.llvm.org/D149773



More information about the llvm-commits mailing list