[llvm] [AMDGPU][SIPreEmitPeephole] mustRetainExeczBranch: estimate ThenBlock cost using MachineTraceInfo (PR #111117)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 10 10:37:15 PDT 2024


Juan Manuel Martinez =?utf-8?q?Caamaño?= <juamarti at amd.com>,
Juan Manuel Martinez =?utf-8?q?Caamaño?= <juamarti at amd.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/111117 at github.com>


================
@@ -304,11 +317,24 @@ bool SIPreEmitPeephole::getBlockDestinations(
   return true;
 }
 
-bool SIPreEmitPeephole::mustRetainExeczBranch(
-    const MachineBasicBlock &From, const MachineBasicBlock &To) const {
-  unsigned NumInstr = 0;
-  const MachineFunction *MF = From.getParent();
+bool SIPreEmitPeephole::mustRetainExeczBranch(const MachineInstr &Branch,
+                                              const MachineBasicBlock &From,
+                                              const MachineBasicBlock &To) {
+
+  const MachineBasicBlock &Head = *Branch.getParent();
+  const auto *FromIt = find(Head.successors(), &From);
+  assert(FromIt != Head.succ_end());
+
+  auto BranchProb = Head.getSuccProbability(FromIt);
+  if (BranchProb.isUnknown())
+    return false;
+
+  uint64_t BranchTakenCost =
+      TII->getSchedModel().computeInstrLatency(&Branch, false);
+  constexpr uint64_t BranchNotTakenCost = 1;
----------------
arsenm wrote:

I think this needs to be bumped up. This should be at least 4 for < gfx10. IIRC the not taken branch was higher but I don't remember 

https://github.com/llvm/llvm-project/pull/111117


More information about the llvm-commits mailing list