[llvm] [AMDGPU][SIPreEmitPeephole] mustRetainExeczBranch: use BranchProbability and TargetSchedmodel (PR #109818)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 4 02:12:06 PDT 2024
Juan Manuel Martinez =?utf-8?q?Caamaño?= <juamarti at amd.com>,
Juan Manuel Martinez =?utf-8?q?Caamaño?= <juamarti at amd.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/109818 at github.com>
================
@@ -326,15 +345,32 @@ bool SIPreEmitPeephole::mustRetainExeczBranch(
if (TII->hasUnwantedEffectsWhenEXECEmpty(MI))
return true;
- // These instructions are potentially expensive even if EXEC = 0.
- if (TII->isSMRD(MI) || TII->isVMEM(MI) || TII->isFLAT(MI) ||
- TII->isDS(MI) || TII->isWaitcnt(MI.getOpcode()))
- return true;
-
- ++NumInstr;
- if (NumInstr >= SkipThreshold)
+ if (TII->isWaitcnt(MI.getOpcode()))
return true;
}
+
+ if (!MinInstr)
+ MinInstr = Traces->getEnsemble(MachineTraceStrategy::TS_Local);
+
+ auto Trace = MinInstr->getTrace(&From);
+ ThenCyclesCost +=
+ std::max(Trace.getCriticalPath(), Trace.getResourceDepth(true));
+
+ // Consider `P = N/D` to be the probability of execz being false (skipping
+ // the then-block) The transformation is profitable if always executing the
+ // 'then' block is cheaper than executing sometimes 'then' and always
+ // executing s_cbranch_execz:
+ // * ThenCost <= P*ThenCost + (1-P)*BranchTakenCost + P*BranchNonTakenCost
+ // * (1-P) * ThenCost <= (1-P)*BranchTakenCost + P*BranchNonTakenCost
+ // * (D-N)/D * ThenCost <= (D-N)/D * BranchTakenCost + N/D *
+ // BranchNonTakenCost
+ uint64_t Numerator = BranchProb.getNumerator();
+ uint64_t Denominator = BranchProb.getDenominator();
+ bool IsProfitable = (Denominator - Numerator) * ThenCyclesCost <=
+ ((Denominator - Numerator) * BranchTakenCost +
+ Numerator * BranchNotTakenCost);
+ if (!IsProfitable)
+ return true;
----------------
arsenm wrote:
return !IsProfitable
https://github.com/llvm/llvm-project/pull/109818
More information about the llvm-commits
mailing list