[llvm] [AMDGPU] Prefer v_madak_f32 over v_madmk_f32 to reduce vgpr pressure (PR #72506)
David Stuttard via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 16 04:40:00 PST 2023
================
@@ -3454,6 +3454,19 @@ bool SIInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
if (!Src2->isReg() || RI.isSGPRClass(MRI->getRegClass(Src2->getReg())))
return false;
+ // If src2 is also a literal constant then we have to choose which one to
+ // fold. In general it is better to choose madak so that the other literal
+ // can be materialized in an sgpr instead of a vgpr:
+ // s_mov_b32 s0, literal
+ // v_madak_f32 v0, s0, v0, literal
+ // Instead of:
+ // v_mov_b32 v1, literal
+ // v_madmk_f32 v0, v0, literal, v1
+ MachineInstr *Def = MRI->getUniqueVRegDef(Src2->getReg());
+ if (Def && Def->isMoveImmediate() &&
+ !isInlineConstant(Def->getOperand(1)))
+ return false;
+
----------------
dstutt wrote:
Answering my own question - I'm pretty sure that's the case.
https://github.com/llvm/llvm-project/pull/72506
More information about the llvm-commits
mailing list