[llvm] [AMDGPU] Prefer v_madak_f32 over v_madmk_f32 to reduce vgpr pressure (PR #72506)

David Stuttard via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 16 04:40:00 PST 2023


================
@@ -3454,6 +3454,19 @@ bool SIInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
       if (!Src2->isReg() || RI.isSGPRClass(MRI->getRegClass(Src2->getReg())))
         return false;
 
+      // If src2 is also a literal constant then we have to choose which one to
+      // fold. In general it is better to choose madak so that the other literal
+      // can be materialized in an sgpr instead of a vgpr:
+      //   s_mov_b32 s0, literal
+      //   v_madak_f32 v0, s0, v0, literal
+      // Instead of:
+      //   v_mov_b32 v1, literal
+      //   v_madmk_f32 v0, v0, literal, v1
+      MachineInstr *Def = MRI->getUniqueVRegDef(Src2->getReg());
+      if (Def && Def->isMoveImmediate() &&
+          !isInlineConstant(Def->getOperand(1)))
+        return false;
+
----------------
dstutt wrote:

Answering my own question - I'm pretty sure that's the case.

https://github.com/llvm/llvm-project/pull/72506


More information about the llvm-commits mailing list