[llvm] [WIP] Optimize S_MOV frame index elimination support (PR #101322)

Wed Jul 31 05:41:37 PDT 2024

================
@@ -2568,30 +2568,46 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
                 } else
                   Add.addImm(Offset).addReg(TmpResultReg, RegState::Kill);
               } else {
-                BuildMI(*MBB, MI, DL, TII->get(AMDGPU::V_MOV_B32_e32),
-                        TmpResultReg)
-                    .addImm(Offset);
                 assert(Offset > 0 &&
                        isUInt<24>(2 * ST.getMaxWaveScratchSize()) &&
                        "offset is unsafe for v_mad_u32_u24");
-                // We start with a frame pointer with a wave space value, and an
-                // offset in lane-space. We are materializing a lane space
-                // value. We can either do a right shift of the frame pointer to
-                // get to lane space, or a left shift of the offset to get to
-                // wavespace. We can right shift after the computation to get
-                // back to the desired per-lane value.
-                // We are using the mad_u32_u24 primarily as an add with no
-                // carry out clobber.
-                Add = BuildMI(*MBB, MI, DL, TII->get(AMDGPU::V_MAD_U32_U24_e64),
+                if (AMDGPU::isInlinableLiteral32(Offset,
+                                                 ST.hasInv2PiInlineImm())) {
----------------
arsenm wrote:

>  V_LSHRREV_B32_e64 comes before mad in first case

It does not need to. Nothing about the order of instructions should change in the inline immediate case. You just avoid the materialize with the v_mov_b32 

https://github.com/llvm/llvm-project/pull/101322