[llvm] [MCA][X86] Pretend To Have a Stack Engine (PR #153348)

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 14 03:53:15 PDT 2025


================
@@ -36,11 +36,31 @@ void X86InstrPostProcess::setMemBarriers(std::unique_ptr<Instruction> &Inst,
   }
 }
 
+void X86InstrPostProcess::useStackEngine(std::unique_ptr<Instruction> &Inst,
+                                         const MCInst &MCI) {
+  if (X86::isPOP(MCI.getOpcode())) {
+    assert(Inst->getUses().size() == 1 &&
+           "Expected pop instruction to only use stack pointer register");
+    Inst->getUses().clear();
+  }
----------------
adibiagio wrote:

PUSH instructions are essential store operations, and they are guaranteed to always be processed in-order by the LSUnit (see method dispatch() https://github.com/llvm/llvm-project/blob/main/llvm/lib/MCA/HardwareUnits/LSUnit.cpp#L69).

However, POP instructions are problematic because they behave like normal load operations, and the LSUnit allows loads to be reordered w.r.t. other loads (see code below).

```
  // A load may pass a previous load.
  MemoryGroup &Group = getGroup(CurrentLoadGroupID);
  Group.addInstruction();
  return CurrentLoadGroupID;
```

Before your patch, this was fine because POP instructions were also in a register dependency due to the RSP updates. So, although technically they didn't alias with each other (i.e. they were part of a same load group), they were never scheduled out of order.

A quick workaround is to force stack load (e.g. POP) to be in their own load group. Groups are guaranteed to execute in-order, so that would force POP instruction to always execute in the right sequence. Note however that this isn't an ideal/perfect solution, because it prevents unrelated non-stack loads to be reordered w.r.t. POP instructions. 

There may be another way to fix this issue. However, it needs to be tested to see if it actually works:

It requires adding a target hook to check whether an opcode is a known stack operation `bool isStackOperation(opcode op)`. By default, it would always returns `false`, however, we may have that on x86 it defaults to true for PUSH/POP.

In InstrBuilder, we check if an instruction is a known stack operation. In case, we artificially override the write latency of RSP to 0. That might work, although I didn't test it. It should prevent reordering because we didn't eliminate the dependency. However, it should theoretically allow scheduling multiple PUSH/POP at the same time. That's the part that needs to be tested (I didn't check the Scheduler logic).

I hope it helps.
-Andrea

https://github.com/llvm/llvm-project/pull/153348


More information about the llvm-commits mailing list