[llvm] [MCA] Extend -instruction-tables option with verbosity levels (PR #130574)

Julien Villette via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 21 03:37:06 PDT 2025


================
@@ -141,6 +241,33 @@ void InstructionInfoView::collectData(
     IIVDEntry.mayLoad = MCDesc.mayLoad();
     IIVDEntry.mayStore = MCDesc.mayStore();
     IIVDEntry.hasUnmodeledSideEffects = MCDesc.hasUnmodeledSideEffects();
+
+    if (PrintFullInfo) {
+      // Get latency with bypass
+      IIVDEntry.Bypass =
+          IIVDEntry.Latency - MCSchedModel::getBypassDelayCycles(STI, SCDesc);
+      IIVDEntry.OpcodeName = MCII.getName(Inst.getOpcode());
+      raw_string_ostream TempStream(IIVDEntry.Resources);
+      const MCWriteProcResEntry *Index = STI.getWriteProcResBegin(&SCDesc);
+      const MCWriteProcResEntry *Last = STI.getWriteProcResEnd(&SCDesc);
+      auto Sep = "";
+      for (; Index != Last; ++Index) {
+        if (!Index->ReleaseAtCycle)
+          continue;
+        const MCProcResourceDesc *MCProc =
+            SM.getProcResource(Index->ProcResourceIdx);
+        if (Index->ReleaseAtCycle > 1) {
----------------
jvillette38 wrote:

Your are right! So throughput is not constant...
```
          |  0  |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |
SiFive7CQ    X           Y           Z           W
SiFive7VL          X     X     Y     Y     Z     Z     W     W
```
- 3 cycles for 1 instruction: RThroughput = 3
- 5 cycles for 2 instructions: RThroughput = 2.5
- 7 cycles for 3 instructions: RThroughput = 2.33
- 9 cycles for 4 instructions: RThroughput =  2.25
Instructions = x
Cycles = (x * 2) + 1
Cycles = (x * (ReleaseAtCycle - AcquireAtCycle)) + AcquireAtCycle
RThroughput =  [(x * (ReleaseAtCycle - AcquireAtCycle)) + AcquireAtCycle] / x

Are you agree with this formula?
But it does not solve our issue to define constant throughput...

![image](https://github.com/user-attachments/assets/c6759d35-579d-4666-81d6-95d4ce5da968)

Do we consider large number of instructions and the limit +oo? So in this case, it would give a throughput of 2 = (ReleaseAtCycle - AcquireAtCycle).
And if we have more than 1 resource: Reverse Throughput will be (ReleaseAtCycle - AcquireAtCycle) / Number of resources.

Finally: is it better to consider throughput of one instruction or lot of identical instructions?
Note: the max error is equal to AcquireAtCycle.

https://github.com/llvm/llvm-project/pull/130574


More information about the llvm-commits mailing list