vfdff wrote: Thanks. The shortest forward latency is **1-cycle** according the document for **madd/msub** when their accumulator operand depend on MAC operation's result, LGTM https://github.com/llvm/llvm-project/pull/82343