[PATCH] D125588: [MachineCombiner] Improve MachineCombiner's cost model

Sun Jun 26 10:04:05 PDT 2022

dmgreen added a comment.

In D125588#3600680 <https://reviews.llvm.org/D125588#3600680>, @Carrot wrote:

> It sounds to me like the MADD is decoded into two uops, mul and add. Then the mul can be immediately executable once the two multipliers are available. Is this correct?

Not necessarily two microops exactly, but late-farwarding into the add operand of the MADD. The two mul operands are needed on cycle 0, but the add operand is needed on cycle 2. The schedule has this information, but it can easily get tripped up by the way it tries to specify it. Whatever happens here I think it might be worth trying to simplify that so that it is more correct more of the time, but that involves updating all codegen ever, so needs to be carefully handled.

> I agree in this case MADD is always preferred. But the problem is current MachineCombiner and latency computation doesn't model this behavior. It's an independent issue. I think it should be solved in another patch.

I think - the old costmodel was not great, but the new code is still incorrect for some cases. For the MADD case we don't handle the differing latencies of each of the input operand. (The latency for the mul operand might be 3, but for the add operand is 1. The cost just always uses "depth" + 3). There may also be differences when inserting multiple instructions too. The old model was `lastinstrdepth + sum(latencies)`. The new one is `firstinstrdepth + sum(latencies)`, which I feel may not be more accurate if the instructions in `InsInstrs` have dependencies that are not in InsInstrs.

As a concrete way forward - can you take the isCoalescableCopy and split it into a separate patch. It can also change the `if (!MI->isCopy())    return false;` to `if (!MI->isCopy())    return MI->isTransient();` and maybe rename the function to isTransient. That should be fine to go separately, I believe, and the isCoalescableCopy makes a nice change. For the rest we may need something that more accurately calculates the final depth for instructions with forwarding like the MADD case.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125588/new/

https://reviews.llvm.org/D125588