[PATCH] D125588: [MachineCombiner] Improve MachineCombiner's cost model

Sat Jun 18 14:44:44 PDT 2022

dmgreen added a comment.

Hello - I have been trying to take a look at the problems here, I feel without a lot of success.

The codegen changes that change madd to mul+add look worse on paper, and the performance results you've quoted seem to be a noisy decrease in performance. That matches the results I have which don't seem fantastic.

I think in general there shouldn't be any reason for the machine combiner to choose mul+add over madd. (There might be certain times on inorder cores where the mul+add is faster due to the exact scheduling, but the machine combiner isn't considering those characteristics, and we need consider general codegen even if -mcpu=generic is using an inorder schedule). A MUL is really a MADD with a WZR addend register, and with late forwarding the MADD should be preferred. We may be fighting the schedule a bit - it doesn't always report the schedules correct. I think I can look into trying to improve that my simplifying it a little, but that might take some time to get right. And I worry it might not exactly fix the issues if this isn't considering that instructions can have different latencies from each operand.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125588/new/

https://reviews.llvm.org/D125588