[PATCH] D92296: [AARCH64] Improve accumulator forwarding for Cortex-A57 model

Sat Dec 26 10:09:24 PST 2020

mnadeem marked an inline comment as done.
mnadeem added a comment.

@evgeny777 Regarding the PMUL latency the optimization guide says this. AArch64SchedA57.td probably has older latencies. Should these be updated?

> Cortex-A57 r1p0 and later reduce the latency of ASIMD multiply and multiply-with-accumulate instructions relative to r0pX.

@dmgreen You are correct the only MUL forwarding is from FP MUL (FMUL/FNMUL) and ASIMD FP MUL (FMUL/FMULX). Let me correct that and divide the patch into two.

I'll have to see if there is any negative performance impact first.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92296/new/

https://reviews.llvm.org/D92296