[PATCH] D99586: [AArch64] Default to zero-cycle-zeroing FP registers.

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 30 10:15:34 PDT 2021


dmgreen added a comment.

OK. I think I see what's wrong. According to the A55 software optimization guide, the dual issue for a movi is a little more restrictive than fmov, which can lead to slower code. We would probably want to prefer the fmov there. Which probably applies to other inorder cpus.

I don't have great visibility on other cpus. I just happen to have some very low noise A55 tests that can show whether this kind of small change is actually beneficial.

It looks from the other optimization guides like the two instructions should be treated the same, performance wise. I would be surprised if a fmov s0, wzr wasn't really treated like a form of "FP move, immed", although I have no evidence one way or the other which way it works.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99586/new/

https://reviews.llvm.org/D99586



More information about the llvm-commits mailing list