[llvm] [AArch64] Use 0-cycle reg2reg MOVs for FPR32, FPR16, FPR8 (PR #144152)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 20 06:27:01 PDT 2025
================
@@ -5302,30 +5302,73 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
if (AArch64::FPR32RegClass.contains(DestReg) &&
AArch64::FPR32RegClass.contains(SrcReg)) {
- BuildMI(MBB, I, DL, get(AArch64::FMOVSr), DestReg)
- .addReg(SrcReg, getKillRegState(KillSrc));
+ if (Subtarget.isTargetDarwin() && Subtarget.hasZeroCycleRegMove()) {
----------------
davemgreen wrote:
The zero-latency instructions on the Arm side are listed in the SWOG's and are often a little different. As far as I could tell from the code, the existing HasZeroCycleRegMove might better be described as "HasZeroCycleXRegMoveButNotWRegMove". We tend to have both if we have any, so don't prefer the xreg version over the wreg. For FPR regs it looks like if we have D we have S as free too.
My point - what do you think of representing it as the individual instructions that are expected to be free? So something like `if (Subtarget.hasZeroCycleDRegMove() && !Subtarget.hasZeroCycleSRegMove()) ...`. Hopefully that is still simple enough and not too over-engineered.
https://github.com/llvm/llvm-project/pull/144152
More information about the llvm-commits
mailing list