[PATCH] D135208: [AArch64] Swap 'lsl(val1, small-shmt)' to right hand side for ADD(lsl(val1,small-shmt), lsl(val2,large-shmt))

Sun Oct 9 01:43:50 PDT 2022

dmgreen accepted this revision.
dmgreen added a comment.
This revision is now accepted and ready to land.

Sorry for the delay. This sounds OK to me. Like you say, it could be based on LSLFast, but it shouldn't be worse for other CPUs and would be worthwhile for generic at least.

Other than some nitpicks, LGTM

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16618
+  // On many AArch64 processors (Cortex A78, Neoverse N1/N2/V1, etc), ADD with
+  // LSL shift (shift <= 4) has smaller latency and larger throughput than AND
+  // with LSL (shift >= 4). For the rest of processors, this is no-op for
----------------
AND -> ADD

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16619
+  // LSL shift (shift <= 4) has smaller latency and larger throughput than AND
+  // with LSL (shift >= 4). For the rest of processors, this is no-op for
+  // performance or correcteness.
----------------
`> 4`

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16620
+  // with LSL (shift >= 4). For the rest of processors, this is no-op for
+  // performance or correcteness.
+  if (isOpcWithIntImmediate(LHS.getNode(), ISD::SHL, LHSImm) &&
----------------
correctness ;)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135208/new/

https://reviews.llvm.org/D135208