[llvm] [AArch64][Machine-Combiner] Split loads into lanes of neon vectors into multiple vectors when possible (PR #142941)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 16 01:21:38 PDT 2025
================
@@ -7317,6 +7319,57 @@ static bool getMiscPatterns(MachineInstr &Root,
return false;
}
+/// Search for patterns where we use LD1i32 instructions to load into
+/// 4 separate lanes of a 128 bit Neon register. We can increase ILP
+/// by loading into 2 Neon registers instead.
+static bool getLoadPatterns(MachineInstr &Root,
+ SmallVectorImpl<unsigned> &Patterns) {
+ const MachineRegisterInfo &MRI = Root.getMF()->getRegInfo();
+ const TargetRegisterInfo *TRI =
+ Root.getMF()->getSubtarget().getRegisterInfo();
+ // Enable this only on Darwin targets, where it should be profitable. Other
+ // targets can remove this check if it is profitable there as well.
+ if (!Root.getMF()->getTarget().getTargetTriple().isOSDarwin())
----------------
davemgreen wrote:
CPU tuning subtarget features are the preferred way to tune for different systems. MachineCombiner can use scheduling depths to calculate when it should be profitable, so you might be able to enable it more generally. It feels like it might be OK as a DAG combine to be honest, although it would add more instructions IIUC.
https://github.com/llvm/llvm-project/pull/142941
More information about the llvm-commits
mailing list