[llvm] [AArch64][GlobalISel] Split offsets of consecutive stores to aid STP … (PR #66980)

Thu Sep 21 07:06:40 PDT 2023

================
@@ -492,7 +507,192 @@ bool AArch64PostLegalizerCombiner::runOnMachineFunction(MachineFunction &MF) {
                      F.hasMinSize());
   AArch64PostLegalizerCombinerImpl Impl(MF, CInfo, TPC, *KB, CSEInfo,
                                         RuleConfig, ST, MDT, LI);
-  return Impl.combineMachineInstrs();
+  bool Changed = Impl.combineMachineInstrs();
+
+  auto MIB = CSEMIRBuilder(MF);
+  MIB.setCSEInfo(CSEInfo);
+  Changed |= optimizeConsecutiveMemOpAddressing(MF, MIB);
----------------
aemerson wrote:

> Just wondering, do you need to do this because the combiner isn't powerful enough/is missing some feature to make this a normal rule (what feature?), or just because this is a separate transform that could technically be in another pass (but it's not worth creating a pass just for that)?

Its the latter, we're just piggy-backing off the existing combiner pass. There's a few reasons for doing that:

1. We need this to run after reassociations/ptradd_immed_chain since this is undoing the effect of those.
2. Since this optimization wants to look at other, non-use/def instructions in the function, doing so with manual C++ let's us make sure we have predictable linear complexity.
3. We also want a predictable top-down inst visitation order, instead of running to a fixed point.

https://github.com/llvm/llvm-project/pull/66980