[PATCH] D99272: [AArch64] Adds a pre-indexed paired Load/Store optimization for LDR-STR.

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 21 03:00:10 PDT 2021


dmgreen added a comment.

I presume that everything that uses hasUnscaledLdStOffset is still OK? It would either handle pre loads/stores or not reach them as pre loads/stores?



================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:2235
+
+  switch (MI.getOpcode()) {
+  case AArch64::LDRWpre:
----------------
Can these use isPreLd?


================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:2257
   // Make sure this is a reg/fi+imm (as opposed to an address reloc).
   assert((MI.getOperand(1).isReg() || MI.getOperand(1).isFI()) &&
          "Expected a reg or frame index operand.");
----------------
Is this assert valid/sensible for PreInc?


================
Comment at: llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:1044
+  // Adds the pre-index operand for pre-indexed ld/st pairs.
+  if (isPreLdSt(*RtMI) && (getPreIndexedOpcode(Rt2MI->getOpcode()) == Opc))
+    MIB.addReg(BaseRegOp.getReg(), RegState::Define);
----------------
Does this need the second condition in the if? Should that always be true if the first part is a pre?


================
Comment at: llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:1669
+      // above.
       if (BaseReg == MIBaseReg && ((Offset == MIOffset + OffsetStride) ||
+                                   (Offset + OffsetStride == MIOffset) ||
----------------
stelios-arm wrote:
> dmgreen wrote:
> > Why does this not already handle the combining of PreLdStPair?
> > 
> > The existing code can combine in both directions. Presumably it's only valid for forward now?
> For example, the following instructions:
> 
> ```
> str w1 [x0, #20]!
> str w2 [x0, #4]
> ```
> Can be paired to:
> 
> ```
> Stp w1, w2, [x0, #20]!
> ```
> The offset of the first and second instruction is `20` and `4`, respectively. The offset stride is `4`. Therefore, the check `(Offset == MIOffset + OffsetStride)` and `(Offset + OffsetStride == MIOffset)` will return `false`.
> 
> That’s is why it’s needed. And yes, for such cases it’s only valid for forward now, since the order of the instructions matters for this optimization.
OK, but it looks like the existing `(Offset == MIOffset + OffsetStride)` conditions could be true for preinc where we don't want them to be.  Can we switch it around to be something like:
```
bools IsPreLdSt = isPreLdStPairCandidate(..)
if (!IsPreLdSt) {
  check conditions else continue
} else {
  check pre conditions else continue
}
```

That way we don't need the extra indenting, and the conditions don't get muddled together.


================
Comment at: llvm/test/CodeGen/AArch64/ldrpre-ldr-merge.mir:400
+    ; CHECK: RET undef $lr
+   /*
+      The offset of the second load is not equal to the
----------------
Comments can be added here with `;`


================
Comment at: llvm/test/CodeGen/AArch64/strpre-str-merge.mir:267
+    ; CHECK: RET undef $lr, implicit $x0
+    early-clobber renamable $x0 = STRSpre killed renamable $s0, killed renamable $x0, 257 :: (store 4)
+    STRSui killed renamable $s1, renamable $x0, 1 :: (store 4)
----------------
-257?

If the strp has to be aligned, is it worth adding a test for +/-260 too?


================
Comment at: llvm/test/CodeGen/AArch64/strpre-str-merge.mir:328
+
+    STRXui killed renamable $x1, renamable $x1, 1 :: (store 16)
+
----------------
store 4


================
Comment at: llvm/test/CodeGen/AArch64/strpre-str-merge.mir:357
+    early-clobber renamable $x0 = STRSpre killed renamable $s0, killed renamable $x0, 12 :: (store 4)
+    early-clobber renamable $x0 = STRSpre killed renamable $s1, killed renamable $x0, 12 :: (store 4)
+    RET undef $lr, implicit $x0
----------------
What happens if this is 16? (Or 4)


================
Comment at: llvm/test/CodeGen/AArch64/strpre-str-merge.mir:389
+    RET undef $lr, implicit $x0
+...
----------------
I tried to come up with a list of tests. You have most of the covered, I also came up with these, some of which might be good to make sure are covered:
Given ldrqpre a, [b, c]   and ldrqui d, [e, f]
 - q with d
 - load with store
 - a == b? No sure what happens then
 - b != e. That's probably tested naturally anyway.
 - second instruction is ldruqui.

There were some others but they sound less useful.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99272/new/

https://reviews.llvm.org/D99272



More information about the llvm-commits mailing list