[PATCH] D12116: [AArch64] Improve load/store optimizer to handle LDUR + LDR.

Mon Aug 31 18:00:23 PDT 2015

ab added a comment.

This LGTM (modulo nits), but let's see what the testsuite says first.

In http://reviews.llvm.org/D12116#231366, @mzolotukhin wrote:

> As far as I understand, currently ISelLowering splits the unaligned stores, from which we happen to get STUR and STR, which we can't combine to STP without this patch. With this patch, we'll be able to merge them back, so we'll undo that optimization.


But this issue isn't specific to this patch, right? If the unaligned store was split to STR+STR we would have generated an STP even before this change.  I agree we'll need to do something about this though, but separately, and for both mixed and non-mixed STR/STUR pairs.


================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:450
@@ +449,3 @@
+  // We're trying to pair instructions that differ in how they are scaled.
+  // If FirstMI is scaled then scale the offset of MI accordingly.
+  // Otherwise, do the opposite (i.e., make Paired's offset unscaled).
----------------
MI -> Paired, FirstMI -> I ?

================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:607-610
@@ +606,6 @@
+static bool canMergeOpc(unsigned Opc, unsigned PairOpc, LdStPairFlags &Flags) {
+  bool CanMergeOpc = Opc == PairOpc;
+  // Opcodes match nothing more to check.
+  if (CanMergeOpc)
+    return true;
+
----------------
What about removing CanMergeOpc? It's assigned to several times with different values, and all could be folded into ifs/returns.

================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:619-620
@@ +618,4 @@
+  // Opc will be the first instruction in the pair.
+  CanMergeOpc = NonSExtOpc == getMatchingNonSExtOpcode(PairOpc);
+  if (CanMergeOpc) {
+    Flags.setSExtIdx(NonSExtOpc == (unsigned)Opc ? 1 : 0);
----------------
Same: fold the condition into the if, remove CanMergeOpc?

================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:627-629
@@ +626,5 @@
+  bool IsUnscaled = isUnscaledLdSt(Opc);
+  unsigned NewOpc =
+      IsUnscaled ? getScaledFromUnscaled(Opc) : getUnscaledFromScaled(Opc);
+  CanMergeOpc = NewOpc == PairOpc;
+  if (CanMergeOpc)
----------------
I see this is the only use of the get*From* helpers.  What about:

    getMatchingPairOpcode(Opc) == getMatchingPairOpcode(PairOpc)

That'll let us get rid of the added functions.

================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:630-633
@@ +629,6 @@
+  CanMergeOpc = NewOpc == PairOpc;
+  if (CanMergeOpc)
+    return true;
+
+  return false;
+}
----------------
Same: fold the condition into the return, remove CanMergeOpc?

================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:721
@@ -619,4 +720,3 @@
         // is out of range for a pairwise instruction, bail and keep looking.
-        bool MIIsUnscaled = isUnscaledLdSt(MI);
-        if (!inBoundsForPair(MIIsUnscaled, MinOffset, OffsetStride)) {
+        if (!inBoundsForPair(IsUnscaled, MinOffset, OffsetStride)) {
           trackRegDefsUses(MI, ModifiedRegs, UsedRegs, TRI);
----------------
To make sure I understand: the previous code was "incorrect" in checking MI rather than FirstMI, but it didn't make a difference because we didn't allow mismatched opcodes, right?

================
Comment at: test/CodeGen/AArch64/ldp-stp-scaled-unscaled-pairs.ll:8
@@ +7,3 @@
+define void @test1(float* %ptr, <4 x float> %v1, <4 x float> %v2) {
+entry:
+  %tmp1 = bitcast float* %ptr to <4 x float>*
----------------
Kill the "entry:"s ?


Repository:
  rL LLVM

http://reviews.llvm.org/D12116