[PATCH] D89894: [AArch64] Backedge indexing

Wed Oct 21 09:19:16 PDT 2020

samparker created this revision.
samparker added reviewers: SjoerdMeijer, dmgreen, fhahn, efriedma, sanwou01.
Herald added subscribers: danielkiss, arphaman, hiraditya, kristof.beyls.
Herald added a project: LLVM.
samparker requested review of this revision.

Bit of a brain dump because I was seeing the same problems with addressing modes in unrolled loops and is completely related to what @SjoerdMeijer is currently working on in D89693 <https://reviews.llvm.org/D89693> and I doubt I will have time to look more into this...

For the benchmark that I am looking at, the total size shrinks, but there seems to be a problem because we no longer generate the LDPs, (which I presume this is just a current limitation of the AArch64LoadStoreOptimizer?):

  < 	ldp	q0, q2, [x2, #-16]
  < 	ldp	q1, q3, [x4, #-16]
  < 	subs	x5, x5, #8                      // =8
  < 	add	x4, x4, #32                     // =32
  < 	add	x2, x2, #32                     // =32
  < 	fmul	v0.4s, v0.4s, v1.4s
  < 	fmul	v2.4s, v2.4s, v3.4s
  < 	ldp	q1, q3, [x3, #-16]
  < 	fadd	v0.4s, v1.4s, v0.4s
  < 	fadd	v1.4s, v3.4s, v2.4s
  < 	stp	q0, q1, [x3, #-16]
  < 	add	x3, x3, #32                     // =32
  ---
  > 	ldr	q0, [x5, #32]!
  > 	subs	x27, x27, #8                    // =8
  > 	ldur	q1, [x5, #-16]
  > 	ldr	q2, [x7, #32]!
  > 	ldur	q3, [x7, #-16]
  > 	ldr	q4, [x6, #32]!
  > 	fmul	v0.4s, v0.4s, v2.4s
  > 	fmul	v1.4s, v1.4s, v3.4s
  > 	ldr	q2, [x6, #16]
  > 	fadd	v1.4s, v4.4s, v1.4s
  > 	fadd	v0.4s, v2.4s, v0.4s
  > 	stp	q1, q0, [x6]


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D89894

Files:
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h


Index: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
===================================================================

--- llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+++ llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
@@ -231,6 +231,8 @@
     }
   }
 
+  bool shouldFavorBackedgeIndex(const Loop *L) const;
+
   unsigned getGISelRematGlobalCost() const {
     return 2;
   }
Index: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -16,6 +16,7 @@
 #include "llvm/CodeGen/TargetLowering.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/IntrinsicsAArch64.h"
+#include "llvm/IR/Use.h"
 #include "llvm/Support/Debug.h"
 #include <algorithm>
 using namespace llvm;
@@ -972,6 +973,36 @@
   return Considerable;
 }
 
+bool AArch64TTIImpl::shouldFavorBackedgeIndex(const Loop *L) const {
+  // This optimisation will generally introduce base address modifying
+  // instruction(s) into the preheader and is only really useful for
+  // unrolled loops, and we don't generally do when optimising for size.
+  if (L->getHeader()->getParent()->hasOptSize() ||
+      L->getNumBlocks() != 1)
+    return false;
+
+  // Find pointers with multiple uses within the loop.
+  DenseMap<Value *, unsigned> NumPointerUses;
+  for (auto &I : *L->getHeader()) {
+    if (I.getType()->isPointerTy())
+      NumPointerUses[&I] = 0;
+
+    for (auto &Use : I.operands()) {
+      if (!Use->getType()->isPointerTy())
+        continue;
+      if (NumPointerUses.count(Use))
+        NumPointerUses[Use]++;
+      else
+        NumPointerUses[Use] = 0;
+    }
+  }
+
+  return std::any_of(NumPointerUses.begin(), NumPointerUses.end(),
+                      [](detail::DenseMapPair<Value *, unsigned> Pair) {
+                        return Pair.second > 1;
+                      });
+}
+
 bool AArch64TTIImpl::useReductionIntrinsic(unsigned Opcode, Type *Ty,
                                            TTI::ReductionFlags Flags) const {
   auto *VTy = cast<VectorType>(Ty);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D89894.299711.patch
Type: text/x-patch
Size: 2177 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201021/528fb9ae/attachment.bin>