[PATCH] Remove SCEVCache and FindConstantPointers from complete loop unrolling heuristic.

Thu Jun 4 17:14:00 PDT 2015

As Andy said, really nice use of SCEV. This is exactly the direction I was hoping for here.

Detailed comments below. Mostly this is about sorting out minor details, naming, and how to cache these things.


================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:281-288
@@ +280,10 @@
+  /// \brief A cache of pointer bases corresponding to GEP (or derived from GEP)
+  /// instructions.
+  ///
+  /// In order to find the base pointer one needs to perform non-trivial
+  /// traversal of the corresponding SCEV expression, so it's good to have the
+  /// results saved.
+  /// The instruction doesn't have to be GetElementPtrInst - for instance, it
+  /// could be PHINode. That's why we use a generic Instruction type here.
+  SmallDenseMap<Instruction *, Value *> GEPPointerBases;
+
----------------
Based on this (very nice) comment, the name should just be "PointerBases"... Or even better, BasePointers. But see below...

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:315
@@ +314,3 @@
+  /// it instead.
+  bool SCEVSimplify(Instruction *I) {
+    if (!SE.isSCEVable(I->getType()))
----------------
I think a better name for this would be 'simplifyInstWithSCEV'

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:326
@@ +325,3 @@
+    if (auto *AR = dyn_cast<SCEVAddRecExpr>(S)) {
+      const SCEV *IterationNumber = SE.getConstant(APInt(64, Iteration));
+      const SCEV *ValueAtIteration =
----------------
I'd compute this as part of the constructor (and thus once).

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:338-339
@@ +337,4 @@
+        return false;
+      if (auto *Offset =
+              dyn_cast<SCEVConstant>(SE.getMinusSCEV(ValueAtIteration, Base))) {
+        GEPPointerBases[I] = Base->getValue();
----------------
For pointer types, is this the byte difference or are the units in something other than bytes?

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:340-341
@@ +339,4 @@
+              dyn_cast<SCEVConstant>(SE.getMinusSCEV(ValueAtIteration, Base))) {
+        GEPPointerBases[I] = Base->getValue();
+        SimplifiedValues[I] = Offset->getValue();
+        return true;
----------------
I don't think this quite works.

The offset value, while constant, is not the simplification of this instruction. For example, if the offset is 2, and we see 'icmp eq' of I and the constant '2', it shouldn't be able to constant fold.

I think you want to store a mapping from instruction to a pair of Value* and Constant*, "BasePointersAndOffsets" or some such. Not sure of a good name here.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:361-362
@@ -487,2 +360,4 @@
   bool visitBinaryOperator(BinaryOperator &I) {
+    if (SCEVSimplify(&I))
+      return true;
     Value *LHS = I.getOperand(0), *RHS = I.getOperand(1);
----------------
Is this better than constant folding? It's not clear to me that it is, it seems like it might be much more expensive.

I think I would change the *end* of this function to 'if (SimpleV) return true;' and then add a final return that was essentially 'return Base::visitBinaryOperator(I);' so its clear you're just delegating.

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:387-389
@@ -511,8 +386,5 @@
     Value *AddrOp = I.getPointerOperand();
-    if (!isa<Constant>(AddrOp))
-      if (Constant *SimplifiedAddrOp = SimplifiedValues.lookup(AddrOp))
-        AddrOp = SimplifiedAddrOp;
-
     auto *GEP = dyn_cast<GetElementPtrInst>(AddrOp);
     if (!GEP)
       return false;
+
----------------
This shouldn't be needed now, right?

================
Comment at: lib/Transforms/Scalar/LoopUnrollPass.cpp:407-411
@@ -533,9 +406,7 @@
 
     // This calculation should never overflow because we bound Iteration quite
     // low and both the start and step are 32-bit integers. We use signed
     // integers so that UBSan will catch if a bug sneaks into the code.
     int ElemSize = CDS->getElementType()->getPrimitiveSizeInBits() / 8U;
-    int64_t Index = ((int64_t)OptionalGEPDesc->Start +
-                     (int64_t)OptionalGEPDesc->Step * (int64_t)Iteration) /
-                    ElemSize;
+    int64_t Index = SimplifiedAddrOp->getLimitedValue() / ElemSize;
     if (Index >= CDS->getNumElements()) {
----------------
The comment above doesn't make much sense any more. =]

I'd also consider bailing or asserting if the offset requires lots of bits here rather than limiting it.

http://reviews.llvm.org/D10205

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/