[PATCH] D15559: [SCEVExpander] Make findExistingExpansion smarter

Wed Jan 20 21:19:48 PST 2016

flyingforyou added inline comments.

================
Comment at: lib/Analysis/ScalarEvolutionExpander.cpp:1863-1864
@@ -1838,4 +1862,4 @@
     if (!match(BB->getTerminator(),
-               m_Br(m_ICmp(Pred, m_Instruction(LHS), m_Instruction(RHS)),
-                    TrueBB, FalseBB)))
+               m_Br(m_ICmp(Pred, m_Instruction(LHS), m_Value(RHSV)), TrueBB,
+                    FalseBB)))
       continue;
----------------
mzolotukhin wrote:
> I don't understand why we need to break symmetry in this code. Anyway, if we do this, we should either add an assert that the opposite situation never happens, or properly handle both cases. The fact that we've only seen one case is not a justification for not handling the other one.
> 
> IOW, why can't LHS be Value, and RHS be Instruction?
I think LHS can be Value. But what I know is above pattern can cover most of cases.


```
    if (match(BB->getTerminator(),
               m_Br(m_ICmp(Pred, m_Value(LHSV), m_Value(RHSV)), TrueBB,
                    FalseBB))) {
      dbgs() << "m_Value(LHSV), m_Value(RHSV) pattern\n";
      if (match(BB->getTerminator(),
                 m_Br(m_ICmp(Pred, m_Instruction(LHS), m_Instruction(RHS)), TrueBB,
                      FalseBB)))
        dbgs() << "m_Instruction(LHS), m_Instruction(RHS) pattern\n";
      else if (match(BB->getTerminator(),
                 m_Br(m_ICmp(Pred, m_Instruction(LHS), m_Value(RHSV)), TrueBB,
                      FalseBB)))
        dbgs() << "m_Instruction(LHS), m_Value(RHSV) pattern\n";
      else
        dbgs() << "unknown(LHS), unknown(RHS) pattern\n";

    }
```

When I test above code, I can't find "unknown(LHS), unknown(RHS) pattern\n" on building big benchmarks programs including LLVM itself.



|m_Value(LHSV), m_Value(RHSV) pattern |50650
|m_Instruction(LHS), m_Instruction(RHS) pattern|30178
|m_Instruction(LHS), m_Value(RHSV) pattern |20472

So I think it's no necessary to consider `LHS `is `Value` now.

Actually, we don't make compare instruction likes `cmp constant, reg`...

================
Comment at: lib/Transforms/Utils/LoopUnrollRuntime.cpp:346-347
@@ -343,4 +345,4 @@
   //  extra iterations = run-time trip count % (loop unroll factor + 1)
   Value *TripCount = Expander.expandCodeFor(TripCountSC, TripCountSC->getType(),
                                             PreHeaderBR);
   Value *BECount = Expander.expandCodeFor(BECountSC, BECountSC->getType(),
----------------
mzolotukhin wrote:
> This should be in the same patch, otherwise the change will only paper over the issue, and lead to actually worse code in the end. If `expandCodeFor` isn't updated, it'll fail to reuse the existing value, so it'll generate yet another division. The idea of the patch is to teach SCEV to reuse existing expressions, not just hack around the high cost of division. The division should still be considered expensive, so we definitely shouldn't generate redundant ones.
> 
Michael, I am not an expert of SCEV. I think you might want to get review from Sanjoy.

Sanjoy, How about applying below patch on this?

```
-    if (SE.getSCEV(LHS) == S && SE.DT.dominates(LHS, At))
+    const SCEV *LHSSE = SE.getSCEV(LHS);
+    if (LHSSE->getType() == S->getType() &&
+        isa<SCEVConstant>(SE.getMinusSCEV(LHSSE, S)) &&
+        SE.DT.dominates(LHS, At))
       return LHS;
 
     if (Instruction *RHS = dyn_cast<Instruction>(RHSV)) {
-      if (SE.getSCEV(RHS) == S && SE.DT.dominates(RHS, At))
+      const SCEV *RHSSE = SE.getSCEV(RHS);
+      if (RHSSE->getType() == S->getType() &&
+          isa<SCEVConstant>(SE.getMinusSCEV(RHSSE, S)) &&
+          SE.DT.dominates(RHS, At))
         return RHS;
     }
```

================
Comment at: test/Transforms/LoopUnroll/high-cost-trip-count-computation.ll:27-28
@@ -26,1 +26,4 @@
 
+;; We expect this loop to be unrolled, because IV's step is minus one.
+;; In this case, we don't need to generate dvision for computing trip count.
+
----------------
mzolotukhin wrote:
> Maybe something like this?
> ```
> Though SCEV for loop tripcount contains division, it shouldn't be considered expensive, since the division already exists in the code and we don't need to expand it once more. Thus, it shouldn't prevent us from unrolling the loop.
> ```
> Also, I think I don't completely understand why we care so much about step being -1. Why is it important?
Cool. Michael. 
I will modify comments as you said.

`-1` indicates that tripcount doesn't need to be divided for calculation when loop unrolling.
Anyway thanks for good sugesstion.

================
Comment at: test/Transforms/LoopUnroll/high-cost-trip-count-computation.ll:53
@@ -27,1 +52,2 @@
+
 !0 = !{i64 1, i64 100}
----------------
mzolotukhin wrote:
> This metadata is used as `!range !0` in `@test` and as `!tbaa !0` in `@test2`. Are you sure this is correct?
I don't think that makes errors. But I got your point. Thanks.
I will modify `!tbaa !0 ` to `!tbaa !1`
and add `!1 = !{i64 0}`

Is this ok?


http://reviews.llvm.org/D15559