[PATCH] D15559: [SCEVExpander] Make findExistingExpansion smarter
Michael Zolotukhin via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 27 16:40:14 PST 2016
mzolotukhin requested changes to this revision.
================
Comment at: lib/Transforms/Utils/LoopUnrollRuntime.cpp:346-347
@@ -343,4 +345,4 @@
// extra iterations = run-time trip count % (loop unroll factor + 1)
Value *TripCount = Expander.expandCodeFor(TripCountSC, TripCountSC->getType(),
PreHeaderBR);
Value *BECount = Expander.expandCodeFor(BECountSC, BECountSC->getType(),
----------------
We can wait for Sanjoy's opinion too, but the patch can't be landed without this issue fixed first anyway.
I slightly modified your testcase to expose the issue:
```
define i32 @test2(i64* %loc, i64 %conv7) {
entry:
%rem0 = load i64, i64* %loc, align 8
%ExpensiveComputation = udiv i64 %rem0, 42 ; <<< Extra computations are added to the trip-count expression
br label %bb1
bb1:
%div11 = udiv i64 %ExpensiveComputation, %conv7
%cmp.i38 = icmp ugt i64 %div11, 1
%div12 = select i1 %cmp.i38, i64 %div11, i64 1
br label %for.body
for.body:
%rem1 = phi i64 [ %rem0, %bb1 ], [ %rem2, %for.body ]
%k1 = phi i64 [ %div12, %bb1 ], [ %dec, %for.body ]
%mul1 = mul i64 %rem1, 48271
%rem2 = urem i64 %mul1, 2147483647
%dec = add i64 %k1, -1
%cmp = icmp eq i64 %dec, 0
br i1 %cmp, label %exit, label %for.body
exit:
%rem3 = phi i64 [ %rem2, %for.body ]
store i64 %rem3, i64* %loc, align 8
ret i32 0
}
```
If we apply the patch and run this test thru `opt`, the loop will be successfully unrolled, *but* we'll generate the code for "expensive" computations. This is IR after unrolling:
```
...
entry:
%rem0 = load i64, i64* %loc, align 8
%ExpensiveComputation = udiv i64 %rem0, 42
br label %bb1
bb1: ; preds = %entry
%div11 = udiv i64 %ExpensiveComputation, %conv7
%cmp.i38 = icmp ugt i64 %div11, 1
%div12 = select i1 %cmp.i38, i64 %div11, i64 1
%0 = udiv i64 %rem0, 42 ; <<<< This is our %ExpensiveComputation, and we just duplicated it!
%1 = udiv i64 %0, %conv7 ; <<<
%2 = icmp ugt i64 %1, 1
%umax = select i1 %2, i64 %1, i64 1
%3 = add i64 %umax, -1
%xtraiter = and i64 %umax, 7
%lcmp.mod = icmp ne i64 %xtraiter, 0
br i1 %lcmp.mod, label %for.body.prol, label %bb1.split
...
```
Please note inserted instruction `%0` and `%1` - they're completely redundant (as we compute the same value, that we have in `%div11`).
So, as I said, it's not sufficient to just fix `findExistingExpransion`, we also need to fix `SCEVExpander::expand`.
http://reviews.llvm.org/D15559
More information about the llvm-commits
mailing list