<div dir="ltr">Given the performance data and analysis, I suggest submit your original patch that is already reviewed if there are no other objections. As a follow up, please also add GEP depth limit stats and collect some data.<div><br></div><div>thanks,</div><div><br></div><div>David</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 20, 2015 at 9:41 PM, Wei Mi <span dir="ltr"><<a href="mailto:wmi@google.com" target="_blank">wmi@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I did some performance testing for the reassociategep patch proposed<br>
by Mark. The testing result was mixed -- reassociategep patch was<br>
better for some benchmarks and disabing gep merging was better for<br>
some other benchmarks, and combining the two patches worked the best.<br>
We did some perf analysis based on some testcases extracted from those<br>
benchmarks with the largest perf difference between the two patches:<br>
<br>
* The testcase 1.cc in <a href="https://llvm.org/bugs/show_bug.cgi?id=23163" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=23163</a><br>
showed reassociategep patch is needed:<br>
<br>
For the testcase 1.cc, there are some sext generated for array address<br>
computation. Existing CSE/LICM<br>
cannot find the common expression separated by "sext". A special<br>
reassociation is required.<br>
<br>
loop:<br>
... = sext(a + 1) + c // a and c are loop invariant.<br>
...<br>
... = sext(a - 1) + c<br>
end loop.<br>
<br>
CSE/LICM cannot find that "sext(a) + c" is a common expression and a<br>
loop invariant expression. reassociategep does the reassociation so<br>
CSE/LICM can optimize it.<br>
<br>
When I changed the testcase a little bit to remove the sext needed,<br>
disabling gep merging patch generated the same code with<br>
reassociategep patch.<br>
<br>
< int c = p1.m_fn1(), x;<br>
---<br>
> long c = p1.m_fn1(), x;<br>
<br>
* The testcase 2.cc attached showed why disabling gep merging is needed.<br>
The command compiling 2.cc is:<br>
~/workarea/llvm-r234390/build/bin/clang++ -O2 -std=c++11<br>
-fno-omit-frame-pointer -fno-strict-aliasing -funsigned-char -S 2.cc<br>
-o 2.s<br>
<br>
For the kernel loop, the IR before Codegen Prepare using the<br>
reassociategep patch:<br>
<br>
for.body10:<br>
...<br>
%add.ptr.sum = add nsw i64 %mul14, %idx.ext<br>
%add.ptr16 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr.sum<br>
%15 = load i32, i32* %add.ptr16, align 4<br>
...<br>
%add.ptr16.sum = add nsw i64 %add.ptr.sum, %idxprom<br>
%arrayidx19 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr16.sum<br>
%17 = load i32, i32* %arrayidx19, align 4<br>
...<br>
%add.ptr16.sum82 = add nsw i64 %mul22, %add.ptr.sum<br>
%add.ptr24 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr16.sum82<br>
%19 = load i32, i32* %add.ptr24, align 4<br>
%add.ptr24.sum = add nsw i64 %add.ptr16.sum82, %idxprom<br>
%arrayidx28 = getelementptr inbounds i32, i32* %p1, i64 %add.ptr24.sum<br>
%20 = load i32, i32* %arrayidx28, align 4<br>
...<br>
br i1 %exitcond, label %for.cond8.for.end_crit_edge, label %for.body10<br>
<br>
The IR before Codegen Prepare using the disabling gep patch:<br>
<br>
for.body10:<br>
...<br>
%add.ptr = getelementptr inbounds i32, i32* %p1, i64 %idx.ext<br>
...<br>
%add.ptr16 = getelementptr inbounds i32, i32* %add.ptr, i64 %mul14<br>
%15 = load i32, i32* %add.ptr16, align 4<br>
...<br>
%arrayidx19 = getelementptr inbounds i32, i32* %add.ptr16, i64 %idxprom<br>
%17 = load i32, i32* %arrayidx19, align 4<br>
...<br>
%add.ptr24 = getelementptr inbounds i32, i32* %add.ptr16, i64 %mul22<br>
%19 = load i32, i32* %add.ptr24, align 4<br>
%arrayidx28 = getelementptr inbounds i32, i32* %add.ptr24, i64 %idxprom<br>
%20 = load i32, i32* %arrayidx28, align 4<br>
...<br>
br i1 %exitcond, label %for.cond8.for.end_crit_edge, label %for.body10<br>
<br>
gep merging generated several add insns which cannot be removed by<br>
reassociategep patch. It is difficult for reassociation to remove<br>
various redundencies introduced by aggressive gep merging.<br>
<br>
I saw Mark already posted his patch for review:<br>
<a href="http://reviews.llvm.org/D9136" target="_blank">http://reviews.llvm.org/D9136</a>. I will update the patch in next mail to<br>
fix some testcases.<br>
<br>
Thanks,<br>
<div class="HOEnZb"><div class="h5">Wei.<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</div></div></blockquote></div><br></div>