<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 12/24/2015 01:13 AM, Chandler
Carruth via llvm-commits wrote:<br>
</div>
<blockquote
cite="mid:differential-rev-PHID-DREV-wbh6cxhxv4zn2bs5uwjt-req@reviews.llvm.org"
type="cite">
<pre wrap="">chandlerc created this revision.
chandlerc added reviewers: craig.topper, majnemer.
chandlerc added a subscriber: llvm-commits.
The original movitavion was the following code pattern coming out of
Clang itself:
define i64 @_ZNK5clang4Sema12getExprRangeEPNS_4ExprE(%"class.clang::Sema"* nocapture readnone %this, %"class.clang::Expr"* %E) {
%1 = icmp eq %"class.clang::Expr"* %E, null
br i1 %1, label %6, label %2
; <label>:2 ; preds = %0
%3 = bitcast %"class.clang::Expr"* %E to %"class.clang::Stmt"*
%4 = tail call i64 @_ZNK5clang4Stmt14getSourceRangeEv(%"class.clang::Stmt"* %3)
%5 = and i64 %4, -4294967296
%phitmp = and i64 %4, 4294967295
br label %6
; <label>:6 ; preds = %0, %2
%.sroa.3.0 = phi i64 [ %5, %2 ], [ 0, %0 ]
%.sroa.0.0 = phi i64 [ %phitmp, %2 ], [ 0, %0 ]
%7 = or i64 %.sroa.0.0, %.sroa.3.0
ret i64 %7
}
This really should simplify away, but it doesn't. </pre>
</blockquote>
Hm, this is an interesting example. It's worth noting that the OR
can constant fold on the %0,%6 edge. This really looks to me like a
partial redundancy elimination case. The value of %7 is known along
one edge (via constant folding), so we can push it back into the
other incoming edge. The current performScalarPRE code in GVN.cpp
doesn't handle this, but that's the framing that immediately jumps
to mind. <br>
<br>
Another possibility is to frame this as tail duplication. In fact,
why doesn't our existing tail duplication logic get this specific
case? It definitely should. <br>
<br>
Another approach is not to hoist the or, but to sink the two ands.
In this particular case, sinking the and past the phi doesn't change
the result at all (since we know the other input is a constant
zero.) This would produce inferior code if the %0,%6 case is the
dominate one, but maybe this would work either as a PGO or a
canonical form we later reverse? It reminds me a lot of PHI to
Select canonicalization actually. <br>
<br>
I also noticed a couple of spiritually similar optimizations in
InstCombine around Selects. For instance:<br>
// add (select X 0 (sub n A)) A --> select X A n<br>
and foldOpIntoSelect in general..<br>
<blockquote
cite="mid:differential-rev-PHID-DREV-wbh6cxhxv4zn2bs5uwjt-req@reviews.llvm.org"
type="cite">
<pre wrap="">My first attempt to
help teach LLVM about these kinds of patterns is this patch. It doesn't
handle the above case (where two separate PHI nodes would both need to
be simplified at the same time), but it handles many other cases.</pre>
</blockquote>
Haven't looked at the actual patch yet. Won't get a chance to do
that until next week at the earliest.
<blockquote
cite="mid:differential-rev-PHID-DREV-wbh6cxhxv4zn2bs5uwjt-req@reviews.llvm.org"
type="cite">
<pre wrap="">
When using a Clang with this patch to build the LLVM test suite, we
fully simplify approximately 2000 phi nodes, and we hoist a user into
one predecessor (simplifying the other predecessors) 57 times.
I'm still not sure this is the right approach. It is fairly heavy weight
(as you'll see) and I don't see a good way to extend its current
strategy to handle hoisting around multiple phi nodes. =/
I'm thinking of trying a different approach which would be a generic
combine firing on users of phi nodes, and try to hoist that user into
a predecessor to expose more simplifications. This would for example be
able to handle cases where both operands were phi nodes. But this
wouldn't completely obviate the patch I have here -- there are cases
that this will fire on that the other might not.
The other overall concern I have about this is that it really needs to
be "late" in instcombine. We only want to hoist around phi nodes after
every user of the instruction has already been combined and the current
instruction's other combines have already been exhausted. Unfortunately
this is the exact opposite order of combining from what instcombine
usually does. I'm not really sure how to address this.
</pre>
</blockquote>
I'm not sure I really follow your ordering concern. At least in
principal, any of the transforms that would have triggered should be
able to hoist themselves into the predecessor and then trigger. (At
least when the other input is constant.) Do you have particular
examples in mind here? I may be missing the obvious. <br>
<blockquote
cite="mid:differential-rev-PHID-DREV-wbh6cxhxv4zn2bs5uwjt-req@reviews.llvm.org"
type="cite">
<pre wrap="">
Thoughts and suggestions on the issue at large very welcome!
<a class="moz-txt-link-freetext" href="http://reviews.llvm.org/D15768">http://reviews.llvm.org/D15768</a>
Files:
lib/Transforms/InstCombine/InstCombinePHI.cpp
test/Transforms/InstCombine/select.ll
test/Transforms/InstCombine/shift-sra.ll
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
llvm-commits mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a>
</pre>
</blockquote>
<br>
</body>
</html>