[llvm-commits] More aggressive speculation in SimplifyCFG

Tue Jan 22 21:34:38 PST 2013

On Tue, Jan 22, 2013 at 8:28 PM, Evan Cheng <evan.cheng at apple.com> wrote:

>
> On Jan 22, 2013, at 4:20 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org>
> wrote:
>
> > On 1/22/2013 5:55 PM, Evan Cheng wrote:
> >>
> >> What are you trying to accomplish with this change? It's hard to reason
> about the cost effectiveness of select formation in LLVM IR. Have you
> looked at the EarlyIfConversion pass?
> >
> > This was written almost a year ago.  I don't know if the
> early-if-conversion existed then, and if it did, then I was unaware of it.
>  This was mainly motivated by source code analysis in the benchmarks that
> we were working on at the time.  The main thing was to linearize the
> control flow, specifically to eliminate small blocks, even if it comes at
> the price of speculatively executing some code.
> > Cost effectiveness is definitely tricky---I don't know how this could be
> made universally applicable to every target.
> >
> > The reason I'm submitting it is basically that we have it, it works for
> us, and maybe someone would find it useful.  I haven't analyzed the early
> if conversion, so I'm not sure how this code compares to it.
>
> The early if-converter is written exactly for this. It uses
> MachineTraceMetrics analysis to estimate CPU resource usage to make the
> decision to speculate and form conditional moves. It subsumes the
> functionality of your patch and has much more sophisticated analysis behind
> it. I don't think it makes sense to add the aggressive speculation code to
> SimplifyCFG.
>

I'm not really arguing for the specifics of this patch, but I do think
there is a place for more aggressive speculation at the IR-level. But
perhaps not for the common reasons people push for more speculation... ;]

There are a number of IR-level optimizations which are significantly aided
by forming a flatter CFG with simple value selection. An obvious is the DAG
building and combining being able to more effectively combine across basic
block boundaries. There are similar issues with instcombine, instsimplify,
etc, where we do a BB-local scan. When we detect very simple CFG patterns
that can be replaced with flat code, minimal speculation and selects
instead of phi nodes, I think it is often a Good Thing to do in order to
simplify downstream optimizations.

Now, that said, I completely agree that the IR optimizers are hopeless at
actually figuring out when to emit speculated code with cmov (or
predication, or whatever the technique is). The early if-converter (or any
passes built on top of the trace metrics and infrastructure you mentioned)
are the right places to do that.

What would make the most sense to me is for the IR optimizers to try to
find the right mixture of speculation to simplify the CFG for other IR
optimizations, and early if-conversion could both form a CFG for cases
where speculation and cmov is the wrong fit, and flatten the CFG when it is
the correct fit. Does the logic to expand a IR select formation into a CFG
exist yet?

For context, I'm actually working on making the IR speculation more
powerful and able to handle this, but until the above issue is clearly
handled in the backend, I hadn't planned on increasing the threshold for
speculating in the IR at all...

There also appears to be some problems in early if-conversion as code in
simple, common compression loops for clamping values seems to end up as a
CFG rather than a cmov currently, and as a consequence runs significantly
slower in my benchmarks of zlib and the like. While this might be "fixed"
by making the IR level speculation better, it seems likely to indicate an
actual issue with the underlying if-conversion pass...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130122/1a97a8d3/attachment.html>