RFC: Tweak heuristics in SimplifyCFG

Kristof Beyls kristof.beyls at arm.com
Mon Feb 9 08:52:40 PST 2015


Hi James,

 

I would find it easier to review this if the patches were in phabricator. Can you upload them to phabricator?

 

Thanks,

 

Kristof

 

From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of James Molloy
Sent: 09 February 2015 16:28
To: Hal Finkel; Geoff Berry
Cc: LLVM Commits
Subject: Re: RFC: Tweak heuristics in SimplifyCFG

 

Hi Hal,

Please find attached two patches implementing what we discussed last week. The code changes are trivial, there is some slight churn in the tests which I have attempted to justify.

 

Cheers,

 

James

 

On Sat Feb 07 2015 at 10:07:56 AM James Molloy <james at jamesmolloy.co.uk> wrote:

Hi Geoff,

The inverse transform is done in CodeGenPrepare, see TargetLoweringInfo::isPredictableSelectExpensive()

 

Cheers,

 

James

 

On Fri Feb 06 2015 at 10:44:04 PM Hal Finkel <hfinkel at anl.gov> wrote:

----- Original Message -----
> From: "Geoff Berry" <gberry at codeaurora.org>
> To: "James Molloy" <james at jamesmolloy.co.uk>, "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
> Sent: Friday, February 6, 2015 4:38:27 PM
> Subject: RE: RFC: Tweak heuristics in SimplifyCFG
>
>
>
>
> Hi James, Hal,
>
>
>
> I’ve been looking into a related issue, and would appreciate your
> input on it. We currently convert branches to selects in
> SimplifyCFG, but we don’t do the opposite anywhere at the IR level
> from what I can tell.
> It seems like it might be beneficial to do
> this conversion in cases where the un-if converted code would be
> rejected by the cost heuristic that was being discussed below. Code
> sinking also comes into play, since ideally we would consider the
> branching case cost with code sunk into the then/else blocks where
> possible/profitable. Come to think of it, it seems like we would
> want to do some sinking before the SimplifyCFG case as well to make
> a better decision.
>
>
>
> Any thoughts on this?
>

I think that I understand what you're saying, but could you provide a couple quick examples? I'm somewhat skeptical because creating branches from selects generically makes optimization and analysis harder. As a result, we generally delay this until CodeGen when our modeling is more precise.

 -Hal

>
>
> Thanks,
>
>
>
>
> --
>
> Geoff Berry
>
> Employee of Qualcomm Innovation Center, Inc.
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
> Linux Foundation Collaborative Project
>
>
>
> From: llvm-commits-bounces at cs.uiuc.edu
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of James Molloy
> Sent: Friday, February 06, 2015 12:01 PM
> To: Hal Finkel
> Cc: LLVM Commits
> Subject: Re: RFC: Tweak heuristics in SimplifyCFG
>
>
>
>
> Hi Hal,
>
>
> That's a really good point, I'm on board with that. I'll cook up a
> patch soon and send it for review.
>
>
>
>
>
> Cheers,
>
>
>
>
>
> James
>
>
>
>
> On Fri Feb 06 2015 at 4:53:44 PM Hal Finkel < hfinkel at anl.gov >
> wrote:
>
>
>
> ----- Original Message -----
> > From: "James Molloy" < james at jamesmolloy.co.uk >
> > To: "LLVM Commits" < llvm-commits at cs.uiuc.edu >
> > Sent: Friday, February 6, 2015 9:00:33 AM
> > Subject: RFC: Tweak heuristics in SimplifyCFG
> >
> > Hi all,
> >
> >
> > I've been looking at why we generate poor code for idiomatic stuff
> > like clamp() and abs().
> >
> >
> > Clamp normally looks like this:
> >
> >
> > T clamp(T a, T b, T c) { return (a < b) ? b : ((a > c) ? c : a); }
> >
> >
> > We currently produce the following IR for this:
> >
> >
> > define i32 @clamp2(i32 %a, i32 %b, i32 %c) #0 {
> > entry:
> > %cmp = icmp sgt i32 %a, %c
> > br i1 %cmp, label %cond.end4, label %cond.false
> >
> >
> > cond.false:
> > %cmp1 = icmp slt i32 %a, %b
> > %cond = select i1 %cmp1, i32 %b, i32 %a
> > br label %cond.end4
> >
> >
> > cond.end4:
> > %cond5 = phi i32 [ %cond, %cond.false ], [ %c, %entry ]
> > ret i32 %cond5
> > }
> >
> >
> > This is multi-block so makes later optimizations more awkward, such
> > as loop vectorization and loop rerolling. SimplifyCFG can convert
> > this into "icmp; select; icmp; select", but doesn't because it has
> > quite a conservative heuristic - it'll only ever hoist one (cheap)
> > instruction into the dominating block.
> >
> >
> > I think this is too conservative - given the potential gains later
> > on
> > in the optimizer from flattening basic blocks (and that
> > CodegenPrepare can remove selects again!) - we should be more
> > aggressive here.
> >
> >
> > My suggestions are:
> > - Up -phi-node-folding-threshold from 1 to 3.
> > - Add "fcmp", "fadd" and "fsub" to the list of cheap instructions
> > to
> > hoist. (fadd and fsub to make abs() work!)
> >
> > Would anyone object to this? I'll have benchmark results on AArch64
> > by the end of the weekend.
>
> This sounds good to be. Regarding the second point, I'd rather that
> SimplifyCFG did not have its own list of cheap instructions (I'm
> referring to ComputeSpeculationCost in
> lib/Transforms/Utils/SimplifyCFG.cpp), but rather used the existing
> TTI interface for this. SimplifyCFG already now uses TTI for other
> things, and I think this is a natural enhancement.
>
> I think that we should call TTI.getUserCost(&I) (which is the same
> interface used by the inliner's cost analysis, the loop unroller,
> etc.), and hoist an unlimited number of instructions marked as
> TargetTransformInfo::TCC_Free and some limited number of
> instructions marked as TCC_Basic. The idea is that the total cost of
> the instructions should equal
> (phi-node-folding-threshold)*(TCC_Basic).
>
> This also provides a natural way to turn off these optimizations for
> fadd, etc. on targets that don't have hardware-implemented floating
> point.
>
> -Hal
>
> >
> >
> > Cheers,
> >
> >
> > James
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150209/cf7bb16b/attachment.html>


More information about the llvm-commits mailing list