[llvm-dev] [RFC] Delaying phi-to-select transformation until later in the pass pipeline
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Wed Sep 12 14:53:57 PDT 2018
Sorry for the delayed response.
On 08/23/2018 10:01 AM, John Brawn wrote:
>
> > Here's another motivating bug where early transform to select
> prevents optimization:
>
> > https://bugs.llvm.org/show_bug.cgi?id=36760
>
> > Is that case affected by this patch?
>
> From a quick reading of that bug it looks like CVP needs to be
> enhanced to do a certain transformation, but the presence
> of a select means that even with that enhancement it wouldn’t do
> anything. With this patch we have a phi and so I think
> that if the CVP enhancement were done it should work.
>
> >>> 1. LICM does not know how to hoist or sink regions. It does know
> how to hoist and sink selects.
> >>I think we should teach LICM to do this eventually.
> >Agreed. I just don't currently see patches to do so. Once LICM
> supports region hoisting, much of my concern disappears.
>
> I’ve looked into what happens with LICM with this change, and it looks
> like everything except the phi gets hoisted, then
> later the phi is turned into a select, then a later LICM pass hoists
> the select. For the very simple tests I tried we end up with
> the same result in the end, but I wouldn’t be surprised if indeed LICM
> not doing this makes things worse in more complex
> cases.
>
> I’ve hacked together a modification to LICM to make it able to hoist
> phis (by essentially hoisting entire blocks instead of
> moving everything into a single preheader block) and it seemed to work
> OK, so I’ll get to work on getting that working
> properly and committed before I touch phi-to-select transformation.
>
This would be a really useful extension to LICM. I look forward to the
patch. I was half considering an approach recently where LICM itself
did phi-to-select if all of the operands to the resulting phi would be
loop invariant. It was just ugly enough I didn't move forward with it,
but we have seen cases where it would resolve pass ordering problems
leading to strictly better results overall. I suspect your region
hoister would have the same overall effect.
> > InstCombine has limited support for triangles/diamonds, but fairly
> extensive support for selects.
>
> Currently I have InstCombine running directly after phis have been
> turned into selects, so this should only be a problem if
> something before that relies on the select instcombine having already
> happened. I’ll look into that to see if it is indeed a
> problem.
>
> >>>EarlyCSE and GVN do not know how to eliminate fully redundant
> triangles/diamonds. PRE *may* get some of these cases, but that is by
> no means guaranteed or likely to be robust.
> >>Agreed, we’ll need a plan to deal with these issues.
> >Again, I'd love to see these issues fixed. Once that's done, the
> convert about phi vs select as canonical form is much less relevant.
>
> I’ll look into this.
>
> > For the one of the specific cases mentioned, teaching GVN to do FRE
> and PRE for loads from selects of pointers just doesn't seem that
> painful. I would be really tempted to do that
> > instead. Similarly, walking facts back from select uses in CVP
> seems generally useful since we have use dependent facts in a bunch of
> contexts, not just selects. (Call arguments for
> > instance, non-null from unconditional deref, etc..)
>
> Actually making GVN able to handle selects being painful is exactly
> the reason I’m proposing doing this. GVN has a hard
> constraint that a single block can only have a single available value,
> and attempting to change that (which you have to do
> when selecting between values defined in the same block) causes a ton
> of problems that have to be resolved.
>
> John
>
> *From:*Philip Reames [mailto:listmail at philipreames.com]
> *Sent:* 17 August 2018 21:17
> *To:* Amara Emerson; Hal Finkel; John Brawn
> *Cc:* Sanjay Patel; llvm-dev at lists.llvm.org; nd
> *Subject:* Re: [llvm-dev] [RFC] Delaying phi-to-select transformation
> until later in the pass pipeline
>
> On 08/17/2018 09:02 AM, Amara Emerson wrote:
>
> On Aug 15, 2018, at 10:57 PM, Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> On 08/15/2018 02:38 PM, Philip Reames via llvm-dev wrote:
>
> I'm concerned that we're focusing on one side of this.
> Let me point out a few concerns w/changing the canonical
> form here:
>
> 1. LICM does not know how to hoist or sink regions. It
> does know how to hoist and sink selects.
>
> I think we should teach LICM to do this eventually.
>
> Agreed. I just don't currently see patches to do so. Once LICM
> supports region hoisting, much of my concern disappears.
>
> 2. InstCombine has limited support for triangles/diamonds,
> but fairly extensive support for selects.
> 3. EarlyCSE and GVN do not know how to eliminate fully
> redundant triangles/diamonds. PRE *may* get some of these
> cases, but that is by no means guaranteed or likely to be
> robust.
>
> Agreed, we’ll need a plan to deal with these issues.
>
> Again, I'd love to see these issues fixed. Once that's done, the
> convert about phi vs select as canonical form is much less relevant.
>
> My personal opinion is that selects are the appropriate
> canonical form.
>
> For the one of the specific cases mentioned, teaching GVN to
> do FRE and PRE for loads from selects of pointers just doesn't
> seem that painful. I would be really tempted to do that
> instead. Similarly, walking facts back from select uses in
> CVP seems generally useful since we have use dependent facts
> in a bunch of contexts, not just selects. (Call arguments for
> instance, non-null from unconditional deref, etc..)
>
> To be clear, I am raising concerns, not actively objecting to
> this. If you want to move forward and commit to work through
> all of the issues identified I wont actively stand in the way.
>
>
> As I've expressed in the past, I think that not using select as
> part of our canonical form is potentially a superior design.
> However, it would be a major change. In addition to the issues
> that Philip mentions, there's also the fact that we'll just have
> more basic blocks and that, in itself, could lead to an increase
> in compile time. However, working through these issues will likely
> leave us with a more-robust optimizer.
>
> See also: https://bugs.llvm.org/show_bug.cgi?id=34603#c19
>
> Canonicalizing to phis is my preference too.
>
>
> -Hal
>
>
> Philip
>
> On 08/14/2018 12:39 PM, Sanjay Patel via llvm-dev wrote:
>
> I didn't look closely at the new patch, but this appears to be a
> small extension to:
>
> https://reviews.llvm.org/D38566
>
> ...and the GVN-based reasons for delaying transformation to
> 'select' are discussed in detail in the motivating bug for that patch:
>
> https://bugs.llvm.org/show_bug.cgi?id=34603
>
> So this sounds like the right direction to me. Note that there was
> objection to the implementation (a pile of pass options vs.
> uniquely named passes).
>
> Here's another motivating bug where early transform to select
> prevents optimization:
>
> https://bugs.llvm.org/show_bug.cgi?id=36760
>
> Is that case affected by this patch?
>
> On Tue, Aug 14, 2018 at 11:17 AM, John Brawn via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Summary
> =======
>
> I'm planning on adjusting SimplifyCFG so that it doesn't turn
> two-entry phi
> nodes into selects until later in the pass pipeline, to give
> passes which can
> understand phis but not selects more opportunity to optimize. The
> thing I'm
> trying to do which made me think of doing this is described below,
> but from the
> benchmarking I've done it looks like this is overall a good idea
> regardless of
> if I manage to get that done or not.
>
> Motivation
> ==========
>
> My goal is to get clang to optimize some code containing a call to
> std::min_element which is dereferenced, so something like:
>
> float min_element_example(float *data, int size)
> {
> return *std::min_element(data, data+size);
> }
>
> which, after inlining a specialization, looks like:
>
> float min_element_example_inlined(float *first, float *last)
> {
> for (float *p = first; p != last; ++p)
> {
> if (*p < *first)
> first = p;
> }
> return *first;
> }
>
> There are two loads in the loop, *p and *first, but actually the
> load *p can be
> eliminated by using either the previous load *p or the previous
> *first,
> depending on if the if-condition was taken or not. However the
> "if (*p < *first) first = p" gets turned by simplifycfg into a
> select and this
> makes optimizing this a lot harder because you no longer have
> distinct paths
> through the CFG.
>
> I have some ideas on how to do the optimization (see my previous
> RFC "Making GVN
> able to visit the same block more than once" posted in April,
> though I've
> decided that the specific idea presented there isn't the right way
> to do it),
> but I think the first step is to make sure we don't have a select
> when we try
> to optimise this.
>
> Approach
> ========
>
> I've posted a patch to https://reviews.llvm.org/D50723
> <https://reviews.llvm.org/D50723> showing what I'm
> intending to do. An extra parameter is added to SimplifyCFG to
> control whether
> two-entry phi nodes are converted into select, and this is set to
> false in all
> instances before the end of module simplification. At the end of
> module
> simplification we do SimplifyCFG, then Instcombine to optimise the
> selects that
> are introduced, then EarlyCSE to eliminate common subexpressions
> introduced by
> instcombine.
>
> Is scheduling another simplifycfg, instcombine + earlycse pass really
> necessary? I’m concerned about the compile time impact.
>
> Cheers,
>
> Amara
>
>
> Benchmark Results
> =================
>
> These are performance differences reported by LNT when
> running llvm-test-suite,
> spec2000, and spec2006 at -O3 with and without the patch
> linked above (using
> trunk llvm from a week or so ago).
>
> AArch64 results on ARM Cortex-A72:
>
> Performance Regressions - execution_time
> Change
> SingleSource/Benchmarks/Shootout/Shootout-ary3 9.48%
> MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt 3.79%
> SingleSource/Benchmarks/CoyoteBench/huffbench 1.40%
>
> Performance Improvements - execution_time
> Change
> MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl
> -23.74%
> External/SPEC/CINT2000/256.bzip2/256.bzip2
> -9.82%
> MultiSource/Benchmarks/TSVC/Searching-flt/Searching-flt
> -9.57%
> MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt
> -4.38%
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
> -3.94%
> MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl -3.44%
> External/SPEC/CFP2006/453.povray/453.povray
> -2.50%
> SingleSource/Benchmarks/Adobe-C++/stepanov_vector -1.49%
>
> X86_64 results on Intel Xeon E5-2690:
>
> Performance Regressions - execution_time Change
> MultiSource/Benchmarks/Ptrdist/yacr2/yacr2 5.62%
>
> Performance Improvements - execution_time Change
> SingleSource/Benchmarks/Misc-C++/Large/sphereflake -4.43%
> External/SPEC/CINT2006/456.hmmer/456.hmmer -2.50%
> External/SPEC/CINT2006/464.h264ref/464.h264ref -1.60%
> MultiSource/Benchmarks/nbench/nbench -1.19%
> SingleSource/Benchmarks/Adobe-C++/functionobjects -1.07%
>
> I had a brief look at the regressions and they all look to
> be caused by
> getting bad luck with branch mispredictions: I looked into
> the Shootout-ary3 and
> yacr2 cases and in both the hot code path was the same
> with and without the
> patch, but with more mispredictions probably caused by
> changes elsewhere.
>
> John
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> _______________________________________________
>
> LLVM Developers mailing list
>
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
>
> _______________________________________________
>
> LLVM Developers mailing list
>
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180912/3f6912b6/attachment-0001.html>
More information about the llvm-dev
mailing list